Introduction to Formal Language Theory - Harrison Michael A.

Автор: Harrison Michael A.
Теги: computer science
ISBN: 0-201-02955-3
Год: 1978
Похожие
Introduction to automata theory, languages, and computation
Computers and Intractability. A Guide to the Theory of NP-Completeness. (Series of Books in the Mathematical Sciences)
An Introduction to Computational Learning Theory
LR Parsing. Theory and practice
Текст
                    MICHAEL A. HARRISON
|W|h|o|t| [a|r|e| | 1j h|o]s|e1 |a|n| i)m|o| I |s |
finite
state
control

MICHAEL A. HARRISON
University of California
Berkeley, California
Introduction
to Formal
_ Language
Theory
A
TT
ADDISON-WESLEY PUBLISHING COMPANY
Reading, Massachusetts
Menlo Park, California
London
Amsterdam
Don Mills, Ontario
Sydney

This book is in the
ADDISON-WESLEY SERIES IN COMPUTER SCIENCE
Consulting editor:
Michael A. Harrison
Copyright © 1978 by Addison-Wesley Publishing Company, Inc. Philippines copyright 1978 by
Addison-Wesley Publishing Company, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or
otherwise, without the prior written permission of the publisher. Printed in the United States of
America. Published simultaneously in Canada. Library of Congress Catalog Card No. 77-81196.
ISBN 0-201-02955-3
ABCDEFGHIJK-MA-798

To Susan and Craig

M
ace
Formal language theory was first developed in the mid 1950's in an attempt to develop
theories of natural language acquisition. It was soon realized that this theory
(particularly the context-free portion) was quite relevant to the artificial languages that had
originated in computer science. Since those days, the theory of formal languages has
been developed extensively, and has several discernible trends, which include
applications to the syntactic analysis of programming languages, program schemes, models
of biological systems, and relationships with natural languages. In addition, basic
research into the abstract properties of families of languages has continued.
The primary goal of this book is to give an account of the basic theory. My
personal interests are in those applications of the subject to syntactic analysis, and some
material on that subject is included. The temptation to write about lindenmayer
systems, program schemes, AFL's, and natural-language processing has been resisted.
There are now specialized books available on those topics.
This book is written for advanced undergraduates or graduate students. It is
assumed that the student has previously studied finite automata, Turing machines, and
computer programming. Experience has shown that the material in this book is easily
accessible to mathematics students who do not know any computer science. Although
no single academic course is prerequisite to the study of the material in this book, it
has been my experience that students who have had no background in constructing
proofs do have difficulty with the material. Any junior or senior who is majoring in
computer science should be able to read the book.
One of the problems facing any person teaching formal language theory is
how to present the results. The proofs in language theory are often constructive and the

PREFACE
verifications that the constructions work can be long. One approach is to give the
intuitive idea and omit a formal proof. This leads to an entertaining course, but some
students are unable to carry out such proofs and soon learn to claim all sorts of false
results. On the other hand, if one attempts to prove everything thoroughly, the course
goes slowly and becomes one interminable induction proof after another. The few
students who survive such a course are capable of doing research and writing proofs.
The presentation in this book is designed to help the instructor to strike a compromise
between the two extremes. When I teach the course, I try to give the intuitive versions
of the constructions and proofs in class and to depend on the text for the details. There
are a lot of exercises, particularly of the "prove something" variety, in the text and I
assign a good many of these when I teach the material. The student who wants to be
a specialist can become well versed in all the techniques; yet it is still possible to
accommodate students who have a more casual interest in the subject.
A few sections are marked with an asterisk to indicate that they may be
skimmed on a first reading. Just because a section is "starred" does not mean it is
difficult. It means that it can be omitted if an instructor becomes pressed for time.
There is enough material in the book for a two-course sequence. In a one-
quarter introductory course, the following sections should be covered:
Chapter 1, Sections 1.1,12,1.4,1.5,1.6.
Chapter 2, Section 2.5 if the students are well prepared. If not, everything.
Chapter 3, Sections 3.1—3.4.
Chapter 4, Sections 4.1—4.4,4.6.
Chapter 5, Sections 5.1—5.4.
Chapter 6, Sections 6.1,6.2,6.4,6.5,6.7.
Chapter 7, Sections 7.1,7.2.
Chapter 8, Sections 8.1-8.4.
Chapter 9.
Chapter 12, Sections 12.1,12.2,12.4,12.6,12.7.
If an instructor has additional time, a good strategy would be to choose
additional material from the non-starred sections.
Sometimes a proof will depend on an earlier exercise. However, all the proofs
needed to develop basic results are given in the book; consequently, the instructor can
feel confident that he or she will not run into an unpleasant surprise while preparing a
lecture. Not all of the exercises are straightforward and particularly difficult ones are
starred.
An attempt has been made to credit results to their authors in the historical
summaries at the end of each chapter. There are some cases in which priority is in
doubt, and then the first published paper or thesis is referenced. The literature on
formal language theory is extensive and I have limited the bibliography to papers
whose results have been used or which I have read and found interesting and/or useful.
No attempt has been made to include papers on program schemes or biologically
motivated automata theory. Some papers on AFL theory and Turing complexity have been
included. I apologize in advance to those who may feel that their work was
inadequately referenced, and hope they understand the limitations of both space and time.

PREFACE
I have had a great deal of help in the preparation of this book. Most of this
assistance came from students who took notes, read drafts, corrected error or
constructed improved proofs. I want to thank them all but especially Douglas J. Albert,
John C. Beatty, John Foderaro, Matthew M. Geller, Ivan M. Havel, Robert Henry,
Richard B. Hull, William N. Joy, Kimberly N. King, Arnaldo Moura, John D. Revelle,
Walter L. Ruzzo, Donald Sannella, Barbara Simons, Neil Wagoner, Timothy C. Winkler,
Paul Winsburg, and Amiram Yehudai.
The continuous support of the National Science Foundation is gratefully
acknowledged.
I want to express my deep appreciation to the many people at Addison Wesley
who assisted in the preparation of the book. Special thanks must go to Mary A. Cafarella
who edited the manuscript and to William B. Gruener who is the finest Computer
Science editor that I have ever known.
The special assistance of Ralph L. Coffman, Anthony M. Harrison, John E.
Hopcroft, and Joan M. Sautter made the whole project even more worthwhile.
Susan L. Graham taught from a draft of the manuscript and her suggestions
were very helpful. Ruth Suzuki typed the manuscript and course notes. Somehow she
converted a nearly illegible handwritten manuscript into beautifully typed notes so
that student feedback was possible, all in realtime. Her contribution is greatly
appreciated. Both Dorian Bikle and Caryn Dombroski helped with other aspects of the
production. Caryn also drew the cover design which is based on a pun by Susan L.
Graham.
Berkeley
May 1978
Michael A. Harrison

Contents
FOUNDATIONS OF LANGUAGE THEORY - STRINGS AND
HOW TO GENERATE THEM 1
1.1 Introduction 1
1.2 Basic definitions and notation 2
1.3 2* revisited — A combinatorial view 5
1.4 Phrase-structure grammars 13
1.5 Other families of grammars — The Chomsky hierarchy 17
1.6 Context-free grammars and trees 23
1.7* Ramsey's theorem 31
1.8* Nonrepetitive words 36
1.9 The Lagrange inversion formula and its application 40
1.10 Historical survey 44
2 FINITE AUTOMATA AND LINEAR GRAMMARS 45
2.1 Introduction 45
2.2 Elementary finite automata theory — Ultimately periodic sets 45
2.3 Transition systems 52
2.4 Kleene's theorem 57
2.5 Right linear grammars and languages 60
2.6 Two-way finite automata 65
2.7 Historical survey 71

XII
CONTENTS
3 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES 73
3.1 Introduction 73
3.2 Reduced grammars 73
3.3 A technical lemma and some applications to closure properties 79
3.4 The substitution theorem 82
3.5 Regular sets and self-embedding grammars 85
3.6 Historical survey 91
4 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS 93
4.1 Introduction 93
4.2 Norms, complexity issues, and reduction revisited 93
4.3 Elimination of null and chain rules 98
4.4 Chomsky normal form and its variants 103
4.5 Invertible grammars 107
4.6 Greibach normal form 111
4.7 2-standard form 116
4.8 Operator grammars 121
4.9 Formal power series and Greibach normal form 125
4.10 Historical survey 133
5 PUSHDOWN AUTOMATA 135
5.1 Introduction 135
5.2 Basic definitions 135
5.3 Equivalent forms of acceptance 143
5.4 Characterization theorems 148
5.5 A-moves in pda's and the size ofapda 159
5.6 Deterministic languages closed under complementation 163
5.7 Finite-turn pda's 179
5.8 Historical survey 184
6 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE
RESULTS 185
6.1 Introduction 185
6.2 The iteration theorem 185
6.3 Context-free languages over a one-letter alphabet 194
6.4 Regular sets and sequential transducers 197
6.5 Applications of the sequential-transducer theorem 206
6.6* Partial orders on 2* 214
6.7 Programming languages are not context-free 219
i*
6.8 The proper containment of the linear context-free languages in <? 221
6.9 Semilinear sets and Parikh's theorem 225
6.10 Historical survey 231
7 AMBIGUITY AND INHERENT AMBIGUITY 233
7.1 Introduction 233
7.2 Inherently ambiguous languages exist 233

CONTENTS
XIII
7.3 The degree of ambiguity 240
7.4* The preservation of unambiguity and inherent ambiguity by
operations 243
7.5 Historical survey 246
DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS 247
8.1 Introduction 247
8.2 Solvable decision problems for context-free grammars and languages 248
8.3 Turing machines, the Post Correspondence Problem, and context-
free languages 249
8.4 Basic unsolvable problems for context-free grammars and languages 258
8.5* A powerful metatheroem 262
8.6 Historical survey 270
CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES 271
9.1 Introduction 271
9.2 Various types of phrase-structure grammars 271
9.3 Elementary decidable and closure properties of context-sensitive
languages 275
9.4 Time- and tape-bounded Turing machines - & and ^J? 280
9.5 Machine characterizations of the context-sensitive and phrase-
structure languages 290
9.6 Historical survey 296
REPRESENTATION THEOREMS FOR LANGUAGES 299
10.1 Introduction 299
10.2* Baker's theorem 300
10.3 Deterministic languages generate recursively enumerable sets 307
10.4 The semi-Dyck set and context-free languages 312
10.5 Context-free languages and inverse homomorphisms 326
10.6 Historical survey 331
BASIC THEORY OF DETERMINISTIC LANGUAGES 333
11.1 Introduction 333
11.2 Closure of deterministic languages under right quotient by regular
sets 334
11.3 Closure and nonclosure properties of deterministic languages 341
11.4 Strict deterministic grammars 346
11.5 Deterministic pushdown automata and their relation to strict
deterministic languages 355
11.6* A hierarchy of strict deterministic languages 366
11.7 Deterministic languages and realtime computation 375
11.8* Structure of trees in strict deterministic grammars 381
11.9 Introduction to simple machines, grammars, and languages 392
11.10 The equivalence problem for simple languages 398
11.11 Historical survey 416

CONTENTS
RECOGNITION AND PARSING OF GENERAL CONTEXT-FREE
LANGUAGES 417
12.1 Introduction 417
12.2 Mathematical models of computers and a hardest context-
free language 417
12.3 Matrix multiplication, boolean matrix multiplication, and
transitive closure algorithms 426
12.4 The Cocke—Kasami—Younger algorithm 430
12.5* Valiant's algorithm 442
12.6 A good practical algorithm 470
12.7 General bounds on time and space for language recognition 489
12.8 Historical survey 500
LR(k) GRAMMARS AND LANGUAGES 501
13.1 Introduction 501
13.2 Basic definitions and properties of LR(k) grammars 501
13.3 Characterization of LR{0) languages 515
13.4 LR-sty\e parsers 525
13.5 Constructing sets of LR(k) items 530
13.6 Consistency-and parser construction 544
13.7 The relationship between LR -languages and A0 554
13.8 Historical survey 558
REFERENCES 559
INDEX
584

one
Foundations
of Language Theory-
Strings and
How to Generate Them
1.1 INTRODUCTION
Formal language theory concerns itself with sets of strings called "languages"
and different mechanisms for generating and recognizing them. Certain finitary
processes for generating these sets are called "grammars." A mathematical theory of such
objects was proposed in the latter 1950's and has been extensively developed since
that time for application both to natural languages and to computer languages. The
aim of this book is to give a fairly complete account of the basic theory and to go
beyond it somewhat into its applications to computer science.
In the present chapter, we begin in Section 1.2 by giving the basic string-
theoretic notation and concepts that will be used throughout. Section 1.3 gives a
number of useful results for dealing with strings. Many of those results will not be
needed until later and so the section has an asterisk after its number. Sections which
are starred in this manner may be read lightly. When references to a theorem from that
section occur later, the reader is advised to return to the starred section and read it
more thoroughly.
Section 1.4 introduces the family of phrase-structure grammars, which are
fundamental objects of our study. A number of examples from natural language and
programming languages help to motivate the study. Section 1.5 introduces many
other families of grammars and indicates intuitively how their languages may also
be characterized by different types of automata.
Section 1.6 is devoted to trees and their relation to context-free grammars.
The intuition developed in that section will be important in subsequent chapters.
1

2 FOUNDATIONS OF LANGUAGE THEORY
Section 1.7 is devoted to Ramsey's Theorem, which is a deep and very
general combinatorial theorem. It is used to show relationships that must hold in any
sufficiently long word. Section 1.8 is devoted to showing how to construct words
with no repeated subwords.
Section 1.9 proves a classical theorem of analysis due to Lagrange. In later
sections, this can be applied to power series generated by context-free grammars
and can be used to solve for the "degree of ambiguity."
1.2 BASIC DEFINITIONS AND NOTATION
A formalism is required to deal with strings and sets of strings. Let us fix
a finite nonempty set 2 called an alphabet. The elements of 2 are assumed to be
indivisible symbols.
Examples 2 = {0,1} or 2 = {a, b).
In certain applications to compiling, we might have an alphabet which
contains begin, end. These are assumed to be indivisible symbols in the applications.
A set X= {1,10,100} is not an alphabet since the elements of A" are made up of
other basic symbols.
Definition A wore? (or string)ovei an alphabet 2 is a finite-length 2-sequence.
A typical word can be written as x = a^ ■ ■ ■ ak, k > 0, a; £ 2 for 1 < i < k.
We allow k = 0, which gives the null (or empty) word, which is denoted by A. k is
called the length of x, written \g(x), and is the number of occurrences of letters of
2 in*.
Let 2 denote the set of all finite-length 2-sequences.
Let x and y be in 2 . Define a binary operation of concatenation, where
we form the new word xy.
Example Let 2 = {0,1} and
x = 010 7=1
Then
xy = 0101 and yx = 1010
The following proposition summarizes the basic properties of concatenation.
Proposition 1.2.1 Let 2 be an alphabet.
a) Concatenation is associative; i.e., for each x, y, z in 2*,
x(yz) = (xy)z
b) A is a two-sided identity for 2*; i.e., for each x e 2*,
xA = Ax = x

1.2
BASIC DEFINITIONS AND NOTATION
3
c) 2* is a monoid^ under concatenation and A is the identity element.
d) 2* has left and right cancellation. For each*,7,z G 2*,
i) zx — zy implies x = y, and
ii) xz =yz impliesx =y.
e) For znyx,y G 2*,
IgfoO = l«(*) + lgG0.
It is necessary to extend our operations on strings to sets. Let X, Y be sets
of words. Hereafter, one writes X, Y£_ 2*. Then the product of two sets is:
XY = {xy\xeX,y<EY}.
Exponent notation can be used in an advantageous fashion for sets. Let
X£ 2*. Define
X° = {A} and Xi+1 = XlX for all i>0.
Define monoid (or Kleene) closure of a set X £ 2* by
X* = U X\
and semigroup closure by l>0
X* = X*X = U X\
i> 1
Examples Let 2 = {0,1}. In general,
2'' ={x£2*l lg(x) = /}.
Thus*
2 = (2)*.
Note that
2+ = 2* - {A}.
Also, if
X = {A,01} and Y = {010,0},
then
XY = {0,010,01010}.
Proposition 1.2.2 For each X £ 2 ,
a) for each / > 0, X{ £ X*;
b) for each i> \,Xl£ X*;
c) AGI*;
d) AGr if and only if AGI.
t A semigroup consists of a set S with a binary associative operation • defined on S. A
monoid is a semigroup which possesses a two-sided identity. The set of functions on a
set is a monoid under the operation of functional composition.
•f The set of all 2-words can be obtained by applying the star operation to 2.

4 FOUNDATIONS OF LANGUAGE THEORY
The product, exponent, and star notation has a counterpart in the monoid
of binary relations on a set which we will mention.
Definition Let p<~ X x Y and a £ Y x Z be binary relations. The
composition of p and a is:
pa = {(x,z)\ (x,y)£p and (y,z)£o for some.)' e Y}£ Xx Z.
If we specialize to binary relations on a set - that is, p£ X x X - we can
define
p° = {(x, x)lx£I} (the equality or diagonal relation).
For each / > 0,
P''+1 = P!P.
The reflexive-transitive closure of p is:
p* = U p',
while the transitive closure of p is:
p+ = p*p = U p\
i> 1
It is easy to verify the following property of p*, which is usually taken as
the definition of reflexive—transitive closure.
Fact Let p be a binary relation on X and let x, x be in X. Then (x, x) £ p*
if and only if there exists r > 0;z0,. . . , zr in X such that:
0 x = z0;
ii) zr = *';
iii) (z,-,z1+1)Gp for each 0 </'<r.
A similar statement holds for transitive closures. Just replace "*" by "+" and
"r > 0" by "r > 1."
Another useful operation on strings is transposing or reversing them. That is,
if x = 011, then define xT = 110. A more formal definition, useful in proofs, is now
given.
Definition The transpose operator is defined on strings in 2* as follows:
AT = A
For any x in 2*, a in 2,
(xa)T = a(xT)
The notation is extended to sets in 2* "pointwise":
XT = {xT\xGX}.

1.3* Z* REVISITED- A COMBINATORIAL VIEW 5
Examples Let 2 = {a,b)andL = ab* = {ab'\ i>0}.Then
LT = b*a = {bia\ i>0}.
Also, the following sequence of containments is a good test of familiarity with the
notation".
{aWl i > 0} c a*b* = WI /, / > 0}c {a> 6}*;
/> = {wwTl wG{a, 6}*}.
F is the set of even-length "palindromes." A palindrome is a string that is the same
whether written forwards or backward. We conclude this section by giving some
natural-language examples.'
RADAR
ABLE WAS I ERE I SAW ELBA
MADAM IM ADAM
A MAN A PLAN A CANAL PANAMA
1.3* Z* REVISITED - A COMBINATORIAL VIEW
In this section, we shall take a brief, more algebraic look at the monoid 2*.
We begin by recalling some concepts from semigroup theory.
If we have some semigroup S, let T be a subset of S (not necessarily a sub-
semigroup). T is contained in at least one subsemigroup, namely S. Let T* be the least
subsemigroup containing T (i.e., the intersection of all subsemigroups of S containing
T). We say that T* is the semigroup generated by T. If T* = S, we say that T is a set
of generators for S. Similar remarks hold for monoids, except that we would write
T* instead of T+. There is another important type of generation.
Definition Let T be a set of generators of a semigroup (monoid) S. S is free
over T if each element of S has a unique representation as a product of elements
from T.
Fact 2 is free over 2.
Thus if we have x, y G 2*, where x = a^ ■ ■ ■ am, at £ 2, m > 0, and y =
bx ■ ■ ■ bn, bt G 2, n > 0, then we may have x =y if and only if m = n and a,- = b;
for all /, 1 </' < m ~n.
There is an important but simple theorem that lies at the root of the study
of equations in 2*.
Theorem 1.3.1 (Levi) Let v, w, x, and y be words in 2*. If vw =xy, then:
i) if lg(i>) >\g(x), then there exists a unique word z G2* such that v =
xz andy = zw;
ii) if lg(z») = lg(x), then v = x andw = y;
The spaces between words are included only for clarity in the last two examples.

6 FOUNDATIONS OF LANGUAGE THEORY
iii) if lg(p)<lg(x), then there exists a unique word z e 2* such that* =
vz and w = zy.
Proof The most intuitive way to deal with general problems of this type is
to draw a picture of the strings, as follows:
CASE i)
V
X | Z
w
y
From the picture, it is clear that the desired relationships hold.
The proof follows by considering whether the line between x and y in the
second row of the picture goes to the left or right of the line between v and w.
A formal proof of the result would proceed by an induction on lg(wv), and
is omitted. □
Definition Let w and y be in 2*. Write w<j, w is a prefix of y, if there
exists/ in 2* so that
y = wy'
Also, w is a suffix ofy ify =y'w for some /e2*.
This allows us to state a corollary to Theorem 1.3.1.
Corollary Let v, w, x, andj be in 2*. If vw = xy, then
x < v or v < x
As an example of the usefulness of these little results, we prove the following
easy proposition.
Proposition 1.3.1 Let w, x, y andz be in 2*. If w <y and* <yz then either
w<xoix<w.
Proof w <y and* <yz imply that there exist./ andx' so that
y = wy' (1-3-1)
yz = xx (1-3.2)
Thus (1.3.1) and (1.3.2) imply that
yz = wy'z = xx'
but w(y'z) = xx' implies that
x<w or w<x. O
A more interesting and useful result is the following.

1.3* Z* REVISITED -A COMBINATORIAL VIEW 7
Theorem 1.3.2 Let y G 2 and x, z G 2+ such that xy=yz. Then there
exist u, z>G2* andp >0 such that* = wz>, z = z>w, andy = (uv)"u = u(yu)p.
Proof We induct on the length of y.
Basis. If lg(y) = 0, we have u = A=y,v = x =z, and p = 0.
Induction step. There are two cases depending on the comparative lengths of
x and 7.
CASE1. ]g(x)>]g(y).
By Theorem 1.3.1 applied to xy = yz, we know that there is vG 2* so that
x = yv and z = vy
We may choose p = 0 and u =y and the result is proven. Note that if lg(x) = lg(y),
then v = A, but the proof is still valid.
CASE2.Ig(x)<lg(7).
Again by Theorem 1.3.1, there exists w G 2* so that
y = xw and y = wz
and lg (w) < lg (y). Since we have
xw = wz
with lg(vv) <lg(y), we can assert, by the induction hypothesis, that there exist u,
v G 2* and p > 0 so that x = uv, z = z»m, and w = {uv)pu. From the above, we have
y = xw = uvw = uv{uvfu = (MZ»y+,M.
The induction has been extended. □
Now we approach a very useful theorem which has many applications. The
sufficient condition is particularly valuable.
Theorem 1.3.3 Let u, z»G2*. u — wm and v=w" for some w G 2 , w,
n > 0, if and only if there exist p, q > 0 so that up and p" contain a common prefix
(suffix) of length lg(«) + lg(i>) — (lg(w), lg(^)), where (/', /) denotes the greatest
common divisor of / and /.
Proof Let lg(«) = x and \g(v) = t. Let
u = a0 ■■ as-! )
with ah bj G2,
v = b0 ■••bt-i J
and suppose that s <t.
Let us first suppose that the second condition is satisfied, that is, we deal
with sufficiency first. Suppose that (x, t) = 1.

8 FOUNDATIONS OF LANGUAGE THEORY
= 1.
we get:
Claim a, = b0 for each 0 < / < x. That is, all the a-, are equal and thus lg (w)
Proof We simply begin to pair off the first x + t — 1 letters of up and vq, and
i
0
1
s + t-
-2
up v"
do = b0
a, = bi
ah - bk
It is clear that
a,- = b( for each i, 0 < i < x.
Moreover, the following is true.
Fact For each /', 0 < / < x — 1,
a(iV()mods = b;.
This fact is easily seen to be true for i = 0. From that, the case of i >0 follows easily
from the form of up and vq. Now let us consider the set of letters
A = KtmodJ 0<k<s}.
Since (x, t) = 1, we have {ktmodxl 0<fc<x} = {0,1, ...,x — 1}. By this fact, every
letter in u is in A and equals b0. Thus all the letters in u are identical.
Therefore u =b% and v = b% since every letter in v is equal to some letter
inw.
To extend the result to the case where d = (x, t) > 1, simply take words of
length d and work over Sd, using the previous case.
Necessity. Let u = wm and v = wn for some wes";m,n>0. Let^
m + n — (m, n)
P =
and
Then we have
m
m + n—(m,ri)
The notation \x] denotes the "ceiling" of x, the smallest integer >x.

1.3* Z* REVISITED- A COMBINATORIAL VIEW 9
but mp>m + n—{m,n) and nq>m +n —(m,n) so that up and vq have a common
prefix of length
> lg(w)(iH +n-(m, n)) = lg(M) + lg(z>) -(lg(M), lg(i>)). □
Corollary If uv = vu, where u, !)£S", then there iswGZ* so that u = wm
and z> = w" for some w, « > 0.
Now, an application of the previous theorem will be presented.
Theorem 1.3.4 Let X={x,y}£ S+ with x ¥=y. X* is free if and only if
xy ¥=yx.
Proof If X* is free, then xy ¥=yx.
Conversely, suppose xy ¥=yx with A¥=x ¥=y¥= A. Assume, for the sake of
contradiction, that there is a nontrivial relation involving.* andy, say,
Xniyni ■ ■ ■ x"k-ly"k =ymlxm* ■ ■ ■ ymS-ixms (1.3.3)
with «;, m, > 0 for all i. [Note that (1.3.3) can be assumed to begin (end) with
different words by left (right) cancellation.] We may assume that lg(y) < lg(x). This and
(1.3.3) imply that y is a prefix of x. Thus,
x = y9t0 (1.3.4)
for some t0 G2*, where q > 1 is the largest power of y that is a prefix of x.
Substituting (1.3.4) in (1.3.3) yields:
yqt0y-y"h = ymiyqt0 ■ ■■x"1*.
By left cancellation,
t0y-y"k = ymit0 ■■■x"1*. (1.3.5) •
Note that y cannot be a prefix of t0 because, if it were, the definition of q would be
contradicted. By (1.3.5), we know that t0 is a prefix of y. Hence,
y = t0ti
for some t\ €2*. From (1.3.3) we must have that y is a suffix of x. [Recall that
lg(v) ^lg(*)] ■ Since y = t0ti is a suffix of x and lg(y) = lg(?o'i)> we must nave.
from Eq. (1.3.4), that:
y = tth = t0ti (1.3.6)

10 FOUNDATIONS OF LANGUAGE THEORY
From the Corollary to Theorem 1.3.3 and Eq. (1.3.6), we know that there exists
z G 2* so that
t0 = zp<>, tx = zp>, y = zp*
for some p0, plt p2 > 0, p0 + Pi = Pi > 0. But now we know from (1.3.4) that
* = y"t0 = (?oM%
= zP29+Po_
Since both* andy are powers of z, we know that:
Xy = yx = z<-q*')p^po
which is a contradiction. D
In certain applications, it is useful to order strings.
Definition Let S be any set. A binary relation, to be written <, is a partial
order on S if it is reflexive, antisymmetric, and transitive, that is, if for each a, b, c
in S,
1) a<a;
2) ifa<b and b<a, then a = b\
3) if a < b and b < c, then a<c.
Moreover, < is said to be a total order if it is a partial order which satisfies the
following additional property: For each a, b £S, either
a<& or b<a.
where the relation a < b is an abbreviation for a < b and aj=b.
For example, the "prefix relation" is a partial ordering of 2*. The ordinary
relation < on natural numbers is an example of a total order. Now we introduce a
mathematical counterpart to a dictionary ordering.
Definition Let 2 be any alphabet and suppose that 2 is totally ordered by
-<. Let x=X\ ■ ■ ■ xr, y = yx ■ • -ys, where r, s > 0, xu ys € 2. We say that x< y if
i) Xj = yt for 1 < / < r and r < x; or
ii) there exists & > 1 so that xt = yt for 1 < i < k and xk < yk. < is referred
to as lexicographic order.
This definition is not hard to understand:
i) says that if x is a proper prefix of y (that is, j £ {jc}2+), then x-<y;
ii) says that if there exist u, v, v G 2*; a, b, € 2 so that x = uav, y = ubv'
and a<b, then.x-<.y.
Note that if 2 = {a, b, .. . , z} and the total order is alphabetical, then the lexicographic
order on 2* is the usual dictionary order.

1.3* 2* REVISITED -A COMBINATORIAL VIEW 11
Example Consider 2 = {0,1}. We begin to enumerate the elements of 2
in lexicographic order.
A
0
00
000
01
1
11
First we justify that «< is a total order on 2*.
Proposition 1.3.2 Let*, y, zG2*.
1 lfx<y,y<z,ihenx<z.
2 Ifx^Ly andy^x, then* =y.
3 For every x, y G 2*, either x =y,x<y, or y<x.
This proposition establishes -< as a total order on 2 . Next we establish the
connection between concatenation and lexicographic order.
Proposition 1.3.3 Let*, y, zG2*. x<y if and only if zx <zy.
Next we show that things are different on the right.
Proposition 1.3.4 Concatenation is not "monotone" on the right. That is,
there exist x, y, z G 2*, such that x < y but xz-^yz.
Proof Let 2 = {0,1,2}. Choose x = 0, y = 01, and z = 2. Clearly x<y,
butxz =02>012=7z. □
Now we will state and prove one of the main properties of lexicographic
order.
Theorem 1.3.5 Let*,7 G 2 andx-<y. Either
i) x is a prefix of y, or
ii) for any z, z G 2 , xz -<yz'.

12 FOUNDATIONS OF LANGUAGE THEORV
Proof Suppose x is not a proper prefix of y. Then, since x<y, there exists
u, v, z/G2*;a, 2>G2, so that
x = uav
y = ubv'
with a < b. But xz = uavz and yz = ubv'z, so that xz <yz. □
PROBLEMS
1 Two words x and z G 2 are said to be conjugates if there exists >> G 2 so
that jry =>>z. Show that x and z are conjugate if and only if there exist u, v G 2
so that x = uv and z = Vu.
2 Show that the conjugate relation is an equivalence relation on 2 .
3 Show that, if x is conjugate to y in 2*, then x is obtained from y by a
circular permutation of letters of >>.
4 Let x, w G 2*. Show that, if x = w" for some n > 0, then every conjugate
x is of the form x = (w )" for some n > 0 and some w G 2 which is conjugate to
w.
5 Let w, z G 2 . z is a roof of w if w = z" for some n > 0. The root of w of
minimal length is a primitive root of w, and is denoted by p(w). Show that the
following proposition is true.
Proposition Let x, y G 2+. We have xy = yx if and only if p(x) = p(y).
6 Show that, for any w G 2*, we have p(wk) = p(w) for any k > 1 where
p(w) is the primitive root of w.
7 Prove Proposition 1.3.2.
8 Let -< be the lexicographic order on 2 . Prove that, if y -<x-<yz, then
there exists w G 2 such that x = yw, where w-<z.
9 Let 2 be of finite cardinality n and let co be any map of 2 onto {1, . . . , n).
This provides a standard way of totally ordering 2. Extend the order to 2 as follows:
Let x = jcj- • • Xk, jCfe G 2 for 1 < i < k. Define
For example, if 2 = {0,1} and co(0) = 1 and co( 1) = 2, then we have
co(00) = \, cj(A) = 0, co(01) = 2, co(Oll) = \.
Give maximum and minimum values of co(jc) as a function of n and k.
10 Prove that, for any strings x and>>, co(x)< co(y) if and only if x<y.
11 Suppose we redefined the cj function to be
Would Problem 10 still be valid?
12 Show that Theorem 1.3.3 is best possible by showing that the bound
\g(u) + \g(v) —(lg(«), lg(i>» cannot be improved.

1.4 PHRASE-STRUCTURE GRAMMARS 13
1.4 PHRASE-STRUCTURE GRAMMARS
There is an underlying framework for describing "grammars." Our intuitive
concept of a grammar is as a (finite) mechanism for producing sets of strings. The
following concept has proved to be very useful in linguistics, in programming languages
and even in biology.
Definition A phrase-structure grammar is a 4-tuple: G = (V, 2, P, S), where:
Fis a finite nonempty set called the total vocabulary;
2 £ V is a finite nonempty set called the terminal alphabet;
5GF— 2 = Af is called the start symbol f
Pisa finite set of rules (or productions) of the form
a-»-0 where oi<EV*NV* and (3GF*.
Some examples will help to motivate and clarify these concepts.
Example of natural language The alphabet 2 would consist of all words in
the language. Although large, 2 is finite. N would contain variables which stand for
concepts that need to be added to a grammar to understand its structure, for example*
<noun phrase) and <verb phrase). The start symbol would be Sentence) and a typical
rule might be:
Sentence) -»■ <noun phrase) <verb phrase)
Example from programming languages Here 2 would be all the symbols in
the language. For some reasonable language, it might contain
A, B, C, . . . , Z, :=, begin, end, if, then, etc.
The set of variables would contain the start symbol <program), as well as others like
<compound-statement), <block), etc. A typical rule might be as follows:
<for statement) -*■ for <control variable) := <for list) do Statement)
Example 1.4.1 (of a more abstract grammar) Let 2 = {a, b}, N = F— 2 =
{A, S} with the productions:
S -+ ab AS -+ bSb
S -»■ aASb A -»■ A
A -»■ bSb aASAb -»■ aa
Note that a phrase-structure grammar has very general rules. The only
requirement is that the lefthand side has at least one variable.
Grammars are made for "rewriting" or "generation," as given by the
following definition.
t ./V is called the set of nonterminals, or the set of variables.
For the moment only, variables are being written in angled brackets to distinguish them.

14 FOUNDATIONS OF LANGUAGE THEORY
Definition Let G = (V, 2, P, S) be a phrase-structure grammar and let
a', (3' G V*. a is said to directly generate f3', written a => (3' if there exist ax, a2,
a, (3 G F*, such that a' = at aa2, j3' = aj (3a2 and a -»■ j3 is in P. We write => for the
reflexive—transitive closure of => .
Example We use the grammar of Example 1.4.1:
S => ab
S => aASb =» aW»2> =» aftaftftft (1.4.1)
5 ^ (aft)2 ft2
A sequence like (1.4.1) is called a generation or derivation. It is convenient in
displaying derivations, to underline the subword being rewritten, as is done in (1.4.1).
Certain conventions are adopted in usage of symbols. Capital letters near the
beginning of the alphabet are used for elements of V or N. Lower-case elements like
a, b, c designate elements of 2 or SA = 2 U {A}. One uses a, j3,7,... for elements of
V* and u,v, w,. .. for elements of 2*.
Definition Let G = (V, 2, P, S) be a phrase-structure grammar. The set of
sentential forms of G, written S(G), is the set:
S(G) = {aeV*\S 5* a}.
Intuitively, a sentential form is something derivable from S. Historically,
S stood for "sentencehood" and so, potentially, a sentential form could be expanded
to a sentence. The set of sentences can be defined as the language generated by the
grammar, as is done in the following definition.
Definition Let G = (V, 2, P, 5) be a phrase-structure grammar. The language
generated by G, written L{G), is the set:
L(G) = 5(G) n 2* = {wG2*l S ^> w}.
An important concept is expressed when we say that two grammars are
"equivalent." Since there are a number of concepts of equivalence which make sense,
the term weak equivalence is sometimes used for this concept.
Definition If G and G' are phrase-structure grammars, then G is said to be
(weakly) equivalent to G' if L{G) = L(G').
Let us work more closely with a grammar in order to get a feeling for what is
involved. The example will show that these constructs do possess a certain complexity.
LetG = (K, 2, P, S), where
2 = {0, 1}, V = 2 U {A, B, C, D, S).

1.4 PHRASE-STRUCTURE GRAMMARS 15
The rules of Pare listed in Table l.l, with numbers and explanatory comments.
TABLE 1.1
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
Rule
S -*
AB -*
AB -*
£»C -*
£T ->
£»0 -*
£»1 -*
£"0 -*
£1 ->
AB -*
C -*
05 -*
\B ->
/45C
0/4 £»
\AE
BOC
B\C
OD
ID
OE
IE
A )
50
fil
Comments
start
addaO
add a 1
drop a 0
drop a 1
skip right
remembering a 0
skip right
remembering a 1
terminate
move B left to
continue
What is L{G) for this grammar? At this point, the answer is not clear, and
even the intuitive comments given in the grammatical description do not help very
much. It turns out that
L(G) = {xx\xe{0,l}*}.
(1.4.2)
Even knowing the answer is of little help, since we must prove it. Proofs of this kind
are important because we must verify that:
i) xx is in L{G) for eachx E {0,1 }* ;
ii) every string in L(G) is of the form xx, where x£{0,lf.
Step (i) is not too hard in general. Step (ii) is necessary for making sure that a
construction is correct.
Let us now work out some of the propositions needed to prove (1.4.2).
Consider derivations that start from the string xABxC, where x £ {0,1} .We
will determine all the ways in which the derivation can continue and ultimately be
able to generate a sentence.
First of all, these considerations suffice to deal with (1.4.2), since
Now consider
ABC = AABAC
xABxC
How shall we continue?
CASE 1. Apply production 10 to (1.4.3). We get:
xABxC => xxC
(1.4.3)
(1.4.4)

16 FOUNDATIONS OF LANGUAGE THEORY
Our only choice is to use production 11, to get:
xxeL(G).
This is fine, since we need this string in the language.
CASE 2. Apply production 11 to (1.4.3). We get:
xABxC =*• xABx
We could again apply production 10 but that would reproduce Case 1. So we'll use
one of productions 2 or 3, say 2. Then we get:
xABx =» xOADx
Now rules 6 and 7 are applicable (depending on the structure of x). There is no way
that the D can ever be made to go away, because it will never find a C with which to
"collaborate" (as in rule 4). Thus the initial choice of rule 11 leads us to block. (The
argument would be the same if production 3 instead of production 2 had been
employed.)
CASE 3. Apply production 2 to (1.4.3). (There is an analogous argument
for production 3 by the symmetry of the rules.)
xABxC => xOADxC
* rules 6 and 7
=*• xOAxDC
=> xOAxBOC drop a 0
=* xOABxOC continue
In this case, a string was found which is one longer than x and of the same form.
The cases we have studied exhaust all possible rewritings of xABxC, and it
now follows that (1.4.2) is true. □
Doing these proofs is important because it helps to prevent errors. In the
example, note that rules 12 and 13 occur after the termination rules. The reason is
that the grammar was designed first. When the proof was carried out, it became
apparent that omissions had been made.
Writing a grammar for a problem is very much like composing a program
(in a primitive programming language). The techniques for writing an understandable
or structured program are even more useful here and should be employed.
PROBLEM
1 Give a phrase-structure grammar that generates the set
L = {aib1aibi\i,j>\}.
Explain clearly how the grammar is supposed to work, and also prove that it generates
L; i.e., every string in L (G) is in L and every string in L is in L (G).

1.5 OTHER FAMILIES OF GRAMMARS-THE CHOMSKY HIERARCHY 17
1.5 OTHER FAMILIES OF GRAMMARS - THE CHOMSKY HIERARCHY
We shall now survey some of the other families of grammars which are known,
and characterize some of them by automata. This will be a somewhat informal survey.
In later chapters, these devices will be studied in great detail and the actual
characterizations will be proved.
First, we consider some variations on phrase-structure grammars.
Definition A phrase-structure grammar G = (V, S, P, S) is said to be of
type 0 if each rule is of the form
a -» j3
whereaGAf+andj3e V*.
We shall see that type 0 and phrase-structure grammars are equivalent. This
is, in fact, a general theme that will run throughout the book. For a given family of
languages, Jz^ we will be interested in very powerful types of grammars that generate
-2? Moreover, we are very interested in the weakest family of grammars that also
generates ¥. When we have some particular set L and wish to show it is in Jz^ we use
a powerful type of grammar. On the other, if we wish to show that some given set
L' is not in Sf, then it is convenient to assume it is in Jz^and is generated by a very
constrained type of grammar. It then becomes much easier to find a contradiction.
Definition A phrase-structure grammar G = (V, 1,,P,S) is said to be
context-sensitive with erasing if each rule is of the form
where AGNand a, j3, y & V*.
Examples Consider the grammar Gx given by the following rules."
S -+ ABC
A -»■ a
aB -» b
C -+ c
This grammar is neither type 0 nor context-sensitive with erasing, but is phrase-
structure. Of course, every type 0 grammar and context-sensitive with erasing grammar
must be phrase-structure.
Now consider grammar G2 shown below:
S -+ AB A -»■ a
AB -+ BA B -+ b
G2 is type 0 but not context-sensitive with erasing. It is easy to modify Gj to obtain
a grammar G3 which is context-sensitive with erasing but not type 0. This is left to
the reader.
The next definition will lead to a different family of languages.

18 FOUNDATIONS OF LANGUAGE THEORY
Definition A phrase-structure grammar G = (V, S, P, S) is context-sensitive
if each rule is of the form
i) aAy -»■ a$y
where A E N; a, y G V*, j3 € F*; or
ii) 5 -> A
If this rule occurs, then S does not appear in the righthand side of any rule.
The motivation for the term context-sensitive comes from rules like
aAy -*■ afiy
The idea is that A is rewritten by (3 only if it occurs in the context of a on the left
and y on the right.
The purpose of the restriction (3 € V* is to ensure that when A is rewritten,
it has an effect. But this condition alone would rule out A from being a sentence.
For certain technical reasons and for certain applications, we would like to include
A. This leads to condition (ii) in the definition. We must constrain the appearance of
S in the rules; otherwise erasing could be simulated by a construction using rules
aAy -*■ aSy and S -»■ A, to yield
aAy =» aSy =» ay
Now we give a definition of context-free grammars, which will be one of the
main topics of this book.
Definition A phrase-structure grammar G = (V, 2, P, S) is a context-free
grammar if each rule is of the form
A -»■ a
where A<EN,a<EV*.
The term "context free" means that A can be replaced by a wherever it
appears, no matter the context.
Some authors given an alternative definition in which a € V+, which
prohibits A rules. The presence of A rules in our definition must be accounted for
eventually, and indeed it will be shown that, for each language L generated by a context-
free grammar with A rules, there exists a A-free, context-free grammar that generates
L — {A}. In the meantime A rules will play an important role in proofs, since they
frequently give rise to special cases that must be carefully checked. Because A rules
do cause technical problems, their omission is a common suggestion of novices and
a temptation to specialists. Systematic omission of A rules changes the class of
theorems one can prove. We shall include them.
There is an alternative form of context-free grammar which is used in the
specification of programming languages. This system is called the Backus normal
form or Backus—Naur form; in either case it is commonly abbreviated BNF. The
form uses four meta characters which are not in the working vocabulary. These are
< >

1.5 OTHER FAMILIES OF GRAMMARS-THE CHOMSKY HIERARCHY 19
D
D
D
D
-*■
-»
-»
-»
1
2
3
4
The idea is that strings (which do not contain the meta characters) are enclosed by
< and > and denote variables. The symbol .":= serves as a replacement operator just
like -*■ and I is read "or".
Example An ordinary context-free grammar for unsigned digits in a
programming language might be as follows, where D stands for the class of digits and U stands
for the class of unsigned integers:
D -> 0 D -> 5
D -> 6
D -> 7
Z> -» 8
£> -> 9
£/->£> U -+ UD
The example, when rewritten in BNF, becomes
<digit>:: = 0|l|2|3|4|5|6|7|8|9
<unsigned integer):." = <digit>|<unsigned integer) <digit>
In this book, we will adopt a blend of the two notations. We shall use | but
not the other metacharacters. This allows for the compact representation of context-
free grammars.
There are a few special classes of context-free grammars which will now be
singled out.
Definition A context-free grammar G = (V, 2, P, S) is linear if each rule is
of the form
A -> uBv or A -»■ u
where A, B&N and u, z»GS*.
In a linear grammar, there is at most one variable on the right side of each
rule.
In a linear grammar, if the one variable that appears is on the right end of
the word, the grammar is called "right linear." A left linear grammar is defined in
analogous fashion.
Definition A context-free grammar G = (V, 2, P, S) is right linear if each
rule is of the form
A -+ uB A,BGN
or
A ■+ u mGS*
Examples S -*■ aSa\bSb\ A is a linear context-free grammar for the even-length
palindromes; that is,
L(G) = {wwT\we{a,b}*}.
The grammar G' given by
S -+ aS\a

20 FOUNDATIONS OF LANGUAGE THEORY
is a right linear grammar which generates
L(G') = a*
Any of these families of grammars gives rise to a family of languages. The
following definition makes the connection more precise.
Definition A language L is said to be of type X (e.g., phrase-structure,
context-sensitive, etc.) if there exists a type-X grammar G such that L(G) = L.
For any language, there will be infinitely many grammars generating it.
Example Let Gx be the following grammar:
S -+ aSa
S -»• aa
S -» a
Clearly, L{Gl)=a+.
This example shows that a* is a linear language, but we know still more.
The previous example establishes that a* is a right linear language.
A great deal is known about these classes of languages and grammars. In the
rest of this section, we shall give an outline. To summarize the material, we use Table
1.2. The undefined terms in the table will be explained subsequently.
TABLE 1.2 The Chomsky Hierarchy
Grammars Languages Automata
Phrase-structure
Type 0
Context-sensitive
with erasing
Context-sensitive
Monotonic
Context-free
LR(k)
Linear
Right linear
Left linear
Recursively
enumerable sets
Context-sensitive
Context-free
Deterministic
context-free
Linear context-
free
Regular sets
Nondeterministic
or deterministic
Turing machines
Nondeterministic
linearly space-
bounded Turing
machines, or
lba's, for short
Nondeterministic
pushdown automata
Deterministic
pushdown automata
Two-tape
nondeterministic finite
automata of a special
type or l-turn pda's
Nondeterministic
or deterministic,
one-way or two-way
finite automata

1.5 OTHER FAMILIES OF GRAMMARS-THE CHOMSKY HIERARCHY 21
We shall now try to give some intuitive meaning to the technical terms in
this table. (This organization is sometimes called the Chomsky hierarchy.) The reader
who has trouble understanding the table should not be distressed. The definitions will
be given later in full detail, and the results will be carefully proven.
A deterministic automaton which is in a given state reading a given input
always does the same thing (i.e., transfers to the same new state, writes the same
output, etc.). On the other hand, a nondeterministic automaton may "choose" any
one of a finite set of actions. A nondeterministic automaton is said to accept a string
x if there exists some sequence of transitions which start from an initial configuration,
end in a final configuration, and are "controlled by x".
A recursive set is a set for which there exists an algorithm for deciding set
membership. That is, for any string x, the algorithm will halt in a finite amount of
time, either accepting or rejecting x. A recursively enumerable set is a set for which
there exists a procedure for recognizing the set. That is, given a string* in the set, the
procedure will halt after a finite length of time, accepting x. On the other hand, if*
is not in the set, the procedure may either halt, rejecting x, or never halt.
Recursive sets are a proper subset of the recursively enumerable sets.
A Turing machine is simply a finite-state control device with a finite but
potentially unbounded read—write tape. The sets accepted by deterministic and non-
deterministic Turing machines are the same.
A /inear founded automaton (lba) is a Turing machine with a bounded
tape. It should be noted that if the automaton is limited to n squares of tape, by
suitably enlarging its alphabet to fc-tuples, it can increase its effective tape size to kn
for any fixed k. Hence the word linear.
While many properties of context-sensitive languages and deterministic
context-sensitive languages (languages accepted by deterministic lba's) are known,
it is an open question as to whether the two families of languages are equivalent.
A pushdown automaton (pda) has a finite-state control with an input tape
and an auxiliary pushdown store (LIFO, or /ast-m, /irst-out, store). In a given state,
reading a given input and the symbol from the top of its pushdown store, it will make
a transition to a new state, writing n > 0 symbols on the top of its pushdown store
(n = 0, corresponding to erasing the top symbol of the pushdown store).
LR(k) grammars are grammars that may be parsed from left to right, with k
symbols lookahead, by working from the bottom to the top. This class of grammar is
rich enough to contain the syntax of many programming languages but restrictive
enough to be parsable in "linear" time.
The deterministic context-free languages are characterized by the
automata which accept them. It will be shown that they are a proper subset of the family
of context-free languages.
PROBLEMS
Definition A phrase-structure grammar G = (V, S, P, S) is called monotonic
if each rule is of the form:
i) a-► j3 where a G V*NV*,f3e V*, and lg(a)< lg(j3);
ii) S -* A and if this rule occurs, then S does not appear on the righthand side
of any rule in P.

22 FOUNDATIONS OF LANGUAGE THEORY
1 Given a phrase-structure (context-sensitive with erasing)
[context-sensitive] {monotonic} grammar, show that there is a phrase-structure (context-sensitive
with erasing) [context-sensitive] {monotonic} grammar G' = (v', 2, P , S') with
L(G') = L(G) and for each rule a-* j3 in p', we have a GN*.
2 Prove the following statement (if it is true; if not give a counter example).
Fact For each phrase-structure (context-sensitive with erasing) [context-
sensitive] {monotonic} grammar G = (V, 2, P, S), there is a phrase-structure
(context-sensitive with erasing) [context-sensitive] {monotonic} grammar G = (V ,
2, P',S') such that L(G') = L(G ) and each rule in P' is of the form
a -* j3
withaG/V*, j3GN*,or
y4 -* a
with /4 G N and a G 2 U {A}.
In the next few problems, we shall show that the family of monotonic languages is
coextensive with the family of context-sensitive languages.
Definition For any phrase-structure grammar G = (V, 2, P, S), let the
weight of G be max {lg (j3)l a -* j3 is in P}.
3 Prove the following fact.
Fact For each monotonic grammar G = (V, 2, P, S), there is a monotonic
grammar G' = (V', 2, P', S) of weight at most 2, so that L(G') = L(G ).
4 Prove the following statement.
Fact For each monotonic grammar G = (V, 2, P, S), there is a context-
sensitive grammar G' = (V', 2, P', 5) such that L(G') = L(G).
5 Consider the following "proof" of Problem 4:
Assume without loss of generality that the weight of G is 2.
i) If a production n G P has weight < 2, then 7r G P .
ii) If 7r = ,4.6 -* C£» and C = A or fi = D, then 7r G p'.
iii) If 7T = /li? -* CD with C =£ /I and D ¥= B, then the following productions
are in P':
AB -* (7T,/4)£
(7T,^)fi -* (7T,/4)£»
(7T, A)D -* CD
where (n, A) is a new variable.
Surely P is a finite set of context-sensitive rules, and since
AB =*, (n,A)B gf (n,A)D g CD
we have that L(G')=L(G)
a) Is this proof correct? If not, explain in detail.
b) If the proof is correct, is the argument incomplete in some way? If there
are any gaps, complete the proof.

1.6 CONTEXT-FREE GRAMMARS AND TREES 23
c) If your answer to (a) was yes and to (b) was no, then what do you think
is the point of this problem?
6 Give a context-sensitive or monotonic grammar which generates
{xx\xe{a,b}*}
Be sure to explain your grammar and prove that it works.
7 Show that the family of languages generated by context-sensitive grammars
with erasing is identical to the phrase-structure languages.
1.6 CONTEXT-FREE GRAMMARS AND TREES
The systematic use of trees to illustrate context-free derivations is an
important device that greatly sharpens our intuition. Since we use these concepts in formal
proofs, it is necessary to have precise formal definitions of the concepts. It turns out
that the usual graph-theoretic terminology is insufficient for our purposes. On the
other hand, a completely formal treatment becomes tedious and obscures the very
intuition we wish to develop. For these reasons, a semiformal approach is developed
here, which can be completely formalized. We leave a number of alternative approaches
for the exercises.
An example of a tree T is given in Fig. 1.1. It has one or more nodes (x0,
Xi, .. . ,Xio), one of which is the root (x0). We denote the relation of immediate
descendancy by r (thus, on Fig. 1.1(a), x2 Tx4 but not x2 Fxl0). The descendancy
relation is the reflexive and transitive closure r of T (now x2 T^io). If xf~ y, then
the path from x to y (or path to y if x is the root) is the sequence of all nodes
between and including x and y (for example, (x0, x2, x4, xa) is the path frornxo tox8
b B a
(b)
FIG. 1.1

24 FOUNDATIONS OF LANGUAGE THEORY
in Fig. 1.1(a)). If n is the number of nodes in one of the longest paths in T, then the
height of T is, by definition, n — 1 (the tree in Fig. 1.1(a) has height 4). A node x is
a leaf in T if for no y in T, x V y. Nodes that are not leaves are called internal.
We must pay attention to the left-right order of nodes. Let
yi,yi,---,ym, m>\ (1.6.1)
be a sequence of all the leaves in T without repetition. (There is no other restriction
on this sequence except the informal assumption that sequence (1.6.1) represents the
intuitive left-to-right orders of leaves.) Now we define a special relation L between
certain pairs of nodes in T.
For any x, y in T,
x L y
if and only if
i) x and y are not on the same path, and
ii) for some leavesyhyi+l (1 </< m) in (1.6.1) we have x [7; and 7 r yi+1.
Thus, in particular, there is no leaf "between" x and y. (Thus, for instance, we have
x2 \~x6 but not x9 \-X6 in Fig. 1.1(a).) Note that relation L is determined uniquely
by (1.6.1). The left-to-right order is then the reflexive and transitive closure \1 of |_.
Note that we have r+ D Lt = 0. On the other hand, one of the two relations |_! orT*
(or their inverses) holds between any two nodes of a tree.
Every node xGT has a label X(x) from a given finite set of labels — in our
cases it is always the set VA = VU {A}, where V is the alphabet of a given grammar.
The corresponding function X: T ►*■ VA;: x ►*■ X(x) is the labeling (thus Fig. 1.1(a)
&. Fig. 1.1 (b)). Of special importance are the root label of T, rt(7), and the frontier
of T, fr(T). The latter is defined as a concatenation of labels of leaves in (1.6.1):
fr(D = XOi)X02)--XOm)GF*.
Note that some leaves may be labeled by A and thus fr(7) may be shorter than sequence
(1.6.1) (for instance, in Fig. \.\,ix{T) = abaA).
For any internal node x in T the set {y\ y = x or x\~y} defines an
elementary subtree of T in an obvious way. A cross section in T is any maximal
sequence of nodes in T (with respect to the partial ordering obtained from the prefix
relation)
? = (xi,x2,■ ■. ,xn)
such that Xi Lx2 L • • • L xn (n > 1). Thus, in particular, sequence (1.6.1) is a cross
section in T.
Two trees T, T' are structurally isomorphic, T = T', if and only if there is a
bijection T *-T'\ x ** x , called a structural isomorphism from T to T', such that
xTy (x\-y) if and only if x \~y' (x'Ly'); (intuitively, T and T' are "identical
except for the labeling"). Where there is no danger of confusion, we write T = T'
when T= T', and the labeling is preserved (this is in agreement with the customary
understanding that T and T' are "identical," but not with the algebraic definition of
trees). Also we use the same symbols F, L, X, even for different trees.

1.6 CONTEXT-FREE GRAMMARS AND TREES 25
We now define certain trees and sets of trees related to grammars. Let G =
(V, 2, P, S) be a context-free grammar. We interpret the productions of G as trees
(of height 1) in a natural way: The production A0 + AiA2 • • ■ An corresponds to
the tree shown in Fig. 1.2 (or, formally, x0 r xlt x0 \~x2, . .., x0 t~ xm, Xi Lx2 L
• • • \-Xn, and \(x;) —Ah 0 <*'<«). We make a convention that, in this correspondence,
X(xi) = A if and only if i = n = 1 and A 0 -* A is in P.
Definition Let G = (V, 2, P, S) be a context-free grammar and let T be a
tree with labeling X: T ►* V\. T is said to be a grammatical tree of G if and only if,
for every elementary subtree T' of T, there exists a production pGP corresponding
to T'. Moreover, a derivation tree T is a grammatical tree such that fr(r) € 2*.
Consequently, the leaves in a derivation tree are precisely the nodes labeled
by letters in 2 (called the terminal nodes) or by A (the A-nodes). All other nodes
are labeled by letters in Af = V — 2 (and are accordingly called the nonterminal
nodes). Note, that in particular, a trivial tree consisting of a single node, labeled
by a € 2 or by A, is a derivation tree of G.
A, A
The tree in Fig. 1.1 may serve as an example of a grammatical tree for any
grammar with productions
S -> aAB I bBa
A -* SA
B -> A
Now let us turn to the question of relating these trees to the properties of
the relation =».
Example Let G be
then the derivation
is associated with the tree
S -* AB
A -* a
B -* b
S => AB => Ab => ab
S
A'
a i
i
>B
<b
(1.6.2)

26 FOUNDATIONS OF LANGUAGE THEORY
Note that the derivation
S => AB => aB => ab (1.6.3)
is also a derivation of ab, and the same tree is associated with it. Thus there are many
different derivations associated with the same tree.
Example If G is the grammar
S -+ aSa
S -> bSb
S -»• A
which may be compactly written
S -> aSa\bSb\A
then the derivation
5 =* aSa =* abSba =* abaSaba =* abaaba
is represented by the tree in Fig. 1.3, where the derived string may be read off by
concatenating in left-to-right order the symbols on the frontier of the tree; i.e.,
"reading the tree leaves."
Since many derivations may correspond to the same tree, we would like to
identify one of these as canonical so as to set up a one-to-one correspondence between
trees and derivations.
S
a
b
a
Definition Let G = (V, 2, P, S) be a context-free grammar and let a, j3 £ V*.
If
a => /3 (1.6.4)
then there exist ax, a2, .4, 6, so that a = ai^a2, (3 = ai0a2, and A ->■ 6 is in P. If
a2 G 2 , then (1.6.4) is called a rightmost (or canonical) generation. If ax G 2*, then
(1.6.4) is a leftmost generation. The notation
a ^ (3 or a =* 0
is used in the respective cases.
FIG. 1.3

1.6 CONTEXT-FREE GRAMMARS AND TREES 27
Note that (1.6.2) and (1.6.3) provide examples of rightmost and leftmost
derivations.
Now we can state the correspondence between trees and rightmost
derivations.
Theorem 1.6.1 Let G = (V, 2, P, S) be a context-free grammar. There is a
one-to-one correspondence between rightmost (leftmost) derivations of a string
w £ 2 and the derivation trees of w with root labeled S.
Proof The proof is obvious intuitively, the correspondence being the natural
one. A formal proof is omitted, since trees have been introduced only informally. □
The condition in Theorem 1.6.1 that w £ 2 is necessary. To illustrate this
point, consider the grammar:
S -+ ABC B -»• b
A -»• a C -»• c
The tree
b
has frontier AbC. There is no rightmost or leftmost derivation that contains that
string.
Definition Let G = {V, ~L,P,S) be a context-free grammar; a&V is a
canonical sentential form if S^ a.
xv
Not every sentential form is canonical, as the following trivial example shows.
Let G be
S ^ AA
A ->■ a
then aA £ S(G) but is not canonical.
Definition A context-free grammar G = (V, 2, P, S) is ambiguous if there
exists x £L(G) such that x has at least two rightmost generations from S. A grammar
that is not ambiguous is said to be unambiguous. Alternatively, G is unambiguous if,
for any two derivation trees Tand T' (from S), fr(!T) = fr(r') implies T= T'.

28 FOUNDATIONS OF LANGUAGE THEORY
Example Let G be S->SbS\a. Then G is ambiguous, since the string (ab)2a
has the two rightmost generations shown in Fig. 1.4.
r * 1
b /
s.
a
>
S
( • i
b
>
i
FIG. 1.4
Definition A context-free language L is unambiguous if there exists an
unambiguous context-free grammar G such that L = LOG).
It can easily be shown that the language generated by the grammar in the
preceding example is
Consider the grammar G':
Then we have
L{G) = {(ab)na\n>0}.
S -+ Ta
T -> abT\A
T f abTf (abfT f{ab)n
S =» {abfa, n>0
We have that LOG) = LOG') and hence the language L = {Oab)na \n > 0} is
unambiguous.
This discussion raises a deep question. Can ambiguity always be disposed
of by changing grammars? Assuming the worst, the following definition would be
useful.
Definition A context-free language is inherently ambiguous if L is not
unambiguous (i.e., there does not exist any unambiguous grammar G such that L =
LOG)).
In a later chapter, it will be shown that there exist inherently ambiguous
context-free languages. The following is an example.
L = {db'ck\i = / or / = k,where/',/, k> 1}
Returning to the relationship between trees and grammars we note that if
G = {V, 2,,P, S) is linear, then each rule in G is of the form

1.6 CONTEXT-FREE GRAMMARS AND TREES 29
A -»• aia2 ■ ■ ■ fl,-5fl,+ i • • • ak B&N, d(6S,
or
A ->• flifl2 "' 'flfc A: >0.
Thus trees associated with a linear grammar have a special form. At each level, at
most one node may have descendants. An example of such a tree is shown in Fig. 1.5.
Example Let G be S -*■ aSb I A, which is a "minimal linear" grammar in that
it has only one variable. Then
L(G) = tfb'\i>0}.
FIG. 1.5
Then a derivation tree is of the form shown in Fig. 1.6.
If the grammar is right linear, then the generation trees have an even more
restricted form. At each level at most one node may have descendants, and this must
be the rightmost node. Thus, Fig. 1.7 would be an example of a derivation tree for a
right linear grammar.
S
a' is *b
FIG. 1.6 I
s
"1
FIG. 1.7
ci
cm

30 FOUNDATIONS OF LANGUAGE THEORY
PROBLEMS
1 Let G be the grammar
SbS\a
and consider the tree shown in Fig. 1.8. We shall give a technique for numbering the
tree. The root will be numbered (1). The successors are numbered (1,1), (1,2), (1,3)
in order, and the numbered tree looks like Fig. 1.9.
Using this idea, construct a formal definition of generation trees.
2 Using the definition of Problem 1, prove Theorem 1.6.1.
3 Show that the generation trees, as defined in problem 1, have the following
properties:
i) there is exactly one root which no edge enters; i.e. it has in-degree 0;
ii) for each node N other than the root, there is a unique path from the root
to TV.
4 Show that if G = (K, 2, P, S) is a context-free grammar, then
for some a& V , u> G Z*, if and only if there is a grammatical tree T such that the
root is labeled by A, fr(T) = w, and a = X(§), for some cross section % in T.
5 Let T be a set. Define a "tree structure" {J, a, X, x0) over Tby the
following axioms:
i) a. and X are partial orders on T.
ii) a U X U (a U X)-1 = T x T, where for any relation p, p~l denotes the
inverse (or converse relation); that is, (x,v)Gp-1 if and only if (y, x)
Gp.
(1,3)
(1,M)
(1,1,1,1)
FIG. 1.9
(l,l,2)
(1,1,3)
(1,1,3,1)

1.7* RAMSEY'S THEOREM 31
iii) X' = a'1 X'a, where X' = X - {(*, x) I x G T}.
iv) x0 G T, and for any y E.T, x0ay.
Intuitively, xay means that x is an "ancestor" of y while x\y means x is to the left of
y. Do our grammatical trees satisfy these axioms? Do these axioms capture the essence
of "tree structures?"
6 Let T be any tree. Define a canonical cross section (of level n, for n > 0)
in T inductively as follows:
i) The sequence % = (x0) consisting of the root x0 of T, is a canonical cross
section (of level 0) in T.
ii) Let
£o — ix\, ■ ■ ■ ,xk, . . . , xm)
be a canonical cross section of level h, in which xk is the rightmost node
which is internal in T. Let y\, y2, . . . ,yr be all the immediate
descendants of xk in T and let y i L y2 I— ■ • • L ». Then the sequence
§i = (xi, . . . ,xfe _1(^i,. . . ,>v, xfe+1, . .. ,xm)
is a canonical cross section (of level h + 1) in T.
a) Show that every canonical cross section is a cross section.
b) Show that no two distinct canonical cross sections in a tree can be of the
same level.
7 Let T be a derivation tree of any grammar G, and let % = (xj, . . . , xm)
and § =(*!,..., xm') be two canonical cross sections in T, of level h and h ,
respectively. Define X(§) = X(x,) • • ■ \(xm) G K*, and similarly X(§'). Show that X(§) ^ X(§')
if 6' = h + 1. Also show that if T is a derivation tree with fr(D = S of G, then X(§) is
a canonical sentential form in G; and that, conversely, for every canonical sentential
form a, there is a grammatical tree with a canonical cross section % such that X(§) = a.
8 Prove the following useful proposition.
Proposition Let T be a derivation tree of some unambiguous grammar G and
let %, % be two canonical cross sections in T. Assume that if rt(r) = A , then S =*x A y
for some*,/ G 2*. Then X(§) = X(§') implies § = §'.
1.7* RAMSEY'S THEOREM
Let 2 = {a, b). It turns out that any string in 2* of length at least 4 has two
consecutive identical subwords. To see that, let's try to find a counterexample x.
The string x must start with a letter, say a. The second letter cannot be a for, if it were
we would have x = aa ■ • • and that would satisfy the condition. Similarly, the next
character must be an a and again the same reasoning forces the fourth letter to be b.
Thus, x begins
abab = (ab)2,
and the result is verified.
Now let us consider another more general situation. Let 2 be a nonempty
set and suppose that 2+ is partitioned into two classes; that is, 1,*=A UB with
A C\B = 0. Any string of length at least 4 has two consecutive nonempty substrings,
which are both in A or both in B. It clearly suffices to deal with a string of length 4.
Suppose w is a string of length 4, which does not have the desired property w = abed

32 FOUNDATIONS OF LANGUAGE THEORY
with a, b, c,d£S. Suppose a&A. Then we must have b 6B, cGA, and dEB. To
which set does the subword be belong? If it belongs to A, then jz,,^ is a
contradiction and if be G B then ^^ is a contradiction.
We now state a very general theorem due to Ramsey, and examine special
cases of it. The theorem will show that these examples are special cases of a more
general situation.
Theorem 1.7.1 (Ramsey) Let qlt... ,qt, r be given positive integers such
that Kr<ql,. .. ,qt. There exists a minimal positive integer N(ql,...,qt,r)
such that for all integers n ^N(gi,. .. ,qt,r), the following is true.
Let S be a set with^ \S\ = n, and suppose that Pr(S), the set of all subsets
of S of size r, is expressed as
Pr(S) = {U^S\ It/I = r} = At U • • • UA„
with j4,- n^4y = 0 for /' =£/'. Then there exists F£ S and /' such that \F\ = q;
and
PAF) 9 Aj.
The theorem is sufficiently complex that some examples may help to clarify
the result before we prove it.
Example Consider the statement when r = 1. The statement is:
Let #!,..., qr t > 1. There exists a minimum positive integer N(q x,..., q t, 1)
so that, for all integers n >N(qt,.. .,qt, 1), the following is true: If S =
Ax U • • -UAt, \S\ = n, A^Aj = 0 if i=t=j, then S has a subset T such that
Irl = qj and T1- Aj for some/.
This is a familiar statement in which we can rephrase the second part as
follows:
If there is a set of n balls and t boxes with n >N(qi,..., qt, 1), then there
is a subset of qj balls which go into box Aj.
This form is more familiar and is known as Dirichlet's principle (or the "drawer"
or "shoe box" principle).
Let us estimate N(qi,..., qt, 1). The largest case in which this would not
be true is taken:
N(.qi,...,qt,l) = ((7,-1) + ... + 07,-1).
(If <7i = .. . = qt = q, then we could have q — 1 balls in each box and the principle
would not be true.) Adding one more point gives the desired result, so:
N(qi,...,qt, 1) = 1+ £(<7,--l) = <?, +...+qt-t+l.
1=1
' For any set S, the cardinality of S, or the number of elements in S, is denoted by
ISl.

1.7* RAMSEY'S THEOREM 33
Example Consider the theorem with qx = .. .=qt=q>r>\. The
statement is roughly that, if ISI = n, which is sufficiently large, and
Pr(S) =AlU---UAt,
then there exists F£ S, \F\ = q, and Pr(F) £ A,.
The statement is equivalent to Ramsey's theorem. In one direction, this is
trivial. Conversely, suppose the assertion of this example to be valid. In the general
case 1 <r<qlt... ,qt, and define q = max {qt}. Then, by the above statement,
there is a subset F with iFl = q and Pr(F) £ Ar Take any subset F' £ F such that
|F'| = q-j, and we have that Pr(F')£ Pr(F) 9 Ajy and so Ramsey's theorem is satisfied.
Now we begin the proof of Ramsey's Theorem.
Proof We begin by induction on t.
Basis, t = 1. The result is trivial since Pr(S) = A i and we need only take
N(q1,r) = ql.
Induction step. t>2
We first show that it suffices to prove the result for t = 2. Suppose that we
can prove the theorem in that case. We will now show that the result is true for t + 1.
We can write
Pr(S) = Ax U...U^t+1 = Ax U(42U...U^t+1).
Let q2 = N(q2, ■ ■ ■ ,qt*i,r). If n^N(qi, q'i,r), then S must contain a subset F
with the property that:
i) |F| = qi andP^F)9 Ai,oi
ii) |F| = q'2 and Pr(F) £ A2 U • • • UAt+i.
If the first alternative holds, we are done. Otherwise, take S' = F so that 15'I = q\ =
N(q2,..., qt+l, r) and Pr(S') = A'2 U . .. U A't+l where A'} = Aj n Pr(S') for 2 </ <
t+l. By the induction hypothesis, there is a subset G£ S' so that IGI = <7; and
Pr(G) £ Al'k Ax for some /'. This completes the proof of sufficiency.
Next, we return to a proof of the result for t = 2. It has already been established in an
example that:
N(qi,q2,l) = qi+q2~l.
Claim 7V(<7i, r, r) = qx andN(r, q2, r) = q2.
Proof The second result follows from the first by symmetry. To see the first
result, let q2=r and n^qx. If A2 =£0, then let {ax,..., ar} be an /--combination of
A2. Then F= {ax,... ,ar} is a subset of S (since \S\ = n>qx >r) such that Pr{F) 9
A2. If A2 =0, then Pr(S) = Ax; and if S is any set with \S\ = n >otl, then any subset
Fofqi elements has Pr(F) £ A x. This completes the proof of the claim.

34 FOUNDATIONS OF LANGUAGE THEORY
The main argument with t = 2 is completed by induction. Assume that we
have 1 <r<qi,q2- The induction hypothesis asserts that there exists integers
N(qi - 1,(72, r), N{qx, q2 — l,r), and N(q'uq'2, r—\) for all q\, q\ which satisfy
1 </-— 1 <q\, q2. We shall show the existence of N{qx, q2, r). This is valid because
we can fill out the following array for each /-; for example, for r = 2, we have:
N(2, 2, 2) N(2, 3,2)
7V(3,2,2) N(3,3,2)
7V(4,2,2) 7V(4,3,2)
This will imply the existence of N(qx, q2,3), etc.
Assume the existence of pl =N(g1 ~ \, q2, r), p2 = N{qx, q2 — \, r), and
N(j)\, p2, r— 1), and we shall prove the existence of N(qlt q2, r). Moreover, it will
be shown that
N(ql,q2,r)<N(pl,p2,r-l)+l. (1.7.1)
Suppose n >N{px, p2, r— 1) + 1. Let S be a set with a65 and ISl = n.
Define T = S - {a}. Then
Pr(S) = Ai UA2,
with ^ C\A2 =0. Define
Pr-i(T) = fli UB2, (1.7.2)
where B( = {UeP^^TJlUU {fl}G^,-} for / = 1, 2.
Clearly, \T\>N(pup2,r—1), since T = S — {a}. By the induction
hypothesis, there exists a set F so that
i) IFI = p! andpr_!(F)^5i,or
ii) IFI = p2 anApr^(F)^B2.
We assume the first case. (Case (ii) is analogous.) Since px =N(ql — 1,
<72, r),Fcontains a subset G so that:
iii) IGI = <7! -1 withPr(G)£Au or
iv) \G\ = q2 with^(G)^^,
by the induction hypothesis.
If (iv) holds, then the induction is completed, so we assume that (iii) holds.
Then H = G U {a} has the property that I//1 = qx; and now we consider the
/--combinations of H. If an /--combination of// does not contain a, then it is an /--combination of
F and, by (iii), it is in A\. If an /--combination of H contains a, then it consists of a
and an (/- — l)-combination of G. But G has all its r — 1 subsets in B\. By the
definition of the 5-decomposition, our /--combination is in A \. Thus, H is a subset of
cardinality qx and Pr(H) 9 At. Thus the induction is extended and the theorem is proven.
□

1.7* RAMSEY'S THEOREM 35
There is an interesting application of Ramsey's theorem to graph theory.
In the vernacular, the following result asserts that, at a sufficiently large party, there
exist m individuals who are all strangers to each other, or a clique of m people all of
whom know each other.
Theorem 1.7.2 For any m there exists N(m) so that, if n >7V(m), any
graph with n vertices either contains a set of m isolated vertices or contains the
complete subgraph on m points.
Proof Let N(m) =N(m, m, 2) with P1(S)=Al UA2, where A2 is the set
of edges and Ax =P2(S)—A 2. n
Another important application of Ramsey's theorem concerns repeated
patterns in strings.
Theorem 1.7.3 Let Z+=Al LM2U...LMt, AiC\A} = 9, if i¥=j. For
each natural number n, there exists a minimum positive integer Nt(n) such that each
word w, where lg(vv) >Nt(n), is in Z*AfL* for some/ with 1 </ < t.
t
Proof Let Nt(n) = N{n^T.. ,n, 2) and let w = ax .. .am, at£E, m>
Nt(n). Let S = {0,1, 2,.. ., m}. For each pair (/', /), with /', / £ S and /' </, associate
a unique word w,j = ai+1 ■ • • as. Thus there is a partition on Pi(S) induced by the
A;. By Ramsey's theorem, there is a set F = {p0,.. . ,pn}£ S with Pi<pi+i such
that
WpiPjl Pt<Pj; i,j£F}£Ae.
Thus the consecutive subwords of w (the words are consecutive since /', /6f)
wn d » wn D » • • • , Wn „
Po'Pi' Pi>P2' ' Pn-i'Pn
are all in^4E, and so wG 2*^4g2*. □
Now we can estimate the number Nt(n). We begin by an example.
Example 1 N2(2) < 4. This is the example which was given before.
Example 2 Suppose 7V2(2) = 3. Then take 2 = {0, \},Ai= {0}, A2 = 2+ -
{0}. The string w = 010 fails to be in one of the desired sets, since its factorizations are
A2
'o
Ax
n2
1 ,
I1
Ai
1
0
Ax
One can study this case in more detail and show that Nt(ri) = n*.
PROBLEMS
1 Prove the following geometric proposition:
Suppose there are five points in the plane and no three of them are collinear.
Then four of the points are the vertices of a convex quadrilateral.

36 FOUNDATIONS OF LANGUAGE THEORY
2 Prove the following fact:
Let m > 4. If m points in the plane have no three points collinear, and if all
the quadrilaterals formed from the m points are convex, then the m points are the
vertices of a convex m-gon.
3 Using Problems 1 and 2, prove the following theorem:
Let m > 4. There exists a minimal positive integer Nm so that the
following is true for all n>Nm: If n points in the plane have no three points collinear,
then m of the points are the vertices of a convex w-gon.
*4 Ramsey's Theorem becomes easier to prove in the infinite case. Prove the
following form of the result:
Let t, r > 1. Let S be an infinite set and suppose that
Pr(S) = Ai U. . .UAt
with A( UAj = 0 if i ¥=/. Then there is an infinite set F^ S so that Pr(F) 9 A{ for
some i.
1.8* NONREPETITIVE WORDS
For the moment, let 2 be a binary alphabet. Any string of length at least 4
must have two identical subwords, as we argued in the last section. It turns out that
over a three-letter alphabet, one can construct strings of infinite length such that no
two consecutive subwords are identical. Such a string will now be constructed. There
are many applications of these sequences, particularly in mathematical games.
The construction is made in the following way: Define
So = a
and
for each / > 0, where 0 is the homomorphism:
(pa = abcab = a,
(pb = acabcb = (3,
0c = acbcacb = y.
The desired string is the limit of the §,'s. For instance,
So = a
Si = abcab
S2 = abcabacabcbacbcacbabcabacabcb
etc.
The argument proceeds by examining these words closely.

1.8* NONREPETITIVE WORDS 37
Proposition 1.8.1
a) Each of a, /3, and y (which will be called blocks) begins with an a and is
a product of two words, each beginning with an a (which will be called
half-blocks).
b) All blocks and half-blocks are distinct.
c) For each x in {a, /3, y}*, if x= wlauaw2, where the number of a's in
u is 0 (written #a(u) = 0), then lg(«) < 3.
d) For any x in {a, /3, y}*, if x = wlauaw2 and #a(u) = 0, then u uniquely
determines the block to which the first a belongs and whether it is the
first or second a in that block.
Proof Only (d) requires explanation. In view of (c), the only cases that may
arise are those shown in Table 1.3.
TABLE 1.3
aua
. . . aba . . .
. . . aca .. .
. .. abca . . .
. . . acba . . .
. . . abcba . . .
. . . acbca . . .
First a
belongs
a
&
a
1
&
1
to:
First or second a?
Second
First
First
Second
Second
First
□
Now we prove a key lemma.
Lemma 1.8.1 If 77 is a nonrepetitive sequence and d = 0(t?), then d is a non-
repetitive sequence.
Proof Suppose d has a repetition; that is,
0 = 00?) = wltlt2w2
where tx = t2. It will be shown that 77 has a repetition.
CASE 1. Suppose #a(ti)>2. Let X be the block to which the first a
belongs. One possible picture is shown in Fig. 1.10. More generally, X need not be
contained in ti.
1 =
FIG.
1
• '[2
.10
a
X
A
1 a
~\
1
Y
'1
'
a
X
Y
h
1 a
\
|
|

38 FOUNDATIONS OF LANGUAGE THEORY
The block X and its position in t\ (respectively, t2) is determined by the
string between the first two a's in tx. (This is cross-hatched in Fig. 1.10.)
If the two blocks were together, then we would have
ti = t2 = X and d = WiXXw2
which implies
77 = w'iddw'2
where <f>(d) = X and <A(w,') = w{ for / = 1,2. That would finish the proof.
X
y\' • -yn
'1
X
y\> • -yn '' '
'2
FIG. 1.11
Now suppose the X-blocks are not adjacent. Then the picture is shown in
Fig. 1.11, where Yx,..., Yn are the blocks between the X-blocks. Note that Yn may
overlap t2 but all the a's in Y\ • • • Yn are in tx because we singled out the first a in
ti and ti = t2. These blocks are uniquely determined by the part of t\ to the right
of X. Compare Figs. 1.8 and 1.9. The corresponding part of t2 determines the same
sequence of blocks and
77 = w\dyi ■ ■ -yndyi- • -ynw2
where 0(w,') = wh (pd = X, <S>y-} = Yjiovi= 1,2;/= 1,. . . , n.
CASE 2. #a(ti)= l.Thus
#a('2) = #a('l) = I-
Thus the two a's are consecutive in 6 = 4>(rf). Since the second a in each block is
preceded by c and since each block ends in b, the only way we can have consecutive
fl's in d is to have them following different letters in 6. The only way that this is
compatible with membership in tl and t2 is for ti and t2 to begin with a, as is shown
in Fig. 1.12.
h
1 1 a 1
v
/ \
a \ | |
'1
FIG. 1.12
Thus ti is a half-block and t2 is a subword of the following half-block. In
Table 1.4, we enumerate all the possible cases in order to determine which ones are
actually feasible.

1.8* NONREPETITIVE WORDS 39
Inspection of the table shows only two possibilities. Either:
1 ti is the second half-block of a and t2 is a subword of the first half-block
ofa;or
2 ti is the second half-block of 7 and t2 is a subword of the first half-block
of y.
In the event that (1) holds, 77 contains aa; and if (2) holds, 77 contains cc.
TABLE 1.4 Feasible matches
between tt
ti
abc
ab
ab
ab
ac
abcb
abcb
abcb
acbc
acb
acb
acb
[ and t2w2
t2w2
ab
abc
ac
acbc
abcb
abc
ac
acbc
acb
abc
ac
acbc
Feasible?
No
Yes
No
No
No
No
No
No
No
No
No
Yes
CASE 3. #a0i)=0. Then #a(t2) = 0 also. Since there are at most three
letters of the sequence between any two consecutive a's, the only possibility is that
lg(fi) = IgCa) = 1.
but this is impossible, because no two consecutive letters are equal. □
With the aid of this lemma, the main theorem follows easily.
Theorem 1.8.1 Let 2 = {a,b,c}. There is a "doubly-infinite"-length string
over 2 which is nonrepetitive; i.e., has no two consecutive identical subwords.
Proof Define the sequence
So = a,
Then
Si = 0a = a = abcab
§! = a contains a = So as a strictly proper subword [i.e., the first and last letters of
Si are not in So] ■ Repeating §2 has Si as such a subword and, in general, %n contains
S„-i as a strictly proper subword. There is a unique infinite-length 2-sequence % =
liirin^ooSn such that each prefix of S of length lg(Sfe) is %k. S does not contain any
repetition of any subword because §0 = a does not and by the Lemma. □

40 FOUNDATIONS OF LANGUAGE THEORY
PROBLEMS
*1 Generalize the result of this section to show that there exist sequences
of unbounded length over an alphabet of five letters, which are strongly nonrepetitive;
i.e., such that there are not two consecutive words t\ and t2 such that t2 is obtained
by tx by a permutation of the letters. [Hint. Let 2 = {a, b, c, d, e) and let a be the
cycle (abed e). Define a homomorphism 0 by:
4>a
4>b
0C
4>d
0e
=
=
=
=
=
bacaeacadaeadab
a (a),
a2 (a),
a3 (a),
a4 (a),
So =
*;+i =
a
«(*»),
Now define
to obtain the desired result.] Can you take the limit of the £,• to obtain an infinite
sequence?
2 Taking the sequences constructed in Problem 1, show that asymptotically
each symbol occurs with density |.
3 Show that if k < 3, any string over an alphabet of k symbols of length
greater than 2fe — 1 has two consecutive subwords, one of which is a permutation of
the other.
4 (Open problem). Do there exist infinite sequences over a four-letter
alphabet which are strongly nonrepetitive?
5 Let S be a semigroup. An element 0€S is a zero for S if sO = Os = 0
for each s65. S is nilpotent if there exists an integer k such that Sk = {0}.
Construct a semigroup S which is nonnilpotent, has three generators, and the square
of each element in S is 0.
1.9* THE LAGRANGE INVERSION FORMULA AND ITS APPLICATION
There is an old inversion formula due to Lagrange which is occasionally useful
in language theory. We sketch a derivation of the formula and omit all considerations
of convergence, as is customary in the combinatorial applications of generating
functions and power series.
Suppose that
x = yf(x), (1.9.1)
and that f(x) can be expanded as a power series in x:
fix) = 1 ftx1, (1.9.2)
1 = 0

1.9* THE LAGRANGE INVERSION FORMULA AND ITS APPLICATION 41
where f0 ¥= 0. We wish to solve not just for x as a power series in y, but for some
functiong(x). That is,one seeksa0,ai,...,such that:
g(x) = t a-j. (1.9.3)
1 = 0
Since (1.9.3) implies that x = 0 if y = 0, then a0 = g(0)- Moreover,
y = x(f(x)yl = £ bfic'. (1.9.4)
;=i
If Eq. (1.9.3) is differentiated, we get
g\x) = I ia^'1 £. (1.9.5)
Multiplying (1.9.5) by (f(x))n yields:
S ■ dv
g'(x)(f(x)f = X /^'-'(/W)"/. (1.9.6)
/=i dx
By using(1.9.1), we get:
g\x){f{x))n = 1 ia^'"-1 xn ^. (1.9.7)
i*=i ax
Let us now evaluate
J^i tfMM4-.
by using Eq. (1.9.7). Clearly, the result is:
(n - 1) ! [Coefficient of*""1 in (1.9.7)]. (1.9.8)
This gives rise to two cases.
CASE 1. i = n. The general term of (1.9.7) becomes:
iaiyi'n''xn^- =nanxn-d^, (1.9.9)
dx y dx
n ^"=1 lV"
= nflnx"'^! +CiX + c2jc2 +...).
Therefore the coefficient of*"-1 is:
nan. (1.9.11)
CASE 2. / =£ n. The general term is
• , dv \ d .
iaiXny'""-» -j- = iaiX" — \y>~n ]. (1.9.12)
dx i—ndx
nanxn !Ti ' ,. (By using (1.9.4)) (1.9.10)

42 FOUNDATIONS OF LANGUAGE THEORY
Note that there does not exist any power of x (even negative ones) which has a
derivative l/x; consequently (d/dx) \y'~n] does not contain*-1 and Case 2 contributes no
terms in x"'1.
ThuS £ ■ , dy
X idix"/-"'1 -^ = ... + nanxn~l + ..., (1.9.13)
and therefore ,n
. T^i fe'WtfW)"] = (« - 1)!«»„ = «!«„•
c?x '*=o
The expression for a„ is:
n! g?x
fl" = i tSt-i &'(*)(/(*))") I _ (1-9-14)
if n > 1 and
flo = *(0). (1.9.15)
Summarizing the above discussion, we have proved a famous theorem of
Lagrange. Actually we shall state a slightly stronger theorem than we proved, and leave
the generalization to the reader. Instead of forcing our original relation to be
homogeneous, we shall allow a constant term. The changes to the statement and theorem
are minor.
Theorem 1.9.1 Let
x = c+yf(x). (1.9.16)
Then
dn-~, g'(x)(f(x)r (1-9.17)
g(x) = g(c)+ X ^r
ixn
First, we give a purely mathematical example of the use of Lagrange's
Theorem.
Example Let
x = yex. (1.9.18)
Then (/(*))" = enx and takeg(x) = x. Computing, we get:
dx"'1
so that „
flo = 0,
a„ = nn~
and oo ..n
L—y. (i.9.i9)
n = l n\
One may check that (1.9.19) is a solution of x in terms ofy by substituting in (1.9.18)
Now let us consider an example from language theory. Consider the
grammar G:
S -> SbS\a

1.9* THE LAGRANGE INVERSION FORMULA AND ITS APPLICATION 43
The language L(G)= {(ab)na\ n>0} is ambiguously generated by G. We seek the
number of derivations of (ab)na. Let us rewrite the grammar as an equation:
S = a+SbS. (1.9.20)
One may now derive the power-series solution of (1.9.20) iteratively but we need not
do so here. If we map Eq. (1.9.20) into the ring of polynomials in the natural way
(actually mapping from the noncommutative case into the commutative one), we get
U UlUL/L/Ulgi
Thus (1.9.20) becomes:
Thus f(x) = x2. For g(x),
Now we evaluate
dn-l
dx"'1
and ,~ -.
(2n)-
an -
Therefore
S » x,
b » y,
a » 1.
x = yx2 + 1.
we choose:
g(x) = x.
(x2n) = (2n)(2n - 1) • • • (n + 7)xn + l
•■(n + 2)(n+l)! 2n\ _ (2„")
n\ (n+1)! (n+l)(n!)2 (n + 1)
*-i+i[i?y
(1.9.21)
This tells us that there are (2„")/(n + 1) derivation trees of (ab)na. One could also
derive the result from the grammar by generating function techniques, but the
present approach is easier and more general.
PROBLEMS
1 Carry out the nonhomogeneous proof of Theorem 1.9.1.
2 Was it necessary to use the nonhomogeneous form of Lagrange's theorem
in working the last example of the section?
3 Let 1 — x + yxl3= 0. Use Lagrange's theorem to solve forx" as a function
of y. By judicious choices of a and /?, a number of classical identities involving
binomial coefficients may be derived.
4 Consider the grammar
S -> aSa\bSb\aa\bb\A
Show directly that the number of strings of length i in this grammar is given by 5;,
where
0 if i is odd,
6< i ,
2* if i = 2/'.

44 FOUNDATIONS OF LANGUAGE THEORY
5 Suppose we have a triangle ABC. By placing a new point in the interior
and joining this point to each of the vertices A, B, and C, we can decompose the
triangle into three elementary triangles. This construction may be iterated with some
other triangle. The diagrams in Fig. 1.13 show the "triangulations of a triangle" with
1,3, and 5 elementary triangles. Find the number of triangulations of a triangle into
n elementary triangles. [Hint. Construct a map from triangulations to parenthesized
strings. Construct an appropriate grammar which generates these strings such that the
degree of ambiguity is the desired number.]
, A
3 A
.A A A
FIG. 1.13
1.10 HISTORICAL SURVEY
The origins of formal language theory may be found in the work of Chomsky
[1959a, 1963, 1964, 1965]. A number of important early contributions are due to
Bar-Hillel and his colleagues, and may be found in Bar-Hillel [1964]. There are a
number of other general books on this subject. Aho and Ullman [Vol. 1,1972] cover many
of the basic facets of the theory in a book oriented toward theories of translating and
compiling. Book and Greibach [1973] present the basic theory in a style oriented
towards a general abstract study. Ginsburg [1966] is a text that emphasizes the
mathematical aspects of the context-free case. Hopcroft and Ullman [1969] give a
very general coverage, as well as a number of related topics. The book by Salomaa
[1973] covers a wide variety of topics concerning formal languages, also.
Theorem 1.2.1 is from Levi [1944]. Much of the material from Section 1.3
may be found in Lentin [1972] and Schutzenberger [1967]. In particular, Theorem
1.3.3 is from Fine and Wilf [1965]. Theorem 1.7.1 is from Ramsey [1930]. The proof
given follows Ryser [1963] and Erdos and Szerkeres [1935]. The latter reference
contains the geometric results given in the problems. Theorem 1.8.1 is from Pleasants
[1970]. Our proof of the Lagrange inversion formula follows Stanton [1961].
Applications of simple language theory to compiling may be found in the
textbooks by Gries [1971] and Aho and Ullman [1977].

tWD
Finite Automata
and Linear Grammars
2.1 INTRODUCTION
In this chapter, regular events, or the sets accepted by finite automata, are
discussed. In particular they are characterized by a class of context-free grammars
as well. This fills in one of the pieces of the Chomsky Hierarchy. A number of simple
but useful results are obtained both in the text and in the exercises.
It has been assumed that the reader has had some familiarity with elementary
automata theory and so the coverage in this chapter is quite concise. Readers who
experience difficulty are advised to consult some of the general references given in
Section 2.7.
2.2 ELEMENTARY FINITE AUTOMATA THEORY - ULTIMATELY
PERIODIC SETS
The techniques and results of finite automata theory are necessary for our
development. The subject is treated very concisely here and the reader is referred to
the literature for a more leisurely discussion.
Definition A finite automaton is a 5-tuple A = (Q, 2, 8,q0, F), where:
i) Q is a finite nonempty set of states.
ii) £ is a finite nonempty set of inputs.
iii) 6 is a function from Q x 2 into Q called the direct transition function.
iv) q0^Q is the initial state.
v) f~ Qis the set of final states.
45

46 FINITE AUTOMATA AND LINEAR GRAMMARS
Next, it is necessary to extend 6 to allows! to accept sequences of inputs.
Definition Let A = (Q, 2,,8,q0, F) be a finite automaton. For each (q,a,x)
G Q x 2 x 2*, define
8(q,A) = q and 8(q,ax) = 8(8(q,a),x).
Note the following simple identities:
For each q&Q;x,y,z&'2*,
8(q,xy) = 8(8(q,x),y); (2.2.1)
8(q,x) = 8(q,y) implies S(q,xz) = 8(q,yz). (2.2.2)
The notation rp(y) = 8(q0,y), the "response" to y, is sometimes used, and
denotes the stated reaches after reading^.
Finally, A accepts a string* G 2* if A, starting in q0 at the left end of x, goes
through some computation and reads all of x, and stops in some final state. The set of
all accepted strings is denoted as 7\A). More compactly,
T(A) = (*G2*| 8(q0,x)GF}.
Definition A set L £ 2* is called regular if there is some finite automaton
A such that L = T{A).
Example Let L = {a'V| i > 0, / > 0} = a*b*. We shall show that L is regular
by constructing a finite automaton A which accepts x. To avoid excessive formalism,
we draw the "state diagram" of A as in Fig. 2.1. The initial state is q0 and the set of
final states is F = {q0, q^.
a b
0—-—^Q—-—>
FIG. 2.1
In our discussions of families of languages, we want to show that certain
sets are not definable. For this reason, we need to find a set that is not regular.
Theorem 2.2.1 The set L = \a'b'\ i ~> 0} is not regular.
Proof Suppose L = {a'b'\ i>0} is regular. Then there exists a finite
automaton A = (Q, 2, 8,q0,F) so that L = T(A). Consider the set {6(^r0, «')l i > 0}. There
are infinitely many tapes a', i > 0, and only finitely many states in Q. Therefore, there
exist i,/, with i >/, so that
Sfao.<0 = SfaoX).

2.2 ELEMENTARY FINITE AUTOMATA THEORY 47
By setting z = b' in Eq. (2.2.2), we have:
8fooV6') = SfaoX'n
If 8(q0,a'b') = q is in F, then a'b' is in T^), which is a contradiction since />/. If
# ^ F, then a'b' £ T(A), which is again a contradiction. Therefore, L cannot be regular.
□
We shall now establish a necessary condition on strings in a regular set. This
theorem and its generalization for more powerful classes of sets is an important tool
in proving that certain sets do not belong to the family in question.
Theorem 2.2.2 (Iteration theorem for regular sets) Let A = (Q, 2, 6, q0, F)
be a finite automaton with n states. Let wEJ\A) and lg(w)>m. If m>n, then
there exist x,y, z, such that w = xyz,y ¥= A, and xykz G T(A) for each k > 0.
Proof Let w = al- • am. Consider the set Q = \rp(ax • • • at)\ 0 </ <m\
We have \Q\ = m+ 1 >n = |Q|. Then there must be p, r,such that 0 <p<r<n so
that qp = qr where qt = rp(at ■ • •«,) for / > 0. Take x = ax ■ ■ ■ ap, y = ap +1 • • • ar,
z = ar+1- ■ -am. Note that lg(y) = r — p>0. Thus w = xyzET(A). Also, rp(x) =
rp(xy).
We claim that
rp(x) = rp(xyk) for each k > 0.
Proof of claim by induction on k k = 0 is trivial. Then,
rp(xyk+1) = 8(rp(xyk),y).
By the induction hypothesis,
rp(xyk+1) = 8(rp(x),y) = rp(xy) = rp(x)
and the claim is verified. Therefore
rp(w) = rp(xyz) = 8(rp(xy),z)
= 8(rp(xyklz) = rp(xykz).
Thus, if w G T(A), then xykz G rC4) for each yfc > 0. □
Next we turn to the question of dealing with regular sets over the one-letter
alphabet.
Definition 1 A set X of natural numbers is said to be ultimately periodic if
X is finite or if there exist two integers No>0,p> 1, so that if x > N0, then xGI
if and only if x + p G X.
Example The sets X\ and X2 are clearly ultimately periodic:
Xx = {10, 12, 14, 16, 18,...}, #oi=10, Pi = 2;
X2 = {5,8, 11,14, 17,...}, #02=5, p2 = 3.

48 FINITE AUTOMATA AND LINEAR GRAMMARS
Also X3 = Xi U X2 is ultimately periodic. Find N03 and p3.
It is useful to establish a canonical form for ultimately periodic sets.
Theorem 2.2.3 Let X be an ultimately periodic set with N0 and p as in the
above definition. Let
A = Xn{i\ 0<i<No}
and
B = U feCO + pfl «>o},
;'=i
where
G = fe(l),...,g(s)}
= AT>{/| N0<i<N0 + p).
Then X = A U 5.
/Voo/ First we show the following proposition.
Claim 1B=U {g(/) + p/| J > 0} = {x GXI x >#<,} = ^at0-
/V<w/ rtar 5 9 xNo Note that for any /, g(j)<=X and N0<g(j)<N0 +p.
Then *(/) G X^ and g(/) + P°G X^.
Claim 2 Hy^X andy>N0, theny + ip GI for all i, i > 0.
Proof of Claim 2 The argument is a trivial induction on i.
Returning to the proof of Claim 1, note that g(j) + ip G XN and the first
part of that proof is complete, because we have now shown that B ? XNo.
Conversely, suppose that xGXN . ThenxGl and x>N0. Consider
{x\x>N0} = U Yi,
where
Y, = {y\N0 + ip<y<N0 + (i+l)p}.
This can always be done as we are dividing an interval into a countable number of
subintervals which do not overlap.
Claim 3 If x£XN n Yh then x = g(j) + ip for some /, 1 </' < s. Moreover,
xGB.
Proof of Claim 3 The argument is an induction on i.
Basis. i = 0. x£XNo and N0<x <N0 + p. Thus, x = g(j) for some /',
1 </ < s. Therefore x = g(f) + 0 • p and the basis is complete.
Induction step. Assume the result for i = k. Suppose that x£XN and
N0 + (k+ l)p<x <N0 + (k+2)p. Subtracting p from all parts of the inequality

2.2 ELEMENTARY FINITE AUTOMATA THEORY 49
gives N0+ kp <x — p <N0 + (k + \)p. Thus* — p£XNa n Yk and, by the induction
hypothesis, x — p =g(f) + kp for some /. Now we know that x =g(j) + (k + \)p,
and the induction has been extended. Note that xGB because B contains numbers
of this form.
Now the proof of Claim 1 can be finished. Recall that we assumed that
x £X and x >N0- Then x £ Yt for some / and, by Claim 3, x = g(j) + pi for some
/, so thatxGB.
To do the main proof is now easy. Clearly, if x£A UB, then xGI, since
A'-X and by Claim 1. On the other hand, suppose xGX. If x <N0, then xGA. If
x ~> N0, then xGYt for some i, and Claim 3 implies that xGB. □
We can also discuss ultimately periodic sequences as well as sets.
Definition 1 An infinite sequence {xn} of natural numbers is ultimately
periodic if there exist integers N'0 ~> 0 and p > \ such that xn+p' = xn for each n > N'0.
Example (3, 2,1,1,2, 2,1,1, 2, 2,1, 1,2,. ..) is an ultimately periodic
sequence.
This allows us to give a second definition of ultimately periodic sets.
Definition 2 A set X of natural numbers is said to be ultimately periodic if
X is finite or if the sequence (jv1,jv2 —>"i, • • • ,yn+i ~)>n,.. .) is an ultimately periodic
sequence where (yi,y2,.. •) are the elements of X'm increasing order.
Example Consider the set X3 defined at the beginning of this section. A
simple calculation shows that the desired sequence occurs in the previous example and
so X3 is ultimately periodic.
Of course it must be shown that these definitions are equivalent.
Theorem 2.2.4 Definition 1 is equivalent to Definition 2.
Proof The argument is left to the reader.
Definition 2 is more convenient than Definition 1 in showing that particular
sets are not ultimately periodic.
Now we are ready to prove the main result about this topic.
Theorem 2.2.5 A set L £ {a}* is regular if and only if X is ultimately periodic
whereX = {i\ a'<=L}.
Proof Assume that X is an ultimately periodic set of natural numbers. By
Theorem 2.2.3, X = A UB where A is finite and
B = [}{g(j) + pi\i>0}.
Thus s
L = {al\ iGX) = {al\ i£A}U (J {ag(J)+ip\ i>0}.

50 FINITE AUTOMATA AND LINEAR GRAMMARS
The first set is regular since A is finite. The second set is the finite union of sets of
the form {ag(')+ip| i &*0} = a8(J^(cf)*, and so these are regular sets and L is regular.
Conversely, suppose that L'- {«}* is regular. L = T(A), where A = (Q,{a},
8,q0,F). Write L = {a'\ iGX}. If L is finite, then X is finite and hence ultimately
periodic, and we are done. Suppose L is infinite. Then let N0 be the least number such
that c^o^L and N0> \Q\. By the iteration theorem, we know that aN° = a'apas,
p>\, and moreover rp^0) = rp(a'(ap)"as)G F for any n>0. When n = 2, we have
rp(aN°) = rp(aN°+p).
To show that X is ultimately periodic, let i > N0 so that i = N0+ s for some
s > 0. We must show that i G X if and only if i + p €= X. We compute
/p(«'") = r/V^) = 8(rp(aN°),as)
= 8(rp(aN°+p),as) = rp(aN°+s+p)
= rp(ai+p).
Therefore / G X if and only if i' + p G X. Thus X is ultimately periodic. □
One important use of this result is to show that certain sets cannot be regular.
Example {an \ n > 1} is not regular. Let X = {n2 \ n > 1}. Suppose that X
were ultimately periodic. Then the sequence (yi,y2~yu ■ ■ •) would be ultimately
periodic. But this sequence is (1,3,5, 7,9, 11,. . . , In + 1,. . .). In order for the
sequence to be ultimately periodic, we need to have y„+i—y„ = yn+p'+i~yn+p'
for some p . This implies 2« + 1 = 2(n + p')+ 1, which implies p' = 0, which
contradicts p > 1.
PROBLEMS
1 Design a finite automaton that accepts the set
L = {u>G{a,/)}*| w does not contain the sub word bab}.
2 Let xGS*. Construct a finite automaton that accepts {x}. How many
states are there in the minimal finite automaton for this set?
3 Leti?9£x£be given. Define
HR = {flj • • -ak\ k>2, (ai,ai+l)GR{or each i, Ki<k}.
Show that HR is regular.
4 Let*, yes*. Define
XY'1 = {w\ wy<EX for somey G V}.
Show that if X is regular and Y is arbitrary, then XY'1 is regular.
5 Show that the set {xx\ x G {a, ft}*} is not regular.
6 Show that the following sets are not regular.
a) {/| n > 1} b) {a" | pis prime} c){a"l\n>l}

2.2 ELEMENTARY FINITE AUTOMATA THEORY 51
7 Let R be an equivalence relation on a set. Let rk(R), the rank of equivalence
relation R, be the number of equivalence classes induced by R. Let Ex and E2 be
equivalence relations on X. We say Ex refines E2 if Ex £ £"2 (that is, xExy implies that
xE-iy). Show that if Ex and £"2 are equivalence relations on X, it follows that, if
Ex £ e2, then rfc(£"i) > rk{E2).
8 Let Z? be an equivalence relation on 2*. R is a ng/jf congruence relation
if xRy implies that, for any word z G £*, (xz)R(yz); that is,
(x,y) (=R implies (xz,yz) (=R for all z£X*.
Furthermore let L <= 2*. Then Z? refines L iff (x, >>) G Z? implies that x G Z, if and only
if >> GZ,. That is, if xR.y, then either x and .y are both inZ, or x andy are both not in
L. Show that R refines L if and only if L is a union of some of the equivalence classes
oiR.
9 Let L 9 2*. Define Z?i; the ngTif congruence relation induced by the set L,
as follows:
(x,y)GRL if and only if for each z G 2*
[xz(=lL if and only if yz G L ].
Prove the following proposition.
Theorem
a) RL is a right congruence relation.
b) RL refines L.
c) If R is any right congruence relation that refinesL, theni? £ RL.
10 Prove the following proposition:
Theorem Let L £ £*. The following statements are equivalent.
a) L is regular.
b) There is a right congruence relation on 2* of finite rank which refines L.
c) RL has finite rank.
11 Prove Fermat's little theorem. That is, for any prime p and nonnegative
integer a, d — .'
<r = a mod p.
Let us fix £ = {0, l} and define a mapping from £* into the natural numbers, as
follows:
x »■ x
where x is the integer represented by x as a binary string. If x = an^ ■ ■ ■ a0, then
n-l
x=Y, ai2'-
For example,
1=1 101 = 5
10 = 2 A = 0

52 FINITE AUTOMATA AND LINEAR GRAMMARS
12 Prove the following facts:
a) The map x *+ x is a one-to-one map from {l}£* onto the positive integers.
b) For any x, y G £*,
xy = 2l^y)x+y.
c) For any x G £+ and fc > 1,
2fe !g(*) — J
(*fe) = x
13 Prove the following result.
->ig(*)
1
Theorem Let £ = {0, 1} and p be an odd prime. If xj.zGT, so that
2lg(y)# 1 mod p, then
xyp~xz = £z mod p.
14 Prove the following result: Let £ = {0, l}withx, y,z G£*. Letp = xykz
be an odd prime and assume 2l6(y) ^ 1 mod p. Then
xj^ ^p~'z = *y z = 0 mod p.
15 Is the hypothesis that 2lg(y) # 1 mod p actually needed in Problems 13
and 14?
Now the previous problems can be combined to show that no infinite set of prime
numbers can be regular if they are represented in this fashion.
16 Prove the following result.
Theorem Let £={0, l}. Let ^be any infinite subset of {l}£* such that
w G S6 implies that w is prime. Then S6 is not regular.
2.3 TRANSITION SYSTEMS
For our purposes it is convenient to consider a more complex type of device
than a finite automaton. Let us describe our new device by a state graph, i.e., a finite
directed labelled graph. Consider the graph shown in Fig. 2.2. Note that this graph
has two arrows labelled 0 leaving q0 and even has two initial states. Formally, we have
the following definition:
Definition A transition system is a 5-tuple A = (Q, £,6,20, F) where Q,
2, and-Fare as in a finite automaton, 0^=QO- Q and 6 is a finite subsetof Q x 2* x Q.
Note that every finite automaton is a transition system.
Definition A transition system A = (Q, 1,,8,Q0,F) accepts w^I,* if there
exist r>0,w = u1 • • • «r,«; G I,* ,q0,. . . , qr G Q such that
0 Qo^Qo,
ii) (Qi, W/+i, 4,-+i) e S for 0 < i < r,
hi) qr^F.

2.3 TRANSITION SYSTEMS 53
A F={q3}
Go = too. <7il
FIG. 2.2
Write T(A) = {w\ A accepts w). Intuitively, A accepts w if there is a path in
the graph from some initial state to some final state and the concatenation of all the
labels on the path is w.
Note that if A is a finite automaton, T(A) is the same as when A is regarded
as a transition system.
Example In the previous example, 101011 £ T(A). There are infinitely many
different paths from q0 to q3 with that label.
Since every finite automaton is a transition system, these devices accept all
the regular sets and possibly more. We now show that these devices are no more
powerful than finite automata.
Theorem 2.3.1 (Myhill) For each transition system A = (Q, 2, 6, Q0, F),
T(A) is regular.
Proof We may assume that each element of 6 is a member of Q x (2 U {A})
x Q. [For if (q,x,q')G8 with lg(x) > 1, we write x = ax ■ ■ -ak, k> 2. Delete
(q,x, q') from 6 and add to 6 the triples
(Q,ahq1)
(Qi,a2,q2)
(4fe-2,afe-l,4fe-l)
where the qt are all new state symbols.]
Define B = (2°, 2, 8B, Q'0,FB) where FB = {X9 Q\ XnF=£0}. Define
the direct spontaneous transition relation, P0, on Q by:
(q,q')£P0 if and only if (q, A, q') G6;

54 FINITE AUTOMATA AND LINEAR GRAMMARS
then let P = P% be the reflexive transitive closure of P. Define
Go = {q\ Qo pQ f°r some q0 G Go }•
Finally, for each V~ Q, define 8B(V,a)= {q'\ (v,a,w)£8, wPq for some (w, v)
G G x V}.
To complete the proof, we must argue that T(B) = T(A). It suffices to show
the following:
Claim q GSB(Go,*) if and only if there exist
r,q0t. . .,qrinQ\ bu... ,6rin£U{A}
such that:
i) x = bi ■ ■ ■ br,
ii) qo^Qo,
iii) (q,, bi+1, qi+1)G 6 for 0 < /'< r,
iv) qr = q.
The argument is an induction onm = lg(x):
Basis m = 0, x = A. Then ^G6B(Qo, A) if and only if q^Qo- Thus all
b\ = A, and the basis is established.
Induction step. Suppose the result true for all sequences x of length k. We
will show it true forxa, a G 2, which has length & + 1. Let V = S'(Go, x). Then
<7'GSB(Go,*«) = Sb(Sb(Qo,*),«) = Sb(K,jc)
= {f| (v,fl,w)G6, wft for some (w,v)£Qx V).
But dGF implies that claims (i) through (iv) hold for v. Note that wPt implies a
finite number of additional A-rules (all of type (iii)) ending at q . Thus there exists
a sequence of the desired type.
Conversely, if such a sequence occurs ending at q', we have q ^rpB(xa).
Thus the claim is verified.
Finally, we verify that T(B) = T(A):
x G T(B)
if and only if
SB(Qo,*)GFB.
This holds if and only if
8B(Q'o,x)nF=t 0.
In turn, this is equivalent to the existence of r>0,blt. . . ,br in £ U {A},q0, ■ ■ ■ ,qr
G Q such that:
i) x - bi ■ ■ ■ br,
") Qo^Qo,

2.3 TRANSITION SYSTEMS 55
iii) (qh bi+u qi+1) G 6 for 0 < / < r,
iv) qr£F.
Therefore, x G T(B) if and only if x G r(4). D
Some applications of transition systems will now be given.
Theorem 2.3.2 X is regular implies XT is regular.
Proof Let X = T(A), where A = (Q, 2,6, <y0, /0 is a finite automaton. Define
B = (Q, £,6B,F, {q0}\ where 5B = {(<?',«, <?)| <?' = 5(<?,«)}. Note that B is not
necessarily a finite automaton but is a transition system. We claim that for each
x = ct\ ■ ■ ■ am, a,- G £ for 1 < i < m, q G Q, 8(q, x) = q if and only if there exist
qu... ,4m+1sothat:
0 Qm+i = q\
ii) fa,-+i,«i,4i)GSB for 1 <i<m,
iii) qi = q.
Proof of claim by induction on lg(x).
Basis, x = A so 5(^r, A) = q. But q = q1 = q0+1 = q and the basis is verified.
Induction step. Assuming the result for all tapes of length m, we have that
8(q,xa) = 8(8(q,x),a).
By the induction hypothesis, x = at- ■ ■ am, and the definition of 8B, there exist
Qu ■ ■ ■ ,4m+iS0that:
ii) (qi+uai,qi)G8B for 1 <i<m,
iii) Qi = Q,
iv) (8(q,xa),a,8(q,x))^8B.
Taking qm+2= 8(q,xa),am+1 = a, the result follows.
It is now easy to verify that T\B) = (T(A))T = XT. Let x = ax- ■■ am with
fl,-G£ for 1 <i<m.
x G T(B) if and only if there exist #i,.. ., qm+l G Q such that
0 qi^F,
ii) (q{,ahqi+1)G6B, Ki<m,
i") 4m+i = 4o.
This holds T
if and only if 5(^0,^ ) = ?i£f,
if and only if xT G X,
if and only if x GJT.
Thus T(5) = X7 is regular. D

56 FINITE AUTOMATA AND LINEAR GRAMMARS
FIG. 2.3
a,b
— >(C'^Jl — H d ) T(A) = X
B = AT (I gQ ))< b- (q^)< ^ ( d ) T(B) = XT
FIG. 2.4
Example X= {a}*{b}, XT = {b}{a}*. X is regular, since it is recognized by
the finite automaton >1. (See Fig. 2.3.)
Construct the transition system B from A by converting all final states to
initial states, the initial state to a final state, and reversing all arrows. (See Fig. 2.4.)
Next we show closure of the regular sets under product and star.
Theorem 2.3.3 If X and Y are regular sets, then XY is regular and X* is
regular.
Proof Suppose X = T(A0, Y = T(A2), where each^,- = (ft, 2,6,-, qoh F,) for
i= 1,2 denotes a finite automaton. Assume, without loss of generality, that Qi n Qi
= 0. First we construct a new transition system^, as follows:
A = (GiUe2,£,6A,{^oi},^2),
where 8A = Sj U 62 U {(q, A,^)! q^F{\. The intuitive picture of this construction
is shown in Fig. 2.5.
Thus there is a path from q01 to a final state in F2 under w = xy if and only
if there is a path q0i to a final state in Ax under x and a path from q^ to some final
state of F2 undery. Formally, it is easily seen that
T(A) = TXAJTXAi) = XY,
and therefore XY is regular.
<70I"
State graph
of A i
State graph
Q02 0{ a,
FIG. 2.5

2.4 KLEENE'S THEOREM 57
Next we construct a transition system B which can accept X*. Let
B = (Q\V {<7o}, 2, SB, {q0}, {tfo}) where q0 is a new state not in Qx.
h = 81 U {(?o, A, <7o)} U {fo, A, $0)| <7 e^i}.
The intuitive picture is shown in Fig. 2.6.
It is easily verified that T(B) = (T(A 0)* = X* and hence X* is regular. □
FIG. 2.6
PROBLEMS
1 It is easier to design a transition system to recognize a regular set than to
do it with a finite automaton. In this problem, we shall attempt to see how much
easier. Find a regular set Xn, n>\, such that Xn can be accepted by a transition
system with n states but any ordinary finite automaton takes 2" states.
2 Show that if L £ 2* is regular and </?: 2* -> A* is a homomorphism, then
<pL = {tpx\ x ez,} is also regular.
3 Show that if L 9 A* is regular and tp: 2* -> A* is a homomorphism, then
i£-1Z, ={xeS*| i^t €= Z,} is also regular.
4 Show that the family of regular sets is closed under union.
5 Are the following propositions true or false? Support your answer by
giving proofs or counterexamples.
a) If L i U L 2 is regular and | L11 = 1, then L 2 is regular.
b) If L! UL2is regular andLl is finite, thenL2is regular.
c) If Li U L2 is regular and Li is regular, then L2 is regular.
d) If /. iZ,2 is regular and | L i \ = 1, then L2 is regular.
e) If LiL2 is regular and Li is finite, thenL2is regular.
f) MLXL2 is regular and Lx is regular, thenZ,2 is regular.
g) If L* is regular, then L is regular.
2.4 KLEENE'S THEOREM
The family of regular sets has a lovely set-theoretic characterization due to
Kleene.
Theorem 2.4.1 The family of regular sets (over 2) is the smallest family
containing {a} for each a €= 2 and which is closed under union, concatenation, and *.

58 FINITE AUTOMATA AND LINEAR GRAMMARS
a b
FIG. 2.7
Proof Let M be the family of regular sets and let Jf be the least family
containing the sets {a}, a G £, and closed under U, •, and *. Clearly JT£ <% since every
set {a} is regular and by Theorem 2.3.3 and Problem 4 of Section 2.3. Conversely,
one must show that .& £ J2T. One technique for doing this is to let A = (Q, £, B,quF)
be a finite automaton with Q = {qu ..., qn}, £ = {0,1}, and F = {qSi,qs2, ■ ■■, qsr}.
Definition For all /, /', k, with 1 </, /<« and 0<k<n, define the set
c4p inductively by:
oq) = {«G£U{A}| S(q,,a) = q,}.
Then
Example See Fig. 2.7.
= a U (b U A)(6 U A)*fl
= aUb*a = 6 V
Claim 1 Let /?f/° = {* G 2* | 8(qhx) = qj and for each x, z G £+, x =jz,
8to,-,>') = ^e implies £<yfc}. Then af)=Rfjt\ That is, c$° is the set of all words
which take A from state i to state / without going through (i.e., both entering and
leaving) any state q% with £ > k.
Proof By induction on k.
Base of induction: k = 0. The definition of <4p implies that (4j>) = {a G £
U {A}|S(/jr,-,a) = qj}. Since it is impossible to factor aG£U{A} into a = yz with
y,z(= £+, we have aft* £ Rf?\ Let x^Rfp. Then Sfa,-,*) = qr Assume that x =jz
with y,z£ £+. Then S^,-,^) = qz with £ < 0. But there is no qs G Q with £ < 0. Thus
it is impossible to factor x as above. Then x must be in £ U {A}. Therefore
x G {a G £ U {A}| 8(q,,a) = q}}= oqJ\ Thus 4P> - *kP)> so °iP) = ^P>-
Induction step. Assume a\f _1) = R\f _1) for all 1 < i, j < n. Then
aW = a^-1>U(afri>)(a<fefe-1>ra^-1>
= R$-V VR\£-1\Rgk-1'>)*R%-1'> (by induction hypothesis).
To show that c$° £ R$\ assume that x<Ea\f>. Then jce/jf/8-1* or
■^^"''(^IV0)^^0 In the first case, x^R\f\ since Rtf^ £ Rtf* (easy
consequence of definition of Rfp. In the second case, x=>'zh', where J^^^"1',

2.4 KLEENE'S THEOREM 59
zGCRflr1*)*, and w^Ri)-". Then q) = S(S(S(q,,y),z),w) = S(q,,yzw) = S(qt,x).
Now suppose x—x'x", with x\ x"G £+. An analysis of the cases lg(x')<lg(y),
lg(x') = lg(», lg(»<lg(x')<lgOz); lg(*') = lgO), and \g(yz)< \g{x) < \g(yzw)
indicates that 8(qh x) = q^ with £ = k. Thus x Gi?jyfe), which means that a\f> £ R\f>.
We must now show that R\p<~ a\f^. Assume x^RJp. Then 8(qi,x) = qj
and, for every y, z G £+, such that x=_yz, Sfa/,.}') = <7e with £<&. If for every
y, z6£+, the previous sentence holds with 9.<k, then x&R^'iy = a^fe"1> by the
induction hypothesis. Suppose this stronger condition does not hold. Then there
exist y, z£J* such that x = yz and 8(q;, y) = qk- Let us further restrict y and z so
that y is minimal in the sense that if y = uv, u, z;G £+, d(qt, u) = q% then £ < k. This
simply means that y £R$~^. Now factor z into two words z = z'z" with d(qk, z')
= 4fe and z" # A (we can always do this at first by picking z' = A, z" = z).
Furthermore, choose z' and z" so that z' is the maximal prefix of z such
that Sfafe, z') = ?fe and lg(z') < lg(z). Thus z G (RgT1*)* and z" GR§"».
Summarizing, we have shown that if jc G/?^fe), then either x ^R\f x) or else
x can be factored into x =yz'z" with.y G/?f£-1),z'G(/?j&-1))*, and z'^^"^. But
this means that jtGR},*-1' URfJ-'^-'y^"1' =o{,*). Thus /?&*>£ fljf>
implying /jj^ = a\f^ and the theorem follows by induction. The situation can be
summarized by the diagram in Fig. 2.8.
xER
(*-i)
ye tit"
"e 4*-"
z'eiRi^r
FIG. 2.8
Claim 2 Let A be a finite automaton as above. Then
|0| ifr = 0 (i.e., no final states),
T(A) =
of-"^ U <#£ U • • • U a^ otherwise.
Proof If A has no final states, then T(A) = 0. Suppose r =£ 0. By the previous
theorem,
a^n) = {jcGS*| 6(^i, x) = qj and for each x,zG£+;x =yz, 8(qhy) = qs implies £<«}
= {xGS* | 6(^,-,x) = ^-} (since £ <« for all <qrc G Q).

60 FINITE AUTOMATA AND LINEAR GRAMMARS
Now.jcGTX^)
if and only if 8(qux) = q%. for some i, 1 < i < r,
if and only if i£ a\^ for some i, 1 < i < r,
if and only if x G U a{t = aft? U • • • U a$.
1 < I < r '
Therefore,
Note that the form of the regular set obtained by this method is not unique, since it
depends upon the numbering of the states.
This argument shows that L G Si is also in Jif, so that
j? =jr;
and the proof is complete. □
PROBLEMS
1 Show that:
b)«g> = «^1>(o^-1))*;
c) ag> = (a}/-")*.
2 Give a procedure for constructing a finite automaton directly from a
"regular expression," i.e., a description of the regular set in terms of Kleene's theorem.
[Hint: Use transition systems.]
3 Let L be a regular set. Let
ml = {y\ xyzGL for some x,z G £* such that lg(x) = IgO) = lg(z)}.
Show that ML, the set of middle thirds of L, is regular.
2.5 RIGHT LINEAR GRAMMARS AND LANGUAGES
We are already in a position to characterize the right linear languages. The
argument can be broken down into two lemmas. The first deals with the "power" of
this class of grammars, and shows that any regular set can be generated by some right
linear grammar.
Lemma 2.5.1 If L is a regular set, then L = L(G) for some right linear
grammar G.
Proof Since L is regular, there exists a finite automaton
A = (Q,X,8,qi,F)
such that L = T(A). (^i has been chosen as an initial state for technical convenience.)

2.5 RIGHT LINEAR GRAMMARS AND LANGUAGES 61
Since, by a suitable renaming of Q, we can make £ n Q = 0, we can assume,
without loss of generality, that £ and Q are disjoint.
Defme G = (K,S,P,«D.
where
k = sue,
/> = {q-+a8(q,a)\ (q,a)GQxI,}U{q-*A\ q^F),
the set of variables being Q and the start symbol qx G Qi-
Since 6: (2 x £ ->Q, we have immediately that G is a right linear grammar.
All that must be shown is that L(G) = T(A).
Claim For all _y G 2*, there exists a unique derivation of the form
q\ ^ yq
with q^zQ. Moreover,q = 8(q1,y).
Proof This is an obvious consequence of the fact that y determines a unique
path through the automaton. A formal proof is included as an example of how such
proofs are carried out. Hereafter, if the proofs are straightforward, they will be left to
the reader.
The proof is by induction on lgO>).
Basis. \%{y) = 0 impliesy = A; but, by definition,
8(quA) = qx.
Therefore, q1 =>q1 = Aqx = ASfaj, A) is in P, where the derivation is unique since A
has no A-rules. Therefore,
qx => A^j uniquely.
Induction step. Assume that, for all y 6 2*, if lg(j)<«, then there exists
a unique derivation
Q\ =*■ yq
and q = 8(qlty). Let w G 2* and lg(vv) = n + 1. Then w can be written
w = ya flG£, >iGS*, IgO) = n.
By the induction hypothesis, there exists a unique derivation
Qi ^yq
and ^ = S(quy). But
Sfai.jO -" a8(8(quy),a)
is in P and is unique since 6 is a function. Therefore,
Q\ ^>yK<i\,y) ^ya8(8(quy),a).
Now, using the fact that
6(Sfai,>0,tf) = &(quya)

62 FINITE AUTOMATA AND LINEAR GRAMMARS
and
w = ya
we have „
qx =*■ w8(quw) uniquely,
which proves the claim. Therefore,
qi ^yHiuy) ^y
if and only if &(qi,y) -> A is in P. But, from the definition of P, we have
&(Qi>y) "* A is ini*
if and only if S(qri,.y) S Z7. Therefore,
qi => y if and only if 8(qlty)EF,
or, equivalently,
y e L(G) if and only if yGT(A),
which proves the lemma. □
Corollary Every regular set is an unambiguous context-free language.
Example Let G be
S -» SbS\a
We have seen that G is ambiguous. But L(G) = (ab)*a is regular, and hence there
exists an unambiguous right linear grammar G' such that LOG') = (ab)*a. For instance,
G' could be taken to be
S -> aM|a
Now we turn to the reverse direction of simulating a right linear grammar by
a finite automaton. Such a direct construction would be tedious because there is much
more "freedom" in what the grammar may do than there is in a finite automaton; but
the proof becomes transparent with the aid of transition systems.
Lemma 2.5.2 If L is generated by a right linear grammar, then L is regular.
Proof Let L = L(G), where
G = (V,2,P,S)
is a right linear grammar. Now define a transition system:
C= (G,2,6,{5},n
G = (K-2)U{?}, q$V,
F = {q},
8 = {(Qi,u, q})\ qi ->uqs is in />}U {(qhu,q)\ q-t ->« is in/>}.
Clearly, C is a transition system, and all that must be shown is L(G)= TOC). As is
typical in these constructions, it is easier to prove more.

2.5 RIGHT LINEAR GRAMMARS AND LANGUAGES 63
Claim Letr>l. S = w0^Wi =>w2^- •' =*w, = w £L(G) if and only if
there exist«,- e 2* for 0 < i < r\ qt E Q for 0 < / < r, such that:
i) W[ = «i • • • UjQi for 0 </ < r,
ii) wr = «i • • • ur
iii)'(<7/»u/+D<7/+i) is in 6 forO</<r — l.and
iv)'(<7r-i,ur,<7)isin8.
Proof of the claim It is easy to see, by induction on r, that
S = w0^wi^- ■ -^ = w&L(G)
if and only if there exist«,- S 2* for 0 < / < r; qi G 2 for 0 < / < r, such that
i) W; = «! • • • U(<7/ for 0 </ < r,
ii) wr = «!-•• «r
iii) <7,-->-Mi+i<7,-+i is inP for 0 <i<r — 1, and
iy) ^?r "*■ "r is in ^>-
But by the construction, (iii)' and (iv)' are equivalent to (iii) and (iv). But the second
half of the biconditional, together with the fact that the initial set of states is {S} and
q&F, gives us
S ^ w, we2* ifandonlyif wGT(G).
Therefore,
L(G) = T(Q.
Since C is a transition system, L(G) is regular. CD
Combining this with Lemma 2.5.2 leads to the main result of this section.
Theorem 2.5.1 L is a right linear language if and only if L is a regular set.
Corollary Every regular set is a context-free language.
Consider the grammar G with production
S -» aSb\A
G is a linear grammar with only one variable and
L(G) = {albl\ i>0}.
We know that L(G) is not regular. Therefore, there are linear languages which are not
right linear languages.
Theorem 2.5.2 The family of right linear languages (regular sets) is a proper
subfamily of the family of linear context-free languages.
Corollary The family of regular sets is properly contained in the family of
context-free languages.
We will establish other properties of these classes of grammars in later sections
when more techniques are available.

64 FINITE AUTOMATA AND LINEAR GRAMMARS
PROBLEMS
1 Show that every linear context-free language may be generated by a
context-free grammar G = (V, 2, P, S), where each rule is of the form:
A -> a
A -> aB
or
A -> Ba
where A, B GN,a GT, U {A}. This is sometimes called linear normal form.
2 Prove the following fact:
Fact L 9 2* is a linear context-free language if and only if there exists a
set 2', a regular set £/9 (2')* and two homomorphisms tfil and ip2 from (2')* into
2* so that L = Wi(w)ip2(wT)\ w S £/}.
3 Show that a language L is left linear if and only if L is regular.
4 A context-free grammar is metalinear if each rule is of the form:
S-*Al---Am A{GN-{S);
A -+ UiBui BGN — {S};ul,u2€.'Z*;
A -> u u G 2*.
A set i is metalinear if i = i(G) for some metalinear grammar. Show that L is
metalinear if and only if L is a finite union of products of linear context-free languages.
5 Consider an automaton which has a nondeterministic finite-state control,
two read-only input tapes equipped with right end markers, and which computes as
follows: It starts with input pair (u>i $, u>2$), reading the left end of both tapes. At each
instant of time, the read head of one tape (assume it is associated with the current
state) advances one square to the right and a state change occurs. The device accepts
by final state when one tape has been entirely read. We say the device accepts (x,y) if
(*$,;>$) leads to an accepting configuration. Such a device accepts a binary relation
on words. Formalize such devices. Show that if L is a relation accepted by such a
device, then
IIi(Z,) = {xS2*| (x,y)GL for somey& 2*}
is regular.
6 Show that L is a linear context-free language if every string z S L can be
represented as z = xy and the set
{(x,yT)\ z = xyGL)
is accepted by a nondeterministic two-tape finite automaton. (Cf. Problem 5.)
7 Let L 9 2*c2* with c^2. Is it true that if L is a linear context-free
language, then {x\ xcy S L for some y S 2* } is a regular set?
8 Show that, for k> 1, Lk =a* ■ ■ ■ a% may be generated by a right linear
grammar which has k variables. Show that there is no right linear grammar with fewer
than k variables that generates Lh.

2.6 TWO-WAY FINITE AUTOMATA 65
2.6 TWO-WAY FINITE AUTOMATA
We now consider finite automata which can move both ways on their input
tapes. Intuitively, such devices should be able to accept more than just regular sets,
since they can revisit certain parts of the input.
Definition A two-way finite automaton is a 5-tuple A - (Q, 2,6,^0,F),
where Q, 2, q0, F are as in a finite automaton; 6 is a map from QxS into {—1,0, 1}
xfi.
Intuitively, 8(q, a) = (d, q') means that, if A is in a state q reading input a,
then A goes to the left when d = — 1, remains stationary if d = 0, or goes to the right
if d = 1 and enters state q .
Definition Let A = (Q, X,8,q0,F)be a two-way finite automaton. Assume,
without loss of generality, that Q n 2 = 0. An instantaneous description (or ID)
of A is an element of 2*Q2*.
Example The ID ax ■ • • a,--^,- • ■ ■ an describes the machine as drawn below:
a\
"2
"i-i
at
a„
<7
Definition The move relation |— of ID's of a two-way finite automaton
A = (Q, 2, 8,q0,F) is defined as follows: For any q &Q;ax,... ,an e 2,
a! ■ • • qat ■ ■ ■ an h«i • • • Qa^d ■ ■ ■ an
if 8(q,ai) = (d,q'), where n > 1, 1 </<n, and an+l = A. Moreover, if /= 1, then
d>0.
The convention a„+1 = A allows A to leave the righthand end of its input
tape. The last condition prevents A from going left from the first square of the tape.
As usual, |— denotes the reflexive—transitive closure of |—. It is interesting
to compare the following two relationships.
a) qz )r qz,
b) xqz \~xqz.
If (a) holds, it means that A starts on the lefthand symbol of z, computes within z,
and returns to the same place. Thus, if (a) holds, one can say that (b) holds. On the
other hand, if (b) holds, A starts at the same place but the computation may involve
visitingx as well as z, and one cannot conclude that (a) holds.
Next, we can define acceptance.
Definition Let A = (Q, 2, 8,q0,F) be a two-way finite automaton. Define
T(A) = {x S 2* | qoX )r xq for some q E F}.

66 FINITE AUTOMATA AND LINEAR GRAMMARS
Example Let A be as shown in Fig. 2.9.
0 l
<7o
<7l
12
1. Qi
— 1. <7o
-1.91
-1. Qi
0, ft
1. 12
F={q2}
FIG. 2.9
Clearly,
<70011 hOflill
hO^ll
h01^2l
h-01l£|2.
Thus, 011 E 7X4). The reader should verify that 00 <£ T(A) and 00 causes,4 to go into
an infinite loop.
Since every finite automaton is a special case of a two-way finite automaton,
these new devices can accept every regular set and possibly more. Our next definition
is useful in analyzing the behavior of two-way finite automata.
Definition Let A = (A, 2,8,q0,F) be a two-way finite automaton and
suppose 0 £ Q. For each x E 2*, define a map tx from Q U {0} into Q U {0}, where:
i) lfx=x'a for some a €2 and x'qa f— x'aq', then rx(q) = q'.
Otherwise, Tx(q) = 0.
ii) If q0x \- xq', then tx(0) - q .
iii) Otherwise, tx(0) = 0.
In words, condition (i) means that A is started on the rightmost symbol of an input
x in state q. If A ultimately leaves the entire input word in a state q , then Tx(q) = q .
Otherwise, rx(q) = 0.
Condition (ii) means that if A is started on the left end of input x in state
q0 and it finally goes off the right end of x in state q , then tx(0) = q . Otherwise,
7,(0) =0.
In order to analyze the computations, we shall examine computations which
"return to the same place."
Definition Let A = (Q, 2, 6, q0, F) be a two-way finite automaton. For
each x, z S 2*; q, q'^Q define xqz r~xq'z if there exist n > 1, xt, zt E 2*, where
1 < / < n and
xtfiZi h x2q2z2 h •■• \-x„q„z„
with:
i) xtzt = xz for all /, 1 < / < n,
ii) xlqlzl =xqz,

2.6 TWO-WAY FINITE AUTOMATA 67
iii) xnqnzn =xq'z, and
iv) for all/, 1 <i<n,xt ¥=x andzt ¥=z.
Thus, xqz \— xqz describes the shortest non-null computation which starts with the
input head on the first square of z and returns to that position.
We now begin to relate the t functions to computations.
Lemma 2.6.1 Let A = (Q,X,8,q0,F) be a two-way finite automaton. For
a\[x,y S 2*, if tx =Ty, then, for any z S H*;q,q' S Q,
xqz \— xqz if and only if yqz \— yqz.
Proof Figure 2.10 illustrates computations on xz and yz if we assume that
tx =Ty. Note that xq2z \—xq3z and yq2z Y~yq$z, even though the computations
are different.
x z
<?o«-
<7l
—•—
12
H <73
I
14
<?5
<?6
V Z
<7o«-
<7l
—•-
12
—•—
<?3
14
—•—
<?5
—•—
<?6
—•—
FIG. 2.10 The traces of a computation on xz and on yz if
Tv = 7„

68 FINITE AUTOMATA AND LINEAR GRAMMARS
If x = A, then
t*(0) = q0 and Tx(q") = 0
for all q" e Q, by the definition of rx. Assume that y ¥= A; that is, y = y'a for some
u6E. Since rx =Ty, we know that Ty(0) = q0 and
Qoy f-yQo implies q^'a $- y'aq0,
which implies
Qoy a ^~y'q"a ^~ y'aq0 for some q" eg,
But then , „ , . , „.
Ty(q ) = q0 * o = Tx(q ),
which is a contradiction. Thus,.y = A.
Combining the previous argument with symmetry, we see that the following
is true:
Fact x = A if and only if y = A.
By the symmetry between x and y, it suffices to prove the result in one
direction. Suppose xqz \— xq'z. Then, by definition,
(2.6.1)
At the first step of the computation, we move left or we do not. From Eq. (2.6.1)
and this observation, there are two disjoint cases:
i) All the xt are prefixes of x, at least one of which is proper (the first move
went left);
ii) All the zt are suffixes of z (the first move had d > 0).
CASE i. Assume that all the xt are prefixes of x and at least one of them is
proper.
Let x,y ¥= A, so that x =x'a and y =y'b for some a,b&~L. Using the
assumption of Case (i) and (2.6.1), we have:
x'aqxz \—x'q2az (2.6.2)
and
x'q2a \rx'aqn = xqn. (2.6.3)
From (2.6.2), it clearly follows that:
y'bqiz \~y'q2bz, (2.6.4)
since A is deterministic. Since rx{q2) = Ty(q2), it follows from (2.6.3) that:
y'qib \ry'bqn = yqn =yq. (2.6.5)
Combining (2.6.4) and (2.6.5), we have:
yqz ^yq'z,
and Case (i) is proven.

2.6 TWO-WAY FINITE AUTOMATA 69
CASE ii. All the z; are prefixes of z. Then,
QiZi I \-Qnzn,
which implies that
yqiz | \~yqnz;
that is, +
yqz r^yq'z- n
In our next result, we extend the computations beyond merely p:
Lemma 2.6.2 Let A = (Q,~E,8,q0,F) be a two-way finite automaton. For
eachx,^ S 2*, if tx =ry, then, for each z GS'^GQ,
qoxz f~ xqz if and only if qoyz Y~ yqz.
Proof By the symmetry between x and y, it suffices to prove one direction.
Suppose q^xz \— xqz. Then either
q0x f~ xq (never use the interior of z) (2.6.6)
or * + + +
q0x F^i and xqxz \j xq2z ^ ■ ■ ■ \j xqnz (2.6.7)
for some n > 1, some qt S Q, with qn = q.
To see that these are the only two cases, examine the original computation
sequence; that is,
q0xz \~x\q\z\ \- x\q\z\ \- ■ ■ ■ \- x'mq'mz'm \- xqz. (2.6.8)
If no ID x'lq'tZ; occurs with x] =x and z\ = z, then (2.6.6) holds. Otherwise, we may
consider only the IDs in (2.6.8) with x\ = x and z\ = z and connect these by p~to get
(2.6.7).
Now we analyze the two cases.
CASE1. Assume (2.6.6) holds. Then
Ty(0) = tx(0) = q,
so that q0y \—yq, which implies that:
q0yz £~yqz-
CASE 2. Assume that (2.6.7) holds. By an argument identical to Case 1,
Qoyz \ryqiz- (2.6.9)
By use of Lemma 2.6.1 applied to (2.6.7), we have:
yq\Z \[jyq2z f^--- \jyqnz = yqz. (2.6.10)
Combining (2.6.9) and (2.6.10) produces:
Qoyz f-yqz,
and the proof is complete. □

70 FINITE AUTOMATA AND LINEAR GRAMMARS
Now we are ready to relate a two-way finite automaton A to the induced
right congruence relation R^^ of Problem 9 of Section 2.2.
Lemma 2.6.3 Let A = (Q, 2, 8,q0,F) by a two-way finite automaton. If
tx = Ty, where x,yGX*, then {x,y)GR^A^. Rj^A) is the right congruence relation
induced by 7X^4).
Proof Assume rx = Ty. Then, for any z G 2*,
jczG T(A) **qoXZ \~ xzqf for some qf&F,
o q0xz \— xqz,qz r~ zqf for some q S Q, qf S F,
**Qoyz \~ yqz,qz \~ zqf for some q S Q, qf S F (by Lemma 2),
** Qo)>z \~ yzQf for some qf S F,
oyzET(A).
Thus, (x,y)GRnA). D
Now we present the main theorem of this section.
Theorem 2.6.1 For each two-way finite automaton A = (Q, 2, 6,^0,F),
T(A) is a regular set.
Proof Since rx = ry implies (x,y) £RjxA), then rk(RjxA)) ^ I iTx I x S 2* }|.
If IQI =n, then the number of different tx functions is at most (n+ If**1. Thus,
RjxA) has finite rank and 1\A) is regular. □
Remark 1. The family of sets accepted by two-way finite automata with
endmarkers is the same as those accepted without endmarkers.
Proof {<} X {$} is regular if and only if X is regular. The "if part" is obvious,
and the "only if part" follows from closure under homomorphism. □
Remark 2. Let A = (Q, 2, 6, q0,F) be a two-way finite automaton. Consider
the sets „
R(A) = {x S 2* | q0x r~ xq for some q S Q — F}
and t
L(A) = {x €2*1^0* r~ xq forno^SQ}.
R(A) is the set of tapes rejected by a two-way finite automaton, in that the device
leaves the right end of the input in a nonfinal state. L(A) is the set of tapes that cause
A to never leave the input tape. Since 2* = T(A) UR(A) U L(A), we have that L(A)
= 2* —(T(A) UR(A)), since the sets are disjoint. Therefore, bothR(A) andL(A) are
regular sets.
Remark 3. If A is a nondeterministic two-way finite automaton, then T(A)
is regular and, moreover, Remarks 1 and 2 hold for A

2.7 HISTORICAL SURVEY 71
PROBLEMS
1 Let us change the formal definition of the manner in which a two-way
finite automaton A works, so that it may leave the input tape at either end. Define
TL(A) as the set of tapes A accepts by starting at the left end and going off the left
end in some final state. Show that TL(A) is regular. Can you give an algorithm for
constructing an ordinary finite automaton for this set?
2 Find a regular set Yn (n > 1) such that Yn is accepted by a two-way
finite automaton which has a linear number of states, while the reduced connected
automaton for Yn has an exponential number of states.
3 Find a family of sets Ln such that:
i) Each Ln is accepted by a cn-state transition system, where c is some
constant independent of n; and
ii) for each n > 2, no Ln is accepted by any two-way deterministic finite
automaton with fewer than dn2/log n states; d is a constant independent
of n.
2.7 HISTORICAL SURVEY
There are a number of texts that cover finite automata theory in some detail.
Salomaa [1969a] is an excellent general text. Eilenberg [1974] gives an elegant
mathematical treatment, although his terminology is unique.
The paper by Rabin and Scott [1959] is the source for the classical treatment
of finite automata and is still worth reading. Exercises 9 and 10 of Section 2.2 are
based on the work of Nerode [1958]. The material on the nonrecognizability of the
primes follows Hartmanis and Shank [1968]. Transition systems follow an idea of
Myhill [1957], and are a variation of the nondetenninistic automata of Rabin and
Scott [1959]. Kleene's theorem may be found in Kleene [1956]. Our proof uses
techniques due to McNaughton and Yamada [1960]. Theorem 2.5.1 may be essentially
found in Chomsky and Miller [1958], and Bar-Hillel and Shamir [I960]. Exercise
6 of Section 2.5 is from Rosenberg [1967]. The results of Section 2.6 follow the
paper by Shepherdson [1959].

three
Some Basic Properties
of Context-Free
Languages
3.1 INTRODUCTION
A number of basic results will be established here. It is possible to eliminate
variables that can never occur in a sentential form as well as variables that can never
generate a terminal string. A "factorization lemma" is proved for derivations, and this
lemma is used to prove a number of simple but useful little theorems.
It may be shown that substituting context-free languages into context-free
languages gives context-free languages. Another characterization of regular sets is
presented that involves self-embedding grammars.
3.2 REDUCED GRAMMARS
It can sometimes happen that grammars contain variables that cannot
generate any terminal strings. Consider the following example:
Example E -> E + E\T\F
F -* F*E\(T)\a
T -» E-T
where 2 = {a,+,*, (,),-}.
The only rule with T on the lefthand side has T also on the righthand side,
so it appears that T will never derive a terminal string. Surely such a situation is
undesirable, and an algorithm for eliminating such variables would be useful.
73

74 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
Theorem 3.2.1 For each context-free grammar G = (V, Z,P,S), one can
effectively construct a context-free grammar G' = (V', 2,i>',S) such that L(G) =
L(G'). Moreover, if L(G) = 0, then P' = 0, and if L{G) ¥=0, then for each A &N'
there exists x £S* such that
A ^ x
G
Proof The first part of the proof consists of a construction that determines
the set of variables which derive terminal strings. The proof is a paradigm for future
proofs and is given in (tedious) detail.
Define
Wi = {AeN\ A-+xisinP for some* e 2*}
Wk+l = Wku{AeN\ A-+aisinP for some aS (2 U Wk)*}.
Intuitively, Wk contains those variables that will derive some string of terminals in
a derivation tree of height less than or equal to k.
i) Wk c Wk+l, by construction.
ii) If there exists an / such that Wt = W,-+i, then
Wt = Wi+m, forallm>0.
Proof By induction on m.
Basis. Wt = Wn.0.
Induction step. Suppose Wt = Wl+m. Then
AeWi+m+1oAeWi+m or A^aismPfoTsomeaeCEUWi+m)*
oAeWt or A -> a is in P for some a S (2 U W,-)*
<>Aewi+l
oAeWi.
This completes the proof of (ii).
Note that the above proof would generalize for any sequence of sets defined
in this manner and for any predicate that depends on its predecessors in a similar
fashion.
iii) Wn = Wn+l where n= \N\.
Proof From (i) we have
W^ W2c W3t ■■■cWn9 Wn+19 ■•■.
But for all /, Wi 9 jV=>| Wt\ < \N\. Therefore, by a trivial counting argument,
Wn = Wn+l.
iv) A S Wt if and only if A =>x where x S 2* in a derivation tree of height

3.2 REDUCED GRAMMARS 75
Proof The argument is an induction on /".
Basis. / = 1. The result is immediate.
Induction step. Assume the result for /. Then by the construction,^ G HA+1
if and only if A S Wt or A -> a is in i> for some a S (2 U W/)*. The induction
hypothesis will take care of the first possibility. In the other case, let
a = u0CiUi ■ ■ -uk-iCkuk,
where k > 0, C, S Wh and «;- S 2* for all /. By the induction hypothesis, there exist
Zj S 2* such that Cj ^Zj in a tree of height < /.
If one notes that the tree in Fig. 3.1 is of height < / + 1, then we can conclude
the following: A S fV,-+1 if and only if A =>x for some x S 2* by a tree of height <
/ + 1. This completes the proof of (iv).
v) Wn = {A | A =>x for some xS 2*} where n = \N\.
Proof If A S Wn, then A is in the righthand side of (v). Conversely suppose
A =>jc. Let /be the height of a derivation tree of* from ,4. If/<«, then
Aew^ wn.
If/ > n, write i = n + m. Then, by (ii) and (iii),
and the proof of (v) is complete.
Now we can finish the proof by constructing G'. If L(G) ¥=0, define
G' = (V',Z,P',S),
V' = Wn U 2 U {S},
P' = U->a|4,ae(EU Wn)*, and A-*a\sinP}.
FIG. 3.1

76 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
In the extreme case where L(G) = 0, we'll have Stfi W„. By convention, G'
will not have any rules; that is, P' = 0. If L(G) ^ 0, then, for any A GiV', there is
x e 2* so that *
A => x
G
This follows from the construction and (v). All that remains for us to do is to show that
L(G) = L(G').
Since P' 9 P, every G' derivation is a G derivation and L(G') £ L(G).
Conversely, suppose wSL(G). ThenS =>w and w£X*. For every variable Aj that appears
in the tree, Aj =*",-, u;- S 2*. Thus >4;- S W„ by (v), and the first production used was
of the form ttj = Aj->as and it must be the case that a;S(2 U Wn)*. Therefore,
each such njEP', and hence we have a G' derivation of w. Thus L(G) 9 L(G'), and
therefore:
L(G) = L{G'). D
Example Consider the grammar
f -> £" + £*| r|F
F -+ F*E\{T)\a
T -» F-T
The algorithm works in a bottom-up fashion:
Wi = {F} since F-> a is a production;
W2 = {F, E} since F -> F is a production;
W3 = Wi-
The new grammar has rules:
E -> F + F|F
F -> F*F|a
There is another problem to be considered in examining the structure of a
grammar. The following example illustrates the problem.
Example S -*■ aBa
B -+ Sb\bCC
C -* abb
D -* aC
If S is the start symbol, then D cannot occur in a sentential form; i.e., it is
not "reachable" from S. We begin our consideration of this phenomenon by defining
reachability.

3.2 REDUCED GRAMMARS 77
Definition Let G = (V, Z,P,S) be a context-free grammar, and let AG.N
and BGV. Bis said to be reachable from A if
A ^> otBQ for some a, 0 S V*.
A construction is now given to eliminate unreachable symbols.
Theorem 3.2.2 For each context-free grammar G = (V, X,P,S), one can
effectively construct G' =(V, 2', P',S) such that L(G') = L(G) and each symbol in
V' is reachable from S.
Proof For each AGN, define
WM) = {A}\
Wk+l(A) = Wk(A)U{Cev\ there exist a,0e V* and
B e Wk(A) so that B -> aC0 is in P}.
Intuitively, Wfe(/4) contains all symbols reachable from A by a derivation
of fewer than & steps.
The Wfc(y4) have the following properties:
i) Wk(A)tWk+l(A);
ii) If Wk(A) = Wk+1(A), then for all m > 0, WfeC4) = Wfe+m04);
iii) Wn(A) = Wn+l(A) wheren = \N\;
iv) Wn(A) = {BeV\ A^>uBP, a,0SF*}.
The proof of these facts completely parallels that used in Theorem 3.2.1 and is
omitted.
To obtain G', we use Wn(S) in the following way. Let
G' = {V',1,',P\S),
V' = W„(S) and 2' = K'ni,
P' = {A-+ainP\ A<EWn(S)}.
Every symbol in V' is in Wn{S) and by (iv) is reachable from S. To complete
the proof, it must only be checked that G and G' are equivalent. Since P' £ P, any
G' derivation is a G derivation, and so L(G')9 L(G). Conversely, if w£L(G), then
Sfw (3.2.1)
Every symbol in each step is in Wn(S) and so each production used in (3.2.1) is inP'.
Thus, (3.2.1) may be written:
and L(G) £ /-(C), which completes the proof. □
Example Let us return to the previous example and consider G where P
consists of the following productions:

78 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
S -> aBa
B -» Sb\bCC
C -> abb
D -* aC
Let us compute W4(S):
W^S) = {S},
W2(S) = {S,B,a},
W3(S) = {S,B,C,a,b}= W*{S).
Thus,/) cannot be reached from S and the construction produces G', where:
2'= {a, b} N = {S,B,C}
S -> aBa
B -* Sb\bCC
C -» abb
In our discussion about reachability, our examples stressed the case of
unreachable variables. Actually, the discussion is relevant to terminal symbols that are
not reachable from S. The reader should now recheck the previous discussion to see
how that case was accommodated.
We may combine the ideas of Theorems 2.3.1 and 2.3.2 to yield the
following concept of a reduced grammar:
Definition A context-free grammar G = (V, X,P,S) is reduced if P = 0 or,
for every A G V, S =*aA(S=*-w for some a, 0 E V*, w G 2*.
The main result is that every language has a reduced grammar.
Theorem 3.2.3 For each context-free grammar G = {V,'S,P,S), one can
effectively construct a context-free grammar G' such that L(G') = L(G) and G' is
reduced.
Proof Given a grammar G, we may use Theorem 3.2.1, and we may assume
that every variable (except possibly S) produces a terminal string. Next we apply the
construction of Theorem 3.2.2. It is easily checked that this transformation does not
undo the effects of the first transformation, since we are just deleting rules. Now the
resulting grammar is equivalent to G and is reduced.
There is only one case that has not been explained. Suppose that the original
grammar G generated the empty set. Then all variables except S would be eliminated
by these transformations and all 5-productions would be eliminated by the first
transformation. G' would end up with P' = 0 and that is desirable since 0 is a context-free
language. □
By an earlier remark, a reduced grammar with P^Q has no unused symbols
whether they are terminals or variables.

3.3 A TECHNICAL LEMMA 79
PROBLEMS
1 Let us reconsider the proof of Theorem 3.2.1. Define a sequence of sets
W'o= 0,
W'k+l = {A &N\ A -xxisinP for some aS (2 U Wk)*}.
Relate the sets W'k to the Wk defined in the original proof.
2 Construct a reduced grammar equivalent to the grammar shown below.
Let G be
S -> aBa
B -> Sb\bCC\DaB
C -> abb\DD
E -> aC
3 Construct a reduced grammar for the following grammar G where G =
(V, 2,P,S) and Pis:
S -> aSM
.4 -> bBSa\bBaB
B -> ^5a^ | C
4 Can you reformulate the construction given in Theorem 3.2.2 as a transitive
closure problem?
5 Give an algorithm for deciding whether an arbitrary context-free grammar
generates the empty set.
6 Let G = (V, 2, P, S) be a context-free grammar and let a, j3 S V*. Give an
algorithm for deciding whether a=>|3. Note that when a = 5 and j3=wS2*, this
gives an algorithm for checking whether w£I(G) and shows that context-free
languages are recursive sets.
7 Show that a reduced grammar for the context-free language {A} cannot
have any rule of the form
A -> a<z|3
withae2,a,|3e K*.
3.3 A TECHNICAL LEMMA AND SOME APPLICATIONS TO CLOSURE
PROPERTIES
There is a technical lemma which is very useful in writing proofs in context-
free language theory. We will prove this easy lemma now, and use it in showing that
the context-free languages are closed under transpose.
Lemma 3.3.1 Let G = (V, 2,P,S) be a context-free grammar and let a,
j3 S V*. If a =^0 for some r > 0 and if a = a! • ■ ■ an for some n > 1, at S V* for

80 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
1 </'<«, then there exist tt>0, ft G V* for !</<« such that 0 = ft---ft,,
a; J> ft, and
n
I U = r.
Proof We induct on r, the length of the derivation.
Basis. Let a = a^ • • • a„ =>ft Then a = ft so let ft = a,- and tt - 0. Hence,
<*,■=> ft and:
£ r, = 0 = r.
Induction step. Our induction hypothesis is that if a = al ■ ■ ■ an =>/}, then
there exist tt > 0, ft e K* such that 0 = ft • • ■ ft,, a,- i£ft and 2"=1 tt = r.
Now consider
r+l „
a = ai---a„ => 0
Then there exists )SF* such that a = a^ - • ■ a„ =*7 =*0. Because G is a context-free
grammar, the rule used in a =>7 was of the form A-*%. Let ak, 1 < k < n, contain the
>4 that was rewritten in a =*• 7. Then afe = a'Atf for some a', ft S K*. Let
%■
f,
a; if/' ¥= &,
a'£ft if/' = fc.
Then at =*■ 7,- if i=£k and afe => 7fe. Thus
r
a = a! • • ■ an =* 71 • • • 7„ =*■ 0
By the induction hypothesis, there exist th ft so that 0 = 01 •••ft,, 7,-=*ft and
2"=1r,- = r. Combining the derivations, we find
0 H
at => 7,- => ft for /' ^ &
and (
<*fe => 7fe 4 ft,
Let
f ?,- if/' ^ k,
t\ =
Ufe + 1 if i = k.
Thus we know t't>0, ft S K*, so that
0= ft'"ft.
a; ^ ft
We compute S?=1r| = 1 + S?=1r, = r + 1. D
Next we turn to proving a closure property of context-free languages.
Theorem 3.3.1 If Z, is a context-free language, then
LT = {xT\xGL}
is a context-free language.

3.3 A TECHNICAL LEMMA 81
Proof Let L = L(G), where G = (V, ~L,P,S) is a context-free grammar.
Define
GT = (V, Z,PT,S),
where
i>T = {A-+aT\ A-+aisinP}.
Claim For each A &N,A =fa, ffS K* if and only if >4 =» aT.
/V00/ of claim We will show by induction on the length r of a derivation in
G that if >4 =>a, aS K* then>4 => aT. The reverse direction follows from the
observation that we could replace G by GT, and vice versa, in the proof. Note that (GT)T = G.
Basis. For any A e N, A =g A implies A=>TA.
Induction step. Assume that for all A&N, if A ^> a, aS V*, and t<r,
then A^oF.lfA r£$, then A 7? a =*/}. Let
a = uxAxu2A2 ■ ■ unAnun+i «,e2*, i4,-GiV.
Now since A => a is in P implies A =*■ aT is in i>',
>4 => aT = u^AnU^A,,^ ■ ■ -u^Aiuf
g'
But uxAxu2A2 ■ ■ ■ unAnun+t ^>(j. By the preceding lemma, there exist 7,-SK*,
tt > 0 so that >4,- 4-7,-, 0 = "i7i"272"' " 7n"n+i; and 2"=1r,- = r. Thus each tt < r and
the induction hypothesis holds forAt =^7; so that >4 =L ff ■ Therefore
G
A =t uTn^An ■ ■ Axu\ 4 i£+lT£ • • 7^ = 0T
Gl GT
which extends the induction and proves the claim.
In particular, for w S 2*,
o 73 w 11 anu uiny 11 o ~~^ vt'
Therefore
5 ^ w if and only if 5 =>, w
(L(G)f = L(GT). □
Corollary /, is a regular set if and only if/, is a left linear language.
Proof This is an immediate consequence of the following observations.
a) L is regular if and only if L is a right linear language. Cf. Theorem 2.5.1.
b) If G is right linear, then GT is left linear.
c) If/, is regular, then LT is regular. Cf. Theorem 2.3.2. □
PROBLEMS
1 Show that the language
L = {w£ {a, b}* I #a(w) = #b(w)}
is an unambiguous context-free language.

82 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
2 Is L, given in Problem 1, also linear and unambiguous?
3 Show that Lemma 3.3.1 becomes false for transitive closure (=>) rather
than =*■. That is, if we assume r> 1 and require t; > 1 for each 1 <!<«, then the
result is no longer true.
3.4 THE SUBSTITUTION THEOREM
Intuitively we know that if a context-free grammar G generates {a"bn},
then {0"1"} is also context-free, because we could go through the grammar and
replace a by 0 and b by 1. This would suggest that if we replaced a and b by any
strings x, y, and possibly by any two "reasonable" sets of strings, we would still have
a context-free language. In this section we will treat this situation in its full generality.
Definition Let 2 be a fixed alphabet. For any a S 2, let 2a be an alphabet
and let tp(a) £ 2* be a set. Such a mapping y is a substitution mapping if
m = {a}
and if for all at, 1 < / < k,
ifi(ai---ak) = tfad " ■lAak).
Note, (i) ip is a set-valued function.
(ii) If tfi(a) is a singleton, then ip reduces to the usual definition of a
semigroup homomorphism.
We extend ^ to sets in an obvious way, namely if L 9 2*,
*(L) = U tfx)-
Let us work more examples in order to understand this operation. Let
L = {anbcn\ n>\\
Define ya = Lu ipb — L2, and tpc = L3, where Lu L2, and L3 are any three fixed
context-free languages. Then
tfiL = U L"L2L^.
n > 1
Now, we shall give the main theorem of this section.
Theorem 3.4.1 If L is a context-free language and y is a substitution map
such that for every a e 2, tfi(a) is a context-free language, then y(L) is a context-free
language.
PTOof L = L{GL), GL = (VL,ZL,PL,S),
and for every a S 2L, since La — y(a) is a context-free language, let
La — L(Ga), Ga = (Va,2a,Pa,Sa).

3.4 THE SUBSTITUTION THEOREM 83
By appropriate renaming of variables we can ensure that for each a,b £2,
Na i~> Vb = 0 and Nb n KL = 0 and Ea njVx, = 0.
Therefore we can assume, without loss of generality, that these conditions hold. Now
define a mapping , .
t:Vl-*Nlu[ U {Sa}\
by
«"i
fa if aGNL,
[Sa if a G SL.
Next extend 7 to strings by letting
t(A) = A
and if x € Kj,, x —Xi ■ ■ ■ x„, x,- € Kx,, then
t(x) = t(x{)t(x2) ■ ■ ■ t(x„).
Now let
Gti = (VL;2L;PL;S),
Si' = U {sa},
aesL
Fi- = NL U Sx.',
Pi- = {4 -> 7(a) | ^ -> a is in i>}.
Claim 1 For each A € A^-, A =*• a if and only if >4 =*■ rfaV
Proof This is intuitively obvious, since 7 corresponds to a simple renaming
of "LL. A formal proof by induction on the length of the derivation, using the Lemma
3.3.1, is straightforward and is left as an exercise.
In particular we have
S§>Lar--aneL(GL) if and only ifS=>,Sai ■ ■ ■£„„, a.-GSx,, l</<«.
Now let
Claim 2 5fl] ■
(1 </< n) such that
G = (V,2,P,S),
v = vL'U ( U vA,
2 = U 2a,
'=*U(.M/-)-
■■1S,anf'wGZ,(G)£ 2* if and .
S„t => Wi&L{Ga) and w = w, ■
■w„

84 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
Proof This is a direct consequence of Lemma 3.3.1 and the fact that, since
all the grammars are disjoint,
Sa. => w( if and only if 5ai. =» wt
and, for each a £ 2,
Nanx = 0.
Claim 3 S^w, wEL(G) if and only if there exists a derivation of the form
/W/ Since PL> £ P if 5 =►, 50i ■ • -San =»w then S=fw. Conversely, let
S^w. Since all the grammars involved are disjoint, rules from different grammars
commute. Therefore we may "rearrange" the derivation 5 r> w in such a way that all
rules from PL> precede those from the Pa.. Thus there exists a derivation of the form
such that a =*w uses no rule from PL>. Since Vjj n 2 = 0 and w S 2*, all of a must
be rewritten in the derivation c
from PL', a must be of the form:
be rewritten in the derivation a^vv. Now, since this derivation contains no rules
a — SaiSa2 ■ ■ -San
which proves the claim.
Combining these three facts we have:
S|w£ L(G) if and only if 5 ^, Sa%Saj-• ■ SQn => w
if and only if there exists wt such that
Sa.=>w,&L(Ga) and
5 =f>ax ■■■anG L(GL) (by claim 2),
if and only if w £ ¥>(£).
Therefore i^>(Z,) = Z,(G), which proves the theorem. □
Corollary 1 If L\ and Z,2 are context-free languages, then
0 LiUL2,
ii) LiL2,
iii) £l
are context-free languages.
Proof (i) Let L = {a,b}, which is context-free. (Any finite set is context-
free.) Let
y(a) = Lly v(6) = L2.
Then
<p(L) = LiUL2.

3.5 REGULAR SETS AND SELF-EMBEDDING GRAMMARS 85
ii) Let L — {ab} and y be as in (i). Then
ip(L) — LiL2.
iii) Let L = {a}*. Since L is regular, it is context-free. Again let ^p(a) = Li.
Then
rf£) = L*i-
Corollary 2 If L is a context-free language and ^ is any homomorphism,
then tp(L) is a context-free language.
We already know that every regular set is context-free, by the characterization
theorem for right linear languages (Theorem 2.5.1). The substitution theorem gives us
another proof.
Corollary 3 Every regular set is context-free.
Proof By Kleene's theorem (Theorem 2.4.1), the family of regular sets is
the least collection which contains 0, {a}, a £ 2, and is closed under finite applications
of U, •, and *. Since 0 is context-free, and so is {a} for any a £ 2, we know that every
regular set is context-free, from Corollary 1. □
Now we can also show that the substitution theorem holds for the family
of regular sets.
Theorem 3.4.2 If L is a regular set and tfi is a substitution map such that for
every a £ 2, ip(a) is regular, then y(L) is regular.
Proof Using Kleene's theorem, we know that every regular set can be obtained
from 0 and {a}, a £ 2, by a finite number of applications of union, product, and star.
Since <p maps each singleton into a set that can be obtained by a finite number of
applications of union, product, and Kleene closure, replacing each {a} in the regular
expression for L by tp(a) still gives only a finite number of applications of union,
product, and closure. Hence, y{L) is regular. □
3.5 REGULAR SETS AND SELF-EMBEDDING GRAMMARS
Another grammatical characterization of regular sets can be given.
Definition A context-free grammar G = (V, 2,P,5) is self-embedding if
there exists A EN such that t
A => ocAQ
for some a, 0 £ V*. A context-free language L is self-embedding if every context-free
grammar for L is self-embedding.
This definition is equivalent to: A context-free language L is not self-
embedding if and only if there exists a context-free grammar G for L that is not self-
embedding.

86 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
If a variable is self-embedding, then
A ^>aAP (3.5.1)
and (3.5.1) may be iterated to yield
A ^ otAff for all i>0. (3.5.2)
If one further assumes a reduced grammar, then strings are produced with coordinated
substrings like
v'2v3 v\.
Sets that contain such strings give the appearance of nonregular sets. This provides
some insight for the belief that self-embedding grammars may generate nonregular
sets. That is not true in general, as the following example shows:
S -+ aSa\a\A
This grammar generates Z,(G) = a*. Our intuition is not completely wrong, as the
following theorem shows.
Theorem 3.5.1 A set L is regular if and only if L is a context-free language
and is not self-embedding.
Proof If L is regular, then L = L(G) for some right linear grammar G. Since
every production of G is one of the forms
A -»• uB
A -»• u
where A,BGN,uGX*, then G is not self-embedding.
Assume L = L(G), whence
G = (V.Z.P.S)
is a non-self-embedding, context-free grammar. Further, without loss of generality,
we may assume that G is reduced. Consider the following two cases.
CASE 1. For each A EN, there exist a, 0 in V* such that A ^-aSp. Note
that, since S =*5 is always true, any grammar with exactly one variable satisfies Case 1.
Now every production in P is of at least one of the following forms:
i) A -»• aBP
ii) A -»• aB
iii) A^-BQ
iv) A -+B
v) A^-u
where A, B G N, a, 0 G V+, and u G 2*.
Rules of type (iv) and type (v) cause no difficulty and will not be considered
further.

3.5 REGULAR SETS AND SELF-EMBEDDING GRAMMARS 87
We now prove a series of facts which show that G is either right linear or left
linear.
Fact 1 P contains no rules of type (i).
Proof Assume that P contains a production of type (i). Then
A => aBt$ a, 0 G V*.
Now, since we are in case 1,
Since G is reduced, there exist a2,02 in V* such that
S ^ a2AQ2.
Combining these, we get
A => ctB$ ^ aotiSPiP ^> ac*ia2,4020i0
But a, 0 e F+ implies ac*ia2, /32/3ij3 £ F+. The situation is depicted by Fig. 3.2. But this
is a contradiction, since G is not self-embedding. Hence P contains no productions of
type (i).
Fact 2 P does not contain productions of both type (ii) and type (iii).
Proof Assume
A -»• aB C -»• D/3 tt,|3 6F+,
are in P. Since we are in case 1, there exist c*i, 0i in V* such that
5 ^ a,5/3,
But G is reduced, and hence there exist a2,02 in F* such that
S ^ OjCPj
Similarly, there exist a3, a4, /33,04 in V* so that
£ ^ a3S03 and 5 ^> a4^/34

88 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
FIG. 3.3
Combining these we have
^aotia2a3a4A^3^2^
But again a:J6F+ implies aala2a3a4 and /34/33/3/320iS F+. The tree is shown in Fig.
3.3. This contradicts the fact that G is not self-embedding and hence G cannot have
productions of both type (ii) and type (iii).
Fact 3 If P contains a rule of type (ii) (type (iii)), A ^aB (respectively,
A -+Ba), then a G 2*.
Proof We prove this only for rules of type (ii), since the proof for rules of
type (iii) is completely symmetric.
Let ,4 -*aB and assume a £ 2*. Then there exist 7i,72G V* and C&N, such
that n
a = y\Cy2
There are two cases to consider:
a) 7! ^ A. Then
A -»• yiCy2B isinP;

3.5 REGULAR SETS AND SELF-EMBEDDING GRAMMARS 89
But 71, y2B S V+, and thus P would have a rule of type (i), which contradicts Fact 1.
b) 7! = A. Then
A -»• Cy2B isinP;
but this is a rule of type (ii) and type (iii), which contradicts Fact 2. Therefore a S 2*.
Combining these three facts, we see that G is either a right linear grammar or
a left linear grammar. In either case,/, is regular, which completes Case 1.
CASE 2. There is some A G N such that for no a, 0 e F* does
^ ^ aS0
We will show by induction on the number of variables in G that L(G) is
regular.
Basis. |An = l. Any grammar with one variable satisfies Case 1 and the
basis is vacuously satisfied.
Induction step. Assume that if \N\ = k and there exists A E A7 such that for
noa,0E V* does „
A =>aS0
then L(G) is regular.
Now consider a grammar G in which \N\ = k+ 1. Consider the following
grammars:
Gl = (V,XU{A},PUS),
Pi = {B^ainP\ BG(N-{A})}.
In Gi, we treat A as a terminal.
Now we define G2 to be like G but S is dropped and A is the new start
variable:
G2 = (V-{S},X,P2,A),
P2 = {B^ainP\ Be(N-{S}\aG(V-{S})*}.
Claim L(Gi) and L(G2) are regular.
Proof If Gi(G2) satisfies the condition of Case 1, then L(Gi) (respectively,
L(G2)) is regular. Otherwise Case 2 applies. But lA^I (and also \N2\)~ k and hence,
by the induction hypothesis, L(G{) (respectively, L(G2)) is regular.
Now let y be the substitution map
(I(G2), a = A,
[a, a G 2.
Since Z,(Gi) and L(G2) are regular, tfi(L(Gi)) is regular and all that must be shown is
rt£(G,)) = £(G).
Fact 4 For any otE V*, A^a if and only if A r* a. (Recall that this is the
fixed A in the definition of Case 2.)

90 SOME BASIC PROPERTIES OF CONTEXT-FREE LANGUAGES
Proof This is a direct consequence of the following.
a) If aS V* and A ^a then a does not contain S, since we are in Case 2.
b) P2 consists of precisely those rules that do not have an S in them.
Fact 5xS L(G) if and only if there exists a derivation of the form
S ^ uxAu2A • ■ • unAun+i J> Miw^j • • • unwnun+l
*
where x = UiWiU2 •' • unwnun+1 and A 7? w,-.
Proof Clearly if such a derivation exists, x SZ,(G). Now suppose x SZ,(G).
Then „
x G 2* and 5 s> x
Consider a new derivation obtained from this by:
a) First rewrite all variables except A and its descendants, in each case using
the same rule as was used in the original derivation.
b) Complete the derivation by using the remaining steps in exactly the same
order as they occurred in the original derivation.
Since in (a) no rules with A on the lefthand side are used, they are all in/V
Therefore, * *
S ^ uxAu2-•-Aun+1 ^x «,-6 2 .
Now by Lemma 3.3.1 there exist wt in 2* such that:
X = UiWiU2 • • • UnWnUn + i
and *
A j* Wi
or,equivalently,
This completes the proof of Claim 5. To complete the proof we have x SZ,(G) if and
only if there exists a derivation of the form
S ^ uxAu2 • • ■ unAun+i ^ x = Ui\viU2 • • ■ unwnun+l
and ' „
A J> w{ Wj,w,e2*
from Fact 5. This holds if and only if
5 ^ uxAu2 • ■ ■ unAun+l £ L(Gi)
4 =>WieL(G2),
using Fact 4. In turn, this holds if and only if
jeGrt£(G,)),
by the definition of <p. Thus L{G) = <p(JL(Gi)), and therefore L{G) is regular, by the
substitution theorem. □
and G

3.6 HISTORICAL SURVEY 91
PROBLEMS
1 Explain why the case structure of the proof of Theorem 3.5.1 is necessary.
That is, why not just induct on the number of variables, as in Case 2?
3.6 HISTORICAL SURVEY
Almost all of the results given here may be found in the fundamental paper
by Bar-Hillel, Perles, and Shamir [1961]. Theorem 3.5.1 is from Chomsky [1959a,
1959b].

bur
Normal Forms
for Context-Free
Grammars
4.1 INTRODUCTION
This chapter is devoted to showing that there are many different types of
context-free grammars which can generate all of the context-free languages. Results
of this type are of general interest because, if one wishes to show that some set is
context-free, it is easier if one can employ the full generality of context-free grammars
so that one can exploit A-rules, chain rules, left recursion, etc. On the other hand, if
one wishes to show that some property does not hold, it is convenient to be able to
restrict the type of grammar severely, since the complexity of a proof can be reduced.
We shall be interested in exhibiting a number of normal forms which generate
all the context-free languages. Moreover, we are interested in transformations for
putting a given grammar into these forms, as well as analyzing how much time and/or
space is required. As one might expect, there are some algorithms that are easy to
describe but very inefficient.
A number of the normal forms used here are important in practical
applications to parsing, while almost all of them will occur later in this book in
simplifying some proof or other.
4.2 NORMS, COMPLEXITY ISSUES, AND REDUCTION REVISITED
As we consider various normal forms for grammars, constructions will be
given for achieving these transformations. One question of interest is the efficiency
of these methods. A full discussion of these matters would take us into the theory
93

94 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
of computational complexity and the differences between Turing machines, random-
access machines (RAMs, for short), and bit vector machines. While this is not the
appropriate place for such a treatment (cf. Chapter 9), we do wish to mention enough
of the ideas to permit the reader to be aware of the issues.
In order to discuss these concepts, we need to clarify the notion of the size
of a context-free grammar.
Definition Let G = {V, ~L,P, S) be a context-free grammar. Define
\G\ = I lg04a)
A -*■ a
and in p
IIGII = \G\'log2W\
IG | is read as the size of G and IIGII the norm of G.
The size of G is simply the number of characters involved in productions.
The norm of G is a more realistic measure because it not only counts the size of the
grammar but includes a measure of the size of the alphabet used. The log occurs
because one should count the amount of information in V, or the number of bits
required to encode V.
For example, suppose we have some algorithm that scans the entire
production set and checks, for each production,
A -*■ a
in P, whether or not A EN. Should we use \G\ or IIGII in evaluating the work done?
Naturally the second measure is more realistic, since it accounts for the size of the
alphabet and if N is large, checking to see whether A EN should enter into the
accounting. On the other hand, for many purposes, | G | will suffice.
First, we estimate the relationship between | V\ and the size of a grammar
G = (V,2,P,S).
Lemma 4.2.1 For any context-free grammar G = (V,2,,P,S) such that if
L{G) =£ 0, L(G) =£ {A}, and if every letter in V occurs in at least one production, then
i) 2<m<|G|;
ii) | F| <2«/(log«), where n = ||G|| = \G\ log | V\. (All logs are to the
base 2.)
Proof Since S GN, \N\ > 1. Since 2 * 0, we have
\V\ = \N\+\-Z\>\ + \ = 2.
The upper bound of (i) follows from the assumption that each symbol of V appears
in at least one production.
From (i), we have
n = | G| log 1 K| > | K| log | K| > 2 log 2.
Consider the function
f(x) = 2v£-logx. (4.2.3)
(4.2.1)
(4.2.2)

4.2 NORMS, COMPLEXITY ISSUES, AND REDUCTION REVISITED 95
/'(*)
c =
1
1
Sn 2 ~
c \/x -
X X
1.44269...
-c
If (4.2.3) is differentiated,
where
Thus
fix) > 0
if x > c2 = 2.0813... The minimum of the function is at c2 and is/(c2) = 1.8278. ..
The function / is monotonically increasing for all x > c2. Since our concern is only
at positive integer values, the above argument shows that
fin) > /O)
for all integers n > 3. It is also true that
/(3) = 1.8791. ..> 1.8284 = /(2).
Hence
/(«)>/(2) = 2V2-1>0
or, for all n > 2,
From (4.2.4), for all n>2,
Taking logs,
2>/«-log«>0. (4.2.4)
2y/n>logn.
log(2V«)>loglog«,
log 2 + jlogH > log log h,
log 2 + log n — log log n > \ log n,
log(i^r)>ilogw-
Multiplying by (2«/(log «)) - (which is always positive in this range), we get:
2n log(^-|>n;
logH \log«
but
n>\V\\og\V\,
so ill
H log —— I > I K| log I ^|.
log« UOgH
It follows immediately that if x, y> 1 and x log* >y logy, then x>y. Taking
x = (2«/(log «)) andy = | V\, then we have shown that
log«
Many of the current investigations into the complexity of computations
give us only "order of magnitude" information about the time or the space required.
Such statements are useful, and so we introduce some terminology which is convenient.

96 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Definition Let
£7(/(«)) = {g(n)\ There exist positive constants c,N0, such that
\g(n)\<cf(n) forall«>7V0};
n(f(nj) = {g(n)\ There exist positive constants c and iV0 such that
gin)>cf(n) for all «>#„};
©(/(«)) = {g(n)\ There exist positive constants c, c , and iV0, such that
cf(n) < g(ri) < c'f{ri) for all n > N0}.
One can read (7{f{n)) to be "order at most /(«)"; U{f(n)) as "order at least
/(«)";©(/(«)) as "order exactly /(«)".
These definitions are used in a peculiar but customary way. One writes
(;) - w>
instead of (S)S ^(«2). Using equality signs in this context requires care because =
is no longer reflexive. For instance, it is true that
1+0OT1) = <7(1),
but
0{X) = 1 + ^(«_1)
is false.
Now, let us examine the computation of W„ in the reduction algorithm,
Theorem 3.2.1.
Theorem 4.2.1 Let G = (V, 2,P, S) be a context-free grammar. The
computation of W= Wn given in Theorem 3.2.1 requires time
/Voo/ It is clear that to compute Wt requires time proportional to the size
of G. That means \G\ if that measure is used, or ||G|| if that is used. We will work
with ||G||. Moreover, the computation may take at most \N\ iterations, so the total
bound is 0(\N\- \\G\\). By Lemma4.2.1,
0{\N\-\\G\\)< ^(|F|-||G||)< ^(r^^jl. □
\log||G||/
Now we shall show how to do better. It will be assumed that we work on a
computer with a random-access memory, which we refer to as a RAM. For the reader
unfamiliar with this concept, imagine a typical general-purpose computer with an
unbounded random-access memory. Each word of memory can hold an arbitrary
number, and the instruction set is sufficient to compute any partial recursive function.
It suffices to write programs for such a machine in a high-level language. (One can read
Chapter 9 for more details about this class of machines.)
Our program will be organized as follows.

4.2 NORMS, COMPLEXITY ISSUES, AND REDUCTION REVISITED 97
Data Structures
W is a subset of TV; stack will denote a typical pushdown store, and we will
have the usual push and pop procedures. For each B&N, position{B) is a list of
positions where B appears in the righthand side of a production; nongen{A -*■ a) is an
integer for each production.
The program now follows, see Fig. 4.1.
begin
W: = 0;
stack: = 0;
for all B G TV do position(B): = 0;
for all A -*■ a in P do
begin
ifaGS* then
begin
if^^K/then
begin
W: = WU{A};
push(A)
end
end else
comment a = a^i • • • an-iBnan where n > 1, at G 2* and 5,- G TV;
begin
nongen(A -*■ a): = number of variables in a;
for all Bt, 1 </'<«, do
position(Bj): = positional) u (04 -*• a, 0}
end
end;
while stack =£ empty do
begin
B: =pop;
for all {A -*■ a, i) in position(B) do
begin
nongen{A -*■ a): = nongen(A -*■ a) — 1;
if nongen(A -*■ a) = 0 and A$W then
begin
«/: = «/U{^};
end
end
end
end.
FIG. 4.1 An Improved Program to Calculate W.

98 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
It is not difficult to see that this program always terminates and calculates
W={AEJV\A=>x for some x £ 2*}. Moreover, the time required is proportional to
the size of G, but that argument is more complicated.
PROBLEMS
1 Prove the claims made about the program given in this section.
2 In the argument of Lemma 4.2.1, Part (i), where are the hypotheses that
L(G) ^ 0, L(G) ^ {A} needed? Show that (i) is false without the hypothesis. Explain.
3 Let k> 1 and Lk = {a4 }. Show that, for any context-free grammar G
such that L(G) = Lk, we must have that \G\> 5k. The argument is rather complex
and so a sketch is given here. If the reader can prove the claims, the result will follow.
LetG = (V, 2,P,S) be a context-free grammar and let P= {^,^-a,| 1 <i<p}.
Assume G minimizes \G\.
a) If L{G) is finite, show that there is no loss of generality in assuming that
there are no recursive variables. (A =*otAfi never occurs.)
b) Show that we may assume that G is A-free if L(G) has cardinality one.
c) Show that, if L(G) = {x}, then
lg(*)<nig(a,).
i=l
d) Show that there is no context-free grammar G\ which can generate {a4}
with|G1|<5.
e) Show that there do not exist nonnegative integers p and xit 1 <r<p,
such that
i=l
and p
X (Xi+ 1)<5*.
i=l
4.3 ELIMINATION OF NULL AND CHAIN RULES
Our first goal is to show that null rules may be eliminated from a context-
free grammar. If we do this carefully, the language L(G) will not be altered if A £L(G).
If A6I(G), then we could eliminate all null rules and get a grammar forZ,(G) — {A}.
Instead, we'll eliminate all null rules except S -*■ A and further restrict the new grammar
so that S appears on no righthand side of a production.
An essential part of a proof of this result is an algorithm to determine which
variables can generate A. The first algorithm is the "classical" one due to Bar-Hillel,
Pedes, and Shamir [1961].
Theorem 4.3.1 Let G = (V, X,P,S) be a context-free grammar. There is an
algorithm for producing a context-free grammar G' = (V', X,P',S'), so that

4.3
ELIMINATION OF NULL AND CHAIN RULES
99
i)£(G') = £(G);
ii) A -»• A is in P' if and only if A G L(G) and A = S';
hi) S' does not occur on the righthand side of any production in P'.
Proof Letn= \N\ and W, = {A\ A -»• A}. For each k> 1, let
^fe+i = Wh uUl -4 ^ a is in P for some a GK/^}.
It is easy to see that:
1 Wi£ Wi+l for eachi>\\
2 If Wt = Wi+l, then W; = W,-+m for each m > 1;
3 W ^, = W •
4 Wn = {AGN\ A^A};
5 A G L{G) if and only if S G «/„.
To construct the new grammar, define G' = (FU {5'}, 2,P', S'), where
P' = {S'^S}U{S'^A\SeWn}U{A^Ai ••■Ah\k>\; AtGV;
there exist a1;... ,ak + l G K/* so that ,4 -^aiAi ... afe^4feafe + 1 is inP}.
Clearly, 5' does not appear on the righthand side of any production inP'. Moreover,
A G L(G') if and only if 5' -»• A is in P' if and only if 5 G Wn if and only if 5 ^ A if
and only if A G L(G).
It only remains for us to show that L(G') = L(G). First, we see that, for
each production A -^Ai ■ • ■ Ak in P', there must have been a production A -»a!1,41
■ • • otkAkak+i in P, where a,- G W* for 1 < i < k + 1. Thus, a, ^ A, and so
A =* alAl---otkAkotk+l =*• Ai ■■ Mfc
Thus every production in G' can be simulated in G by a derivation so that L(G')
c KG).
To see that L(G)£ L(G'), we shall prove, by induction on the length h of a
derivation, that if A ^ w, w G 2+, then A=>w.
ifas/s. A = 1. If ,4 ->• wisinP, then, since wi= A, A ->-wis inP'.
Induction step. If A=g Ax- • ■ Ak=*w, wG2+, /;</*, then, by Lemma
3.3.1, there exist wt G 2* so that w = W] ■ ■ • wh and At =g wt for some /' <h. If
w,G2+, then ,4,=*w,- by the induction hypothesis. If wt = A, then^,GK/„ and
^4 ->• ,4yj • • • Aj is in P', where l<p<fc and ,4y- ■••Aj is the subsequence of
A i • • • Ak obtained by deleting those At whose wt = A. Therefore,
*
A => Aj • ■ -A: =?■ Wj • • • W;
where wii • ■ • wj = w. □
Corollary There is an algorithm which decides whether AEL(G) for any
context-free grammar G.

100 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Proof From (v), A G L(G) if and only if S € Wn. □
Example We can illustrate the construction with the following grammar,
where 2= {a, b,x):
S -»• aOb
O -»• P\aOb\00
P -»• x\E
E -»• A
The grammar is part of ALGOL. To see this, make the following associations:
S stands for (string);
(open string);
(proper string);
(empty);
0
p
E
a
b
X
stands for
stands for
stands for
stands for
stands for
stands for
(any sequence of basic symbols not containing")
Here we regard x as a terminal, for simplicity. We compute:
Wi = {E},
W2 = {E,P},
W3 = {E,P,0},
and Wn = W3. The new grammar is:
S' -»• S
S -»• aOb\ab
O -»• P\aOb\00\0\ab
P -»• jclf
Note that the new grammar contains many redundancies. When it is reduced it may
not even resemble the original one. Note that these changes all came merely from the
single rule E-*- A. In more complex examples, the grammars are changed drastically.
Now that we have developed this grammatical transformation, it is easy to
see how to pass from any context-free grammar G to a A-free grammar which generates
1(G)-{A}.
Example The next example will help us understand the complexity of this
algorithm.
S -»• Ai---Ak
At -»• a,|A for alii, \<i<k.

4.3 ELIMINATION OF NULL AND CHAIN RULES 101
Note that |S I = k, I V\ = 2k + 1, \G\ = 4k + 1, and ||G|| = (4k + l)log(2fc + 1). If
we compute G', we find that P' is as follows:
S' -»• S|A
5 -»• X, • • • Xk Xte {At, A} and Xx-- Xk ^ A,
Aj -»• a,- 1 </<£,
where | F \ = 2k + 2. To compute |G |, we merely count the symbols in each
production. This results in .„,. . .
|Cj | - 2 +
1 +
iLKi+O-1 + 2*
k fk\ k Ik
\G'\ = 2* + 2 + I.+I/
Thus,
= 2fc + 2 + 2fe + /t2fe-1
= (A: + 2)2fe_1 + 2A: + 2.
This algorithm takes exponential time no matter how the set of variables that derive
A is computed, because it takes that long to even write down G .
It is possible to do much better. One can in fact show that this can be done
in essentially linear time.
The next case to be considered is the elimination of "chain rules."
Definition Let G = (V, ~L,P, S) be a context-free grammar. A production
A -*■ B in P is called a chain rule if A, B&N.
Now we present an algorithm to eliminate both chain and A-rules.
Theorem 4.3.2 For each context-free grammar G = (V, 2,P, S), there is
a grammar G' = (V, 2,P',5) so that L(G') = LOG) and P' has only productions of
the form <? ->• A
A ■+ a aG(F-{5})+,lg(a)>2,
A -»• a a G2.
Proof We may assume, without loss of generality, that G = (V,2,,P,S)
satisfies the conditions of Theorem 4.3.1. Let A SN and we shall eliminate all chain
rules A -*■ B, B £ A', by the following construction. Define
WoC-4) = {A}
and, for each />0,
Wi+i(A) = Wt(A) U{B&N\ C^B is inPfor some C& Wt(A)}.

102 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
The following facts are immediate:
i) For each / > 0, Wt(A) £ Wi+l{A);
ii) If Wt(A) = Wi+l(A), then, for all m > 0, Wt(A) = WUm(A);
iii) Wn{A) = Wn+l(A),where n=\N\;
iv) Wn(A)={BeN\ A^B).
Next, we define G' = {V, 2, P', S), where
P' = {A^-a.\0L$N<Ln&B^-0LV>\nP forsometf G Wn(A)}.
It is clear that G' satisfies the conditions of the theorem. It is straightforward
to verify that L{G') = L{G). U
Example E^E+T\T
T -»• T*F\F
F -» (E)\a
Clearly, W3(F) = {F}, W3(T) = {T, F}, W3(E) = {E, T, F). The new grammar is:
E ■* E + T\T*F\{E)\a
T -» T* F\(E)\a
F -» (E)\a
Let us also consider the grammar show below.
Example Consider the grammar
At -*■ Ai+l\aiAi+l for each/, 1 </<£,
The start variable is ^41- For this grammar G, \ GI = 5k — 3 and so || G \\ = 0(k log k).
If we apply the transformation given in Theorem 4.3.2, we get G', as shown below:
Clearly,
or
At -*■ ajAj+i for each/,/, 1 </</<£,
Ai -*■ ak for each/, l</<fc.
fe-i
\G'\ = 2* + 3£/
;=1
_ k(3k+ 1)
2
Gil2
IG'H = 0(A:2logA:) = O
log||G|
This example shows that the transformation can enlarge G' by a factor of
|G||/(log||G||).

4.4
CHOMSKY NORMAL FORM AND ITS VARIANTS
103
PROBLEMS
1 Show that the nested-set algorithm given in Theorem 4.3.1 to compute
Wn takes time less than a constant times
11 Gil2
log 11G 11 '
2 Devise a new algorithm which computes Wn of Theorem 4.3.1 and which
takes a length of time that is linear in the size of the grammar. [Hint: One approach
requires a stack and for each BEN, a list of all places in the righthand sides of
productions at which B occurs. One also needs to keep a count of how many symbols in
a righthand side cannot produce A.]
3 Devise a new algorithm for computing a A-free equivalent of G which
works in linear time (using the | G | measure), assuming that Wn is computable in linear
time.
4 Prove (if it is true) that the constructions given in this section preserve the
unambiguity of the original grammar. Do these constructions change the "degree of
ambiguity" of a grammar?
5 State and prove a result for chain-rule elimination in the presence of null
rules.
6 Consider the last example of A-rule elimination in this section. Estimate
II G' || as a function of II G ||.
Let us extend our notions of complexity from grammars to languages. Define
\L |, for L a context-free language, to be:
|Z,| = min{|G|: G is a context-free grammar and L(G) = Z,}.
Although the same notation is used for the cardinality of L, we shall distinguish
between the two concepts whenever the context is not clear. A similar definition
holds for || Z,||.
7 Let us define
|Z,|A = min{|G|: G is a A-free, context-free grammar and L(G) = L},
where L is any context-free language with A(fcL. Show that for any context-free
language L with A ^ L, we have
\L\A<c\L\
for some constant c, where \L | is defined above. Would a similar result be true if we
changed our definition of | G | to be the number of productions?
4.4 CHOMSKY NORMAL FORM AND ITS VARIANTS
In a context-free grammar, there is no a priori bound on the size of a right-
hand side of a production. Many proofs can be simplified if these righthand sides are
bounded to be of length at most 2; then no grammatical tree ever has more than
binary branching. There are many ways to achieve this effect, and different normal

104 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
forms arise. We mention several of these here. There exist applications for each in
which none of the others would suffice.
Definition A context-free grammar G = {V, ~L,P,S) is said to be in canonical
two form if each rule is of the form
i) A^BCwithB,CGN
ii) A^-BmthBeN
iii) A -*■ a with a E 2
iv) S->-A
Furthermore, if 5^- A is inP, thenB, CEN— {5}in (i) and (ii).
This form guarantees that we have at most binary branching but we may still
have chains, as shown below in Fig. 4.2.
There is another normal form which eliminates that possibility.
Definition A context-free grammar G — (V, 2,P,S) is in Chomsky normal
form if each rule is of the form
i) A^BCwithB,C£N,
ii) A -*a with a E 2,
iii) S->-A.
Furthermore, if S -»• A is in P, then B, C in N — {S} in (i).
Next we show that any context-free language may be generated by a grammar
in Chomsky normal form.
Theorem 4.4.1 For each context-free grammar G — (V, 2,P, 5), there is a
grammar G' = (V', 2,P', S) so that L(G') = L{G) and G' is in Chomsky normal form.
Proof We may assume that G satisfies the conditions of Theorem 4.3.2.
Furthermore, for each production inP,A^>-Bl--Br,Bi£V,r>2,we may assume
that each Bt EN; for, if some Bj E 2, replace Bj by a new nonterminal symbol Cand
add a production C^Bj. (Recall that Bj E 2.) This is repeated for all such instances
in all the rules.
Now we can construct P'.
1 MA ->-aisinPandlg(a)<2, then ,4 ->-aisinP'.
A,
f3
a
FIG. 4.2

4.4 CHOMSKY NORMAL FORM AND ITS VARIANTS 105
is equivalent to
FIG. 4.3
2 hetA^-Bi- • Br,r>3,BjeNbemP. Then P'contains:
A -»• 5^!
Ci -»• B2C2
Cr_3 ->• i>r_2Cr_2
C7r_2 "^ Br_iBr
where Ci,. .. , Cr_2 are all new symbols. [Note that if r — 3, the rules are A ->-51C1
and Ci^-B2B3.] This transformation is illustrated in Fig. 4.3. It is now a
straightforward matter to verify that L{G') = L(G). D
Example Consider the following grammar:
5 -»• TbT
T -»• TaT\ ca
First, we eliminate terminals on the right, and produce:
5 -» TBT
T -»• TAT
T ^ CA
B -»• b
A -*■ a
C -»• c

106 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Only the first two rules need modification to achieve Chomsky normal form. Thus we
have
T -»• CA
B -»• b
A -»• a
C -»• c
S -»• TDi
Dx -»• 5r
r -»• ro2
z>2 -»• ^r
A similar theorem may be stated for the canonical two form.
Theorem 4.4.2 For each context-free grammar G, one may effectively
construct an equivalent context-free grammar in canonical two form.
Another form may be employed in which the generation trees always have
binary branching.
Definition Let G — {V, 2,P, S) be a context-free grammar, G is said to be
in binary standard form if each rule is of the form
A -» BC
whereAeN;B,Ce V.
We leave to the reader (Problem 2, below) the task of formulating and proving
the theorem which tells how general the binary standard form is.
PROBLEMS
1 Prove Theorem 4.4.2.
2 State as general a theorem as possible about the binary standard form.
3 Place the following grammar into Chomsky normal form, canonical two
form and binary standard form.
S -»• aAC
A -»• aB\bAB
B -»• b
C -»• c
4. Let G = (V, 2, P, S) be a context-free grammar in Chomsky normal
form. Suppose w £ L(G) and S => w. Find a relationship between / and lg(vv).
5 Show that if G is an unambiguous context-free grammar and if G' is the
Chomsky normal-form grammar for G constructed by Theorem 4.4.1, then G1 is
unambiguous.
6 Suppose G — (V, 2,P, S) is a context-free grammar. Show that one can
find an equivalent grammar G1 which is in canonical two form, where every variable
in G is in G', and where \G'\ is proportional to \G\. Is the result also true for ||G||?
7 Is the result of Problem 6 also valid for Chomsky normal form?
8 Work out the relationship between \G\ (and ||G||) and \G'\ (respectively,
II G ||) where G is the binary standard form grammar of G.

4.5 INVERTIBLE GRAMMARS 107
4.5 INVERTIBLE GRAMMARS
The motivation for studying the next class of grammars comes from the
theory of bottom-up parsing. Crudely speaking, bottom-up parsing consists of (i)
successively finding phrases and (ii) reducing them to their parents. In a certain sense,
each half of this process can be made simple but only at the expense of the other.
Invertible grammars allow reduction decisions to be made simply.
Definition A context-free grammar G = {V, 2,P, S) is said to be invertible
if A -*■ a and B -*■ a in P implies A = B.
Thus invertible grammars have unique righthand sides of the productions
and the reduction phase of parsing becomes a matter of table lookup. The reason for
this name is that P~1 is a function exactly when G is invertible.
Next we show that each context-free language can be given an invertible
grammar.
Theorem 4.5.1 For each context-free grammar G = (V, 2,,P,S), there is an
invertible context-free grammar G' = (V1, X,P', S) such that L{G') = L(G).
Proof The proof is an example of trivialization. That is, there is an unnatural
argument which proves the result in such a way as to destroy the potential utility of
the theorem. Let G = {V, 2,,P,S), where N= {Aly... ,An). Let R be a new symbol
not in V and define
G' = (V',Z,P',S),
where N' =NUR and P' = {Ai^aRi\ At^-a is in P}U{R^-A}. Clearly, G' is
invertible and L{G') = L{G). U
The trivialization occurs in the previous proof because null rules are used to
encode the index of the variable in unary. We prefer to allow A-rules only if absolutely
necessary. Now we show how to prove the previous theorem without this trick. The
argument becomes more complex but the result is more useful.
Theorem 4.5.2 For each context-free grammar G = {V, X,P,S), there is an
invertible context-free grammar G' = {V', 2,,P',S') so that L(G') — L{G). Moreover
i) A -»• A is in P' if and only if A G L(G) and A = S'.
ii) S' does not appear on the righthand side of any rule in P'.
Proof Let us assume, without loss of generality, that G = {V,~L,P,S) is
A-free and chain-free. [If ASL(G), then Lx — L(G)— {A} has a grammar G' — {V\
Z,P',S'), which is A-free and chain-free. If the result has been proved for G', take
G" = (V'U{S"},I,,P",S"), where P"=P'u{S"^A, S"^S'}. Clearly, L{G")
— L{G) and G" will be invertible if G' is.]
Let G' = {V',T,,P',S') and let N' = {S'} UN", where N" consists of
all nonempty subsets of A7.

108 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
P' is defined to include exactly:
1 S' -*-A is inP' where A is any subset of TV containing S.
2 For each production B^x$\Xx ■ ■ Bnxn with n>0, Bu .. . ,Bn E.N,
and x0, ■ ■ ■ ,xn S 2*, then for each A\,. .. ,An EN' — {S1}, P' contains
A -»• x0Aixl ■ ■ Anxn
where
A = {C\ C^x0CiXi ■ • ■ Cnxn is inP for some Cu ... , Cn with each Q EAt}.
If C^y^C^i- ■ Cnyn is in P with y0,... ,jne2*,Cje V-'L, we call the string
j0 —y\ — • ■ • —yn the stencil of the production (variables replaced by dashes).
Note that P and P' have the same set of stencils if we exclude the rules of
the form (1) fromP'. Assume without loss of generality that G' is reduced.
Before embarking on a proof that L(G') = L(G), we give an example of
the construction.
Example Consider the following grammar
S -» 0A\ IB
A -» 0A\0S\ IB
B -» 1|0
Applying the construction of the theorem leads to the following grammar:
{»}- no
{A} ■* 0{S}\0{S,B}
{A,S} ■+ 0{A}\0{A,B}\0{A,S}\0{A,S,B}\1{B}\1{B,A}\1{B,A,S}\1{B,S}
S' ■* {S}\{A,S}\{B,S}\{A,B,S}
Reducing the grammar leads to
S' - {A,S}
{b}^ no
{A,S}^ 0{A,S}\l{B}
To finish the proof, note that G' is invertible. Now we begin the proof that
L(C') = L(C).
Claim 1 For each^4 SjV'and eachxS 2*,
A ^-xinG'
implies t
B => xinG for eachB eA.
Proof The argument is an induction on C, the length of a derivation in G'.

4.5 INVERTIBLE GRAMMARS 109
Basis. Suppose £ = 1. Then A =>x e 2* in G' and A -*-x is in P'. By the
construction, A = {C€.N\ C^x is in P}. Thus ,4 -*-a is in P' if and only if B-*-x is
in P for each 5 £ ,4.
Induction step. Suppose £>2 and Claim 1 holds for all derivations of
length less than £. Then suppose A ^XqAiX^ • • • Anxn =*x in G'. This implies that
for each/, 1 </<«,
At ^ yt e 2* and x0yixl • • -ynxn = x.
By the construction, for each B €.A there exist Bt £A[ so that B^XqBiX^ • • ■ Bnxn
is in P. Moreover, the induction hypothesis implies that Bt =>yt in G, and therefore
B =* x0Blxl ■ ■ Bnxn =*• x0^iX! • • -ynxn = x
Note that Claim 1 implies that
L(G')tL(G).
To complete the proof, the following result is needed.
Claim 2 For each x e 2*, let X* = {CSjV| C^x in G}. If 5 ^x in G, then
^4 =*x in G' for some ,4 such that B&A 9 X,..
Proof The argument is an induction on £, the length of a derivation in G.
Basis. £ = 1. Suppose 5 =>x S 2* so 5 -*•* is in P. Then ,4 ->• x is in />' with
BeA = {CeN\ C^-xisinP}.
Induction step. Suppose B ^XqBiXi ■ ■ ■ Bnxn ^o^i^i ■ ■ -ynxn = x S 2*
in G is a derivation of length £, where B, Bu .. . ,Bn SjV; x0,. .. ,xn, yu- • • ,yn
S 2*. There are derivations Bt =>yi, all of which have length <£. By the induction
hypothesis, there arc4( SjV' so that
Ai^yj'mG' and B^A^
By the construction, A -*X(yAxx\ ■ ■ A„x„ is in P with B &A. Thus
A => x0AiXi- Anxn ^ x0ylxl ■ ■ ynxn =xinG'.
By Claim 2, L(G') 2 L(G), and hence L(G') = L(G). D
The construction given seems to be exponential in nature. The following
example will confirm it.
Example Let A: be a positive even integer. Let 2 = {a, b, c} andN= {Ai,...,
Ak}. Let aa — (1, 2,..., k) and ab — (1,2) be two permutations written in cyclic
notation. These two permutations generate the symmetric group on ^-letters; that
is, the group of all permutations of the letters {1,..., k}. Define .P by:
Ai -*■ aAGair)\bAabiiy for each/, l</<&;
Ai -*■ c for each odd/, !</<&.

110 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
For G = (V, 2, P, A,), we have
\V\ = k + 3 and \G\ = 7k.
Now the variables of G' will be sets of variables such as {Ai,A3,. ..}. To simplify
the notation, we will write {1,3,. . .} instead. If the computation of G' is carried
out and reduced, we get N' = {B 9 N: | B | = k/2} U {S'} and
P' = {S'^B\AleB^.N, \B\ = k/2)
U{B^aB',B^bB"\ B^N, \B\ = k/2, B' = aa(B), B" = ab(B)}
U{{l,3,...,*-l}-c}.
Clearly,
One can show that
where n
\V'\ =ly2l + 4>2*"
iP'l = 2+6U+U>7-2,,!-
IC'H = O I -5- • 2"/loE " I, (4.5.1)
logH
PROBLEMS
1 Why did we assume that G was chain-free in the proof of Theorem 4.5.2?
2 If we assume that G is unambiguous, then does the following stronger
correspondence hold in the proof of Theorem 4.5.2?
Proposition B =>x in G if and only if {b}=>x in G'.
3 Show that, for any context-free grammar G, there is a context-free grammar
G' such that L(G) = L(G') which is invertible and chain-free.
4 Show that there exist context-free grammars G with A£L(G) for which
there do not exist equivalent grammars that are invertible, chain-free, and A-free.
5 Continuing the last example of Section 4.5,
a) Prove Eq. (4.5.1).
b) Find an equivalent invertible grammar of size k2.
c) Note that L{G) is regular. Find a finite automaton for L(G) of "size"
proportional to k. [The size of a finite automaton may be thought of as
the length of all its transitions when written out as a string. ]
6 There is an intimate relationship between invertibility and determinism in
the case of one-sided linear grammars. Can you expand on this insight?

4.6 GREIBACH NORMAL FORM 111
4.6 GREIBACH NORMAL FORM
In this section, another normal form is considered which has important
implications in both theory and practice.
The most important practical applications of this form are to top-down
parsing. A problem occurs in top-down parsing when we have "left recursive" variables.
Definition Let G = (V, 2,P,S) be a context-free grammar. A variable
A GA' is said to be left recursive (right recursive) if A =>Aa for some aG. V*
(respectively, A =*aA). A grammar is left (right) recursive if it has at least one left (right)
recursive variable.
In "recursive descent" parsers, the presence of left recursion causes the device
to go into an infinite loop. Thus the elimination of left recursion is of practical
importance in such parsers. It will turn out that a grammar in the form we are now
introducing is never left recursive. Thus, the main theorem of this section will solve
this looping problem in top-down parsers. We begin by considering two lemmas, even
before the normal form is introduced.
Lemma 4.6.1 Let G = (V, "L,P, S) be a context-free grammar. Let n — A -*■
axBa2 be a production in P and B-*$x\ ■ ■ • ||3r be the only BNF rule in P with B
on the lefthand side. Define Gi = (V, 2,^,5), where
Pl = (P-{ir})U{A^a1faa2\ai02a2\---\alpra2}.
ThenL(Gi) = L(G).
Proof Clearly, L(Gt) £ L(G), since if A -*-a1/3l-a2 is used in a Grderivation,
then A =g aiBa2 =g a1/3,a2 can occur in G.
Conversely, note that A -^aiBa2 is the only production of G not in Gh
MA -+ctiBa2 is used in a G-derivation of a terminal string, then B must be rewritten
by some B -*■ ft. These two steps are combined in G\. Therefore, L(G) 9 L(G{). □
The next lemma involves certain regular sets.
Lemma 4.6.2 Let G = (V, E,P, S) be a A-free, context-free grammar. Let
A -»• Aax\ ■ ■ ■ \Aar
where at =£ A for all /', 1 < i < r, be all the rules with A on the left such that the
leftmost symbol of the righthand side of the rule is ,4. Let
A -0,|---|&
be the remaining rules with A on the left. Let Gi = (VU {Z}, 2,^,5), where Z is a
new nonterminal symbol and Pi is the set of productions in P with all the productions
having A on the left replaced by
A -^Z|---IAZ|^|---|A
Z ■* a%Z\---\arZ\al\---\ar
Then L(Gd = L(G).

112 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Proof The effect of this construction is to eliminate the left recursion in the
variable A. In each place, we have a new right recursive variable Z. Note that none of
the new A -rules are directly left recursive because none of the ft begin with ,4. Since
ft =£ A, we cannot get left recursion on A by "going through" Z. Also, note that Z
is not a left recursive variable because a,- =£ A for all i and because none of the at
begin with Z.
To complete the proof, note that the original productions in G with A on
the left generate the regular set
{/J,,..., p,}{Aalt... ,Aar,alt... , ar}* U {A}.
Moreover, this is the set generated by A (with the help of Z) in Gx. With the aid of
this remark, it is straightforward to verify that L(Gi) - L(G). We omit the details. □
Now the grammatical form is introduced.
Definition A context-free grammar G - (V, "L,P, S) is said to be in Greibach
normal form if each rule is of one of the forms
A -+ aBi- Bn
A -»• a
S -+ A
where Bu . . . ,Bn EN— {S}, a£ 2. A grammar is said to be in m-standard form if G
is in Greibach form, as above, and n <m.
Thus a grammar is in Greibach form if and only if it is in m-standard form
for some m.
There are a number of alternative definitions of Greibach form. In certain
textbooks, all rules are required to be of the form
A -*■ aa
with aGS,aGJV*. Such a definition rules out A from being in the language. In other
places, the rules may be of the form
A ^ a$
where uSS and |3S V*; that is, terminals and variables may be in ft One must be
quite careful in quoting theorems from the literature that involve this grammatical
form.
This form has a number of important implications. Ignoring S-*■ A, all other
rules have righthand sides that begin with a terminal. If G is in Greibach normal form
and w GL(G), w =£ A, then every derivation of w from S in G has exactly lg(w) steps.
The following proposition generalizes this fact and states it precisely.
Theorem 4.6.1 Let G = (V, 2,P,S) be a context-free grammar in Greibach
normal form. If w S 2*, aG A^*, and S=>wa then every derivation of S =>wa has
exactly max{l, lg(w)} steps.
We are now in a position to prove the main result.

4.6 GREIBACH NORMAL FORM 113
Theorem 4.6.2 Every context-free language may be generated by a context-
free grammar in Greibach normal form.
Proof Suppose, without loss of generality, that L = L(G), where G =
{V,H,P,S) is reduced, in Chomsky normal form, and assume (for the moment)
that G is A-free. Let N = {A x,..., Am} with 5 = A i.
We begin by modifying the grammar so that if At -*AjO. is in P, then / > i.
Suppose that this has been done for all rules starting with Ax and proceeding to Ah.
Inductively, we may suppose that Ai^AjQ. for 1 <*'<& implies/>/'. We now deal
with the rules that ha\eAk+l on the left. If^4ft+1 -+Ajais a production with/ <k + 1,
we generate a new set of productions by substituting for ,4/ the righthand side of each
of the Aj rules according to Lemma 4.6.1. By repeating this operation at most k times,
we will obtain rules of the form
4tl^Aja with C > & + 1.
The rule with £ = k + 1 are replaced according to Lemma 4.6.2 in which we introduce
a new nonterminal Zfe+1.
Repeating the construction for each original nonterminal gives rules of the
following forms:
1 Ak^ABa with C > k, a e ((N- {S}) U {Zu ... , Z„})+;
2 Afe -+ aa where a e 2, a e ((N- {S}) U {Zl,..., Z„})+;
3 Zft ^a where aE ((N- {S}) V{Zlt... ,Z„})+.
The reader should verify that the exact quantification given above is correct.
This requires a separate proof and employs the fact that G was in Chomsky normal
form.
Note that the leftmost symbol on the righthand side of any Am production
must be terminal by rule (1) and the fact that there is no higher indexed nonterminal.
For Am-i, the leftmost symbol generated is terminal or Am. If it is Am,
generate new rules by use of Lemma 4.6.1. These new rules all begin with a terminal
symbol. Repeat the process for Am-i,. . . ,A\. Finally, examine the Zu . . . ,Zn
rules. These rules begin with an At. For each Ah use Lemma 4.6.1 again. Because G
was in Chomsky normal form, the only terminal on the righthand side of any rule is
the first one.
To complete the proof, we must deal with the case where A Si. If this
occurs, let G — (V, ~L,P, S) be the Greibach normal-form grammar for L — {A}.
Define G' = (VU {S'}, I,,P',S'), where P' =PU {S' ->S}U {S' -+ A}. Use Lemma
4.6.1 once on the rule S' -*-S. Clearly, the resulting grammar G' is in Greibach normal
form and generates L. □
Corollary Every context-free language may be given a context-free grammar
which is not left recursive.
Example Consider G as shown below:
Ax -»• ^2^3
A2 ■*i4Ii42|l
A3 -+ AiA3\0

114 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Applying Lemma4.6.1 to,42yields:
Ai -»• A2A3
A2 -»• A2A3A2\\
A3 ^ A1A3\0
Now Lemma 4.6.2 can be applied to the second production:
At -»• A2A3
A2 -» lZill
Zi -*■ A3A2Zl\A3A2
A3 -»• ^i^3|0
Focussing on A3, we use Lemma 4.6.1 to obtain:
Ai -*■ A2A3
A-i -» lZill
Z, -» A3A2ZX\A*A2
A3 "»• ^2^3^310
Now Lemma 4.6.1 is used again oxvA3,
Ax -*■ A2A3
A2 -+ IZjII
Zt -»• A3A2Zl\A3A2
A3 -» IZ1A3A3\IA3A3\0
Now it is possible to begin the process of back-substitution. A3 and A2 are already
in Greibach form, but A i is not:
Ai -+ \ZiA3\lA3
A2 -+ 1Z,|1
A3 -» \Z1A3A3\lA3A3\0
Zt -*■ A3A2Zi\A3A2
If we substitute for A 3 in the Z t rules, we obtain:
At -» \Z1A3\IA3
A2 -+ lZill
i43 -» lZii43i43|M3i43|0
Z! -* \Z}A3A3A2Zi\lA3A3A2Zi\0A2Zi\lZiA3A3A2\lA3A3A2\0A2
This grammar is in Greibach form.

4.6 GREIBACH NORMAL FORM 115
Now consider the following grammar which will help us to estimate the size
of a grammar in Greibach normal form. Let G be:
Bt -* Bi+1Bk+11Bi+1Bk+2 for all i, Ki<k,
Bk ■* a\b
Bk+i -* a
Bk+2 -» b
where 2 = {a,b}, jV= {Bu ... ,Bk+2}, and S = Bx. Then | V\ = k + 4, \G\ = 6*+ 2.
If the construction is carried out, we get G' with productions
Bk+2 -*■ b
for each a in {a,6}{5fe+1,5fe+2}'_1andforeach/, l </< k. Clearly for each /, l <j<A:,
we will have 2' righthand sides (with Bk+1_t on the left), and each such rule has length
/ + 1. Thus: k
\G'\ = 4+ £(/+ 1)2'' = 2k2k +4.
i=l
The latter equality may be easily verified by induction.
Note that G' is exponentially larger than the original grammar. Moreover, G
was not even left recursive originally. If we reduce G', we get G" shown below:
Bk+2 -+ b
Bi -»• a
for each ae {a,b}{Bk+1,Bk + 2}k'>. Thus:
\G"\ = 2fe + 5,
which is still exponential in the size of G.
PROBLEMS
1 Give an algorithm for detecting left-recursive variables.
2 Although the algorithm given in the proof of Theorem 4.6.2 starts from
Chomsky normal form, show that chain rules may occur in the middle of the
computation. Would the algorithm work if we started from canonical two form?
3 Does the construction given in this section preserve unambiguity?
4 Place the following grammar in Greibach normal form
E -»• E+ T\T
T -»• T* F\F
F -»• (£01 a

116 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
5 A context-free grammar G = (V, 2,P, S) is in reverse Greibach form if
all rules are of the form
A -»• aa ae(N-{S})*
A -»• a a€S
S -»• A
Show that any context free language has a grammar in reverse Greibach form.
6 A context-free grammar G = (V, 2,P, S) is in double Greibach form if
each rule is in one of the following forms
S -»• A
.4 -»• a aSS
/I -»• aa* a, 6 S 2; a e (N - {S})*
Show that every context-free language has a grammar in double Greibach form.
7 The following "proof" of Problem 6 has been proposed. If the sketch
presented here is correct, write a complete proof. If there is a flaw in the approach,
explain the problem in detail.
Suppose we are given a context-free grammar G; place G in Greibach
normal form and call your new grammar Gx. Define G2 as Gj, that is, if G\ =
(Vu 2, Pi, Si), define G2 = (Vu 2, P[, Si), where
P[ = {A -*aT\ A -*amPi\.
Place G2 in Greibach normal form and call the resulting grammar G3. Now form
G4 = Gj. It is claimed that L(G4) — L{G) and that G4 is in double Greibach form.
4.7 2-STANDARD FORM
In the previous section, we showed that any context-free grammar could be
placed in Greibach form or, equivalently, in m-standard form for some m. In the
present section, we shall attempt to minimize m and yet still generate all the context-
free languages. If m = 1, we would have rules of the form
A -+ aB BeN-{S]
A -»• a aSS
S -+ A
But these are right linear rules and so we can generate only regular sets. Thus, we
must have m > 2 if we are to succeed in generating all the context-free languages.
Theorem 4.7.1 For each context free grammar G= (V, 2,P, 5), there exists
a context-free grammar G' such that G' is in 2-standard form and L(G) = L(G').
Proof We may assume, without loss of generality, that G is in m-standard

4.7 2-STANDARD FORM 117
form for some m; that is, G is in Greibach form where:
A -* aa ae(N-{S})*, lg(a)<m,
A -*■ a
S -+ A
Thus it suffices to show that if G is in m-standard form for some m > 3, then we can
find a grammar G' in (m — l)-standard form such that L(G) = L(G'). Let
G = (F,2,P,5)
be in m-standard form, m > 3. Define
G' = (V',Z,,P',S),
where
V = VU{(A,B){ A,B&N)
and P' is given by
P' = {^4->-a| A -*a is inP and lg(a)<m}
U^afl, • • •5m.2(5m.1,5m)| A^aB1---Bm isinP}
U {(,4,5) -»• afi| ^ -»• a is in P and lg(a) < m - 1}
U{{A,B)-+aBi ■•■Bm.2{Bm.uB)\ A^aBx- ■ ■ Bm_x\smP}
U {(A,B)^aBl ■ ■ Bm^(Bm.2,Bm^)(Bm,B)\A ^aBl ■ ■ ■ Bm isinP}.
Definition If A -*a is a rule, then we shall call lg(a) the width of the rule.
The intuition behind the construction is not difficult. If we have a rule of
width m + 1, we replace it by a rule of width m whose last variable is a pair; see
Fig. 4.4.
Now, however, the rules with paired variables must be defined. If
5m_, -*■ a is in P
with lg(a) < m, then clearly we will have
(Bm_uBm) - aBm
If the righthand side of a is too long, more paired variables are introduced.
It is certainly clear that G' is in (m — l)-standard form, but it is less clear
that L(G') = L(G).
A
a B\ ' ' ' Bm-2 (Bm-1> Bm)

118 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Claim 1
a) If y, a S V* and y 7* a using only rules of width less than m, then
7^a (4.7.1)
G
b) If A ^ a with aGV* using only rules of width less than m, then
(A,B) => aB
for any B G.N.
Proof To prove (a) it suffices to note that if A is some variable occurring
in 7, then since each rule used in the derivation A ^a is of width less than m, there
is a corresponding rule in P', and hence
Af,a
The generalization to (4.7.1) then follows directly from Lemma 3.3.1.
To prove (b), let the derivation A j? a be of the form
A "g y <? a' h(y)<m.
Thus the rule
A -*■ 7 is in P
and lg(7) < m — 1. Therefore the rule
(A,B) -»• yBisinP'.
Furthermore, since
by (4.7.1), we have
Combining these results gives
Claim 2 If
yta
T? a
(A,B) £ yB £ aB
A => x with x6S*,
then i) A =>, x
ii) (A,B)=>xB for anyBeN.
G
Proof Let p be the number of rules of width greater than m — 1 in the
derivation A ^> x. We induct on p.
G
Basis, p — 0. Claim 2 reduces to a special case of Claim 1 which we have
proven.
Induction step. Assume that if C <p and
Afx

4.7 2 STANDARD FORM 119
using £ rules of width greater than m — 1, then
A §x
(A,B) % xB
Now let
A =>x
using p + 1 rules of width greater than m — 1. Then the derivation is of the form
A =?yA'r, (4.7.2)
g> yaBl ■■ -Bkrj k>m — \,
*
£?■ x0axi • • 'Xkxk+i — x
where the derivation (4.7.2) uses no rules of width greater than m — 1, and hence,
by Claim 1,
A £; yA'n
Bt £ X; for each /, 1 < / < k,
V ^ xk+l
each with at most p rules of width greater than m — 1. Therefore,
7 |*o
Bt § xt 1 </<*:,
V % xk+l
by the induction hypothesis.
There are now two cases.
CASE 1 k = m — 1. In this case, the rule
A' -* aBl ■Bk isinP',
and thus, by combining the above results, we have
A g; yA'ri
g; yaBi ■■Bk'n
*
g; x0axi • • ~xkxk+l — x
CASE 2. k = m. Then the rule
A' -+ aBi- • •{Bm_uBm)ismP'.
Now, since
#m-i f" xm-i and flm §> xm

120 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
using at most p rules of width greater than m — l, by parts (ii) and (i) of the induction
hypothesis, we have:
\Pm-\'"m) q' xm-\"m q' xm-lxm
Combining these results gives
A g; yA'ri
g* 7«ifli---5m_2(5m_I,5m)Tj
*
Q> XqAXj ■ ■ XmXm + l — X
which completes the proof of (i).
To prove (ii), let the derivation A ^x be of the form
A => y ^ x
There are three cases to consider.
CASE 3. lgfr) < m - l. Then the rule
(A,B) -+ yB is inP'.
Using Lemma 3.3.1 and part (i) of Claim 2, we have
y I x
Thus
(A,B) g! yB | xB
CASE 4. lg(7) = m. Then the derivation is of the form
A ^ aBi-'-B^y f ox,-xm_,
Thus we have:
1 B{ =* x, 1< i < m - 1,
using at most p rules of width greater than m — 1, since the first rule is of
width m.
2 The rule
(A,B) ■*aBl---Bm^{Bm.l,B)
is inP'.
3 By the induction hypothesis,
Bt gt X,- 1 <i<m-2,
and
(Z?m_„Z?) | xm_,Z?
Combining gives
04, Z?) gr aBr--Bm.2(Bm.uB)
§; ox,- •xm_,xm5 = x5

4.8 OPERATOR GRAMMARS 121
CASE 5. lg(7) = m + 1. Then, similarly to Case 4, we have that
A - aBx • ■Bm.3(Bm.2,Bm.l)(Bm,B)
is in P' and t
B; g Xi 1 <i<m-3,
, * *
\"m-2>Bm-l) q' xm-2^m-l ($ xm-2xm-l
(Bm,B) f, xmB
Thus
04,Z?) f, aBl ■ ••Bm.3(Bm-2,Bm-i)(Bm,B)
g? axi • ■ -xm_2Xm-iXmB = xB
which completes the induction.
Now, taking ,4 = S in Claim 2 gives:
Claim 3. For each x in 2*, if 5 =» x, then 5 ^x. Thus
G G
£(ff)c-£(C)
Claim 4. L(G') ?/,((;).
Proof. Define a map </>: V' -*■ V, by:
0(a) = a for each a€.V,
0(04,5)) = ,45 for each (A,B)eNxN.
Now extend i/) to be a homomorphism from (V')* to V* in the natural way. It is then
straightforward to prove by induction that, for each a, 0 in (V')*,
Thus if S ^ x, x G 2 *, then
g'
if
then
G
*(a) =» *(0).
0(5) = 5 => 0(x) = x
or, equivalently,/,(£')- L(G), which completes the proof of the theorem. □
PROBLEMS
1 Estimate the size of G' as a function of the size of G. Can you improve the
construction given in Theorem 4.7.1?
2 Put the following grammar into 2-standard form:
S -»• SaA | SA | b
A -»• /IdMSalaS'lc
4.8 OPERATOR GRAMMARS
One of the most important normal forms used in precedence analysis is the
operator form.

122 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Definition A context-free grammar G = (V, E,P,S) is said to be in operator
form if P<k N x (y* _ V*N* y*y
This means that each rule in P is of one of the following forms:
i) A^BforAeN,Be FA,or
ii) ,4-»a:flq3 where for allocs V*,B,CE V, either Be 2 or CE 2.
The main point is that there are never two adjacent variables in the righthand
side of a rule in an operator grammar.
We can now show that every context-free language has a grammar in this form.
It is not too difficult to prove this result with the aid of Greibach normal form [2-
standard form is particularly convenient]. We shall give a proof that does not use this
form because, in other applications, the construction will be needed. Note that the case
in which A £ L is handled somewhat separately because it does not generalize in the
applications to precedence parsing.
Theorem 4.8.1 Let G = (V, 2,.P, 5) be a A-free, context-free grammar.
There is a context-free grammar G - (V', 2,.P',S) in operator form such that L(G')
= UG).
Proof There is no loss of generality in assuming that G= (V, 2,.P, 5) is
A-free and in canonical two form. Define G' = (V', E,P',S) where V' = 2 U {5}
U (Nx 2). DefineP' =P1UP2UP3UP4, where
Px = {S^(S,a)a\ a £2}
P2 = {(A,a)^A\AeN;ae2;A^ainP}
P3 = {(A,a)^(B,a)\ A,BeN;ae2;A -+BinP]
P4 = {(A,a)^(B,b)b(C,a)\A,B,CeN;a,be2;A^BCinP}
The idea of the construction is to imagine that we are at some internal node
of a derivation tree of G which is labelled by A. The leading terminal of the subtree
properly to the right of A is encoded into the variable^ (cf. Fig. 4.5). Note that G'
is in operator form.
a A a
G tree G' tree
FIG. 4.5

4.8 OPERATOR GRAMMARS 123
Let us define a mapping <p from P2UP3U P4 into P as follows:
4>((A, a) -»• A) = A -»• a if (-4, a) -»• A is in P2
4>((A,a)^(B,a)) = A^B if{A,a)^{B,a)\smP3
<p((A,a) -+ (B, b)b(C,a)) = A -*BC if (A,a)^ (B, b)b(C,a) is inP4
Extend ^ to a homomorphism from (P2 U P3 U P*f into P*. We can now state the
exact correspondence between derivations in G and in G'.
Proposition For each aSS, x6S*, AG.N, and for each sequence
7Ti,..., 7r„ of productions inP,
,4 ^ xa by canonical derivation (wi,..., w„)
if and only if there exist -n\,.. ., tin in P' such that
<t>(Ti'i) — w,- for all i, 1 < i < n,
and „
by canonical derivation (wi,..., w^,).
The proof of the proposition involves an induction on n and is left to the
reader.
From the proposition it follows that for any x S 2*, a S 2,
S ^ xa by canonical derivation (wj,..., w„)
if and only if „
(5, a) =*■ x by canonical derivation (wi,. .., tt'„)
and 0(7rJ-) = 7T,-. Using a rule from Pi it follows easily that L(G) = L(G'). □
Now we consider the case in which G is not A-free.
Theorem 4.8.2 If G = (V, I,,P,S) is any context-free grammar, then there
is another grammar G' = (V', E,P',S') so that G' is in operator form and L(G') =
L(G).
Proof If A £ Z,(G), there is nothing new to prove. If A e Z,(G), then L — {A}
is context-free. Apply Theorem 4.8.1 to a A-free grammar for L — {A}, and let G" =
(V", 2,P",S) be the resulting operator grammar. Define G' = (V', 2,P',5'), where
F' = V" U {5'} and S' is a new symbol. Then P' = i>" U {S' -+ A, 5' -» 5}. Clearly, G'
is in operator form and Z,(G') = L. □
Example Let G be given as follows:
S -+ SA\SB\a
A -»• a
fl -► 6

124 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
If we apply the construction of Theorem 4.8.1 to G and then reduce it, we get the
following grammar:
S -» (S,a)a\(S,b)b
(S,a) -+ (S,a)a(A,a)\(S,b)b(A,a)\A
(S,b) - (S,a)a(B,b)\(S,b)b(B,b)
(A,a) -+ A
(B,b) -+ A
For example, we show the derivation of aab in both grammars G and G' in Fig. 4.6.
a a
(a) in G
(b) inC
FIG. 4.6
PROBLEMS
1 Prove the proposition stated in the proof of Theorem 4.8.1.
2 The construction used in proving Theorem 4.8.1 uses A-rules even when G
is A-free. Give a new construction that does not have that property.
3 Can you give a simpler proof of Theorem 4.8.1 with the aid of the 2-
standard form?
4 One can strengthen the definition of operator form by demanding that the
rules be of the following forms:
S
A
A
A
A
a
B
aBC|3
BeN-{S},
where a, 0 e (V- {S})*,B,Cev~{S}, and either
SGSorCeS.
Modify the construction to achieve this form as well.

4.9 FORMAL POWER SERIES AND GREIBACH NORMAL FORM 125
4.9 FORMAL POWER SERIES AND GREIBACH NORMAL FORM
It is possible to view a context-free grammar in terms of some traditional
mathematical concepts such as equations and formal power series. For example,
consider a grammar with the production shown below:
S -+ SbS\a
Let us rewrite this grammar as an equation:
S = a + SbS (4.9.1)
In some sense, we want to solve (4.9.1) for 5. To do this, we approximate by taking
5 to be empty. Then we get a first approximation:
a0 = 0.
A second approximation may be made by substituting a0 into (4.9.1) to obtain a\.
ax — a
Continuing, we get
a2 = a + aba
a3 — a + (a + aba)b(a + aba)
= a + aba + lababa + abababa
Now, these operations have been carried out and we have tacitly assumed a
number of properties of + and ■ such as distributivity. These were fortuitous choices,
since these conventions led to a situation where the coefficients of the term x in the
Oj were the number of derivation trees of x of height </. In the limit, a„ could be
written: v-* , \
a = a„ — i_, a{x)x,
where a(x) is the degree of ambiguity ofx, that is, the number of canonical derivations
ofx.
The example indicates that these ideas give a useful interpretation to the
action of a grammar. It is therefore worthwhile to make these definitions precise and
to understand any underlying assumptions.
Definition Let 2 be an alphabet. A formal power series is a map from 2*
into the integers and is written a:x-* a(x), or as
This is a power series in variables which are the elements of 2; they are associative but
not commutative.
The support of a formal power series a is written as supp(a) and defined by
supp(a) = {xS2*| a(x) =£ 0}.
Thus, the support gives a language from a formal power series.

126 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Because the underlying set is the integers, it is easy to define useful operations
such as plus and product on formal power series. (This is done mainly in the exercises.)
Many other classes of power series can be defined. For instance, a power series arising
from a grammar will have coefficients that are nonnegative integers. It is worthwhile
to study power series which have negative coefficients, or even to generalize to power
series whose coefficients are in an arbitrary ring. Further applications may be found in
the exercises and the literature.
It is useful to explain the process of associating a formal power series with a
grammar. Our first example, while very intuitive, hid several issues because of its
simplicity.
Let G = (V, 2,P,5) be a context-free grammar and let N — {Alt... ,Ak]
with S = A1. For each At£N, let aj, l</<m,-, be all the righthand sides of At
rules. G may be specified by the equations:
A, = ofi+--- + a'm.
for 1 < i<&. If we assume G is A-free and chain-free, then for each x £ 2*, there
will be only a finite number of derivation trees for x. Let a(x) be this number, and
define
a = X <"(•*)*
x es*
Here a may be computed by simultaneous approximation of all variables. We shall
illustrate this by an example.
Example Ai -*Axa\A2b
A2 -*■ a\b
In the form of equations, we have:
A i — A\a +A2b
A2 = a + b
We approximate with ordered pairs of sets:
°o = (0,0) = (OouOm),
a! = (0,a + 6) = (a„, a12),
a2 = (ab + bb,a + b) - (p2\,a22),
a3 = (aba + bba + ab + bb,a + b) = (o3l, a32),
a4 = (abaa + bbaa + aba + bba + ab + bb,a + b) = (a41, a42),
To obtain a, we compute
a — lim ok = ( lim okl, lim afe2).

4.9 FORMAL POWER SERIES AND GREIBACH NORMAL FORM 127
It is easy to prove that a exists, and a procedure like this may be used in a more
general context where the underlying equations were not generated by a grammar.
Formal power series and the associated equations that they satisfy are useful
in deriving an alternative method for placing a grammar into Greibach normal form.
Let G = (V, H,P,S) be a context-free grammar. Assume that N= {Ai,. .. ,Ak],
S = Ai, and that G is A-free and chain-free. We take variable At and partition the
righthand sides associated with ,4; into those in NV+ and those in ~LV*. That is, we
write:
At ■* a}\---\a?
where 1 < i < k, nt > 0, and af e "LV* for 1 < £ < nh and for each 1 < i, / < k, we
write
Aj ■+Afl$,\---\AflfiU
where my > 0 and ay £ V+ for 1 < £ < my. To simplify the notation, we write
r„ = ajj + ■ ■ ■ + a%iJ
b, = a}+--- + a?'
where 1 < /, / < k and nt > 0, my > 0. If (say) my — 0, then ri3 = <J). Now we may
represent G as:
A = AR + b, (4.9.2)
where A = [Ai,... ,Ak], b = \b\,..., bk], and R = (ry). The rtj and bt notation
can be extended by defining
rfj = au and bf = a,-
An example will show that the process is less arcane than the notation might suggest.
Example Consider a grammar G given by:
Ax -*■ A\oAi\A2Ai|b
Ai -*■ A2d\A2Aia\aAl\c
In matrix form, this becomes
aA2
A2 d + Aia_
[Ax A2] = [A, A2)
+ [b aA j + c].
Now we are ready to state an algorithm for transforming a grammar into
Greibach normal form.
Algorithm 4.9.1 Let G = (V, I,,P,S) be a A-free and chain-free context-
free grammar. AssumeN = {Alt... ,Ak}andS = Ai.
1 Express G by equations of the form (4.9.2); that is,
A = AR + b
2 Let H = (Hy) be a kxk matrix, each of whose entries is a new variable.

128 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
Construct a new system of equations:
A = bH + b,
H = RH + R.
(4.9.3a)
(4.9.3b)
3 Replace every word in R that is in NV* by substituting the righthand
sides for the leading variables. That is, if we have a word Apol for some
1 <p < k, then, from (4.9.3a), the Ap productions are:
and
b}H,
ip
for K8<m/', K/'Ofc,
for 1 <C<mp.
"p up
This results in a new system of equations:
A = bH + b,
H = KH + K,
where K is derived from R by making the above substitution.
4 The Greibach form grammar of G is now given by Eqs. (4.9.4a) and (4.9.4b).
If we apply this algorithm to our running example, step 2 gives us
(4.9.4a)
(4.9.4b)
[Ax A
H\\ H\2
H2\ H22.
2] =
= [b
aA2
.A2
aAi + c]
0
d + Ata_
#ii
#21
H\2
#22.
H\2
H22.
+
+
b aAt + c]
aA2 0
_A2 d + Axa_
and
Note that the (2,1) and (2,2) entries of R require further modification. To
derive K from R, we carry out the substitutions, noting in advance that Hi2 must be
a useless variable. We get
H = KH + K,
where
K =
aAi
aA 1H22 + cH22 + oA, + c d + bHna + aAiH2\a + cH2ia + ba
Now the Greibach form is obtained from Eqs. (4.9.4) and gives us:
Ax^bHn\aAxH2x\cH2X\b
A2 ->■ aA lH22\cH22\aA i\c
Hn^aA2Hn\aA2
H2i^(Li1H22Hn\cH21Hi1\aA1Hn\cHn\dH21\bHuaH2i\aA1H21aH21\cH2iaH2i\baH21\
aAlH22\cH22\aAv\c
H22 -*■ dH22\bH uaH22\aA iH2laH22\cH2iaH22\baH22\d\bHna\aA 1H2ia\cH2ia\ba
The method for achieving Greibach form developed in Section 4.6 led to
grammars which were exponential in the size of G. We shall now show that this
method does much better.

4.9 FORMAL POWER SERIES AND GREIBACH NORMAL FORM 129
Theorem 4.9.1 Let G' be the Greibach normal-form grammar produced from
G by Algorithm 4.9.1. Then
\G'\<c\G?
for some constant c.
Proof Assume that the number of R words is u and their total length is 2U.
Similarly, suppose the total length of the b words is % and v is the number of such
words. The rules which arise from Eq. (4.9.2) are
At -* Affi with 1 < £ < rtiju 1 < ij < k,
and
At -► b\ withl<£<«,-, l</<jfc.
There are u rules of the first type, so At and Aj contribute 2u to \G\. The r3i
contribute £u. A similar analysis for the second type of rules yields that
|G| = £u + 2u + £„ + v. (4.9.5)
Now the productions arising from (4.9.4a) are:
Ai -+ bjHji for 1 < £ < nh 1 <i,j<k,
and
Af ^ b, fori <£<«,, Ki<k.
Each bf may appear in at most k rules of the first type and one of the second
type. The contribution of (4.9.4a) is at most:
jfc(£„ + 2v) + H„ + v. (4.9.6)
Now suppose K has w words of total length £„, and recall that K is derived
from R. In particular, each word Apa in R may be replaced by the summation of all
words bjHjPa, where 1 <£<«y, 1 </<&, and b^a, where 1 <£<«p. The number
of words in each category is at most v because each b word can appear at most once
in each type of rule. The total length of words of the first type (involving bf, Hjp,
and a) is at most:
^ + v(l+lg(a)). (4.9.7)
A similar analysis for the second type yields the bound:
% + v(l + lg(a)) + % + v < 2£„ + 2v\gtfj), (4.9.8)
since rjj = Apa and using a bound 2v for the number of words. Since such
substitutions can be made for at most u words of R,
w<2uv (4.9.9)
and
K = I 2£, + 2wlg(/fy)
U,e,
1 < B < m,7,
= 2%u + 2v X lg(4)
= 2£„m + 2i«u<4£„£i,, (4.9.10)
using u < £u and w < £„.

130 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
The contribution to \G'\ from (4.9.4b) is at most:
k(S.w + 2w) + Zw + w, (4.9.11)
as for (4.9.4a).
Finally, the size of G' is determined by (4.9.4a) and (4.9.4b), so (4.9.6) and
(4.9.11) yield:
| G' | < jfc(£„ + 2v) + 1» + v + k(2w + 2w) + Zw + w. (4.9.12)
Now, using (4.9.9) and (4.9.10), we have:
\G'\<k(%, + 2v) + % + v+ &(4£0£„ + 2 ■ 2uv) + 4£u£„ + 2uv.
It follows easily that there is some constant c so that
|G'|<cHuV
Now fiu < | G |, £„ < | G |, and k = | N\< | V\ so that
\G'\<c\G\2-\V\<c\G\* □
To see that the bound given in the previous theorem is achieved, consider
G = (V, 2,P,S), where 2 = {au .. . ,ak}, k>\, N= {Au . .. ,Ak}, S = AU and:
P= {Ai^Af+l\ Ki<k}U{Ak^aj\ 1 </<£}.
Note that | V\ = 2k and \G\ = 5k-3. Algorithm 4.9.1 will produce the
grammar:
At -+ ajHki 1 </< jfc, 1<i <k,
Ak -»• aj 1 </<&,
Hi} -+ aeHkiHt-u 2<i<k, 1 < £,/ < fc,
Hkj - asHk.hj l</<*, KKi,
Hu-i -+ aBHki 2<i<k, 1<£<k,
Clearly, | V'\ = k2 + 2k. After some work,
|C'| = 4k3+5k2 + k.
Let us reduce this grammar to see whether that changes its size substantially. We get:
A1 -+ aeHkl Kf<t,
Hv - asHkM-U j+\<i<k, j>\, 1 <£<*,
Hv ■* aaHk.u \<j<k-\, 1<«<*,
//,-,,•_! -+ a<iHki 2<i<k, 1 <£<k,
Hk.k-i ■* as 1<«<*.
Now we find that \V'\ = (k2 + k + 2)/2, and
\G'\ = 2k3 - 4k2 + 5k.

4.9
FORMAL POWER SERIES AND GREIBACH NORMAL FORM
131
Thus there are grammars for which this algorithm produces results on the order of
Id3.
Our study of the complexity of grammatical transformations has led to a
number of good algorithms but we have not yet proved a lower bound on the
complexity of a transformation. Such a bound is now derived.
Theorem 4.9.2 There is a context-free language Lk with A ^ L such that,
for any context-free grammar G in Greibach form such that L(G) = Lk, it must be
the case that: \G\> Sk\
Moreover, there is a grammar Gk for Lk such that \Gk\ = lk.
ft
Proof Let Lk = {a\,.. ., ak} . Let Gk have the productions:
Ai -* Af+i for 1 </<&,
Ak+i -»" «il •• -\ak
Clearly,
\Gk\ = 5k + 2k = lk.
This proves the last statement of the theorem.
Let G = (V, 2,P,S) be a reduced grammar in Greibach normal form which
generates L. Assume that G is a minimal grammar which generates L with respect to
\G\. The k strings a\ ,... ,a\ are all in L, so simultaneously consider a derivation
of each string. Because G is in Greibach form, the productions used in any pair of
these derivations are disjoint since the terminal strings produced involve different
letters. LstPt be the productions used in the derivation of a? . Define Gt = (V, S, Pt, S).
It is easy to see that „ .ft
L(Gj) = {af }.
ft
Thus each G{ is a Greibach-form grammar generating {a? }. By Problem 3 of Section
4.2, we have:
\Gt\>5k.
Now, since each of the k grammars Gt has pairwise disjoint productions, we know that:
\G\>Y.\Gi\>Sk\ □
PROBLEMS
Let
a = £ a(x)x and t = £ t(x)x
be formal power series and let n be an integer. Define
no = £ na(x)x,
xez*
a + T = £ (a(x) + t(x))x,
x e £*

132 NORMAL FORMS FOR CONTEXT-FREE GRAMMARS
and
ot = V p(x)x,
x e £*
where
p(x) = £ o(y)T(z).
x=yz
Thus ot is the product (or Cauchy product) of a and t.
1 Show that the set of all formal power series is a ring with respect to + and
• of power series.
2 Show that, if a and t are formal power series,
supp(a + t) = supp(a) U supp(r),
supp(ar) = supp(a) supp(r).
3 If a is a formal power series and k > 1, verify that afe is well defined.
Using that notation and assuming that a(A) = 0, define:
a+ = Urn To"1.
m=l
Here a+ is called the quasi-inverse of a. Show that:
a) a+ exists;
b) a+ satisfies a + a+a = a + aa+ = a+;
c) supp(a+) = (supp(a))+.
A collection %? of formal power series is said to be rationally closed if, for every
a G J^such that a(A) = 0, we have a* G W.
The family of (positive) rational power series is the least rationally closed
family which contains A (the power series where A(x) —I if x = A and A(x) = 0 if
x =^= A), each a G S is regarded as a power series a(x) =1 if x = a and a(x) = 0 if
x±a, and is closed under sum, product, and multiplication by (nonnegative) integers
n.
4 Show that a language L is regular if and only if L = supp(p), where p is
in the family of positive rational power series.
Let ju be any map from S into n x n matrices whose entries are nonnegative
integers. Extend ju to a homomorphism by defining:
ju(A) = I, the n x n identity matrix,
p.(ax) = ju(a)ju(x),
where a G 2 and x G S*. A power series a is said to admit a matrix representation if
there exist « > 1 and a map ju such that for all x G S+, a(x) = (p.(x))in (that is, the
(1, n) entry in fi{x)).
5 Characterize the family of formal power series that admit matrix
representations.
6 Prove that Algorithm 4.9.1 works.
7 Show that if one starts from a grammar in Chomsky normal form and
applies Algorithm 4.9.1, then the resulting grammar is in 2-standard form.

4.10 HISTORICAL SURVEY 133
4.10 HISTORICAL SURVEY
A general reference on algorithmic complexity is Aho, Hopcroft, and Ullman
[1974]. The approach adopted here is related to work by Hunt, Szymanski, and
Ullman [1974], who have been studying the complexity of many problems related
to grammars. Theorem 4.2.1 and Problem 2 of Section 4.3 are due to A. Yehudai.
Theorems 4.3.1 and 4.3.2 are from Bar-Hillel, Perles, and Shamir [1961]. Chomsky
normal form appears in Chomsky [1959a]. Canonical two-form first appeared in Gray
and Harrison [1969, 1972]. Invertibility was first studied in McNaughton [1967] for
parenthesis grammars, and Fischer [1969] for operator-precedence grammars. The
fact that it holds for all context-free grammars first appears in Gray and Harrison
[1969].
Greibach normal form and Theorem 4.6.2 are from Greibach [1965]. Our
proof of Theorem 4.7.1 follows Hopcroft and Ullman [1969]. Other proofs and
generalizations may be found in Wood [1969a, 1970]. 2-Standard form is mentioned
in Greibach [1965]. Operator form first occurs in Floyd [1963]. The material in
Section 4.9 on power series is from the work of Schiitzenberger. The matrix technique
for placing a grammar in GNF is due to Rosenkrantz [1967]. Problems 3 of Section
4.2 and 7 of Section 43 are due to Gruska. The lower bound on achieving GNF is due
to Piricka-Kemenova [1975] and to Yehudai.

five
Pushdown
Automata
5.1 INTRODUCTION
The major result of this chapter is the characterization of context-free
languages in terms of pushdown automata. This is an important and satisfying theorem,
in that our intuition about which sets are context-free will be greatly strengthened.
Whenever one is faced with some new context-free set, it is generally much easier
to describe a pushdown automation to accept it than to try to produce a grammar.
Another important family is introduced, the deterministic context-free
languages. These languages are all unambiguous and, moreover, are recognizable in linear
time. It has turned out that this family is a very practical one for compiler writers and
those interested in fast parsing techniques. The relationship between fast parsing and
this family is taken up again in Chapter 13.
An interesting family of pushdown automata is defined by bounding the
number of times the direction of a pushdown can change. Thus "finite-turn pushdown
automata" are introduced in Section 5.7 and some basic results obtained. Others
will occur later when we have additional tools at our disposal.
5.2 BASIC DEFINITIONS
There is a class of devices, called pushdown automata, which is very
important since its study leads to an important characterization of the context-free
languages. Intuitively, a pushdown automaton consists of three parts, a read-only input
tape, a read-write pushdown store, and a finite state control. (See Fig. 5.1.)
135

136 PUSHDOWN AUTOMATA
Input i
M
A
Finite =
state
control
Z
FIG. 5.1
Depending on the state the automaton is in, the input symbol it is reading,
and the symbol on the top of the pushdown store, the automaton may make a
transition to any one of a finite set of states, writing a finite number of symbols on its
pushdown store. (Writing the null string corresponds to erasing the topmost symbol.)
In addition, the automaton may make transitions and write on its store
without reading an input symbol. Such transitions will be called A-moves. Later we
will prove that A-moves do not increase the power of the automaton in the sense
that, for each automaton with A-moves, there exists an automaton without A-moves
which accepts the same set.
We will assume that there must be some symbol on the pushdown in order
for the automaton to make a move (i.e., the automaton must halt when it empties
its store). We will show that, in exactly the same sense as with A-moves, this
assumption leads to no loss of generality.
It should be noted that the automaton has a potentially unbounded memory,
and it is only the fact that its access to this memory is highly restricted that prevents
it from being equivalent to a Turing machine.
We now proceed to make these ideas precise.
where
Definition A pushdown automaton (pda for short) is a 7-tuple
A = (Q,X,r,8,q0,Z0,F),
Q is a finite nonempty set of states,
S is a finite nonempty set of input symbols,
r is a finite nonempty set of pushdown symbols,
<7o £ Q is the initial state,
Z0 £ T is the initial symbol on the pushdown store,
F ? Q is a set of final states,
8: Qx (SUA)x r-> finite subsets of Q x T*.
We will write (q', a) G 6 (q, a, Z) to indicate a particular transition.
To describe the automaton at any given time we need to know three things:
the state it is in, what remains on the input tape, and what is on the pushdown store.

5.2 BASIC DEFINITIONS 137
Definition Let A = (Q, E, F, 8,q0,Z0,F) be a pda. An instantaneous
description (abbreviated ID) of A is an element of Q x S* x V*. An j'w'ft'a/ /£) is a
member of {<70} x 2* x {Z0}.
Thus
(q.ax.aZ), qSQ, xGE*, aG2A, a G r*, and ZGT,
would be an ID for an automaton in state q, currently reading input a, and having Z
on the top of the stack.
Note that a may be A, in which case the device will make a A-move; that is,
the input is not advanced. Also, the topmost symbol on the stack will be written
at the righthand end of the third coordinate of the ID's.
Definition Let A = (ft 2,r,8,q0,Z0,F) be a pda. Define a relation \—
called the move relation by:
(q.ax.aZ) \- (q',x,a$)
if
(q',P)e8(q,a,Z)
for any q,q G Q\ any a in 2 U {A}; any xG 2*; a, j3GT*, andZGT. Let f^be the
reflexive transitive closure of |—.
Note that lg (aZ) > 1, which corresponds to the assumption that the
automaton cannot make a move if the pushdown store is empty. Moreover the case j3 = A
allows the top symbol on the pushdown to be erased.
Note that if, for some q, q G Q,x G S*; a, 0 G r*,
(q,x,a)f-(q',A,0),
then, for any y G £*,
(q,xy,a) ^ (q',y,0).
The converse is also true.
For the pushdown, the operation is slightly different. For any q, q G Q;
P,yer*,zndx,ye-Z*,[f:
(q, xy, j3) 1^ (q, y, y),
then, for any aG r*,
(q.xy.ap) K^ (q',y,ay).
Here the converse is false. The above situation is shown pictorially in Fig. 5.2. The
failure of the converse is shown in Fig. 5.3.
For future reference, we summarize these observations in a formal statement.
Proposition 5.2.1 Let A = (ft S, I\ 5, q0,Z0, 0) be a pda. Let q.q'GQ,
j3,7Gr*,;tGS*.If
fo.x,0) \^(q',A,y),
then, for any y G 2*, a G r*,
(q.xy, <*0) f-^ fa', 7,07)-

138
PUSHDOWN AUTOMATA
Pushdown
- Time, t
v
Driven by x
(a)
Pushdown
aji
ay
-> Time, t
Driven by x
(b)
FIG. 5.2
It is important to observe that the converse of Proposition 5.2.1 is false. We illustrate
this in Fig. 5.3.
The idea of Fig. 5.3 is that, in a computation (q, xy, a/3) \~(q', y, ay), it
is possible that some of a is destroyed and then rewritten. If we ran the same
computation without a on the pushdown, then the computation would "block" at point *
in Fig. 5.3.
We must now define what we mean by saying that a pda accepts a set.
own
• '
>
>
1
1
1 N
t
<
> 1
i-^ay
*■ Time, t
FIG. 5.3

5.2 BASIC DEFINITIONS 139
Definition Let A = (Q, 2, r, 6, q0, Z0, F) be a pda. Define:
7X4)= {wG2*l (<7o,w,Z0) [^ fa, A, a) for some <? GF, a G T*}
and
N(A)= {u>G2*l fa0, w,Z0) h fa, A, A) for some q&Q)
and
LC4) = {w G 2 I (<7o, w, Z0) h fa, A, A) for some q G F).
T(A) is the set accepted by automaton ^ by final state. N(A) is the set accepted by
automaton A by empty store. L(A) is the set accepted by both final state and empty
store.
Intuitively, A accepts by final state if there exists some finite sequence of
moves that exhausts the input and leaves A in a final state. (Note that, since A may
have A-moves, it may not halt even though it accepts.) On the other hand, A accepts
by empty store if there exists some finite sequence of moves that exhausts the input
and empties the pushdown store. (In this case the automaton will stop since it cannot
make any move if its store is empty.)
Finally, we would like to define a "deterministic pda." To ensure that a pda
is deterministic, not only must we restrict the "choice" of a next state to at most
one state, but we must also take into account A-moves. In order to permit the
automaton to perform operations on its pushdown store without reading its input, instead
of simply disallowing A-moves, we permit a A-move from a given state if and only if
there is no "regular" (non-A) move. Hence,
Definition Apda.4 = (Q, 2, T, 5, q0,Z0,F) is deterministic (dpda for short)
if for all (<7, a, Z) G Q x (2 U {A}) x I\
1. l5fa,a,Z)l<l;
2. If 5 fa, A, Z) =£ 0, then 5 (q, a,Z)=Q for each a G 2.
Definition A set L is called a deterministic context-free language if there is
some dpda A such that L = T(A). we say that two dpda's A and A' are equivalent if
T(A)=T(A').
Before proceeding to a mathematical analysis of pda's, we give two examples
to illustrate the ideas and notation.
Example 1 Let G be the grammar
S^aSa\bSb\c
0t may be premature to call a set a "deterministic" context-free language before the
connection between pda's and these languages has been established but we shall do
so.)
Then
Lc = L(G) = {wcwT\ we {a,b}*}.
That is, Lc is the set of palindromes with a center marker. We want to construct a
pda.4 which accepts Lc.

140 PUSHDOWN AUTOMATA
Intuitively, A will copy each symbol of its input onto its pushdown until
it encounters the center marker. It will then go into a new state, in which it will
compare the top symbol on the pushdown with the input. If they fail to match, it will
halt; otherwise it will erase the top symbol on the pushdown and repeat the process
with the next symbol of the input. Finally, if it exhausts its input and only the start
symbol remains on the pushdown, it will go into an accepting state.
Let
A = (Q.Z,r,8,q0,Z0,F),
where
Q = {<7o,<71,(72},
2 = {a,b,c},
T = {a,b,Z0},
F = {q2},
and 5 is given by:*
For allege {a, b),
8 (<70,d,Z0) = (<7o, Z0d) Stack first input letter.
8(q0,d,A) = (q0,Ad) Continue stacking.
8(q0,c,B) = (<7i,5) forall^GT Center found.
8(q1,d,d) = (<7i,A) Compare and erase.
8(qx,\,Z0) = (q2,Z0) Accept.
To show how the automaton works, consider the input aabebaa:
(q0,aabcbaa,Z0) \~ (q0,abcbaa,Z0a)
\~ (q0,bcbaa,Z0aa)
\~ (<7o, cbaa, z0aab)
\~ (qi,baa,Z0aab)
\~ (qi,aa,Z0aa)
\~ (qi,a,Z0a)
hfoi,A,Z«>)
\~ (<72,A,Z0),
at which point the machine halts, since q2 is not in the domain of 5.
Note c&Lc and (q0, c, Z0) £~ (q2, A, Z0).
The automaton is clearly deterministic, and hence Lc = T(A) is a
deterministic, context-free language.
* Note that in a dpda, we write 5 (q, a, Z) = (q',a), instead of 5 (q, a, Z) = {(q', a)}.

5.2 BASIC DEFINITIONS 141
Example 2 Let G be the grammar
S^aSa\bSb\A
Then
P = L(G) = {wwT\ we{a,b}*}
is clearly context-free. P is the set of even-length palindromes (without a center
marker).
P differs from Lc only in that there is no center marker. Hence, if we could
construct a pda which could find the center of the input, we could proceed as before.
A little thought should convince the reader that, since the input is one-way and the
length of the input is unbounded, a deterministic pda cannot find the center of the
input. (Can you prove that rigorously?) Therefore we will construct a pda that will
"guess" at where the center is and then proceed as before. Clearly, if the pda "guesses"
correctly, it will work properly. All that must be checked is that, when the automaton
makes a wrong guess, it does not accept the input.
CASE 1. If the automaton "guesses" too soon, then, although the pda
could empty its pushdown, the input would not be exhausted and hence the string
will not be accepted.
CASE 2. If the automaton "guesses" too late, then it could exhaust its
input, but the pushdown would not be empty and therefore the string will not be
accepted.
Formally, let
where
C = (<2,2,r,5,<7o,Zo,0),
Q = {<7o,<7i},
2 = {a, b),
T = {a,b,Z0},
and 5 is given by:
For allege {a, b),
8(q0,c,Z0) = {(<7o,Z0c)} Stack first letter.
8(q0,c,A) = {(q0,Ac)} Continue stacking.
8(q0,A,Z0) = {(<7i,A)} Accept A.
8(q0,c,c) = {(q0,cc),(qi, A)} Guess what to do.
5 (<71, c, c) = {(q j, A)} Compare and pop.
5(<7i,A,Z0) = {(<7i,A)} Accept.
To show how the pda works, consider the input abba. There are two possible
sequences of ID's:

142 PUSHDOWN AUTOMATA
(<7o,abba, Z0 ) \~ (q0,bba,Z0a)
(— (q0,ba,Z0ab)
\~ (qo,a,Z0abb)
\- (q0,A,Z0abba)
and
(q0,abba,Z0) \— (q0,bba,Z0a)
\— (<7o, ba, Z0ab)
\~ (,Q\,a,Z0a)
\- (<7i,A,Z0)
hfoi.A.A)
Since the second of these ends with an empty pushdown and null input, the
string is accepted. (The reader should verify that a string like abbb is not accepted.)
Thus C accepts P by empty store. It should be noted that Cis nondetermin-
istic. Later we will prove that there is no deterministic pda that accepts/'.
Before embarking on the proof of equivalence between pda and context-free
grammars, we associate timing conditions with these devices.
Definition A pda A = (Q, S, T, 5, q0, Z0, F)
i) operates with delay k,k>0, if for all q, q G Q; a, 0 G r*,
n
(q, A, a) f— (<7 , A, j3) implies n < k.
[Intuitively,^ can have at most k consecutive A-moves.]
ii) operates in quasi-realtime (is quasi-realtime) if there is k > 0 such that A
operates with delay k. [Intuitively,^ can have at most a bounded number
of A-moves.]
iii) operates in linear time if there exists &>0 such that for all q, q G Q;
a,j8Gr*,andwGS+,if
(q, w, a) p- (q', A, 0),
then n<fclg(u>). [Intuitively, every non-null input can be processed in
time proportional to its length.]
iv) is A-free if for all q G Q, Z G T,
Hq,A,Z) = 0.
v) operates in realtime if A is deterministic and operates with delay 0,
vi) is unambiguous if, for each w G S*, there is at most one computation
CohQ | hC„
where C0 is an initial ID involving w and Cn is an accepting ID. [Note
that the form of a final ID depends on whether we are dealing with
L(A),N(A), or T(A).]

5.3 EQUIVALENT FORMS OF ACCEPTANCE 143
Looking ahead to Lemma 5.6.5, we will prove the following result.
Proposition 5.2.2 Every dpda is equivalent to an unambiguous pda.
Every A-free pda operates with delay 0. Each pda that operates in quasi-real-
time operates in linear time. The converse is false. To see this, imagine a dpda A that
accepts a*b*c. As this set is regular, the dpda need not even consult its stack, but let
us suppose that A stacks all its input up to the c. Then if the input is of the proper
form, A will accept as L(A) by erasing its pushdown. This will take a number of
consecutive A-moves that is linear in the length of the input. Thus A would be a linear-
time dpda but not quasi-realtime.
Example Returning to Example 1 and examining the pda A, we note that A
is deterministic and operates with delay one. What can you say about Example 2?
PROBLEMS
1 Show that, if L 9 S*, then L = L(A) for some pda A if and only if L =
T(B) for some pda B if and only if L = N(C) for some pda C.
2 Can you modify the pda in Example 1 to work in realtime?
3 Show that the set L = {w G {a, b}*\#a(w) = #b (w)} can be accepted by a
dpda.
4 Show that the set L = {anbm | 1 < n < m < 2n} can be accepted by a pda.
5 Given any pda A, one can construct a new pda B such that if (<?', a) £
8B(q,a, Z), then lg(a)^2. That is, B never tries to add more than one character
at a time. In what sense is B "equivalent" to A1 That is, is N(B) = N(A)1 If A is
realtime, is B1 etc.
6 There is another way in which dpda's can be used to accept languages. Let
A = {Q,T,,T,b,q0,Z0,F) be a dpda. If
(q, x, a)
is an ID, we say that (q, a(1)) is the mode. We now let F be a set of modes. Show that
L = {w€ 2* I (q0,w,Z0) H1 (q,A,a) for some (q, a(1)) GF}
is a deterministic context-free language.
7 Show that every regular set accepted by a finite automaton with n states
may be accepted by
i) a dpda with n states and one pushdown symbol, or
ii) a dpda with one state and n pushdown symbols.
Be sure to state which type of acceptance is being used.
8 Show that a dpda A, with q states and t pushdown symbols and maximum
length r to be stacked, has an equivalent dpda with q states and at most one symbol
added to the stack at a time.
5.3 EQUIVALENT FORMS OF ACCEPTANCE
Given a pda A, a set may be accepted as T(A), N(A), and L(A). We shall
study the relation between sets defined in these ways.

144 PUSHDOWN AUTOMATA
Definition For a fixed alphabet 2, let:
,TZ ={ics'li = T(A) for some pda.4},
^Ts = {L 9 2* I Z, = W(d) for some pda.4},
^ = {L c 2* I L = L{A) for some pda.4}.
.y~2 is the set of languages accepted by final state, J^^, is the set of languages
accepted by empty store, and Sfz is the set of languages accepted by both final state and
empty pushdown. In this section, we prove that ^ = J^-^ = -S^.
We begin by showing that J^ 9 .T^.
Theorem 5.3.1 For each (deterministic) pda A = (Q, 2, T, 8,q0,Z0, 0),
there exists a (deterministic) pda B = ((/, 2, r',5B,<fo, i^o > {/}) such that N(A) =
T(B) = N(B) = L (B). Moreover
(q0,x, Z0) |j (q, A, A) for some q G Q
if and only if (q0, x, Y0) f^ (/, A, A).
TVoo/ Let
B = (QU{q0,f},Z,rU{Y0},8B,q0,Y0,{f})
where q0 ,/£ 0, r0 £ I\ and 5B is given by:
1 SB(<fo, A, K0)= {(<7o. Y0Z0)} Add a new bottom symbol.
2 For all (q, a, Z) G G x (2 U {A}) x T, 5B(<7, a, Z) = 5 (<7, a, Z) Simulate A.
3 SB(q, A, Y0) = {(/, A)} for all <7 G Q. Erase K0 and accept.
Note. B is deterministic if and only if A is deterministic.
Intuitively, B enters Z0 on the pushdown, simulates A until ,4's stack is
empty, and then makes a transition to/, erasing its start symbol.
We now proceed to show N(A) = T(B).
Claim 1 HxeN(A),then(q0,x, Y0) \j (f.A.A).
Proof lfxGN(A), then
(<7o,^,Z0) [j (<7,A,A) for some <7Gg.
But, since 8A £ 5B, we have
(q0,x,Z0) \~ (q, A, A).
Now, since the pushdown could only have been emptied on the last move, we have
(q0,x, Y0) fj (q0,x, Y0Z0) £ (q,A, Y0) (j </,A,A),
or, equivalently,
IfxG^U),then(«0.*. Ko) fj (/".A.A).

5.3 EQUIVALENT FORMS OF ACCEPTANCE 145
Claim 2 If (q0,x, Y0)\j(f,A,A), thenx&N(A).
Proof If (ifo, x, Y0) \^{f, A, A), then, from the construction of B, the
computation must be of the form
(q0,x, Y0) fj (q0,x, Y0Z0) £ (q,A,Y0) |j if, A, A) for some q G Q.
Now, since Y0 could not have been rewritten in the computation
(q0,x, Y0Z0) \j (q, A, Y0),
we must have *
(q0,x, Z0) [7 (<7,A,A).
This computation uses only states from Q. Therefore
(q0,x,Z0) fj (<7,A,A)
or xeN(A). Thus,
Ofo ,x. Y0) \j (f, A, A) impliesx GN(A).
Also, since / is a final state, x G L(B) and x G T(B). B empties its stack if and only if
B enters a final state. Together with Claims l and 2, we have
N(A) = T(B) = N(B) = L(B)
and
(<7o,*,Z0) fj fo, A, A) if and only if (<Fo,*, >W £ (/", A, A). D
Corollary (a) . r2 ^ J^;(b) J^£ c j^£.
Next we show that JTZ 9 J^^.
Theorem 5.3.2 If L = T(A) for some pda A, then there exists a pda B such
that L=N(B) = L(B).
Proof Let
^ = (Q,X,r,8,q0,Z0,F).
We will construct 5 from A as follows: 5 has a special stack marker Y0
(its start symbol), which can only be erased from a special state d. On a given input,
5 will
a) Place A's start symbol on the pushdown (using a A-move) from a special
initial state;
b) Simulate A until A enters a final state;
c) If A enters a final state, 5 may either (i) go into the special state d,
erase its store and halt, or (ii) continue simulating A until another final
state is reached.
Formally, we have
S=(fiU {d, q'0}, S, T U {Y0}, 8B,q'0,Y0, {d}),
whered, q0 ^Q, Y0 fc r, and 8B is given by:
1 8B(q'0,A,Y0)={(q0,Y0Z0)} Add Z0 to stack.
2 8B(q,a,Z)= 8(q,a,Z) for all (q, a,Z), q ^ F, or a =£ A. Simulate A.

146 PUSHDOWN AUTOMATA
3 If qGF, SB(q, A, Z) = 8(q, A,Z)U{(d, A)} for all ZGru{r0}. In a final
state, guess whether or not it is the end. Erase everything from d.
4 SB(d, A, Z) = {(d, A)} for all Z G T U {Y0}.
Note. B may be nondeterministic even if A is deterministic.
We now must show that N(B) = T(A).
Claim 1 T(A)£N(B).
Proof If x G T(A), then (q0, x,Z0)\j (q, A, a), for some q G F, a G T*.
Since 5 £ 5b >we have
(q0,x,Z0) f^- (<7, A, a) for some <7 GF, a G r*.
Now, since, in the above computation, B could not have annihilated its store (except
possibly on the last step), we have, by (1) above,
(q'o,x, Y0) tj (qo.x, Y0Z0) £ (q, A, Y0a).
But
j(d,A,A) if a = A
' ° ^ 1 (d, A, Y0a') if a = a'Z for some ZGT,
whence
(d,A, Y0a') \j (d,A,A).
Combining we have: If x G T{A), then {q'0,x, Y0) \j (d, A, A) or
T(A)CN(B).
Claim 2 N(B)^ T(A).
Proof If xeN(B), then
(q'o,x, Y0) fj(tf, A,A).
The above computation must end in d since it is the only state that can erase Y0.
From the construction of B, this is possible only if the computation is of the form
fa'o.x, Y0) fj (q0,x, Y0Z0)\j (<7, A, Y0a) ^ (d, A, A) forsome? GFand aGT*.
But since Y0 could not have been rewritten in the computation
(q0,x, Y0Z0) \j (q, A, Y0a),
we must have
(q0,x,Z0)^ (q,A,a), q&F.
Therefore x G N(B) implies x G T{A) or
T(A) 2 N(B).

5.3 EQUIVALENT FORMS OF ACCEPTANCE 147
Combining Claims (1) and (2), we have:
T(A) = N(B),
which proves the theorem. Also, N(B) = L(B). □
Combining these theorems with Problem 5, we have:
Theorem 5.3.3 ^\ = ^-^ = &•£. That is, acceptance by final state or by
empty store or by both simultaneously leads to the same family of accepted sets.
Proof We have actually shown that
The proof that -£^ £ S\ is left for Problem 5. □
PROBLEMS
1 Suppose that the proof of Theorem 5.3.2 were changed and a new start
state <7o was not added. [Change q'0 to q0 in (1) above.] Would the construction
still work?
2 Congratulations! You have just purchased a standard pda or dpda. The
following options are available at a slight extra cost. Which of these features increase
the power of your pda-(dpda)? Prove your answers.
a) The ability to move on input strings rather than on elements of 2A.
b) The ability to define 5 on strings from T+ instead of just T, and hence to
erase strings in one move.
c) The ability to move on A on the pushdown as well as on elements of T.
3 Design a dpda to accept the following set by empty store:
L = {a^b ■ ■ ■a;'-i6a^csa^+i \r>\,\ <i} for all/, 1 </<r, 1 <* < r}.
4 Design the simplest A-free pda you can which accepts L as defined in
Problem 3. Can you do it with a realtime dpda? Can you do it with a deterministic
multitape Turing machine that works in realtime?
5 Explain why it does not follow immediately that ¥■% £ .9"^. Now give
the necessary construction.
6 A pushdown automaton A = (Q, 2, I\ 5, q0, a, F) is said to be a counter
if Irl = 1. Show that, for each counter A, there exists a counter B such that
N(A) = L(B).
7 Let L = {a"bmc\n > m > l}. Construct a counter A such that L(A) = L.
Show that there is no counter B such that N(B) = L.
8 Show that, for counters, the families S~and J1^are incomparable.
For future reference, a counter language is any set L £ £ which is accepted as L(A)
for some counter A. Now, counter languages are very restricted because a pda may
not move on an empty store. Thus the only information we can get from the stack
is when it is empty and then the computation stops. To remedy this, we allow
detection of a zero count as well.

148 PUSHDOWN AUTOMATA
A pushdown automaton A = (Q, £, T, 5, q0, Z0, F) is an iterated counter
if T = {Z0, a}, and
8(q,b,a) S 0x {a}*
and
8(q,b,Z0)£ (QxZ0a*)U(Qx {A})
for each q (=Q and 662U {A}. Thus, an iterated counter may count down to zero
more than once. Of course, every counter language is an iterated counter language.
9 Show that, for the family of iterated counters,
jT = y = j~.
5.4 CHARACTERIZATION THEOREMS
Our purpose in studying pushdown automata is to characterize context-free
languages in terms of them. Without further ado, we begin this task.
Definition Let
^e = {£ - E I L is a context-free language}.
Then 9j; is the family of context-free languages (over £).
In the last section, we showed that JT-% = S~-z = ¥■%. In this section we
will prove that ^^ = 9^; that is, the family of context-free languages is the same as
the family of languages accepted by empty store by pda's. Thus we will have
9^ = J''£ = .9 y, = -z£.
giving a very nice characterization of the context-free languages, which we will use
extensively.
The proof that ^^ = 9"^ is very important, not only for the result
obtained, but also for the technique employed, since similar constructions occur
elsewhere in both the theory and the applications of context-free languages.
Theorem 5.4.1 For each context-free language L, there exists a pda A such
that L=N(A).
Proof Let L = L{G) for some context-free grammar
G = (V,2,P,S).
Now, construct a pda
A = ({<?}, £, V,8,q,S,0),
where 5 is given by
i) 5 (q, A, Z) = {(q, aT)\ Z -* a is in P};
ii) For each a G 2,5 (q, a, a) = {(q, A)}.
Intuitively, A performs a standard top-down parse. There is no left recursion problem
because A is nondeterministic. To illustrate this, consider:

5.4 CHARACTERIZATION THEOREMS 149
Example Let G be the grammar with
S -+ aSb IA
as productions. Then
8(q,A,S) = {(q,bSa),(q.*)},
8(q,a,a) = 8(q,b,b) = {(<?, A)}.
Now consider the string aab. The reader may easily verify that the possible
computations are
i) (q,aab,S) H1 (q,ab,b);
ii) (q, aab, S) \— (q, aab, A);
iii) (q, aab, S) H1 (q, b, bbbSa);
iv) (q, aab, S) £- (q, A, b).
Thus aab £N(A). On the other hand,
(q, ab, S) h (q, ab, bSa)
h (q, b, bS)
\- (q, b, b)
hfo.A,A).
Thus, abGN{A).
We now proceed to prove the theorem.
Claim 1 If S =>ua, with uGZ*,aG NV* U {A}, then, for each w G 2*,
(q, uw, S) \— (q, w, aT).
Proof By induction on the length of the derivation.
Basis S => S, so u = A and a = 5. Then
(<?, w, S) £~ (<7, w, S),
and the basis is complete.
Induction step. Suppose that we have a leftmost derivation of length n + 1,
n
S => M.B7 7* MX/37
where B -+ xj3 is in P, u, x G 2*, 7 S F*, and |3 S NF* U {A}. By the induction
hypothesis, we have, for any w G 2 *,
(<7, mxu>, 5) |— (<7, xw, yTB).
Since 5 -»■ x& is in P, there is a rule of type (i), so that for any w G 2
(<7, mxw, S) \— (q, xw, yTB)
h (<7, xw, 7T/3T*T)
^ (q, w, 7T/3T).

150 PUSHDOWN AUTOMATA
We were able to conclude the above computation using rules of type (ii) above, since
x &"£*. Thus the induction is extended and the claim is proven.
In particular, if S ^x S 2*, then, by taking
u = x, w = A, a = A,
we have
(q,x,S) f-(q,A,A).
Therefore
L c N{A).
Claim 2 If (q,uw,S) £-(q, w, aT), u, weZ*,ae V*, then S fwa.
Li
Proof By induction on the number of steps in the computation.
Basis. If (q, w, S) \—(q, w, S), then u = A and a = S. Therefore, S=>S
completes the basis.
Induction step. Assume that if
(q, uw, S) p- (q, w, aT), u,w£X*,a£V*,
then *
S ?> ua
Li
Now let (q, uw, S) p— (q, w, aT). There are two cases to consider.
CASE 1. The last move is type (ii). Then if we let u = u'a, with a S 2,
(q, uw, S) = (q, u'aw, S) p- (q, aw, aTa)
h (q,w,aT).
By the induction hypothesis,
* i
S r* u aa = ua
Li
CASE 2. The last move is type (i). Then
(q, uw, S) p- (q, w, f B) \- {q, w, fyT) = {q, w, aT).
By the construction, B -*■ y is in P and a = 7/3. By the induction hypothesis,
S => m5|3
Now, using 5 -* 7, we have
5 => "5/3 =* «7|3 = ««
which completes the proof of Claim 2.
In particular, if x S £* and
(<?,*, S) r-((?,A,A),
then, by taking u = x, w = A, a = A, we have

5.4 CHARACTERIZATION THEOREMS 151
Therefore, N(A) 9 L. Now, using the result of the previous claim, we get
L = N{A),
which proves the theorem. □
The construction that we have given allows us to draw some even stronger
conclusions. We shall state them here as a theorem and leave the proof for the
exercises. We will feel free to employ these facts in the future.
Theorem 5.4.2 For each context-free language L, there is a pda A = (Q, 2,
r, 5, q0, S, F) such that N(A) = L(A) = L, and
0 Ifil = 2;
ii) forall<7,<7'e0,aeZA,Zer,aer*,if
(q',a)G8(q,a,Z),
then
lg(a)<2;
iii) forallwei^.aer^ee,
(q0,w,S) F1 (<?, A,a)
implies q ^q0.
iv) If L is an unambiguous context-free language, then A is an unambiguous
pda.
v) A operates in linear time.
We shall shortly investigate the role of A-moves in pda's.
Next we turn to showing that N(A) is context-free. This construction has
many important applications.
Theorem 5.4.3 If A is an (unambiguous) pda, then N(A) is an (unambiguous)
context-free language.
Proof UtA= (Q, 2, I\ 5, q0 ,Z0, F).
We may assume without loss of generality that there exists a unique state
/S Q such that F = {/} and
(<7o.*, Z0) F^ (<7,A, A) ifandonlyif<7 =/,
since, if A does not satisfy this condition, we can construct
a' = (Gug-0,/},z,ru{y0},«',9o.i'o.{/}),
where 5 £ 8' and
8'(q0,A,Y0) = {(q0,Y0Z0)},
8'(q,A,Y0) = {(f,A)} for all? eg.

152 PUSHDOWN AUTOMATA
Clearly,/!' satisfies the condition andN(A) = N(A') = L(A')= T{A').
Now define
G = (V.E.iS),
where
V = Z U (Q x T x Q),
S = (<7o,Z0,/),
and Pis given by:
i) For all k > 1, each a G ZA, each Z, Zj,..., Zk G r, each q,p, qly...,
(q, Z, qk)^>-a(p,Zi, qi)(qu Z2, q2) • ' • fa*-2, Zfc-i, <7fe-i)faft-i> zft> <7ft)
(P, -^fe "• • zi) G 5 (<7, a, Z).
ii) fa, Z,p)^a if (p, A) e 5 fa, a, Z).
The idea of the construction is to encode information into the variables
fa, Z, p). In fa, Z, p), the first coordinate tells us that the pda is currently in state q.
The second coordinate indicates that the top of the pushdown is Z. Let us imagine
that the pda A is in state q and reading Z at the top of the pushdown. Let us
consider strings w G 2* which cause the pda to ultimately erase Z from the top of
pushdown. The third component p is a "guess" that we will be in state p when that Z is
erased as above.
Before giving a proof that L (G) = N(A), we work an example.
Example We will construct A and G for the set
L = {anbman\n,m>\}u{anbmcm\n,m>\}.
The idea behind A's processing of the set is simple.
a) Stack the a's and then b's.
b) If next input is a, erase all b's from stack and match up the remaining a's
in the input with the stacked a's.
c) If next input is c, match up the c's with the b's and then erase the a's from
the stack.
A can accept this set as
L = L(A).
Formally
a = (e,i:,r,5,o,c,{2}),
where
Q = {0, 1, 2},
z = r = {a,b,c}.
Table 5.1 gives
8:Qx (SU{A})x T^Qx T*.

5.4 CHARACTERIZATION THEOREMS 153
The second half of the table gives the productions associated with each move of A.
(Some of the unnecessary productions have been omitted.)
Let <7o = 0, Z0 = c, and
(q',a)e8(q,a,Z),
where q, q E Q, a E V*, Z S T, and a G 2.
TABLE 5.1
q a Z q' a Productions
0 a c
0 a a
0 ft a
0 ft ft
0 a b
1 A b
1 a a
1 A c
0 c b
2 c b
2 A a
2 A c
0
0
0
0
1
1
1
2
2
2
2
2
ca
aa
b
bb
A
A
A
A
A
A
A
A
(0,c, 2)->a(0,a, 1)(1, c, 2)|a(0,a, 2)(2, c, 2)
(0,a, l)->a(0,a, l)(l,«, 1) (0,a,2)->a(0,a,2)(2,a,2)
(0,a, 1)^-6(0,6, 1) (0, a, 2)^-6(0,6, 2)
(0, ft, 1) -> ft(0, ft, 1) (1, ft, 1) (0, ft, 2) -> ft(0, ft, 2) (2, ft, 2)
(0,6, l)^-a
(1,6, D-»A
(l,a, D-^a
(l.c, 2)->A
(0, ft, 2)^-c
(2, ft, 2)^c
(2, a, 2)^ A
(2,c, 2)^ A
7Vo?e. The start symbol in G is (0, c, 2).
Now consider the computation
(0, abba, c) \~ (0, bba, ca)
h (0,6a, cZ>)
h (0, a, ebb)
r-(i,A,d>)
h0,A,c)
h (2, A, A)
The corresponding (leftmost) derivation of abba in G is shown in Fig. 5.4.
The correspondence between a partial computation of A and its associated
generation can best be seen by example:
(0, abba, c) ^ (0, a, ebb).
On the other hand, concatenating the frontier of the generation tree at the third
step in the derivation gives
abb(0,b, 1)(1,6, 1)0,c,2).
The string abb is that part of the input string used and ebb (the reversal of
the string obtained by concatenating the center symbol of each variable) is the state
of the pushdown at the corresponding point in the computation.

154 PUSHDOWN AUTOMATA
<p,
:,2)
0^
(0,a, l)
®/
b
^^^
0
s©
(0,/
b. I)
©^
(0,/», D
©
(1,
:,2)
©
A
\©
(i.M)
©
FIG. 5.4
A similar statement holds in general, and it is just this statement which we
must make precise and prove to show that L(G) =N(A). Hence, we define
H= {aG(Qx rx Q)*l If a = a^q!, Zlt q\)(q2, Z2, q'2)a2, thenq\ =q2).
Thus H is a set of strings of triples where the triples are "internally linked," as shown
below.
Example Continuing from the previous example, let
a = (0,6,I)(1,6, 1X1, c, 2).
Clearly, a G H.
Next, we define a mapping <p: H^-V , where:
*(A) = A,
V(a(q, Z, q')) = Zy{a) for all a(q, Z, q) GH.
Note that, for any a, |3, we have
<p(a|3) = <p(|3)<p(a).
The purpose of <p is to decode the center components of its argument and to reverse
them.
Example Ifa = (0,Z>, l)(l,Z>, l)(l,c, 2), then ya = ebb.
Claim For each w,yG £*,eachp, q, q in Q, each a GN*, and for all n ^ 1,
if and only if
(<7, Z, q) => wa
(q, wy, Z) f^ (p, y, <p(a))

5.4 CHARACTERIZATION THEOREMS 155
and eithert
a e ({p}x r x Q)HnH(Q x r x {<?'}) n#
or
a = A and p = q .
Proof By induction on n.
Basis, n = 1; then,
(q, Z, q') => wa if and only if (q, Z,q) -*■ wot. is in P.
There are two possible cases.
CASE 1. The rule (q, Z, q') -* wa is type (i). This can occur if and only if
a e({p}x Tx Q)HnH(Qx rx{q'})nH.
Suppose
a =(p,Z1,q1)---(qk_1,Zk,q')
for some ?],... ,qk-i ^Q,ZX,. .. ,Zk S T. Thus,
(q, Z, q') => wa
if and only if
(q,Z,q) -y w(p, ZY ,qY) • • • (qk.x, Zk,q) is mP
if and only if
(p,Zk ■ ■ ■ Zj) S 5 (q, w, Z)
if and only if
(q, wy, Z) \-{p,y,Zh--- Zx).
But
<p(a) = <P((Pi,Zi, <?i)(<?i, Z2,q2)---(qk-i. zk, q')) = ZkZk _Y ■Z1.
Thus
(q, Z,q) => wa
by a rule of type (i) if and only if
(q, wy,Z) \- (p,y, «p(a))
and
ae({p}x TxQ)HC\H(Qx Tx {q'})nH.
t This condition means that either
a = (p,Zltqi)&(q2,Z2,tl')^H
for some Q\, Q2 ^ Q< Z l > ^2 £ T, and some /3 S // (that is, the first component of the
first triple is p while the last component of the last triple is q'), or
a = A and p = q'.

156 PUSHDOWN AUTOMATA
CASE 2. The rule (q, Z, q) -*■ wa is type (ii). This can happen if and only
if
a = A
Thus
(q, Z, q') =» w
if and only if
(q, Z, q') -> wisinP
if and only if
(q',A)e8(q,w,Z)
if and only if
(q,wy,Z) h (q'.y.A).
Now, since <p(a) = <p(A) = A, we have
(q,Z,q') => wa if and only if (<7, wy, Z) |— (p,.y, f>(a))
and
a = A and p = q .
Induction Step. Assume the claim holds for k < n. Note that
(q,Z,q) =*■ wa
will hold if and only if
(q,Z,q')=?w1Ya2 ^ w^iia^
where
w = wxa, a = aja^, y-^aai is in P.
Let
Y = (q",Z',p').
Now, applying the induction hypothesis, we have
(q, Z, q) => Wl y<x2
if and only if
(q, wxay, Z) f- ((7", ay, <p(ya2)) = (q", ay, <p(a2)Z'),
where the equality follows from
0(^2) = <p(a2)<p(y) = <p(a2)Z'.
By the induction hypothesis, Ka2 S ({(7"} x T x 2)// H #(2 x T x {(7'}), and hence
a) a2 = A and p = (7'
or
a2 ¥= A and Ka2 e//(2 x T x {(7'}) n//.
But this gives
b) a2 e({p'}x rx 2)//n//(2x rx fo'})n//,since y = (q",z',p).
Now, since y -* aax is in P,
y = (<?", Z', p') 4 aax

5.4 CHARACTERIZATION THEOREMS 157
Applying the induction hypothesis a second time gives
(q" , Z', p') ^ aaY
if and only if
(q", ay, Z') \~ (p,y,y(ai))
with either
c) ax = A and p = p
or
d) ax e({p}x Tx Q)HnH(Qx Tx {p'})nH.
Now, combining, we have
"' , n + l
(q, Z,q) => wa
if and only if
(q, w1ay, Z) \- (q",ay,y(a2)Z') ^ (p,y, tp(a2)^>(a1)),
where we may combine the computations by Proposition 5.2.1.
Now, using the fact that
w = wxa a = axa2
and
«p(a2)«p(ai) = «p(aia2) = <p(a),
we have
, n + l n + l
(q,Z,q) =? wa if and only if (q, wy, Z) \ (p, y, $(<*)).
To get the conditions on a, there are four cases to consider:
CASE 3. ai = a2 = A. Then, from (a) and (c) above, we have:
P = q', P = P-
Thus' a a
a = ala2 = A and p = q .
CASE 4. ai ¥= A, a2 = A. Then, from (a) and (d), we have:
a = av e({p}x Tx Q)HnH(Qx Tx {<?'}) n^-
CASE 5. C*! = A, a2 ¥= A. Then, from (b) and (c),
a = a2<=({p}x rxQ)BnH(Qx Tx {<?'}) n 7/.
CASE 6. aj ¥= A and a2 ¥= A. Then, from (b) and (d),
a = aia2e[({p}xrxQ)HnH(Qxrx{p'})nH\
1({p'} x r x Q)hhh(q x r x {<?'}) n#],
or
ae({P}x rxQ)HnH(Qx rx {<7'})ny/,
which proves the claim.

158 PUSHDOWN AUTOMATA
Now let
q = <7o, Z = Z0, a = y = A, p = q = f
in the above claim and we get
S = fao, Z0,f) ^ w G 2* if and only if (q0, w, Z0) h (/, A, A).
Therefore, L(G) = ^) = ^ = ^
By the claim, if A is unambiguous, then so is G. □
We may now combine several of our previous results to state one of the
main results of this book.
Theorem 5.4.4 Let an alphabet 2 be fixed. Then
Corollary Every deterministic context-free language is unambiguous.
PROBLEMS
1 Prove Theorem 5.4.2.
2 Is Theorem 5.4.2, part (v), true if we replace the linear time condition by
the condition of operating with delay 0; that is, if we require a A-free pda? Explain
in detail.
3 Show that the following sets are context-free
a) L = {albV| i,j>l}U {a'b'c^ i, j>l};
b) L = {alb'\i> \,i<j<7i);
c) L = {xcy\ x,y S £*,x ^y], where c^ 2.
//for. Find suitable context-free languages Lj, Z,2, L3 so that Z, =
L1 U Z,2 U Z,3, and the sets are not pairwise disjoint.
Let us generalize the concept of a pushdown automaton to allow two-way motion on
the input. Formally, a two-way nondeterministic pushdown automaton is a 7-tuple
A = (Q, Z,r,8,q0,Z0,F), where Q, V, q0, Z0, and F have their customary meanings.
2, the input alphabet, is now assumed to contain two distinguished symbols <tand $,
which serve as the left and right endmarkers, respectively. 5 is a mapping from Q x
2 x T into finite subsets of {— 1, 0, l}x Q x T*. The idea is that
(d,q, y)eS(q,a,Z)
means that the device may go into state q ', write y en the pushdown, and move right
if d = 1, move left if d = — 1, and remain stationary on the input if d = 0. If (d, q', y)
S 5 (q, a,Z) implies d > 0, then A is a one-way pda.
If \8(q,a, Z)\ < 1 for all (q,a, Z), then A is deterministic. As an abbreviation,
we can write 1-pda or 2-pda to stand for a one-way or two-way pda, and write 1-dpda
or 2-dpda to stand for one-way or two-way deterministic pushdown automata.
An ID of a 2-pda is a quadruple of the form
(q, <tw$, i, y),

5.5
A-MOVES IN PDA'S AND THE SIZE OF A PDA
159
where q G Q, w G (2 - {<fc $})*, 0 < i < lg(w) + 2, and y G T*. Thus an ID tells us
the state of the device, the location of the read head on the input, and the complete
contents of the pushdown store.
One defines
(q, <tw$, i, aZ) \~(q', <twS, i + d, ay)
if (d, q', y) G 8(q, a, Z) and 1 <z + d <«. A 2-pda >4 accepts w G (£ — {<fc $}) if
(<7o, *w$, 2, Z0) ^ (<7, <tw$, «, a)
for some q GF. Let !TC4) = {w\ A accepts w).
4 Show that any 2-pda A' may be simulated by a 2-pda A with the following
properties.
On any move which adds to the stack (there is also no loss of generality in
assuming that additions to the stack are of length 1),
a) the input head is not moved, and
b) the move is independent of the input.
More formally, if (d, q , y) G 8(q, a, Z) implies
c) y = A, or
d) y = ZZ' with Z,Z' GT,d = 0, and (d, q', y) G b(q, b, Z)
for every b G~L.
5.5 AMOVES IN PDA'S AND THE SIZE OF A PDA
We shall briefly consider the role of A-moves in pda's. One might conjecture
that every context-free language can be accepted by a A-free pda. If "accepted" is
used to mean by N(A) or L(A), then there is a minor problem with A. If AGL,
then the pda must have at least one A-move in order to erase Z0 from the pushdown
store. This argument is one justification of those authors who allow pda's to move on
empty pushdowns. We begin by assuming that A ^Z,.
Theorem 5.5.1 L9 £+ is a context-free language if and only if L = N(A)
for some A-free pda A.
Proof
a) If L =N(A) for some A-free pda, then the result follows directly from
Theorem 5.4.4.
b) Suppose L = L(G), where
G = (V,X,P,S).
We may assume, without loss of generality, that G is in Greibach normal
form. Let , ,
A = ({<?}, Z, V, 8, q, S, 0),
where
5 (q, a, Z) = {(q, aT) I Z -+ aa is in P}.
Clearly, A is A-free and the proof that N(A) = L(G) is exactly the same as in Theorem
5.4.1. □

160 PUSHDOWN AUTOMATA
Now we turn to acceptance by final state. Final states easily eliminate the
previous problem with A.
Theorem 5.5.2 L 9 £* is a context-free language if and only if L = T(A) for
some A-free pda A.
Proof If L = T(A), then L is a context-free language, by Theorem 5.4.4.
Let L = L(G), where
G = (V,I,,P,S)
is in Greibach form.
Defme A = ({<7o, <?,/}, 2, F',5, <?<>, 5', F),
where
Then
and
V = VU{A'\AGN}.
i) 8(q,a,Z) = {(q, aT)\ Z^aa is in P};
ii) 5 (q, a, Z') = {(<?, 5'aT) I Z -» aaS is in P} U {(/, A) IZ -» a is in />};
iii) 5 (<7o, a, S') = {(q, B'qT) \S -* aaS is in P} U {(/, A) 15 -* a is in P};
F ={/}U{40l S^AisinP}.
Note. The state q0 allows A6I.
The proof that L(G) - T(A) is straightforward and is left as an exercise. □
Corollary Every context-free language may be accepted by a nondetermin-
istic pda with delay 0.
We may restate these results in a still more vivid form.
Theorem 5.5.3 For each pda A, one can construct a A-free pda B so that
T(A)=T(B).
Proof Given a pda A, we use Theorem 5.4.4 to get a context-free grammar
Gi for T{A). By Theorem 5.5.2, we can find a A-free pda B so that T(B) = T(A). □
The theorems of this section are really nontrivial in spite of their short
proofs. The reason that the proofs are so simple is that the hard work was done in
the grammatical transformation. The skeptical reader may try to prove Theorem 5.5.3
directly without the aid of Greibach form.
Many pda constructions are greatly simplified by the presence of null moves.
In the future, we shall use them extensively.
In Chapter 4, we discussed the size of a grammar and now we wish to
consider the size of a pda. There are at least two different ways to define the size. By
analogy with \G\, we could define \A\ to be the length of the string which results if
all of the entries in 5 are written out as a word. The formal definition now follows.

5.5 AMOVES IN PDA'S AND THE SIZE OF A PDA 161
Definition Let A = (Q, 2, T, 5, q0, Z0, F) be a pda. Define
\A\= 2 (lg(a) + lg(a) + 3).
(q ,a)£«(W,Z)
While this is a natural definition, it is cumbersome to compute. One would
like to use a measure like \Q\ x 12 I x Irl, but this measure seems too crude. For
example, we could write down two pda's having the same Q, 2, and T, and yet one
writes long strings on each stack move while the other does not. For this reason, it
is necessary to impose a restriction on the class of pda's being considered.
Definition A pda A = (Q, 2, T, 5, q0, Z0, F) is moderate if (q', a) G
8(q,a,Z) implies lg(a) < 2.
We already know something about moderate pda's.
Theorem 5.5.4 For each (deterministic) pda A, there is a moderate
(deterministic) pda B which is equivalent to A in whichever form of acceptance is being
used.
Proof The technique of adding states works in the deterministic case as
well. □
For moderate pda's, \A\ actually grows proportionally to the number of
rules, as the following result shows.
Theorem 5.5.5 Let A — (Q, 2, T, 5, q0, Z0, F) be a moderate pda. Then
3I6KUK6I6I,
and hence
\A\ = 0(|5|).
Proof Since A is moderate, (q', a)G 8(q, a, Z)implies lg(a) < 2. Therefore
\A\ = 1 (lg(«*) + 3)<6|6|.
(n',a)e%ii,Z)
The lower bound is obvious also. □
Now we can consider another definition of size:
Definition Let A = (Q, 2, T, 5, q0, Z0, F) be a moderate pda. Let
Size (A) = 12 x 2x T|.
This is an attractive measure of size, since it is independent of 5.
Now we can begin to relate these measures to each other.
Theorem 5.5.5 There exists a constant c such that for any moderate pda
A = (Q, 2, r, 8,q0,Z0,F), where m = \Qx 2 x T| and n = \A\, we have
n<cm3,
where c depends only on the coding of the rules.

162 PUSHDOWN AUTOMATA
Proof Let \Q\ = </,|2| = s, and |T| = t. Then
m = qst.
To prove that n Kcm3, we construct the largest possible moderate pda. We include
every allowable type of rule, i.e., for each
(q',a)e8(q,a,Z)
for all q, q e Q, Z S I\ a S 2 U {A}, a S A U T U T2, we have a rule. Since A is a
moderate pda,
«<|5|(2 + 1 +3) = 6|5|,
by the definition of n = \A\. Thus
n <6|5|< 6<7(1 +t + t2)q(s+ \)t
= 6q2(t + t2 + t3)(s + \)
<6q3(3t3)2s3 = 36m3,
so c = 36 will do. □
PROBLEMS
1 Show that, for any context-free language L, there is a deterministic pda A
and a homomorphism ip such that .4 operates in realtime and ip(L(A)) = L.
2 Consider the following "proof" of Theorem 5.5.2. Explain in detail exactly
what goes wrong.
Let L = L(G), where G = (V,2,P,S) is in 2-standard form. Let A =
({<7o,<7i}, 2, V,8,q0,S,F), where F= {qi }U {^q I S -» A is in />}. Here, 5 is
defined as follows."
5(<70, a, 4) = 5(<7i, a, 4)
= {On, A) | .4 -► a is in/>}
U{(?1,5)| .4-► a£ is in P}
U {(<?!, Cfi)| .4 -► a£C is in/>}.
Clearly, A is A-free and it is straightforward to verify that T(A) = L(G).
3 Prove the following result.
Proposition For each context-free language L there is a pda A = (Q, 2,
T, 5, q0, Z0, F) such that:
a) L = {WeX*\(q0,W,Z0)\-(q,A,A)for some q S F}.
b) IfAez,,then5(<70,A,Z0) = {((7o-A)}and(7oG/;'. If A^Z,,then
5(<7o,A,2o)=0.
c) If<7Ge-{<7o}orZer-{Z0},5((7, A, Z)= 0.
d) For each (q,a,Z)GQx 2 x r, if to', a)65(?,a,Z), then lg(a) < 2.

5.6 DETERMINISTIC LANGUAGES CLOSED UNDER COMPLEMENTATION 163
4 Show that, if a finite set is recognized by a deterministic (nondeterministic)
pda with q states and t pushdown symbols, then it can be recognized by a (non)-
deterministic finite automaton with a number of states proportional to
tiitf
5 Show that, for each n > 0, there is an infinite regular set Ln which is
accepted by a reduced finite automaton that has at least 22 states but can be accepted
by a dpda of size proportional to n3.
6 Find a lower bound for m in terms of n for moderate pda's, which would
be a counterpart of Theorem 5.5.5. Are additional hypotheses required on the pda?
5.6 DETERMINISTIC LANGUAGES CLOSED UNDER
COMPLEMENTATION
Although the full family of context-free languages is not closed under
complementation (as we shall see shortly), the deterministic languages are. This important
result will often be used in the sequel, even before dpda's are studied in detail. The
proof of the result is fairly complicated.
Our intuition about how to prove this result is strong. One wishes to replace
F by Q—F and design a "complementary machine." The problem is that A-moves
complicate the action of dpda's. A dpda might go into an infinite A-cycle in the
middle of processing an input.
First we ensure that our dpda's always have a next move. This could fail to
happen either if a transition was not defined or if the pushdown were emptied
prematurely. The idea of the first lemma is to make 5 as "total as possible."
Lemma 5.6.1 Let A = (Q, 2, r,8,q0,F) be a dpda. There is an equivalent
dpda/4' = (Q\Z,r',5',(7'o,Zo,F)suchthat
1 For all (q, a, Z) in Q' x 2 x T',
\8'(q,a,Z)\ + \8'(q,A,Z)\ = 1.
2 If 8'(q, a,Z'0) = (q'f a) for some a G 2A, then a = Zo/3 for some |3 G (f)*.
Proof We construct A', in which Z'0 acts as a bottom-of-stack marker so that
the pushdown is never erased, and we add any missing transitions.
Define A' = (Q\ 2, r", 8', q'0, Z'0,F), where r' = ru{z;}, Q' = Q U
{q'o, d) and 5' is defined by four cases:
1 8'(q'0,A,Z'o) = (qo,Z'0Z0)
2 For all (q, a, Z) G dom 5,
8\q,a,Z) = 8(q,a,Z).
3 If 8(q, A, Z) = 0 and 8(q, a, Z) = 0 for some a G 2, then
8\q,a,Z) = (d,Z).
4 ForallZGr>G2,
8'{d,a,Z) = (d,Z).

164 PUSHDOWN AUTOMATA
5 For all 4 e g, a e 2,
8'(q,a,Z'0) = (d,Z'0).
Clearly,/!' is a dpda which satisfies cases (1) and (2). For any x S 2*, q^-F,
(<7o, x, Z0) fj to, A, a)
if and only if
(q'o,x,Z'0) |j, too,*, Z0Z0)
|j, to, A, Z'0a).
Therefore,
T(/l') = T(A). D
One of the problems to be dealt with concerns the possibility that a dpda
will get into a A-loop and the computation will not terminate.
Definition An ID to, A, a) of a dpda A = (Q, 2, T, 5, <70, Z0, F)is immortal
if, for all integers i, there is an ID to,-, A, ft) such that lg(ft) > lg(a) and to, A, a) r~
to,-, a, ft).
Intuitively, an ID is immortal if A can make an infinite number of A-moves
and also never shorten its pushdown below its initial height. This would mean that the
pushdown might grow infinitely, or might cycle between different strings of the same
length.
Some of the basic facts about immortal ID's are summarized in the following
proposition.
Proposition 5.6.1 Let A = (Q, 2, T, 8,q0,Z0,F) be a dpda, and let
<7e2,zer,a,a'er*.Then,
a) If to, A, a) is immortal, then for any w £ 2* and / > 0, there exist q{ S Q,
Pi e r* such that
to, w, a) \- to, w, ft).
b) For any a S T*, to, A, Z) is immortal if and only if to, A, aZ) is immortal.
toi, wi, aO F^ to, w, a) H^ to', w, a)
and to, A, a) is immortal, then w' = u>.
d) If there exists a £ 2 such that
to, a, Z) H1 to', A, aj),
then to, A, Z) is not immortal.
a) If to, A, a) is immortal, then
to, A a) ^ to;, A ft)
for some q{ S 2 and ft S T*. By Proposition 5.2.1, (a) follows.

5.6 DETERMINISTIC LANGUAGES CLOSED UNDER COMPLEMENTATION 165
b) The argument is trivial since the immortal computation never emptied
the pushdown so a would not even be read.
c) This follows from the determinism of A and the definition of immortality,
since, once an immortal ID is reached, no additional input is read.
d) This is a variant of (c), since, if a true input gets read, the ID that started
the computation could not have been immortal, by the determinism of
A. The formal details are left to the reader. □
Because of (a), it is convenient to refer to ID's of the form (q, w, a) as being
immortal also.
Before deriving an algorithm for detecting immortal ID's, we need more
definitions.
Definition Let A = (Q, 2, T, 5, q0, Z0, F) be a pda and let q G Q, Qx £ Q,
and aG r*. Define:
T(A, q, a, 20 = {wG 2* | (q, w, a) f- (q', A, |3) for some |3 G T* and some q G Q1}
and
L(A, q, a, Qx) = {w G 2*| (q, w, a) ^ (q', A, A) for some q G Qx }.
These sets are context-free, in general.
Lemma 5.6.2 Let A = (Q, 2, T, 5, q0, Z0, F) be a pda and let q G Q,
£>! Sz Q, and aGT*. Then T(A,q,a, Q^) and L(A, q, a, gj) are context-free
languages.
Proof Define BqaQi = (Q U {q'0}, 2, T, 5', q'0, Z0, Qx), where 5' is defined
as follows:
1 5'(<7o,A,Zo) = (q,a)
2 8'(p,a,Z) = 8(p,a,Z) for all (p,a, Z) in the domain of 5.
B is deterministic if A is. Clearly,
T(B) = TiA.q.a.Qi)
and
L(B) = HA,q,a,Q1)
are context-free languages. □
Now T(A, q, a, Q) and L(A, q, a, Q) are used to provide a test for immortal
configurations.
Lemma 5.6.3 Let A = (Q, 2, T, 5, q0, Z0, F) be a dpda and let q G Q and
ZG r. If A satisfies condition (1) of Lemma 5.6.1, then (q,A,Z) is immortal if and
only if
T(A,q,Z,Q)= {A}
and
A$L(A,q,Z,Q).
Moreover, this is decidable in polynomial time.

166 PUSHDOWN AUTOMATA
Proof Suppose (q, A, Z) is immortal. Then
A $ L(A, q, Z, Q)
because, if it were, then
(q, A, Z) \r fa', A, A),
which violates the immortality of (q, A, Z). Also, if there exists xGI* which is in
T(A, q, Z, Q), then
(q,x,Z) £ fa',A,/3). (5.6.1)
But, since at least one true input was used in Computation (5.6.1), then (q,x,Z)
could not have initiated an infinite A-loop; that is, (q, A, Z) is not immortal. Cf.
Proposition 5.6.1(d).
Conversely, assume that (q,A,Z) is not immortal; then there exist z'X),
<7'G(2,and 76 T*,such that
(q, A, Z) ^ fa', A, 7),
and there is no q" £ Q, y £ T* with
fa', A, y) h fa", A, 7').
This can occur in one of two cases, namely:
1 y = A, or
2 y = aZ' and 5 far', A, Z') = 0.
If (1) holds, then AEL(A, q, Z, Q). If (2) holds, then, since A satisfies the first
condition of Lemma 5.6.1, there exist a £ 2, q" £ 2,7'" £ T*, such that
5fa',a,Z') = fa"',T"'),
and consequently,
(q,a,Z) ^- fa'",A,aT'").
Then
aeT(A,q,Z,Q).
In either case, one of the conditions is violated.
To analyze the running time of the algorithm, let us proceed as follows.
The size of a pda for L(A, q, Z, Q) is proportional to the size of A. Assume that all
our pda's never add more than one symbol to the stack at a time (fa', a) in 8(q,a,Z)
implies lg(a) < 2). There is no loss of generality in this assumption because we may
break down a long addition of symbols into many additions of length 1.
To check whether A £ L(A, q, Z, Q), convert from the dpda to a context-
free grammar for the same set. The grammar G will be of size proportional to the size of
the pda. One can test whether the grammar generates A in time proportional to IIC II.
For T(A, q, Z, Q), pass to a pda C (not necessarily deterministic) which
accepts this set as N(C). The size of C is polynomial in the size of A. Again pass to a
grammar for this language. This grammar will be polynomial in the size of C. Reduce

5.6 DETERMINISTIC LANGUAGES CLOSED UNDF.R COMPLEMENTATION 167
the grammar in linear time. There exists x ¥= A in T(A, q, Z, Q) if and only if there
is a righthand side that contains a terminal. (Cf. Problem 3.2.7.) □
Now we are in a position to detect which immortal ID's go through final
states.
Lemma 5.6.4 Let A = (Q, 2, I\ 5, q0, Z0, F) be a dpda. Let d = {(q, Z)\
q GQ, ZGT, and (q, A, Z) is an immortal ID, and there exists no q 6f such that
(q, A,Z)\*-(q', A, a) for any a G V*} and C2 = {(q,Z)\q G Q,ZG r,and (q, A,Z)
is an immortal ID and (<7, A,Z) \—(q',A,a) for some q GF and a G r*}. Let(<7, A,Z)
be an immortal ID. Then
a) (q, Z) G d if and only if T(A, q,Z,F) = 0.
b) (q, Z) G d if and only if T(A, q, Z, F) = {A}.
Proof Consider T(A, q, Z, F). Since (q, A, Z) is immortal,
if(q,Z)GC1,
T(A, q,Z,F) = l
{A} if(<7,z)ed- n
Corollary There is an algorithm for testing membership in d or d in time
which is polynomial in the size of A.
Proof The proof of Lemma 5.6.3 gives a polynomial time test for the
emptiness of T(A, q, Z, F) and for membership of A. □
Before continuing, a different form of acceptance is required.
Definition Let A = (Q, 2, T, 8, go, Zq, F) be a dpda. We define a relation
H~ £ P*- as follows: For all g, q GQ, w, w G2*,and 7, 7'G T*,
07. w,7) |^ (q'.w'.y')
if and only if
i) fa, w, 7) (— (q', w, 7') and
ii) 8(q', A,7'(1 ^is undefined (in particular, if 7' = A).
Intuitively, (4, w, 7) H~ (q', w', 7') if and only if (q, w, 7) \— (q', w', 7') in a
computation which cannot be extended by A-moves. That is, A has gone as far as it could go
in computing on (g, w, 7) without reading the first symbol of w'.
The computations defined with |f— are sometimes called d-computations
(d for deterministic). We can define "d-acceptance" by
Td(A) = {w G 2*| (qQ, w, ZQ) |^- (q, A, a) for some qGF,aG V*}.
Note that a dpda A may have many different opportunities to accept w
since w = wA' for any / but only one opportunity to d-accept w.
Next we introduce a special type of dpda.

168 PUSHDOWN AUTOMATA
Definition A dpda A = (Q, 2, r, 8, q0, Z0, F) is loop-free if for all w G 2*
there exists g G 2 and a G T* such that (<70, w, Z0) IP6- fa, A, a).
The g and a in the definition of loop-free are uniquely determined by w.
Thus, a loop-free dpda always exhausts its input string and stops. Note that a loop-free
dpda cannot go into an immortal ID after the input is processed. That is, suppose
(q0,w,Z0) IF11 (q, A, a).
Could (q, A, a) be immortal, or in other words, could this computation continue
indefinitely? This could not happen because, if it did, strings like wa, where a G 2
would violate the definition of loop-free since
(q0,wa,Z0) \^- (q',A,fl)
for no q G Q and |3 G T*, so
is impossible as well.
Lemma 5.6.5 For any dpda A, there is a dpda A' which is loop-free and
T(A') = 7X4). 4' may be effectively constructed from A.
Proof We may assume, without loss of generality, that A =(Q, 2, r, 6,
q0,Z0, F) satisfies the conditions of Lemma 5.6.1. (Intuitively, A always has a next
move.) Define A' = (£>', 2, I\ 5',q0,Z0,F') where Q' = Q U {/, J}andF' = FU {/}.
5' is defined by five cases.
1 For all fa, a, Z) G (Q x 2 x T),
S'O/, a, Z) = 5(<7, a, Z).
(All moves on true inputs are simulated.)
2 For all q, Z such that (g, A, Z) is not an immortal ID, let
8'(q,A,Z) = Sfa, A, Z),
(All A-moves that terminate are included).
3 For all (q, Z) e d of Lemma 5.6.4, let
8'fo. A, Z) = (rf.Z).
(We broke an infinite A-loop in which the stack never shortened and which
never went through a final state and instead went directly to a new non-
final state.)
4 For all (q, Z) G C2 of Lemma 5.6.4, let
5'(<7,A,Z) =(f,Z).
(In this step, we broke an infinite A-loop in which the stack never
shortened and which involved a final state and instead went directly to a new
final state.)

5.6 DETERMINISTIC LANGUAGES CLOSED UNDER COMPLEMENTATION 169
5 For all (a, Z) G 2 x T,
&'{f,a,Z) = (d,Z),
8'(d,a,Z) = (d,Z).
By Lemmas 5.6.3 and 5.6.4, one can decide whether an ID is immortal or not
and test membership in C\ or C2 and so A' may be effectively constructed from A. It is
straightforward to check that A' is a dpda because A is and because C\ n C2 = 0.
Claim 1 .4' is loop-free.
/¥oo/ Suppose A' is not loop-free. There exists w0 G 2* such that, for all
q G Q and a G T*, it is not the case that
(q0, w0,Z0) ||j, fa, A, a).
Since ^' satisfies the conditions of Lemma 5.6.1, this means that, for all /> 0, there
exist qt G Q, a,- G T+, and w,-, a suffix of w0, such that
(q0, w0,Z0) [^ fa,-, wit a,).
Now lg(w,) > 0 for all / and lg(w,) is nonincreasing, so let iQ be chosen such
that for all / > i0, lg (w,) = lg(w,o). Thus, for all / > i0, wt = w,o and
(%, A, %) P2- (<7,-. A, a,).
Now for all i>i0, lg(a,)>0, so there is z'i >i0 such that for all i>il,
lg (a,) > lg (a,-,). Let Z G r and a G T* be such that
aZ = a.-
and let ft be such that
aft = a,-
for all / > z-!; it is clear that
(qh,A,Z) h^ fo,.A.ft)
andlg(ft)>l.
Since for all Z G T,
S'(rf. A- Z) = 5'(/, A, Z) = 0,
it must be true that for all i>ix, we have qt G Q. This, in turn, implies that, for all
i > ii , if
ft = PZt
and
ft+i = SIS'
for some ft, then
(^•+1,/3')e5fe,A,Z,),
because the only entries in 5' — 5 involve d and /. Thus
(</,-, A, Z) f^ fo„ A. ft).

170 PUSHDOWN AUTOMATA
This implies that (qt], Z) G Cx U C2 whence
8Xqi],A,Z)£{(d,Z),(f,Z)l
which contradicts
07i,,A.Z) |j, (<7,i + 1, A,/3,-i + 1)
since qt +lG Q. Therefore A' is loop-free after all.
Claim 2 If
(<7o, w0, 7o) Ij (<?i,wi.7i) lj " ' tj fefe-^fe-Tfe), (5.6.2)
where 70 = Z0 and no immortal ID occurs in (5.6.2) (except possibly (<7fe, wfe,7fe)),
fao, wo, 7o) lj< - - • tj' (<7fe, wfe, 7fe).
Proof The argument is a straightforward induction on k because the
construction of A' includes all the moves used in (5.6.2).
Now we consider "successful" computations of A involving immortal ID's.
Claim 3 If there exists some k > 0 such that
fao. w0, 7o) lj 07i, wi, 7i) I7 " • • I7 fa*, wfe, 7fe) h • • • (5.6.3)
where 70 =ZQ,wk = A, and (<7fe,wfe,7fe) is immortal but (^-, wj,yj) is not, for/ <fc,
then
fao. w0, 7o) tj' • • ' tj- fa. A, 7),
where
(/ if qt G F for some / > k,
d otherwise.
Proof By Claim 2,
fao, w0, 70) tj- fafe, wfe> 7fe).
By the construction of A'
(f A, 7) for some 7 GT* if ^ GFfor some/>£,
. (d, A, 7') for some 7' G T* if q} £F for all/ > k.
The proof is completed by pasting together these A' computations. D
Our next task is to relate computations in A' to those in A.
Claim 4 If there exists k > 0 such that
(<7o, wo, 70) \X' " ' tj' fafe- wfe, 7fe), (5.6.4)
where 7o = Z0, wfe = A, and qk GF U {/}, then either
0 fao, w0, 70) tj • • • Ijfafe, wfe, 7fe) with^ GFand wk = A
[i.e., no immortal ID's encountered in A], or
fafe, wfe,7fe) lj.

5.6 DETERMINISTIC LANGUAGES CLOSED UNDER COMPLEMENTATION 171
ii) there exist g G F, 7 G T* such that
too. w0,7o) tj (qk-i.Wk-i.yk-i) \j (q.^.y)
and wfe_j = A [an immortal ID in C2 encountered by A].
Proofof Claim 4 Consider (5.6.4). All the ID's of A' are ID's of A unless
an immortal ID was encountered by A' when started on (q0, w0,yo)- If none were
encountered then (i) holds. If there exists 0 < / < k such that (qh wt, y\1}) is immortal
in A, then clearly qk =/and/ = k — 1. But, by rule (4) of the construction, we have
fe-i
fao. w0,y0) hp- (qk-i, A, 7fe-i) Ij- </, A, 7fe).
(<7fe-i. 7fe-i) is in C2, and so there exist q £f, 7 G T* so that fafe-i,A, 7fe_,) h~
(4, A, 7). Therefore, fe_, ^
fao. w0, 7o) hj- (?fe-i. A, 7fe_!) h (q, A, 7).
Our last claim relates the behaviors of A and A'.
Claim 5 7'(y4')=7'(yl).
Proof By Claims 2,3, and 4. □
Note that Proposition 5.2.2 follows from the preceding theorem.
We can now put together all our results into the main theorem of this
section.
Theorem 5.6.1 If L is a deterministic context-free language, then L= 2* — L
is also a deterministic context-free language.
Proof Suppose L = T(A), where A = (Q, 2, T, 5, qQ, Z0, F) is a dpda. By
Lemma 5.6.5, there is no loss of generality in supposing that A is loop-free. Define
A' = (Q\ 2, T, 5',<7o,Z0,F'), where
e' = to, ou^g./G {0,1,2}};
0 if q0£F,
q'o = (qo.io) where z'0
1 if q0 ^F;
F' = Q x {2};
and 6' is defined by cases:
1 lf(q,a, Z)GQx 2 x T,then
8'(fo, l),a,Z) = 8'((q,2),a,Z) = ((<?',/), a)
where 6(^7, a, Z) = (g', a) and
[0 if q'£F,
(l if q'GF.

172 PUSHDOWN AUTOMATA
2 If (q, Z) G Q x T and 5 (q, A, Z) = (</, a), then
5'((<?, 1), A, Z) = ((<?', l),a)
and
5'(07,O),A,Z) = ((q',i),a)
where
0 if q'$F,
1 ifqr'GF.
3 If 5 (<7, A, Z) = 0,then
8'((q,0),A,Z) = (fo,2),Z).
The idea of the construction is to use three copies of the set of states of Q.
A state of the form (q, 0) means that A has not been in a final state since it last made
a move on a true input. A state of the form (q, 1) indicates that A has entered a final
state in that time interval. The states of the form (q, 2) are used only for final states.
The detailed argument will proceed by consideration of a number of
claims.
Claim 1 A' is a dpda and Td(A') = T(A').
Proof It is clear that A' is a dpda from the construction. It is always true
that Td(A') 9 T(A'). If w G T(A'), then
(q'o.w,Z0) |j, (fa, 2), A, a)
for some 4 G g and a G T*. But, for all q G Q, Z G r,
5(0?, 2), A, Z) = 0,
so *
(q'o,w,Z0) |tj, ((q, 2), A, a),
and therefore w€Td(A').
Claim 2 For any q, q G Q, y, y' G r*, and / > 0,
(too,a,7) k oy,i),A,7')
if and only if ;
to. a, 7) tr to'.A- ?')•
Proof The argument is an induction on /. The basis when / = 0 is trivial.
Induction step. Assume the result for all computations of length k. Now
((q, D,A,7) t^-OV, 1),A,7') (5.6.5)
if and only if there exist Z G r and y" G r* such that
((q, 1), A, 7) £. ((q", 1), A, y"Z) |j, {{q1, 1), A, 7')- (5.6.6)
Note that we know the intermediate state (q", 1) has second component 1 because
no A-move can change the 1 in (g, 1) to a 0 or a 2. By the induction hypothesis
applied to (5.6.6) and by the definition of 5', this condition holds if and only if

5.6
DETERMINISTIC LANGUAGES CLOSED UNDER COMPLEMENTATION
173
and (q,A,y) ^(q",A,y"Z) ^
8(q",A,Z) = (q',a)
and
7 = 7 a.
Thus, (5.6.7) holds if and only if
(q, A, y) (j (q", A, y"Z) (j (q, A, 7')
or fe+i ( (
07- A, 7) hj- (q , A, 7 ).
The induction has been extended.
Claim 3 For any q, q' G Q, 7,7' G T*, and / > 0,
((<?,0), A, 7) £ ((<?', 0), A, 7')
if and only if
(q,A,y) £ (q',A,y').
Proof The argument is similar to the proof of Claim 2. The difference is that
we may observe that an intermediate state must have second component 0, for if it
were a 1 or a 2 it could not be reconverted to a 0 in a A-computation.
Now we consider how the second component in the states of A' might
change.
Claim 4 For any q, q G Q, 7,7' G r*,
(0?,0), A, 7) £-, ((<?', 1), A, 7')
if and only if there exist / < /, q" G F, and 7" G T*, so that
(q,A,y) £ 0?",A,7") \~ iq',A,y').
Proof The argument is another induction on / in which the basis is omitted.
Induction step. Suppose the result holds for k and consider
(0?,0),A,7) ^ ((<?', 1), A, 7% (5.6.8)
This holds if and only if there exist q" G Q, g G {0, l}, 7" G T*, Z G T, such that
(0?, 0), A, 7) £ (0?", S), A, y"Z) fr ((q\ 1), A, 7'). (5.6.9)
Note that £ ¥= 2 since A' cannot move from a 2-state to a 1-state under A. In turn,
(5.6.9) holds if and only if
fo. A, 7) ^(q",A,y"Z) (5.6.10)
and either
or
C = 0, 8(q",A,Z) = (q',a), q G F and 7' = 7"a
8=1, 8(q",A,Z) = (q',a), y = y'a. and
07, A, 7) fj O7"', A, 7'") for some 4'" G F,/ < fc.

174 PUSHDOWN AUTOMATA
The equivalence of (5.6.9) and (5.6.10) uses the induction hypothesis and Claim 3.
Using (5.6.10) we see that (5.6.8) holds if and only if
fe,A,7) t^ fai,A,/3) ^V,A,7'),
where m = if C = 1 then / else k + 1. The values of q t and |3 may be obtained from
(5.6.10) and q\^F. This extends the induction and completes the proof of Claim 4.
Claim 5 For any q, q e Q, q $ F, y, y e T*, and i > 0,
(fa,0),A,7) ^ (fa', 2), A, 7')
if and only if
(q, A, 7) tj fa'. A, 7'), 5fa', A,(7')(1)) = 0,
and for any p^F, y" G T*,
fa,A,7)^fa,A,7").
/¥oo/ The argument is an induction on /.
Basis, i = 0. Then,
(fa,0),A,7) fr (fe', 2), A, 7')
if and only if (by rule (3))
q = q', 7 = 7' and 5fa,A,7(1)) = 0,
which holds if and only if
fa. A, 7) tjfa',A,7')
and, since q^F, then q' = q (£F, and so
fa.A,7) ^fa",A,7")
for any q" £F.
Induction step. Assume the result for i = k. Let us consider
(fa,0),A,7) ^ (fa', 2), A, 7').
This holds if and only if
(fa, 0), A, 7) k (fa"' °)> A' ?") ^T" (fa'« 2). A' V). (5-6.11)
The second state has second component 0 because if it were a 2, we could not move
further on A. If it were a 1, it would not be possible to ultimately reach a 2-state
on A's. Now (5.6.11) holds if and only if q"£F, 8(q, A, 7(1)) = fa",a), 7 = j37(1),
7' = j3a, and, by the induction hypothesis,
fa", A, 7") \~ fa', A, 7')
q $F, and 5 fa', A,(7')(1)) = 0, and for any p GF, y" e T*,
fa", A, 7") Jh (P, A, 7'").

5.6 DETERMINISTIC LANGUAGES CLOSED UNDER COMPLEMENTATION 175
This holds if and only if
fa, a, 7) Ij fa", A, 7") h fa', A, 7'),
8(q', A, (7")(1)) = 0, and for any p Gf, 7'" e r*.
fa,A,7)i^ (p,A,ym).
This extends the induction.
Claim 6 For any q, q G Q, 7, 7' G r*, and /' G {0, 1, 2}, if
(fa,/), A, 7) l£ (fa',/'), A, 7'),
then/'G {1,2}.
* 0
Proof If / = 2, then /' = 2 because |p reduces to p. If / = 1, then /' = 1
for any number of A-moves. If /= 0, then it may be changed to a 1 and would
remain so because a 1-state cannot change to a 0-state under A's. If /= 0, it may be
changed to a 2. In the only remaining case, we would have
(fa,0),A,7) Bj. (fa',0),A,7').
This holds if and only if
(0?,0), A, 7) £. ((<?', 0),A,7')
and
5'(fa',0),A,(7')(1)) - 0- (5-6.12)
But if 5fa', A, (7')(1)) = 0, then (5.6.12) is contradicted by part (3) of the
construction. On the other hand if 5fa', A, (7')(1)) =£ 0, then (5.6.12) is contradicted by part
(2) of the construction.
Claim 7 For any w G 2*, q G Q, 7 G T*, /0 £ (0, 1}, if
(fa,/o),w,7o) Itj. (fa',/),A,7),
then t
fa, w, 70) (j fa', A, 7)
and if q = q0 and y0 = Z0, then / = 1 holds if and only if w G T(A).
Proof The argument is an induction on lg (w).
Basis, lg(vv) = 0, so w — A, and we have
(fa,/0),A,7o) llj' (fa',0,A,7)-
There are several possibilities:
CASE 1 If i0 = 1, then /= 1, since a "1-state" cannot be changed to a "0-
state" under A's nor can it go to a "2-state" without becoming a "0-state" first. By
Claim 2,
fa,A,7o) \j fa', A, 7).
Since q = q0, y0 = Z0, and /0 = 1 imply q0 G F, then this in turn implies A G T(A).
This completes the proof of Case 1.

176 PUSHDOWN AUTOMATA
CASE 2 z'0 = 0. By Claim 6, / = 1 or / = 2. Assume / = 1. Then, for some
k>0,
to,0),A,7o) Itj, ((<?', 1), A, 7).
By Claim 4, there exist
07,A,7o) tj (q", A,7') \~ (q',A,y)
for some / < k, q" G F. Thus, ifq=q0,y0=Z0, then w = A G r(^).
Now assume / = 2. Then
to, 0), A, 70)^'to', 2), A, 7)
implies that there is some k > 0 such that
(07,0),A,7o) |^" to', 2), A, 7).
Note that g $ F since /0 = 0. By claim 5,
fa.A.To) lj (q',A,y),
and for no p G F, 7' G r* do we have
(^,A,7o) I7 (p,A,7').
Therefore, q = qo, 70 =Z0 implies that w^T(A). The proof of the basis is now
complete.
Induction step. Assume the result true for all w G 2* such that 0 < lg(vv) <
k. Let
((q, /o). w. 70) Ifr to'. 0, A- 7), (5-6.13)
where lg(vv) = k + 1 > 1. Then we factor this computation as
to./o),w,7o) |j, toi,/i),a,7i) (5.6.14)
and
toi,*'i),a,7i) fr (fei,»2),A,72) ft to',0,A,7), (5.6.15)
where w = w'a where w' G 2*,a G 2, and Sfa', A,7(1)) = 0, because the original
computation (5.6.13) involved It"*-.
Consider computation (5.6.14). We have
to, z'o),w',7o) tj. toi,/i),A,7iZ),
where 7! = y[Z, y[ G T*, and Z G T. Furthermore,
5'to1,/1),a>Z)^0,
so that 5'to!, /j), A, Z) = 0, since A' is deterministic. Then
to,/0),w',70) Itj. toi,*'i),A,7i),
where lg(vv') < k, so the induction hypothesis applies, to yield
O7.w',7o) £ («i,A,7i) (5.6.16)
and w' G T(A) if and only if il = 1 when q - qo and 70 = Z0.

5.6 DETERMINISTIC LANGUAGES CLOSED UNDER COMPLEMENTATION 177
Now, we consider the computation of (5.6.15). By the construction of A',
fai>«>7i) fj fa2>A,72). (5.6.17)
Since
(fo2.»2).A,72) Ifr (fo'.O. A, 7), (5-6.18)
which follows from (5.6.15), and by the construction of A', we have that
i2 = 1 if and only if q2 G F.
By the induction hypothesis applied to (5.6.18), we have:
fa2,A,72) £fa\A,7). (5.6.19)
But (5.6.16) and (5.6.17), together with (5.6.19), yield
fa, w, 7o) tj fa', A, 7).
It follows that q = qo,lo = Zo, and * — 1 hold if and only if w G T(A). D
Claim 8 For any w G 2*, ^ GQ, and 7 G T*,
fao.w, Z0) ffj fa, A, 7)
implies
(fao,'o),w, Z0) Hj, ((?,/), A, 7)
for some /.
/¥oo/ The argument is an induction on lg(w).
Basis, lg(vv) = 0. From the definition of A', /0 = 0 or /0 = 1. By Claim 6, / = 1
or /= 2. Now, every possible case that can arise is dealt with by Claims 2, 4, or 5.
Induction step. Assume the result for all w G 2 * such that 0 < lg(vv) < k. If
(qo, w,Z0) tA fa, A, 7),
where lg(vv) = k + 1, then
(q0,w,Z0) f£ fa,,tf, 7i) (j fa2,A,72) (£ fa, A, 7)
for some w' G 2*, a G 2, so that w = w'a, some qx, q2 G 2, and some yi, 72 G T .
This implies
(fao.''o).w',Z0) H^. (fai.ii), A, 71)
and z'i G {1, 2}, by the induction hypothesis. Hence
(faw'oXw.Zo) H^. (fai,/i),a,7i) lj< (fa2.'2).A,72)
with/2 = 0 or 1, by the construction of A'. Moreover,
(fo. h), A,72) ifr (fa.0,A>7)
by Claims 2,4, 5, or 6. Combining the subcomputations, we have:
(fao.'o).w,Z0) Hjt (fa,/), A, 7).

178 PUSHDOWN AUTOMATA
We now arrive at our last proposition.
Claim 9 7X4') =2* -T(A)
Proof Since A is loop-free, for any w£X , there exist q and 7 such that
{q0,w,Z0)VrA (q,A,y)
Hence, by Claim 8
) |£ (0?,/), A, 7)
By Claim 7,/=2 if and only if w$ T(A). Thus w<$ T(A) if and only if w G Td(A') =
T(A'). Therefore,
7X4') = 2*-7X4). D
PROBLEMS
Let A = (Q, 2, T, 5, q0, ZQ, F) be a dpda and let q E Q, Qx c Q, for any
aG r*. Define:
L(A,q, a, Qx) = {wG 2*| (q, w, a) £- (<?', A, A) for some q G Qx\,
Ld(A,q,a, Qx) = {wG 2*| (q, w, a) W1- {q, A, A) for some q'G Qx};
Td(A,q,a, Qt) = {wG 2*| (q, w, a) [^ (q', A, |3) for some q G Qi,|3e T*}.
1 Show that, if A is a dpda, then £d04, q, a, Qi) and Td(A, q, a, Qx) are
context-free languages. Show that one can decide whether they are nonempty and
whether a given string belongs to them in time that is polynomial in the size of A.
2 Let A = (Q, 2, T, 6, q0, Z0, F) be a dpda. Show that A is loop-free if and
only if
2* = (Td(A,q0,Z0,F) U Td(.A,qQ,ZQ,Q-F)).
3 Let A = (Q, 2, T, 5, q0, Z0, F) be a loop-free dpda. Is it true or false
that
Td(A, q0, Z0, Q-F) = 2* - 7X4)?
Justify your answer.
4 Let A = (Q, 2, T, 5, q0, F) be a dpda. A pair (q, a) with <? G Q, aG T*,
is accessible if there exists w G 2 such that
(q0, w,Z0) f— (<?, A, a).
Is there an algorithm to decide which configurations are accessible? Suppose A were
nondeterministic. Would the result still hold?
*5 Let A = (Q, 2, r, 5, q0, Z0, 0) be a dpda. Let
r = max{l,lg(7)|5(Q, a, Z) = (<?', 7)}
and let
( (/•ioi|r|-1 )/(/•-1) if r > 1,
/V =
llQliri if r= 1.

5.7 FINITE-TURN PDA'S 179
Show that (q, A, Z) is an immortal ID if and only if there exists j> N such that
(q,A,Z) f(q',h,y)
for some q' G Q, y G T*.
6 Show that in a dpda, A-moves can be essential only in computations that
cause the stack to decrease in height.
7 Show that it is never necessary for a dpda to change states in computations
where the stack is increasing in height.
8 In Lemma 5.6.5, is it true or false that:
a) Td(A') = Td(A)1 b) T(A') = Td(A)l c) T(A') = Td(A')1
9 Let L £ 2 be a deterministic context-free language and let c ^ 2.
Construct a deterministic loop-free dpda A such that
L(A) = idU) = rU) = Td(A) = £c.
10 Show that any deterministic context-free language may be accepted by
a dpda A such that for each input string w of length n, the computation of A on w
takes time proportional to n.
11 Define a pda ,4 to be a deterministic (iterated) counter if A is
deterministic and is a(n) (iterated) counter. (Counters and iterated counters are defined
in the exercises in Section 5.3.) Show that the family of deterministic iterated counter
languages is closed under complementation. Is the same result true for counter
languages?
5.7 FINITE-TURN PDA'S
We have seen that restricting pda's to be deterministic has led to a useful
family of languages. Another restriction will be considered here and some simple
implications will be derived. In later sections when more tools are available, we shall
return to this family and characterize it in several simple and elegant ways.
To illustrate the restriction we have in mind, consider the set
L = {a"b"\ n>0}{c}{anbn\ « > 0>.
A pda accepting this set might stack a's and pop on ft's and then repeat this strategy
for the second half of the set. A graph of the pushdown would look like Fig. 5.5.
We shall say that, in this computation, there were 3 "turns" or places where
the length of the pushdown reversed. The restriction we shall study is to bound the
number of turns in a computation. Formalism is needed if we are to be precise about
this restriction.
Pushdown
FIG. 5.5
Turns
-^•Time

180 PUSHDOWN AUTOMATA
Definition Let A = (Q, 2, T, 6, q0, Z0, 0) be a pda. A sweep is a
computation
fao, w, Z0 ) Pfo.A.A)
for any w£E and q E Q.
Thus a sweep is any computation that empties the pushdown. Next we
classify the moves of the pda.
Definition Let A = (Q, 2, T, 5, q0, Z0, 0) be a pda and let (q, 0) E 8(q,a,Z)
for some g G Q, a E 2 U {A}, and ZEY.k move (q, ax, aZ) {~(q', x, a/3) is called
Nondecreasing if lg (|3) > 1,
Nonincreasing if lg (|3) < 1,
Increasing if lg (|3) > 1,
or
Decreasing if |3 = A.
Next the notion of "increasing" is applied to sweeps.
Definition Let (q0, x0, y0) \ h(qk, xk, yk), where y0 = Z0, yk = A,
and xk = A, be a sweep. The concept of an increasing (decreasing) sweep is defined
recursively as follows:
i) The length of(q0, x0, y0) is defined to be increasing;
ii) Inductively, if the length at (qu xt, yt) is increasing and if the move
(47> xi> Ji) \~(qu xi+i> 7i+i) is nondecreasing (decreasing), then the
length at (^i+1, xi+l, yi+l) is said to be increasing (decreasing);
iii) If the length at (qh xv yt) is decreasing and the move (q[t xit 7,) (—
(qi+i, xi+1, 71+1) is nonincreasing (increasing), then the length at (^i+1>
xi+l, 7,-+1) is said to be decreasing (increasing).
Now, a "turn" can be precisely defined.
Definition Let A be a pda and suppose we have
(<7;. xh 7,0 h (qi+i, xi+l, y!+l).
If the length at (q{, xh 7,) is increasing (decreasing) while the length at (^,-+1, xi+i,
7i+1) is decreasing (increasing), then the length has had a turn at (qt, xh yt).
Definition A sweep is a (2k — l)-turn sweep if it has exactly (2k — 1)
turns. A pda is a (2k — \)-turn pda if every sweep has at most 2k — 1 turns. A pda
A is said to be a finite turn if there is some k > 1 so that A is a (2k — 1) turn pda.
Let us single out the sweeps with a bounded number of turns.
Definition Let A be a pda and letk> 1. Define
N2k~i (A) = {xG 2*I (q0,x, Z0) \— (q, A, A) for some q E Q is a (2/ — 1) sweep
for some / < k}.

5.7 FINITE-TURN PDA'S 181
Our first little result is to show that N2k-i(A) is always a context-free
language.
Lemma 5.7.1 For each pda A and any k > 1, there is a (2k — l)-turn pda
A' such that
N(A') = N2k_M).
Proof Let A = (Q, 2, T, 5, q0, Z0, F) be a pda. We shall construct a pda
A' with 2k copies of the states of A, which will simulate A and keep track of the
(2fc-l)-turn sweeps. Define /T = (Q', Z, T, 5',fa0, 0, 0), where Q'=Qx {/|1<
/ < 2k}. S' is given by cases:
CASE 1 ((q, i), a) E S'((q, i), a, Z) if and only if
(q, a) E 8(q, a, Z) and either / is odd and lg(a) > 1
or / is even and lg (a) < 1.
(Thus a move from (q, /') to (q, i) is nondecreasing if /' is odd and nonincreasing if
i is even.)
CASE 2 ((q', 2i), A) £ S'((q, 2/ - 1), a, Z) if and only if
(q',A)E8(q,a,Z).
(Moves from a state (g, 2/ — 1) to (q', 2i) are decreasing moves.)
CASE 3 ((q, 2/ + 1), a) E 5 \(q, 2i), a, Z) if and only if
(q, a) £ 5 (q, a, Z) and lg (a) > 2.
(Moves from (g, 2/') to fa', 2/ + 1) are increasing moves.)
It is now a straightforward matter to verify that each turn in a sweep of A'
corresponds to a move from a state of the form (q, i) to one of the form (q', i + 1).
Therefore, A' is a (2k — l)-turn pda. Note that any (2/— 1) sweep of A (where 1 <
/'<£) corresponds to a sweep of A'. Since the converse of this is also true, it follows
that
yV2fe_,C4) = N(A'). □
Corollary N2k-\ (A) is a context-free language for each pda A.
One of our ultimate goals will be to characterize the finite-turn languages
grammatically. The following definition will prove to be equivalent to these
languages but it will take some effort before the characterization is achieved.
Definition A context-free grammar G = (V, 2, P, S) is ultralinear if there
exists a finite number of (possibly empty) pairwise disjoint sets X0,.. . ,Xn such
that
n
N = V-T, = Ul;
1 = 0

182 PUSHDOWN AUTOMATA
and for each Xt and each A £1,, either
1 A -»■ w is \nP
where w£pUI0U'"U *,-_,)*, or
2 ^ -+uBv is inP
withw, i)£ 2* and B G AV
Note that every linear language is ultralinear by taking X0 = 0 and A^ = A'.
We are now ready to prove half of our characterization theorem.
Theorem 5.7.1 If A is a finite-turn pda, then N(A) is an ultralinear language.
Proof Let A = (2, 2, T, 5, ^0, Z0, 0) be a finite-turn pda. There is no loss
of generality in assuming that .4 satisfies the following properties:
0 Q = Qi u---uQ2fc with<jr0eGi-
ii) Every move from a state of (?2,-i is either a nondecreasing move to a
state of (?2i-i or a decreasing move to a state of (?2j-
iii) Every move from a state of Q21 is either a nonincreasing move to a state
of Qii or an increasing move to a state of Qi^x •
iv) fa', a) G 5fa, a, Z) for any q G g, a G 2 U {A}, and Z G T implies lg(a)
<2.
(i), (ii) and (iii) follow from Lemma 5.7.1. Property (iv) may be assumed because we
could add extra states to achieve it if necessary. Such additions would result in a new
pda A' such that N(A') = N(A), and if A is a (2k — l)-turn pda, then so is A'.
Now for any state q G Q, define v(q) — i, where q G Qr Note that v(q0) = 1
and i^fa) < 2k for all 4 G 0.
Define G - (V, 2, P, 5), where
V = 2 U {5} U U (2; x T x 2; x (r U {A})).
1 «!</«2fe
and P is given by the following productions: For all q, q , /,pGg;a£2U {A};
Y, Z, Z' G T; X G T U {A}. Then,
1 5^fa0,Z0,^', A);
2 fa.Z,?', A)^a iffa', A)G5fa,a,Z);
3 fa, Z, 4", Af) -> afa', F, 4", AT) if fa', F) G 5 fa, a, Z);
4 fa", X ?', Y) -» fa", X 4, Z)a if fa', r) e 8 fa, a, Z);
5 fa, Z, 4", Z') -» afa', F, 4", A) if fa', Z'Y) G 5 fa, a, Z);
6 fa", Z', <?', A) -> fa", Z', 4, Z)a if fa', A) G 5 fa, a, Z);
7 fa, Z, 4', AO -> fa, Z, p, F)(p, F, q, X) for all p6g such that v(q) <
v(p)<v(q')

5.7 FINITE-TURN PDA'S 183
The basic idea of the construction is embodied in the following observations
and the claim.
The first rule indicates a goal of obtaining all words in N(A). Rule 2 is a
terminal rule, which corresponds to a decreasing move of the pda and will only be
applicable when v(q) is odd and v(q') — v(q) = 1. Productions (3) and (4) correspond to
length-preserving moves of the pda. Production (5) corresponds to an increasing move
of the pda. It can occur only when v(q) is odd and v(q') <v(q"). Similarly, (6)
represents a decreasing move of the pda, which can only apply when v(q') is even and
v(q") < v(q). Rule 7 corresponds to a sequence of moves of the pda such that there
is a state p such that
v{q)<v(p)<v(q),
at which the length of the pushdown is exactly one. The following claim summarizes
the correspondence between the variables and the pda's computation.
Claim For each q, q G Q, Z G T, and Y G V U {A},
(q,Z,q', Y)^w wG 2*
if and only if
(q,w,Z) \^(q',A,Y).
Proof The argument involves an induction and case analysis in each direction
and is omitted.
It follows from the claim that
L(G) = N(A).
To complete the proof, we need only verify that G is ultralinear. Define
X[ = {(q, Z, q',Y)\ v(q) - v(q) = i} for 1 < / < 2k - 1
and
X2H = {SI
It is now easy to verify that G is ultralinear. □
Corollary If A is a one-turn pda, then N(A) is a linear language.
Proof If k = 1, then the grammar G uses no productions of type (7). Thus
G is in linear normal form (Problem 2.5.1). □
PROBLEMS
1 Complete the proof of the claim in Theorem 5.7.1.
A context-free grammar G = (V, 2, P, S) is said to be nonterminal bounded if there
exists an integer k > 1 such that, for any derivation A =^a, aG V*, the number of
occurrences of variables in a is at most k. A language is called nonterminal bounded if
it is generated by a nonterminal bounded grammar. If G is such a grammar and a G V
define p(a), the rank of a, to be the largest integer r such that there is a word |3 G V
with r occurrences of variables such that
*

184 PUSHDOWN AUTOMATA
2 Prove that:
a) For w e 2*, p(w) = 0.
b) pC4, ■■■A„)= 2 p(^i) for >!!,.. . ,^„ £ V,n>0.
1 = 1
c) p(/4)> 1 for each A G N.
d) If A is a variable of rank r, there exists a word a £ F such that:
i) A 4- a
ii) a has /■ occurrences of variables, and
iii) each variable in a has rank 1.
e) Every nonterminal bounded grammar contains variables of rank 1.
3 Prove that a context-free grammar G is ultralinear if and only if G is
nonterminal bounded.
4 Is the set L = {{anbn \ n > 0}c)* nonterminal bounded or not? Can you
prove your answer?
5 Show that if Ll and L2 are finite-turn languages, then so are
£i U L2
and
LtL2-
6 Show that every regular set is a finite-turn language.
5.8 HISTORICAL SURVEY
The close connection between context-free languages and pushdown
automata was established by Chomsky and Schiitzenberger. Cf. Chomsky [1962] and
Schiitzenberger [1963]. Another early reference is Evey [1963]. Theorem 5.6.1 is
due to Haines [1965]. The theory of deterministic context-free languages originated
in Haines [1965] and Ginsburg and Greibach [1966a]. Related results are in
Schiitzenberger [1963]. The theory of finite-turn pda's is due to Ginsburg and Spanier
[1966].

six
The Iteration Theorem,
Nonclosure
and Closure Results
6.1 INTRODUCTION
This chapter begins with an iteration theorem for context-free languages. Use
of this result allows us to give direct proofs that certain sets are not context-free. The
theorem also shows that, over the one-letter terminal alphabet, there is no difference
between context-free and regular sets. Next, the closure properties of the context-free
languages are explored. One particularly important result is that context-free languages
are closed under "sequential transducer mappings."
Section 6.6 digresses by giving a closure result whose proof is not constructive.
Section 6.7 discusses the relationship of programming languages to context-free
languages and indicates the programming language features that are incompatible with
the context-free languages. A theorem is established in Section 6.8 that enables us to
prove that there are "nonlinear" context-free languages. The proof is rather a nice
blend of the different techniques used in this chapter. A natural map of context-free
languages into n-ary relations on natural numbers is given in Section 6.9, and Parikh's
Theorem is proved.
6.2 THE ITERATION THEOREM
We will now establish one of the most important theorems of context-free
language theory. It is a necessary condition on the set of strings of a language. The
notation of Section 1.6 will be needed in our development.
185

186 THE ITERATION THEOREM. NONCLOSURE AND CLOSURE RESULTS
Definition Let wG2*. Any sequence \p = (vu . . . , vn)£ (2*)" such that
w = vt---v„
is called a factorization of w. Any integer /, 1 < / < lg(vv) is called a position of w.
Let ^ be a set of positions in w. Any factorization \p induces a "partition"
of A, which we write as
Kfr= (Ku...,Kn),
where for each /, 1 < / < n,
K, = {k£K\ lgfa, . . . , q_,) < * < lg(»,,. . . , t*)}.
Note that some A,- may be empty, so this is not a true partition.
Examples
a) Let w = ai ■ ■ • a„,at€ 2, and A = {1, . . . ,«}.
i) If we take \p = (a1,a2, ■ ■ . ,an), then
A,V = {A-,} where A", = {«}.
ii) If we take <p = (^1^2'"' anX then
A/^ = A, = A.
b) Let w = ff" 6P <■",</>= (a", 6p,c") and A = {p + 1,... ,2p};then
A, = A3 = 0, A2 = A.
Notation. To mark a set of positions we will underline them. Thus, in (b)
above, the set of positions would be written as:
ap .b". c".
We can now state and prove an iteration theorem for context-free languages.
The theorem we prove is stronger than the "pumping lemma" ("xuwvy theorem")
found in some standard texts. The additional power of the iteration theorem stems
from the use of positions. This forces the factorization of a string to be restricted and
cuts down on the number of cases that can arise.
Theorem 6.2.1 (The Iteration Theorem) Let L = L(G) be a context-free
language and G = (V,~L,P,S) be a grammar for L. There exists a number p(G) such
that for each w£L and any set A of positions in w, if |A| >p(G), then there is a
factorization
<fi = (vu ... ,vs) of w
such that:
i) There exists A&N such that
S ^> ViAvs
A => v2Av4
A => v3

6.2 THE ITERATION THEOREM 187
if) For each^XD,^ v%v3vivs€-L.
iii) lfK/*=lKl,...,Ks},tben
a) eitherKUK2,K^<}> or K3,KA,Ks=£ 0;and
b) |tf2Utf3Utf4|<p(G).
Part (iii) means, in part, that K3¥=0 and either K2 or ^f4=£0. Thus either
z>2 =£ A or z>4 =£ A. Condition (Hi) (b),
|tf2Utf3Utf4|<p(G),
is necessary in order to prohibit trivial factorizations in which w = v2v3v4.
Proof Let
r = max {2, lg(ct)| ^ -*■ a is in P},
and define
v = |7V|, p(G) = r2"+3.
Observe that one can actually compute p(G) and that it depends on the grammar.
This will be needed later when this theorem arises in certain decidability questions.
Let w be in L and let K be any set of positions in w such that \K\ >p(G).
Let T be a generation tree for S=>w and let _y1?... ,yigiW) be the sequence of non-
null terminal nodes in T. That is, using the notation from Section 1.6,
yi\lyi+i for l </<lg(w),
w = XCi)---XOig("'>)>
and
Now define two sets:
D = {jc in r| x ry-t for some / G /T},
5 = {^inr| xrx1andxrx2 for somexl,x2^D,xl ¥=x2}.
Intuitively, a node is in D if there is a path from the node to a position. A node is in
B if it has at least two immediate descendants, each of which has a path to a position.
(Nodes inZ) (B) will be referred to asD- (2?)-nodes.)
Claim 1 If every path in T, the tree for w, contains </ 5-nodes, then w has
< r' positions in K.
Proof The maximum number of positions would occur if each 5-node had
maximum "fan out" and each of the successors of a 5-node was a Z)-node. Since the
maximum "fan out" of any node in the tree is r, the maximum number of positions
in w is /■'.
Now choose a path s = (sq, . .. , st) in T such that:
a) s0 is the root of T;
b) st is a leaf;
c) s contains the maximum number of 5-nodes taken over all such paths.

188 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Claim 2 s contains at least (2v + 3) fi-nodes.
Proof Assume s has < (2v + 2) 5-nodes. Then, since s has a maximal number
of 5-nodes, every path in T from s0 to a leaf must have < (2v + 2) fi-nodes. By Claim
1, w would have 02"+2 positions in K; but w was chosen to have at least r20*3
positions in K. Hence s must have at least (2v + 3) 5-nodes. □
Now define a set Cs of nodes on s by.
*
ii) For each* &Cs,x\~ y,y &B imply_y S Cs;
iii) \Cs\ = 2v+3.
Intuitively, Cs contains the "lowest" (2v + 3) fi-nodes on s.
Claim 3 If Cs = CL U CR for any sets CL and CR, then either
\C^\>v+2 or \CR\>v+2.
Proof Assume that this is not the case. Then
I Ca\ < | CL| + | CR| < 2v+ 2 < 2v + 3.
But this is a contradiction. Hence, either | CR | or | CLI > v + 2.
We now define two sets
CL = {xeCglxr^ and y\lst for some y&D]
CR = {x€Ca\xFy and st\ly for somey €Z)}.
Intuitively, *GCL (respectively, CR) if it is in Cs and has an immediate
descendant y which has a descendant that is a position strictly to the left (right) of
st.
Example See Fig. 6.1.
Note. CL and CR are not necessarily disjoint.
Claim 4 CS=CL UCR.
Proof
a) CL U CR 9 Cs, by definition of CL and CR.
and
D = fi-nodes
x = positions in K
Cs = \x0,xx)
CL={xx)
CR= {x0,X\)
FIG. 6.1

6.2 THE ITERATION THEOREM 189
b) On the other hand, if x 6CS, then x has at least two descendants which
are positions in K, since x is a B-node. One of these might be st, but at
least one must lie either to the left or to the right of st. Hence, x is an
element of either CL or CR. Thus C £ CL U CR, and therefore
Cs = Cl U Cr ,
which completes the proof of Claim 4.
By Claim 3 there are now two dual cases to consider, namely
\C,\>v+2
\CR\>v+2.
We will assume that
\C,\>v+7.
The argument in the other case completely parallels this one and is omitted.
Thus we have
Cl — v*i, • ■ • ,xm\,
where +
xt r *,-+, for 1 < / < m.
Now, since m> (v+ 2), there exists/, k, and there is ,4 S TV such that
\<j<k<m
X(xj) = X(xk) = A.
and
But this gives
S =
A --
A --
This is illustrated in Fig. 6.2.
= position
□ = CL-node
ViAvs for some Vi,vs£ 2*,
v2Av4 for some v7, v4 S 2*,
v3 for some v3G 2*.
FIG. 6.2

190 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Now, by iterating the second sequence of productions q^O times, we get
S => vxAvs =*■ v^Avlvs ^ vxv\vzv%v^L.
SincexuXj,xk 6CL, there existyhyj, and_yfe such that i,j,k&K,and
* * *
Xiry,, XjFyj, xkFyk,
and
yt\ly}\lym\lst
(where we have also used the fact that Tis a tree).
Letting y?be the factorization
<fi = (Vi,V2,V3,Va,Vs)
and
Kh= {Kl3...,Ks},
we have
/€*,, JGK2, k&K3.
Thus, KuK2,K3^0.
Finally, since s was chosen to have the maximum number of S-nodes and
that part of the path starting at xx has exactly (2v + 3) fi-nodes, no other path starting
at xx can have more than (2v+ 3) fi-nodes. Further, any path starting at Xj can be
extended to be a path starting at xu and hence must have <(2z> + 3) B-nodes.
Thus, by Claim 1, the subtree rooted at Xj can have at most
/•w+3 = p(G) = p
positions in K. Since this subtree derives v2v3v4, we must have
\K2UK3LlK4\<p. □
Corollary {The Pumping Lemma) Let L be a context-free language generated
by G = (V, 2, />, S). There exists a number p(G) such that, if w G L and lg(w) >p(G),
then there exists a factorization
P = (Vl,V2,V3,V4,Vs) Ofw
such that:
i) There exists ,4 GN so that
S ^ ViAvs
A => z>2^4
ii) For each q>0,vlv^v3v^vsG L;
iii) vu v2, v3¥=Aot v3, v4, vs¥=A and \g(v2v3vA)<p(G).
Proof Choose every position to be distinguished. □
A careful study of these proofs reveals that we never used the fact that the
leaves were labeled by 2. (We did require that positions not be labeled by A.)
Therefore, the result is valid for sentential forms.

6.2 THE ITERATION THEOREM 191
Theorem 6.2.2 (The Iteration Theorem for Sentential Forms) Let L(G) £ S*
be a context-free language generated by G = (V, I,,P,S). There exists a number p(G)
such that, for any sentential form a of G and any set K of positions in a, if | K |> p(G),
then there is a factorization
V = (01,02,03,04, Us) °fw
such that:
i) There exists ,4 S^Vsuch that:
S 4 ft^
A 4 02,404
^ ^03
ii) For each g >0, 010203040s is a sentenial form of G;
hi) If */<* = {*,,..., Ks}, then:
a) either £ l, tf2, £3 =£ 0 or £3, tf4, £s =£ 0; and
b) |/:2u/:3u/:4i<p(G).
Using the iteration theorem, we can now show that there exist languages
that are not context-free.
Theorem 6.2.3 TV = {a"b"a" | n > 1} is not context-free.
Proof Suppose TV is context-free. Then, from the iteration theorem, there
exists a p such that conclusions of the theorem hold.
Let
w = apj£ a"
where the underlined bs are distinguished. That is,
K = {p+l,...,2p}.
Then there exists a factorization
V = (Pi, v2, ■ ■ . , 0s),
and we may assume, without loss of generality, from the symmetry of w that Kx,
K2,K3¥=<}.
Thus,
vx = apb', i>0
v2 = bJ, />0
z>3 = bkv, k>0, veb*a*.
Now consider
w2 = ^^2030405 = a"bv' for some z/ S b+a* because w2STV;

192 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
But lg(w2) = \g(apbpap) = 3p. On the other hand,
lg(w2) = lg(w) + lg(w2^4) > 3P + l,
which is a contradiction. Hence TV is not context-free. □
It is of some interest to compare the pumping lemma with the iteration
theorem, As we shall later see, the use of positions greatly restricts the number of
possible factorizations of a string and reduces the labor involved in showing a set to
be noncontext-free. We shall now give an example of a noncontext-free language
that can be shown to be noncontext-free by the iteration theorem but not by the
pumping lemma.
Theorem 6.2.4 Let L = {a*bc}U {a"banca"\ p prime, n>0}. L is not a
context-free language, but the conclusions of the pumping lemma hold for L.
Proof The iteration theorem may be used to show that L is not context-free.
Let p be any prime larger than the constant p0 of the iteration theorem.
Consider „ „ „
w = a^ bapcap,
where the underlined a's are distinguished. All possible factorizations lead to a
contradiction. We omit the details because a more elegant proof that L is not context-free
will be given in Section 6.5.
Now let us show that the pumping lemma is too weak to show that L is not
context-free.
CASE 1. Let w = ambc, where lg(w)>p0, the constant of the pumping
lemma. Then we have a factorization:
», = a"1-1
v2 — a
v3 = be
vA = A
vs = A
Clearly, VxV%v3vivs&L for all g 5*0.
CASE 2. Let w = apba"ca", where lg(w) >p0,n> 1. Take the factorization
u, = apba"-1
v2 = a
v3 = c
vA = a
vs = a"-1
Then ViV^v3vivs G L for all q > 0. Thus the pumping lemma is satisfied. □

6.2 THE ITERATION THEOREM 193
PROBLEMS
1 Use the iteration theorem to show that the set L given in Theorem 6.2.4
is not context-free.
2 If you try to prove an iteration theorem for regular sets using right linear
grammars, what is your strongest result?
3 Prove the strongest iteration theorem you can get for the family of linear
context-free languages by specializing the argument in this section.
4 In Problem 8 of Section 2.5, it was shown that every linear context-free
language was representable as
where <pu ip2 are homomorphisms and R is regular. Can you use this characterization to
obtain an iteration theorem for the linear context-free languages?
5 Give a pda proof of the iteration theorem.
6 Using the result of Problem 5, can you find an iteration theorem for
deterministic context-free languages?
7 By the use of the iteration theorem, show that the set L given below is
noncontext-free. Show that this result cannot be obtained with the pumping lemma.
L = {aib'ck\ i^j^k^i}.
8 Is the set L given below context-free or not?
L = {albJck | i,j>\,k> max{/, /}}.
9 Let G = (V, 2,P, S) be a reduced context-free grammar. A variable A 6Af
is said to be recursive HA =±>uAvfox some u,vE2,*, where uv¥= A. Show that L(G) is
infinite if and only if G has a recursive variable.
10 Are the following sets context-free or not? Prove your answer.
L, = {aWbJ\ lj>\}\
L2 = {xx\ xe{a,b}*};
L3 = {wicw^l «i, «2e l{°> 1}*, c^{0, l}, and «i<«2 where «,•
denotes the binary number represented by «,- for i = 1, 2};
L4 = {anbna'\ n<j<2n,n>\}.
*11 Find an iteration theorem for the family of counter languages.
12 Recall from problems 13 through 16 in Section 2.2 that, if 2 = {0, l},
the map x ^x is a one-to-one map from {l}£* onto the positive integers. Prove the
following result.
Proposition Let 2 = {0, 1}, u,€{l}S*, v2, v3, vA, fls€E*, be such that
21e("2) ^ j moc[ p an(j 21b("4> ^ i moci p^ where p is an odd prime. Then
V\V2~lV3,V%~lVs = I^|^3^5 mod p.
Hint: Recall Problem 14 of Section 2.2.

194 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
13 Prove the following result.
Proposition Let 2 = {0, 1}, z;, G{l}2*, v2, v3, v4, vs be in 2*. Let k > 0
be such that ViV2v3v4vs = p is an odd prime, 2lg("2> ^ 1 mod p and 2lg("4> ^ 1 mod p.
Then , , , _
V\v\v\ v3v%v4 vs = V\v\v$v*Vs = 0 mod p.
14 Let 2 = {0, 1}. If & is an infinite subset of {l}2* which represents
primes (i.e., w£# implies w is a prime), then ^ is not context-free.
6.3* CONTEXT-FREE LANGUAGES OVER A ONE-LETTER ALPHABET
The following theorem characterizes the context-free language over a one-
letter alphabet.
Theorem 6.3.1 Every context-free language L <L {a}* is regular.
Proof Let L 9 {a}* be a context-free language generated by some grammar
G, and let N0 = p(G) be the constant given by the iteration theorem. Then, for each
w e z,n2yv»2+,
there exists a factorization
V = (vuv2,v3,v4,vs)
such that
a) w = ViV2v3v4vs,
b) \g(v2v3v4)<N0,
c) v2v4 ¥= A.
From (b) and (c) we have
v2v4 = a' for some/, 1 </<jV0.
Further, since 2 = {a}, any two words commute:
Wk = vxv\v3vh4v% = ViV3vs(aJ)k GZ,, 0<&.
Claim 1 Let p be the least common multiple of {1,... ,N0}. Then, for all
w G Z, n 2N» 2+, we have that w(ap)* 9 Z,.
ZV00/ Each word in w(a")* is of the form
w(ap)' = ViV2v3v4vsam = ViV3v5aJap' = ViV3vsaJ+pt.
Now, since p is the least common multiple of {1,. . . ,jV0} and 1 </<jV0, / divides
p. Therefore,
/ + p/ = /(l + (p//)0,
where the second factor is an integer. Thus
w(al7" = ViVjVsa^^^ = ViV3vs(v2VAy*wiii,
which is an element of L by the iteration theorem.

6.3* CONTEXT-FREE LANGUAGES OVER A ONE-LETTER ALPHABET 195
Now define
A, = aN°+i(ap)* nL, Ki<p.
Note that A; may be empty for some /. Let yt be a word of minimal length in At if
Aj ¥= 0. Otherwise, let_y,- be undefined.
Claim2 L = Z, n U Z') Ju U y^a*)* I.
a) in |Ur|u U^TCi.
Since: i)in(Ul')Ci;
ii) ^,- GL and, by Claim 1 ,.y,(ap)* 9 Z,.
b) To show that
letx be any element ofZ,. There are two cases.
CASE 1. lflg(x)^N0, then
jcGLnlU 2'j.
CASE 2. Iflg(x)>7V0,then
lg(x) = N0 + k for some fc.
Let
k = qp + r, 1 <r<p.
Then
x = aN"+r(a"f.
Thus
i4r = (aN°+r(ap)* nL)*Q,
and^r has a minimal element^. Let
JV = aN»+r(ap)s for some s>0.
This gives
x = jvW*,
where q — sX) since j>r was minimal, which completes the proof of Claim 2.
Therefore Z, is regular, since it can be written as a regular expression. □
Recalling the definition of an ultimately periodic set from Section 2.2, we
can prove the following theorem.

196 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Theorem 6.3.2 For L 9 E*, where | E | = l, the following conditions are
equivalent:
1 L is a context-free language.
2 L is a regular set.
3 IL = {i\ a' G L} is an ultimately periodic set.
/Voo/ (1) implies (2) by Theorem 6.3.1. (2) implies (3) by Theorem 2.2.5.
If IL is ultimately periodic, then L is regular, by Theorem 2.2.5, and context-free
by the Corollary to Theorem 2.5.1. □
PROBLEMS
1 Show that, if L 9 E* is any context-free language, then {lgOOl xGL} is
an ultimately periodic set of natural numbers.
2 Show that L = {af(-n)\ n> 1}, where f(n) is a polynomial of degree at
least 2, is not an ultimately periodic set.
3 If L 9 a* ■ ■ ■ a^, where E = {at, . . . , a„), define
I(L) = {(/!, ...,in)\a\<-ai?eL}.
Show that, if Z, is a context-free language, then I(L) may be expressed as a finite union
of sets of the form:
{c + ilpl + • • • + impm hi, ... , im > 0}.
The symbols c, Pi,. . ., pm are n-tuples of nonnegative integers and none of the p,-
contain more than two nonzero components.
4 Let /be a function on the natural numbers and i9S*. Define
Uf) = {x&T,*\xy&L for some y G E* and lgOO = /(lg(x))}.
Here / is said to be a context-free preserving function if, for each Z, that is context-
free, /.(/) is also context-free. Show that if/is a context-free preserving function, then:
a) for each r > 0, {i'|/(0 = r} is ultimately periodic;
b) for each d> 1 and for each r<c/, {*1(/(0)mOdd = r) is ultimately
periodic; and
c) for each k > 1, {;'| /(i) < fc;'} is ultimately periodic.
5 Show that if /is a context-free preserving function, then for k > 1,
{w|/(«) = w and f(n)*^kn for somen}
is a finite set.
6 Show that /is a context-free preserving function if and only if:
a) for each r > 0, {i'|/(0 = r} is ultimately periodic;
b) for each d > 1 and each r<d, {*1(/(0)mOd d = r} is ultimately periodic;
c) for each k > 1, {m\ f(n) = m and f(n) < fc« for some n} is finite.
7 Using the result of Problem 6, show that the class of context-free preserving
functions is closed under composition, addition, multiplication, and exponentiation.
8 Is every context-free preserving function also a total recursive function?

6.4
REGULAR SETS AND SEQUENTIAL TRANSDUCERS
197
6.4 REGULAR SETS AND SEQUENTIAL TRANSDUCERS
In studying the closure properties of context-free languages, it is possible to
give very simple proofs by using pushdown automata.
We have already shown that the context-free languages are closed under union,
product, Kleene closure, substitution mapping, homomorphism, and reversal. We now
show that they are closed under intersection with a regular set. Later it will be shown
that the intersection of two context-free languages is not necessarily a context-free
language.
Theorem 6.4.1 If L is a (deterministic) {unambiguous} context-free language
and R is a regular set, then L n R is a (deterministic) {unambiguous} context-free
language.
Proof Let L = T(A) for some pda A:
A = (Q,X,r,8A,q0A,Z0,FA)
and R = T(B) for some finite automaton
B = (QB,X,8B,q0B,FB).
Let C be a pda defined by
C = (QA*QB,°Z.,T,Z,(q0A,q0B),Z0,FA xFB),
where, for each (q\,q2)&QA x 2B,eachaGSA and each ZGT,
8(fai,flfa),a,Z) = {((q',8B(q2,a)),a)\ (q ,a)e8A(q,a,Z)}.
Intuitively, C simulates^ and B in parallel and accepts if and only if they both accept.
Note that, since for each q2 6 Qb,
5b(^2, A) = q2,
the two machines remain in "synchronization."
Claim For each (qi,q2), (q\-,q2)^ QA x QB, each wGE*, and each a,
per*,
((qi,q2),w,a) £ ((q\,q2), A,0)
if and only if
(Qi,w,<x) f^" (<7i,A,0) and q2 = &B(q2,w).
Proof The proof, by induction on the number of moves, is left as an exercise.
Thus,
xe7\C) if and only if xGT(A) and xGT(B).
Therefore,
T(Q = LHR,
which proves the theorem. Note that Cis deterministic {unambiguous} if A is. □
Corollary If Z, is a (deterministic) {unambiguous} context-free language and
R is regular, then L —R is a (deterministic) {unambiguous} context-free language.

198 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Proof If/? is regular, then E* — R is regular. But we have
L-R = Z,n (£*-/?). □
This result does not generalize to the intersection of two deterministic
context-free languages, as the following example shows. Let
Li = {a'fcV'l i,j>\}
and . . .
L2 = {a'b'aJ\ i,j>\}.
Clearly,Lx and L2 are deterministic context-free languages. Then
L,nz,2 = {«VV| i>\)
would be context-free but this is not so. Thus we have shown:
Theorem 6.4.2 The family of (deterministic) context-free languages is not
closed under intersection. The family of context-free languages is not closed under
complement. There is a context-free language which is not deterministic.
Proof The second sentence follows because
LinL2= 2*-((2*-Z,i)U(2*-Z,2)).
If every context-free language were deterministic, the family <g would be
closed under complementation, by Theorem 5.6.1. □
In Section 3.4, the idea of a homomorphism was generalized to the concept
of a substitution. Another direction would be to have a powerful kind of transducer
into which a context-free language could be sent. Such devices will now be introduced.
Definition A sequential transducer is a 5-tuple
S = (2,2, A, 6,<?<,),
where
Q is a finite nonempty set of states,
E is a finite nonempty input alphabet,
A is a finite nonempty output alphabet,
q0 G Q is the start state, and
6 is a finite subset of Q x E* x A* x Q.
Intuitively, if (q,u,v,q')G8, the machine in state q under the influence of
input u G E* will make a transition to state q , outputting v&A*. This may be
represented graphically as:

6.4 REGULAR SETS AND SEQUENTIAL TRANSDUCERS 199
A/1
FIG. 6.3
Thus there is a natural correspondence between sequential transducers and labeled
directed graphs.
Example See Fig. 6.3.
The sequential transducer functions like a nondeterministic finite automaton
(i.e., given an input, one can follow any path through the automaton which
corresponds to that input). Thus, in the example,
S(ab) = {A}U{01*}.
We now give formal definition of the output of S.
Definition Let Q = (Q, E, A, 8,q0) be a sequential transducer. For u62*,
we say vGS(u) if and only if there exist uu ... ,uk GE*;^,... ,vk G A* ,qu .. . ,qk
G Q, such that
i) u = ul- ■ -uk,
ii) v=vx ■■■vk,
iii) (fli, ui+l, vi+l, qi+l) G 6 for each /, 0 < / < k.
IfZ,9 E*,then
S(L) = U S(tt).
u E L
Finally, if£9 A*, then
S~\L) = {wGE*| S(w)9Z,}.
Note that, if S = (Q, Z,A,8,q0)isa sequential transducer, then:
a) IfS^QxExAxQ and satisfies the condition that whenever (q, a, v, q)
G6 and (q,a,v',q")e.8, then v=v and q = q", then S corresponds to
a "sequential machine."
b) If ip: E->-A* is any homomorphism, then, if one takes the one-state
machine defined by
8 = {(<7o,a,¥>(a),<7o)l for each a G EA},
then S(x) = <p(x) for each x G E*.

200 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
We now prove the following important theorem.
Theorem 6.4.3 (The Sequential Transducer Theorem) If L is a context-free
language (regular set) and S is a sequential transducer, then S(L) is a context-free
language (regular set).
Proof Let L = T(A) for some pda
A = (Gi,Z,rj8ljqf0jZ0,fi)
satisfying the following condition:
Forall(qr,a)G(2xEA,
if (q' ,a)G8l(q,a,Z0), then a ¥= A.
The condition ensures that the start symbol Z0 is never erased. The method of Lemma
5.6.1 may be employed here to ensure that this condition is satisfied. Let
S = (G2,2,A,62,s0).
Now let
m = max{l,lg(M),lg(z;)| (q,u, v,q) G52}
and
Sm = U Sf, Am = U A\
1 = 0 1=0
Define a new pda B, which simulates both the original pda A and the
sequential transducer S.
B = (e,A,r,6,(<7o,s0,A,A),Zo,F),
where
Q = QixQ2xZmxAm,
F = F1 x Q2 x {A} x {A},
and 6 has three types of rules:
Type 1. For all (q, Z) G Ql x T and for all s G Q2,
8((q,s,A,A),A,Z) = {(fa.s'.u,*;), Z)| (s,u,u,s')G62}.
7)'pe 2. For all (q, s, a) G Qi x Q2 X (2 U {A}), and for all (x,y) G Em x Am
such that lg(ox) < m,
8((q,s,ax,y),A,Z) = {((q',s,x,y),a)\ (q ,a)e8l(q,a,Z)}.
Type 3. For a\\(q,s,b)GQ1 xQ2x A and for ally G Am such that lg(»<m,
8((q,s,A,by),b,Z) = {((q,s, A,y),Z)}.
Although B looks rather formidable, the idea involved is simple. 1 f (q, s, u, v)
is a state, then q and s correspond to the state of A and S, respectively, at that point
in the computation, and u and v are input and output "buffers" for S (used in a
manner described below). In accepting a string, B performs the following sequence of
steps.

6.4 REGULAR SETS AND SEQUENTIAL TRANSDUCERS 201
Step 1 (Move of type 1). B "guesses" that S in state s will map u into v and
transfer to state s':
((q,s,A,A),x,P) k ((q,s',u,v),x,j}).
Note that a Type-1 move occurs only when both the input and output buffers are
empty. Also, the state of the pda, the pushdown, and the input all remain unchanged.
Step 2 (Move of type 2). B now simulates A started in state q with 0 on the
pushdown and input u:
((q,s',u,v),x,(i) £ ((q ,s',A,v),x,$').
The state of the transducer, the output buffer and the input remained unchanged
during this computation.
Step 3 (Moves of type 3). B then checks whether the output v of S generated
in Step 1 corresponds to the input string:
((q',s',A,v),vx,j}') \^ ((q',s',A,A),x,j}').
If x ¥= A, then B goes back to Step 1; otherwise the input string is accepted if and only
if q GFt. Note that the states of A andS and the pushdown store remain unchanged.
We now prove a series of claims which show that B operates as described.
Claim 1 For each q, q G<2i, each s,s' GQ2, eachz/G Sm,each vGAm, each
w, w' G A*, each a', 0 G T*, and each Z G T,
(G7,s,A,A),w,0Z) t-g- ((q',s',u,v),w\a')
by a move of Type 1 if and only if
q = q, w' = w, a' = 0Z, and (s,u,v, s')G62.
Proof Follows directly from the definition of moves of Type 1.
Claim 2 For each q,q & Qx, each s, s' G Q2; each w£Sm, each v, t'eAm,
each w, w' G A*, each 0 G T*, 0' G T+, and Z G T,
((q,s,ux,v),w,pZ) \j ((q',s',x,v'),w',P'),
using only moves of Type 2 if and only if
s' = s, v = v, w' = w, and (q,u,(lZ) |j (^',A,0').
Proof We prove the only if direction by induction on n.
Basis, n = O.Then,
((q,s,ux, v),w,PZ) [j «y,s',x,z/),w',0'),
q = q, s' = s, u = A, v = v, w = w, and 0' = 0Z.
But
(<7,M,0Z) ^ (^?,u,0Z).

202 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Induction step. Assume that the only if direction of the claim holds for all
computations of length n. Let
((q,s,ux,v),w,pZ) Fir ((q',s',x,v'),w',p').
Then
((q,s,u'ax,v),w,pZ) |j ((<7i,s,,ax,z>i),wi,0iZ,), (6.4.1)
\B(ftl',s,x,v'),w',PM, (6.4.2)
where
u = u'a, «eZU{A}, and 0' = 0,02.
Applying the induction hypothesis to (6.4.1) gives:
si = s, Vi = v, Wi = w, (q,u',j3Z) \^ (<7i,A,0iZ).
Now (6.4.2) is a move of Type 2. Hence,
s = si, v = vu w' = Wi, («7',j32)G6(^i,a,Z).
Combining these results gives:
s' = s, v = v, w' = w,
and
(q.u.pZ) = (q,u'a,jiZ) \j (ql,a,plZl)
|j (fl',A,AW = fo',A,A
which completes the induction.
The j/ direction follows directly from the fact that, for each move in A, there
is a corresponding Type-2 move in B. A formal proof by induction is left as Exercise 1.
Claim 3 For each q,q' GQU each s,s' G Q2, each uxG 2m,each w,w' G A*,
zyeAm,|3'er+,|3er\andzer,
((<7, s, «x, ly), w, j3Z) £ (fa', s,x,y), w', j3'),
with no moves of Type 1 if and only if s = s,w = vw',and
(q,u,pZ) \j fa', A, A
Proof The ow/j> if direction will be proved by induction on n.
Basis. \g(v) = 0. Then v = A and
((q,s,ux,y),w,pZ) £ ((q ,s',x,y),w',P').
Since only moves of Type 2 could be used, we have, by Claim 2, that
s = s, w' = w, and fa,«,0Z) |j fa',A,0').
Induction step. Assume that the only if direction of the claim holds for
\g(v) = n.
Letlg(z;) = H + l.Then,
v = v'a, aGA lgO>') = n.

6.4
REGULAR SETS AND SEQUENTIAL TRANSDUCERS
203
The computation of B must be of the form:
((q,s,UiU2x,v'ay),w,^Z) (j (07i,Si,u2*,<y), vt>i,0iZ,) (6.4.3)
Ij ((<72,s2,"2x,.y),w2,02Z2) (6.4.4)
£ te's'^Jb'.U'), (6.4.5)
where u = UyU2.
In line (6.4.3), only moves of Type 2 and Type 3 are used and lg(w') = n.
Thus, by the induction hypothesis,
Si = s, w = v'wi, and (q,uu(iZ) |j (qiy A,0iZi).
Move (6.4.4) is a Type-3 move and thus,
Q2 - Qu s2 = Si, Wi = «w2, 02Z2 = |3iZi.
Finally, (6.4.5) consists of only moves of Type 2. Again applying Claim 2 gives
s = s2, w' = w2, (q2,u2,j}2Z2) [j (q',A,j}').
Combining these results gives
s' = s, w = z/«w' = ww
and * * , ,
(<7,"i"2,0Z) \j (<72,"2,|32Z2) fj (<7,A,0),
proving this direction of Claim 3.
Conversely, it suffices to prove that, for all <7 G Qu s G Q2, u G Am, lg(«) < m,
if w = i>w', then:
(fa,s,u,vy),w,pZ) \j ((q,s,u,y),w',(}Z).
This is a straightforward proof by induction on \g(v). Then the desired result follows
directly from Claim 2.
Claim 4 For each?,?' G Qu s, s' G Q2, w,y G A*, 0 G r*,0' G r+, and Z G r,
((<7,s,A,A),wy,0Z) £ ((<?', s', A, A)^,0'),
using k > 1 moves of Type 1 if and only if there exist wu ... , wfe in A*, there exist
s0, ■ ■ ■, Sfe in Q2, there exist uu .. ., uh in E* such that
w = Wi • • • wfe,
t
S0 — S, Sfr — S ,
(s(, ui+1, w,+1, s,+1) G S2, 0 < / < k,
fa.K,•••«*,0Z) \j (<7',A,0').
Proof The argument in the only if direction is an induction on k.
Basis. k = 1. If
((<7,s,A,A),wy,0Z) £ ((<?',s', A, A),/,0'),

204 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
using only one move of Type 1, then the computation must be of the form:
(fy,s,A,A),wy,pZ) |j(fai,Si,A,A),*i,0iZD (6-4.6)
Ij ((Q2,S2,u,v),x2J2Z2) (6.4.7)
ft(ffl',s',A,A),y,P'), (6.4.8)
where si,s2GQ2,x1,x2GA*.
The computation (6.4.6) has only Type-2 moves, since no Type-3 moves
are possible when the output buffer is empty. By Claim 2.
Si = s, Xi — wy,
and .
fa,A,0Z) [jtoi.A.ftZO.
Now (6.4.7) is the single Type-1 move, and hence, by Claim 1,
<72 = Qi, *2 = *i, j32Z2 = PiZu
(Sl,«,fl,S2)G62-
Finally, (6.4.8) has no Type-1 moves and thus, by Claim 3,
s' = s2, x2 = vy,
(q2,u,a2Z2) [j fa'.A,/}').
Combining these results gives
V = W, St = S, S2 = S ,
(s,m, w,s') 662,
(<?,«, j3Z) £ (flfi,«,PiZ,) £ fo', A,0').
Thus, by taking
s0 = s, Si = s', «i = m, and wt = w,
we will have proved the basis.
Induction step. Assume that the only if direction of the claim holds for a
computation with k > 1 moves of Type 1. Let
((<7,s,A,A),wy,0Z) £ ((<?',s', A, A),y,p'),
with A: + 1 moves of Type 1. Then
(iq,s,A,A),w'w"y,pZ) £ ((<?„/•,, A, A), w"y,^Z,) (6.4.9)
ft«tl',s,A,A),y,p'), (6.4.10)
where w = w'w" and (6.4.9) uses only k moves of Type 1. Such a "factorization" of
the computation must exist since the last move of Type 1 cannot be made unless both
the input and output buffers are empty.

6.4 REGULAR SETS AND SEQUENTIAL TRANSDUCERS 205
Thus the induction hypothesis applies to (6.4.9) and there exist
w1(. .. ,wk in A*, s0,. . . ,sk in Q2, «i,.. . ,uk in E*,
such that i
w = Wi ■ ■ ■ wk,
s0 = s, sfe = ru
(s,-, ui+u wi+l, s,+1) G 62, 0<i<k,
(q>Ul--uk,pZ) \j (q1,A,P1Z0.
But (6.4.10) uses only one move of Type 1 and thus, by the basis of the induction,
there is some u£Z* such that
(rl,u,w",sl) G62
and
foi.K.frzo [j («', a, A
Therefore, if we let
sfe+1 = s', wfe+1 = w", z/fe+1 = u,
we will have
w = w w' = wi"--w^+i,
s0 — s, Sk+} — s ,
(s,-,«,+1,w,+1,s,+i)G62, 0</<&+ 1,
(<7,«i- ■ukuk+1,pZ) ^ (fli.Ufc+i.jSiZ,)
£(<?', A, 0'),
which completes the proof of the necessity of Claim 4.
Using Claims 1, 2, and 3, the proof of the converse is straightforward and is
left as an exercise.
Now, taking
q = q0, s = s0, 0 = y = A, Z = Z0
in Claim 4 gives
(G7o,So,A,A),w,Zo) fc (to',s',A,A), A,0')
if and only if there exist wi,. .. , wfe in A*,s0,... ,sfe in Q2, Ui,... ,uk in E* such
that
W = W] • • • Wk,
(sf, «,+1, w,+1, s,+1) G 62 for 0 < i < k.
(q0,Ui ■ -Mfe.Zo) [^ to',A,0').
Thus, wG 7(5) if and only if w &S(L), which proves the theorem for context-free
languages.
For regular sets, it suffices to note that exactly the same construction works
without using the pushdown store. In this case, B will be a nondeterministic finite
automaton but T(B) is still a regular set. □

206 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
PROBLEMS
1 Give a formal proof of the if direction in the proof of Claims 2 and 3 in
the proof of Theorem 6.4.3.
2 Show that, for each sequential transducer S = (Q, 2, A, 6, q0), there is a
sequential transducer S' - (£?', 2, A, 6', q'0) such that 6' 9 Q' x Ea x A* x Q' and
5(u) = 5'(u)foralluGE*.
It is often useful to discuss a-transducers or accepting transducers. The idea is to
define S = (Q, E, A, 6, q0, F), where F'-Q and the rest is as usual in a sequential
transducer. We change the definition to define Siw) as before, except that the
computation must end in a final state.
3 Show that, if S is an a-transducer and L is a context-free language, then
S(L) is a context-free language.
4 Show that if L is a linear context-free language and S is a sequential
transducer, then S(L) is a linear context-free language.
5 Extend the previous problem to the case where L is metalinear. Cf. Problem
4 of Section 2.5 for the definition of metalinear.
6 Give a context-free grammar proof of the sequential-transducer theorem.
7 Let G be the following linear grammar:
S -+ aSa | bSb | dSa | dSb | dS | c
Describe i(G) and show that the complement of L(G) is not context-free.
8 Let L\, i29 E*, and define the set of shuffles of Lx and L2as
shufCZ,!,/^) = {x1y1 ■ ■ -xkyk\ k> \,xy ■ ■ ■ xk &Luy1 ■ ■ ■ yk GL2, ^^eX').
Show that
shuf(Z,1UZ,2,Z,'1UZ,'2) = U shuf(/,,-, /,'•).
Ue{i,2}
Prove or disprove:
shuf(Z,!, L2) = shuf(L2,Z,i).
9 Show that, if L is context-free (regular) and R is regular, then
shut\L,R)
is context-free (regular).
10 Find two context-free languages Ll and L2 such that shuf(Z-!, L2) is not
context-free. Make Ll and L2 as "simple" as possible.
6.5 APPLICATIONS OF THE SEQUENTIAL-TRANSDUCER THEOREM
Using the sequential-transducer theorem from the previous section, we can
obtain further closure properties of context-free languages. We first define a restricted
form of the sequential transducer which is sufficiently powerful for many applications.
Definition A generalized sequential machine (gsm) is a 6-tuple
S = «2,2,A,6,A,<70)

6.5
APPLICATIONS OF THE SEQUENTIAL-TRANSDUCER THEOREM
207
where
Q is a finite nonempty set of states,
E is a finite nonempty input alphabet,
A is a finite nonempty output alphabet,
q0 G Q is the start state,
8: Q x E ->■ Q is the transition function, and
X:Qx2^A* is the output function.
The functions 6 and X are extended to 2* in the usual way, namely: For
each<7G0,xG£*,aGE,
8fa,A) = Q,
8(q,ax) = 8(8(q,a),x),
Xfa.A) = A,
\(q,ax) = Xq,a)\(8(q,a),x).
The definition of a gsm is not standardized in the literature. One must
exercise care because some authors define this device with final states.
The gsm can be thought of as a sequential machine with not necessarily
length-preserving outputs. Although many of the results for sequential machines
carry over to gsm's, some do not. For instance, given two sequential machines Afi
and Af2, then it is decidable whether or not there is an xG E+ such that Mi(x) =
M2(x) (where M(x) designates the response of Af to input x). If Mi and M2 are gsm's,
then this is no longer the case.
Also, every homomorphism can be represented by a one-state gsm.
Definition Let L £ E* and let S be a gsm. We define two sets
S(L) = {S(x)\ x&L),
and
S-\L) = {x\ S(x)eL}.
Theorem 6.5.1 If Z, is a context-free language (regular set) and S is a
generalized sequential machine, then S(L) is a context-free language (regular set).
Proof Let S = (Q, E, A,6,,0)
be a gsm. Define a sequential transducer
S' = (2,S,A,6',<70),
where
6' = {(q,a,Uq,a),8(q,a))\ for each (q,a)GQ x S}U {(q0, A, A,q0)}.
Claim S(L) = S'(L).
Proof By induction on the length of the input. The argument is left as an
exercise.
Thus, by the sequential-transducer theorem, we have completed the proof. □

208 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Corollary If L is a context-free language (regular set) and ip is a homomor-
phism, then \p(L) is a context-free language (regular set).
Example Let
L = {a*be} U {apba"ca" \ p prime, n>0}.
L is the noncon text-free set used in Theorem 6.2.4 to distinguish between the iteration
theorem and the pumping lemma. Now an elegant proof that L is not context-free
will be given.
Suppose that L were context-free. Then
Li = L-{a*bc}
would be context-free. Define the gsm S by:
a/a{ { q0 l
Then
S(L{) = {ap\ p prime}
would be a context-free language. This is a contradiction of Theorem 6.3.1, and shows
L to be noncontext-free.
Now, let us consider inverse gsm mappings.
Theorem 6.5.2 If L is a context-free language (regular set) and S is a gsm
mapping, then S~l(L) is a context-free language (regular set).
Proof Let
S = (Q,Z,A,8,\,q0).
Define a sequential transducer
S' = (Q,A,Z,8',q0),
where
6' = {(q,\(q,a),a,8(q,a))\ (q,a) GQ x E}u {fa0, A, A,q0)}.
Claim S'(L) = S~1(L).
Proof By induction. The argument is left as an exercise.
Thus, again applying the sequential-transducer theorem, we have the desired
result. □
Corollary If L 9 A* is a context-free language (regular set) and ip: S* ->■ A* is
a homomorphism, then ip"'(Z.) is a context-free language (regular set).

6.5 APPLICATIONS OF THE SEQUENTIAL-TRANSDUCER THEOREM 209
It turns out that the family of context-free languages is not closed under
S_1 if S is a sequential transducer.
Theorem 6.5.3 There is a deterministic context-free language L and a
sequential transducer S such that S~l(L) is not context-free. The family of context-
free languages is not closed under inverse sequential transducer mappings.
Proof Let S be the following sequential transducer.
Then, for all we{a,bf,
S(w) = {cw,dw}.
Now let
L = {ozW| i,j> l}u{c?«W| i,j>\).
Clearly,/- is a deterministic context-free language. But
S~\L) = {w\ S(w)£L}
~ {w\dwGL and cw&L}
= {aibiah\ i = /and/ = k}
= {a"b"a"\ n>\)
is not context-free. □
Next we turn to a set-theoretic operation on sets.
Definition Let A', 79 2*. Define
X~lY = {wGE*| xwGY for some xG X},
YX~l = {wGE*|wxGy for some xG X}.
YX'1 (X'1 Y) is the right (left) quotient of Y by X.
We now prove that context-free languages are closed under right quotient
with regular sets. The proof is interesting since it is based only on the closure
properties of context-free languages and hence would apply to any family of languages
that satisfy the necessary closure properties.
Theorem 6.5.4 If L is a context-free language and R is a regular set, then
LR'1 is a context-free language.
Proof Let L, R 9 E* and c $ 2. Let
S = (teo,<7i},2U{c},Z,8,flf0),

210 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
where
6 = {(q0,a,a,q0)\aei:}u{((i0,c,A,qi)}utoi1,a,A,qd\aei:u{c}}.
Graphically, S is given by:
a denotes any letter in Z
It is easily seen that, for each xe2*jS(SU {c})*,
S(x) = x, S(xcy) = x.
Now let ifi: (S U {c})* -> S* be the homomorphism given by
a, a 6 2,
A, a = c.
Claim S(y~l(L) n E*c/?) is context-free.
vO) =
i) E*c/? is regular.
ii) Since ip is a homomorphism, ip-1 is an inverse homomorphism. Thus
ip~l(L) is context-free.
iii) <p-1(Z.) H E*c/? is context-free, since the intersection of a context-free
language with a regular set is context-free.
iv) S($~l{L) n E*c/?) is context-free, by the sequential-transducer theorem.
Claim LR'1 = S(<p'\L) n ?*cR).
Proof x eSQp~\L) n Z*cR) if and only if there is.y G S* so that
xcyGvJ_1(Z,)n2*c/?.
This holds if and only if there isy G'E* such that
xcy&tp'l{L) and y&R.
But this holds if and only if there isy G'E* such that
y&R and *p{xcy) = xyGL and y€R;
i.e., if and only if
xGLR-1. O
Corollary If Z, is a context-free language and R a regular set, then R'1L
is a context-free language.
RTlL = (LT(/?T)-')T,
and both regular sets and context-free languages are closed under reversal.

6.5 APPLICATIONS OF THE SEQUENTIAL-TRANSDUCER THEOREM 211
Corollary If L is a context-free language, then
a) init(Z,) = {u\ uv&L for some v£ E*},
b) fin(Z,) = {u\ vu&L for some !)£2*}, and
c) subwords(Z,) = {v\ uvw&L for some u,w6J*}
are context-free languages.
Proof
a) init(Z,) = Z,(E*)-1,
b) fm(L) = (E*r1L,
c) subwords(Z,) = fin(init(Z,)). D
Now we shall show that the full family of context-free languages is not closed
under quotient.
Theorem 6.5.5 There exist two deterministic context-free languages Lx and
L2 such that L[lL2 is not a context-free language.
Proof Let , . .
J Lx = a{blal\ i>\f
and . .
L2 = {a'b2,\ i>\}*.
Clearly, Lx and L2 are deterministic context-free languages, and
L\lL2 = {w\ ywGL2 for someyGLi}.
Now
weL?L2 (6.5.1)
if and only if there exist k > £ > 0, i\,. .. , ih > 1 ,j\, . . . ,/g > 1 such that
yw = a'-b^b21* - ■ •fl'*fc2i* (6.5.2)
and y = ab'iailbi*ai* ■ ■ ■ b^a^. (6.5.3)
In turn this holds if and only if there exist k > Z > 0, ilt. . . , 4 > 1, j\,. . . ,/g > 1,
such that
jr = 2r for each 1>r>\, (6.5.4)
and
ir = 2r'1 foreach£>r> 1. (6.5.5)
It now follows easily from these propositions that
(L?L2)nb+ = {b2m\ m>\).
If L\lL2 were context-free, then, by Theorem 6.4.1, {b2 \ m~> 1} would be regular;
but that would contradict Theorem 6.3.1. The result about right quotient follows
from the identity.
lxl? = (L?(LTyy. a
We shall later see that Theorem 6.5.5 is a trivially weak result. Actually any
recursively enumerable set may be obtained as the quotient of two context-free
languages.

212 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
PROBLEMS
1 Show that, if L 9 A* is a deterministic context-free language and S is a
gsm, then S~l(L) is also a deterministic context-free language. Hence, the deterministic
context-free languages are closed under inverse gsm mappings.
2 Let L 9 A* and let S = {Q, 2, A, 6, q0) be a sequential transducer. By
definition, S~l(L) = {w\ S(w) 9 L} and it has been shown that *& is not closed under
S'1. Suppose we had defined S'1 by the following relationship:
S'HL) = {wGE*| S(w)r\L¥=Q}.
Is & now closed under 5"1?
3 Let X, Y,Z^ E* .'Show that:
a) (XZ~l)Y~l = X(YZy\
b) (X'1Y)Z~1 = X~1(YZ~1),
c) (XY)~1Z = Y~l(X~lZ).
Now we consider a-transducers, which were defined in the problems at the end of
Section 6.4.
Definition An a-transducer S — (Q, 2, A, 5, q0, F) is 1-bounded if
6 9 Q x (Z U {A}) x (A U {A}) x <2.
4 Show that, for each a-transducer S = (Q, 2, A, 6, q0, F), there is a 1-
bounded a-transducer S' = (£)', 2, A, 6 ,qo,F), such that
S'(u>) = S(u>) for each w 6 2*.
Let 5 = (Q, 2, A, 6, q0, F) be an a-transducer and let X be any function from F into
subsets of 2*. Define:
■S(X) = {«!-•• urX(qr)i>r -I'll There exist r>0,<?i, . . . ,qr,qrGF, such
that (<?,-, Mi+i,u,-+1,<jf+1)G 6 forO<i'<r}.
5 Show the following useful result.
Lemma Let S = (Q, 2, A, 8,q0, F) be an a-transducer. If for each q&F,
\(q) is a context-free language, then 5(X) is also context-free. Moreover, if each \(q)
is ultralinear, then so is 5(X).
6 Show that the class of ultralinear languages is the least family which
contains the finite sets and is closed under finitely many applications of union, product,
and 5(X) (where the image of 5(X) is contained in the family).
7 Show that a set L is ultralinear if and only if there is a finite-turn pda A
such that L = N(A).
8 Show that L is linear if and only if L is accepted by a 1-turn pda.
Let 2 and A be two alphabets and let Ri,R2 be subsets of 2* X A*. Define
RiR2 = {(x1x2,y1y2)\ (.x1,yi)eR1,(x2,y2)GR2).
Define
R° = {(A, A)},

6.5 APPLICATIONS OF THE SEQUENTIAL-TRANSDUCER THEOREM 213
and, for each k > 0, define
Rk+i = RkR
Then write
R* = U R*.
A relation R is said to be regular if R is obtained from the set
{(a, A), (A, b)\ aGZ,fcGA}
by finitely many applications of U, •, and *. Also define a transduction to be any
mapping from E* into A*, but we shall write it as a relation.
9 Prove the following important theorem.
Theorem (Nivat) A transduction from E* into A* is regular if and only if
there exists an alphabet V, a regular set R 9 T*, a homomorphism <p from T* into E*,
a homomorphism i// from T* into A* , such that, for each x G E*,
7* = iMYJ-1(x)n/0.
Let S and A be alphabets and let the members of E be interpreted as elementary
messages. The coding problem is to find suitable mappings tp from E+ into A* so that
right inverses exist, i.e., so that vV~ = 1. For our purposes, define an encoding as any
one-to-one homomorphism from E+ into A+. Define
A0 = V>(2)
and
A = ^2+).
A0 is called the set of all code words or just the code, while A is the set of all encoded
messages.
Recall that a semigroup S is freely generated by a set S0 if every element of
S has a unique representation as a product of elements of S0. S is free if there is a set
So which freely generates S.
10 Show that
a) A=A%
b) A is a free semigroup of A+ generated by A0.
11 Show that, if A is an arbitrary free semigroup of A+, then there exists a
set 2, and an encoding \p: S+ -»■ A+ such that A = v>(2+).
Let S be any semigroup and /1,B9 S. Define
^_1fl = {sGS\as&B forsomeaG/l}
and
AB 1 = {seS\saGB for some a G A }.
For Problems 12 through 14, let .40^ A+ and A = {A0)+. Define
^i = A^Aq
and i -l
An+i = A0 A„ UA„ A0
for each n > 0.

214 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
12 Show that A is freely generated by A0 if and only if, for each n > 1,
AnHA0 = 0.
13 Show that, if An — An+k for some n, k > 1, then
An+jk + r An + r
for all/> 0 and all r, 0 < r < k.
14 Show that, if A0 is finite, then there exists a natural number 7V0 such
that, for all n > N0, there exists m<N0 such that
An — Am .
15 Show that there is an algorithm for deciding whether a regular set is a
code. Does your method work for context-free sets?
6.6* PARTIAL ORDERS ON I*
By introducing partial orders on 2*, some surprising properties will be
derived. The methods of this section will be nonconstructive and the reader should
watch for the occurrence of such techniques.
In Chapter 1, we studied various partial orders on 2*. For example, we
might write x<1j' if and only if y =yixy2, which is the familiar relation "x is a
subword ofy." (2*,<!) is a partially ordered set.
Recall that, in any partially ordered set (5,<), x and y are said to be
incomparable if x ^y and j *fex.
Example 1 The set {ab"a\ n > 1} is an infinite set of pairwise incomparable
elements of the partially ordered set (2*, <!).
Next we shall consider another partial ordering, namely that of subsequences.
Definition Let x and y be in 2*. We write x^y if x = Xi • ■ ■ xn and
y =y\X\' • 'ynxnyn-n for somen >Q,xhyj E 2*,where 1 </<« and 1 </<« + 1.
Lemma 6.6.1 (2*, <) is a partially ordered set.
Proof It is clear that < is reflexive and transitive. To prove the antisymmetric
property, assume that x <y, which implies that
X = X\ • ■ ■Xn,
and (6.6.1)
y = >'i*i • ■■ynxnyn*u
and hence that
lgC) = lg(*) + lgO'i"->'n+i) (6-6.2)
and, a fortiori,
lg00>lg(x). (6.6.3)

6.6* PARTIAL ORDERS ONZ* 215
Also assume that y < x, so that
y = y\---y'k,
II III
x — xiyi■ ■ ■*fej;fe*fe+i>
and hence that
lg(x) = \g(y) + \g(x'r--x'k+l),
and, a fortiori,
lg(x)>lgO>). (6.6.4)
But (6.6.3) and (6.6.4) imply \g(x) = lg(y), which implies (using (6.6.2)) that
lg(ji"-"J'n+i) = 0,
so thatyi • ■ -yn+i = A, and, by (6.6.1),j> =x.
The following observation will also turn out to be useful.
Fact 1 For each u E 2*, there are only finitely many x E 2* such that x < u.
Next we consider a lemma which will be applied in our main theorem.
Lemma 6.6.2 If 2 is an alphabet which has the property that every set of
pairwise incomparable elements of 2* is finite, then every infinite subset of 2* has
an infinite chain.
Proof Let L be an infinite subset of 2* and suppose that every chain in L
is finite. Consider the set X of maximum elements of maximal chains. If x, y GI,
then x and y are incomparable. By assumption X is finite and \X\ = the number of
maximal elements of L. But L is infinite, so that infinitely many distinct chains have
the same maximal element u. But we then have x<=u for infinitely many x, which
contradicts Fact 1. □
We are now ready to prove an important result.
Theorem 6.6.2 Every set of pairwise incomparable elements of (2*,<) is
finite.
Proof We induct on |2|.
Basis. | 21 = 1. The result is trivial.
Induction step . Suppose the result holds for all alphabets A with |A| = n.
Let | 21 = n + 1, and let Y = {j>1;y2,. . .} be an infinite set of pairwise incomparable
elements.
Claim 1 There is x E 2* such that x ^yt for all /.
Proof Suppose the contrary; i.e., that for each x E 2*, there is an / so that
x <yt. Consider a string x =yxa, where a E 2*. Then x ^y\ so there must be some
/>1 so that yla=x<=yj. But then J>i<*<j>;-, which contradicts the incompar-
ability of yx andy}-.

216 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
We may assume there is a shortest x which satisfies Claim 1. Among all
infinite sets of pairwise incomparable elements, suppose that Y is chosen so that x
has minimum length.
Let
x = xt ■■-xk,
where xj 62 for 1 </ < k. Note that k =£ 0, for then x would be null and A = x <>>,■
for all /. If k= 1, then yt E (2 — {xi})* for all i> 1 [by the incomparability
condition] . But this would contradict the induction hypothesis for 2 — {xx}.
Claim 2 xx ■ ■ • xfe_i <j>; for all but finitely many /.
Proof Supposexx ■ • -xk^ <j>;- for infinitely many/, say/i,.. . Define
Yo = {yj^yj,, ■ ■ -I yH e Y and jf! • • • jcfc_! <yh).
Then y0 is an infinite set of pairwise incomparable elements which has a string x =
Xi ■ ■ -xk-i satisfying Claim 1, where \g(x') < lg(x), and that would be a contradiction,
since x was chosen to be minimal.
By relabeling subscripts of thejvt-, we may assume that
xi '' 'xk-i ^yi for all i >C.
Moreover, for each i> £, there exist unique words, jv,i, ■ ■ . ,ytk such that
yi = ynxlynx1 ■ ■ ■yik-lxk.lyik
and Xj ^yfj for all 1 </ < k; that is,ytj S (2 — {Xj})*. Moreover, we have xk ^ytk for
all i > £. [Suppose xk <yik for some OS.; then*! • • -xk =x <yt which would
contradict our choice of x.]
Claim 3 There exist infinite index sets A^,.. . ,Nk such that
a) Nj+1c Nj for Kj<k.
b) If p,q GA) for some/, 1 </<& andp <q, thenj>p;- <j>q/.
Proof Let N0 = {/| / > £}. The existence of Nj will follow from the existence
of the A^.! for 1 </<&. Let
Yj = {y^iENj.i} for K/</c. (6.6.5)
CASE 1. Yj is finite.
If Yj is finite, then, since Nj-i is infinite, we have that Iw = {i GA^I ytj = w} for
some fixed wG 2* must be infinite. In other words, from (6.6.5), as / runs through
Nj_i, infinitely many of the ytj must be the same. Define Nj to be any such Iw. Clearly,
this Nj is infinite and satisfies (a) and (b) since Nj^ did.
CASE 2. Yj is infinite.
In this case, since Yj c (2 — {xj})* the induction hypothesis applies. By Lemma 6.6.2,
Yj has an infinite chain
y.j<y.j<---. (6.6.6)

6.6* PARTIAL ORDERS ON Z* 217
Let ti, t2, . . . be an infinite strictly increasing subsequence of slt s2, ■ ■ ■ and define
Nj = {tj\ i > l}. Clearly, Nj is infinite and Nj £ Nj-i. To see (b), suppose p, q EWy and
p<q. By (6.6.6), we know that yps <>'<„. This completes the proof of Claim 3.
If p < q and p, q&Nk, then p, q EVV,- for l <f<k so that for each/, 1 </
< k,ypj ^y<u- Therefore
yP = yPiXiyP2X2 ■ • -yPk-iXk-iyPk
< yq\Xiyq2x2- ■ yqk-ixk-iyqk = yq-
But yp <j>g contradicts the definition of Y. □
Corollary (The Konig Infinity Lemma) Each set of pairwise incomparable
elements of (Nfe,<) is finite where N is the set of natural numbers and (n1;... ,nk)
< («i,.. . , n'k)if nt < n'j for each i, 1 < i < k.
Proof Consider the mapping tp: (n1;. . . ,nk)^>-a"1 ■ ■ -a1^. <p is an
isomorphism. If X is an infinite set of pairwise incomparable elements of (Nk, <) then <p(X)
is an infinite subset of {fli}* • • • {an}* <~{al,..., an}*. But if u and v are incomparable
elements of Nk, then i^(u) and ip(v) are incomparable elements of ({a,,.. . ,a„}*,<).
This contradicts Theorem 6.6.2. D
We introduce the operations of taking subsequences and supersequences of
sets.
Definition Let L £ 2* and < be the "subsequence" ordering defined at the
start of this section. Define
L = {xGI,*\ y<:X for some y in L}
and
L = {xG'E*\x<y for some y in L}.
In words, L is called the set of supersequences of L while L is the set of
subsequences of L.
Some algebraic identities concerning these sets are listed below.
Lemma 6.6.3 For any L £ 2*,
1 L£L,
2 L£L;
3 L=L.
Proof (1) and (2) are immediate from the definitions. For (3), note that
L £ L by (2). Suppose x&L. Then x <j> for some y£L. Again by the definition,
y <z for some z EL. By transitivity,x <z, so that x EL. □
Before passing to our main results, we need one more definition.
Definition Let (5, <) be any partially ordered set. x E S is said to be minimal
if y < x and y E 5 imply j> = x.

218 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
We recall a trivial fact about minimal elements.
Fact Let (5,<) be any partially ordered set. If x and y are distinct minimal
elements, then x and y are incomparable.
Proof Immediate
Now we are ready to prove a useful lemma.
Lemma 6.6.4 Let L £ 2*.
a) There exists a finite subset F'k 2* so that L = F.
b) There exists a finite subset G £ 2* so that
L = 2*-G.
Proof a) Let F be the set of minimal elements of L. By the previous fact,
the elements of F are pairwise incomparable. By Theorem 6.6.1, F is finite. We must
prove that L = F.
Suppose x£F. Then y < x for some y&F^L. Thus, x EL and F 5 L.
Suppose xEL. Then j><x for some j>EL. There is y' £F so that
j> <j><;t. [If j' is minimal, then y'=y. If not, the definition of minimality
guarantees the existence ofy'.] Thus* GF andL 9 F. Therefore,L =F.
b) LetM=2*-L.
Claim M = M.
Proof We already know that M£M. Suppose that M%^M; i.e., there is
x EM and x ^M = 2* — L. That is, x EM and * EL. Since x EM,x >j> for some
y EM. Since x EL and x >jv, we conclude that y EL = L. But we have established
that j> E L and j>EM=2*— L, which is a contradiction. This completes the proof of
the claim.
By part (a) of the Lemma, there is a finite subset G so that M = G. However,
G = M = 2*-L = G,
so that Z
L = 2*-G. □
Now we come to our main result.
Theorem 6.6.3 For any set L 9 2*, L and L are regular sets.
Proof For any word, w E 2*, write
where w, E 2 U {A} for l < i < n. Clearly, w is regular since vv = 2*{w!}2* • • • 2*
{w„}2*. Moreover, for any finite set W^ 2*, W is regular since
W = U {w},
w <EW
and a finite union of regular sets is regular.

6.7 PROGRAMMING LANGUAGES ARE NOT CONTEXT-FREE 219
Let L £ 2* and let F and G be as given by the previous lemma. Then
L = F
is regular by the above remark and
L = 2*-G
is regular since the complement of a regular set is regular. □
This theorem is amazing because there are no constraints on the set L. It
might not even be recursive, and still L and L are regular.
PROBLEMS
Definition Let k > 1 and let x, y E 2*. We define x <fe y if x = xt ■ ■ ■ xk
and y =y\Xi ■ ■ • >'feXfej'fe+1 for somex,- andy3 in 2* where 1 </ < k and 1 </ < k + 1.
Thus x <fe y if a factorization of x into k parts occurs in y.
1 Is (2*, <fe) a partially ordered set?
2 Is a constructive proof of Theorem 6.6.3 possible if L is recursively
enumerable? If L were context-sensitive, could one construct a right linear grammar fori?
3 Show that no matter how one partitions an infinite «-ary expansion of
any real number into blocks of finite length, one block is necessarily a subsequence
of another.
6.7 PROGRAMMING LANGUAGES ARE NOT CONTEXT-FREE
Since much of the motivation for studying context-free languages came from
programming languages, it is desirable to ask whether existing languages are context-
free or not. To understand this question, one must ask "What is a programming
language?" Let us take ALGOL as an example because it was one of the earliest
languages that was carefully designed. The set of valid ALGOL programs is the set of
strings that (i) satisfies the syntactic conditions of ALGOL and (ii) satisfies the
semantic conditions. Since the syntax is written in BNF, there is no doubt that the set
of strings which satisfy (i) is context-free. The semantic conditions for ALGOL were
written in English and it is not immediately clear whether the set of strings which
satisfy (i) and (ii) is context-free. Note that there is a distinction between the
specification of a language and some implementation restriction of a compiler. In ALGOL,
an identifier has unbounded length while, in a particular compiler, the length may be
at most 10 characters. Our interest is in the language definition, not a particular
implementation.
Theorem 6.7.1 ALGOL 60 is not a context-free language.
Proof Let L be the set of (syntactically and semantically) valid ALGOL 60
programs. If L were context-free, then so would
M = Ln({beginKtealK*n;K*}*{:=K*r{end})
since it would be the intersection of a context-free language and a regular set. But
M = {begin realxl;xJ : = xk end| i = j = k,i> 1},

220 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
x/a x/b X/a
begin/A end/A
real/A
FIG. 6.4
where the condition i=j = k follows from the semantic rule that every variable that
occurs in a valid ALGOL program must be previously declared. Also, /> 1, since
there is no bound on the length of identifiers in ALGOL. Let S be the gsm shown in
Fig. 6.4.
Then S(M) = {a"b"a"\ n~> 1}, which is a contradiction since context-free
languages are closed under sequential-transducer maps. Thus L is not context-free. □
Now let us turn to another language.
Theorem 6.7.2 FORTRAN, as specified below, is not a context-free language.
Proof Let F be the set of all valid ASA FORTRAN programs with the
assumption that there is no bound on the number of continuation cards.
Let R be the following set, where the representation of R is written on
separate lines to make it more intuitive .t
R = { WRITE (6, 1)
1 FORMAT(1}{0}*{//}{X}*{)
END}
Clearly, R is a regular set. If F were context-free, then L = FC\R would be, but we
have that
L = { WRITE (6, 1)
1 FORMAT(10"HX10")
END|w>0}
Define the homomorphism tp by <p(X) = X and <p(a) = A otherwise. Then
XL) = {x10"\ n>0}
would be context-free but it isn't. This contradiction establishes that F is not context-
free. □
Now we turn to PL/I, which is a descendant of both ALGOL and FORTRAN.
The argument used for ALGOL will not work for PL/I, since it is not necessary to
;/A
t The characters {and} are meta characters not in the FORTRAN character set.

6.8* PROPER CONTAINMENT OF LINEAR LANGUAGES IN 6 221
declare every identifier. However, the generality of structures in PL/I allows the
number of dimensions of an array to be unbounded.
Theorem 6.7.3 PL/I is not context-free.
Proof Let P be the class of valid PL/I programs and suppose P were context-
free. Consider
M = Pn ({MAH: PROCEDURE OPTIONS (MAIN): DECLARE A(l :4}{,l :4}* {);
A(2}{,2}*{) = A(3}{,3}*{;ENDMAH;})
The quantity in parentheses is regular so that M is context-free. Thus
M = {MAH: PROCEDURE OPTIONS (MAIN);
DECLARE A(l:4{, 1:4}');
A(2{,2}') = A(3{,3};);
END MAH;| i > 0}
This follows because (i) there is no bound on the number of dimensions in PL/I and
(ii) each reference to the array must have the same number of indices.
Define a homomorphism <p as follows:
tp4 = a
<p2 = b
tp3 = c
ipx = A in all other cases.
Clearly, ipM= {a'b'c'\ /> 1}, which would be context-free if P were. This
contradiction establishes that PL/I is not context-free. □
PROBLEMS
1 In Theorem 6.7.2, no bound was placed on the number of continuation
cards in FORTRAN. Suppose this quantity were bounded. Would FORTRAN be
context-free? Explain.
2 Is ALGOL 68 a context-free language?
3 Is PASCAL a context-free language?
4 Is COBOL a context-free language?
6.8* THE PROPER CONTAINMENT OF THE LINEAR CONTEXT-FREE
LANGUAGES INC
The techniques of using closure properties which have been presented in this
chapter are of great value. To illustrate this, we return to the case of linear languages
and prove a theorem from which it follows that there are context-free languages which
are not linear.

222 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
,, „Q0 >QQ
S; -Q0 ^ ^O
a/A
c/A
a/a
c/A
FIG. 6.5 The definition of Sj and S2; a denotes any letter of 2.
Theorem 6.8.1 Let 2 be a finite nonempty set, c$ 2 andi?,X9 I,*.\fRcX
is a linear context-free language, then either R or X is regular.
#•00/ Let
L(G) = i?cX = L,
where
G = (K, 2U{c},i>,5)
is a linear context-free grammar. Assume, without loss of generality, that G is reduced.
Now define two gsm's Si and S2 by Fig. 6.5.
The reader may easily verify that
Si(RcX) = R S2(RcX) = X.
Now define a set of variables
C = {A E7V| A ^w for some w E 2*c2*}.
Since each wGL contains exactly one c,ifA =>aE V*, then either
1 a contains no c and at most one variable which belongs to C; or
2 a contains a c and no variables which belong to C.
Now we partition P into four disjoint sets
P0 = {A^u\ ueZ*}U{A^uBv\ BeN, u,uE2*},
Pi = {A ->ucv\ u,uE2*},
P2 = {A^UiCU2Bv\ BeN,uuu2,vej:*},
P3 = {A^-uBviCV2\ BeN,u,vuv2ei:*}.
The variables that occur on the righthand side of rules in P2 L)P3 are not in C, since,
if they were, one could derive a word in L with two c's.
Define three grammars:
Gi = (K,2U{c},i>0Ui'1,5),
G2 = (V,i:u{c},P0L)P2,S)
G3 = (K,2U{c},i>0Ui>3,S)
and let
^ = L(Gi), 1<*<3.

6.8* PROPER CONTAINMENT OF LINEAR LANGUAGES IN Q. 223
Again, using the fact that if w E L, it has exactly one c, we have that every derivation
in G uses exactly one rule in Px U P2 U P3. Thus
L = LiUL2UL3.
Claim 1 Si(Li), Si(L2), and S2(L3) are regular.
Proof We give a right linear grammar
G\ = (K, 2,/>;,£)
such that S^LO = L(G\). Let
Pi = {A^u\ ,4 ^u is inland «E 2*}
= L){A^uB\ A ^ uBv is in P0,BeN}
= U {A->-u\ A^- ucv is in Pi}.
To show that L(G[) = Si(Li) it suffices to prove:
Claim S^ucv if and only if S=>u.
Proof The argument is omitted.
Thus L(G\) = Sl(Ll). Since G\ is right linear, L{G'{) is regular and
therefore Si(Li) is regular.
Similar proofs show that Sl(L2) and S2(L3) are regular, which proves the
claim.
Now, since
L = RcX = LiUL2UL3,
then
R = SiiL) = SiiLJUSiiLjUSiiLs). (6.8.1)
Also since
X = S2{L) = S2(Li)US2(L2)US2(L3),
then
S2{L^X. (6.8.2)
It will suffice to prove the next claim because then we will have that R is
regular orXis.
Claim 2 EitherR=Si(L1)USi(L2) orX = S2(LZ).
Proof From (6.8.1) we have
Si(Li)USi(L2)^R.
Now suppose Si(Li) U Si(L2) J R. Then there exists a w such that
wGR-(Si(Li)USi(L2)).
Since wGR,
wcX9RcX = L. (6.8.3)

224 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
But
wcXn^ULi) = 0, (6.8.4)
since
Si(wcX) = w$(S1(L1)USi(L2)).
From (6.8.3) and (6.8.4) and the fact that L = i1UL2UL3we have
wcX£L2.
Therefore,
S2(wcX)= X9 52(L3).
Thus, from (6.8.2), we get:
S2(L3) = X,
which proves Claim 2.
Claims 1 and 2 show that either R or X is regular, and this proves the
theorem. □
Corollary Let 2 be a finite nonempty set, c^2, and R £ 2*. If RcR is a
linear context-free language, theni? is regular.
Theorem 6.8.2 There exist context-free languages which are not linear.
Proof Let G be the context-free grammar given by
S -> TcT
T -> aTb\A
Then
L{G) = {fl'ft'l i>0}{c}{aibi\j>0}.
By the above corollary, if L(G) were linear, then
R = {flWl i>0}
would be regular. But, from Theorem 2.2.1, R is not regular. Therefore L(G) is not
linear. □
PROBLEMS
1 Let R 9 2* be a regular set and c $■ 2. Show that
L = {xcxT\ xGR}
is a linear context-free language
2 i is said to be a ucv-language if L 9 2*c2*, where c $■ 2. If L is such a
language, let u, vE. 2*, and define
/l(") = {w| «w6£},
gL(v) = {w\ wcvGL},
U(L) = {w| fL(w) ¥=9],
V(L) = {w\ gL(w) *9}.

6.9* SEMILINEAR SETS AND PARIKH'S THEOREM 225
Prove the following result.
Proposition Let L be a ucv context-free language. If, for each u E2*,/I/(w)
is finite, then V(L) is regular. If, for each uE2*, gL(v) is finite, then U(L) is regular.
3 Use the result of Problem 2 to show that if a regular set gets run into a
pda, the contents of the pushdown store is regular. More formally, let A = (Q, 2, T,
5, q0, Z0, 0) be a pda and let R £ 2* be a regular set. Show that
PR = {aer*| (q0, w,Z0) f^-fa.A.a) for someq EQ and w GR}.
[Hint: Consider Lq = {wca\ (q0, w, Z0) r~(q, A, a), where u>E2*} but watch out
for the possibly infinitely many A-moves.]
4 Show that the linear languages are not closed under product.
5 Let R, X£ 2*, and c £ 2. Show that, if R and X are linear context-free
languages and either R or X is regular, then RcX is a linear context-free language.
6 Let X'- 2* and cf£. Show that AcA" is a linear context-free language if
and only if X is regular.
7 Let G = (K, 2,/", 5) be a metalinear grammar, as defined in Problem 4 of
Section 2.5. The width of G is defined as
max {m \ S -»• .4 i ■ ■ • .4 m is in P}.
L is said to be metalinear of width k if L = L(G) for some metalinear grammar of
width k. Let X^ X* for 1 </< m,m > 2, and let c<£ 2. Show that if A^cA^ • • • cXm
is metalinear of width k > 1 and if X\ is not regular, then X2cX3 ■ ■ • cXm is metalinear
of width at most k — 1.
8 Let Xjt 2* for 1 </<w and let c<£2. Show that if X^ ■ ■ -cXm is
metalinear and of width k, then at most k of the A*,- are not regular.
9 Let X^ 2* and c<£2. Show that (A"c)*X is metalinear if and only if AT
is regular.
10 Let 2 be an alphabet and c $ 2. If L *~ 2*, then define
N(L) = {xcy\ x,y&L,x±y}.
Show that, if L is regular, then N(L) is context-free.
11 Is it true that N(L) context-free implies L is regular?
12 Show that an affirmative answer to Problem 11 gives an elegant proof
that English is not context-free. [Hint: Consider sentences like "John is more
successful as a V than Bill is as a 'z'."]
6.9* SEMILINEAR SETS AND PARIKH'S THEOREM
In this section, a natural mapping from sets of strings into sets of «-tuples
of natural numbers will be explored. The resulting theorem is a classical one in
language theory. The applications of this theorem will be found in the exercises in
future sections. We approach the proof by first stating a result which follows easily
from the Iteration Theorem.

226 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Lemma 6.9.1 (The Pairing Lemma) For each context-free grammar
G = (V, 2, P, S) there is an integer p such that, for any k > 1, if w E L(G) and lg(w) >
pk, then there exist strings vu v3, vs, v2l,. .. , v2k, v4l,... , v4k E 2* and A GN such
that:
i) S =>v1Avs=>vlv2iAv4lvs
^■vlv2lv22Av42v4lv5
=^1^21 • --v2kAv4k ■ --v^Vs
^Vi v2l ■ ■ ■ v2kv3v4k ■■•v/nvs = w
ii) v2iv4i =£ A for each /, 1 < i < k; and
iii) lg(u21 • • • v2kv3v4k ■ ■ ■ v4l)<pk.
Proof When k = 1, the lemma is a minor modification of the pumping
lemma. The proof of the Pairing Lemma merely follows that of the Iteration Theorem
except that, because lg(w)> pk, the crucial path in the generation tree for w has k
repetitions of some variable A. It is not necessary to even worry about positions. □
A fair amount of notation must be built up in order to state the main
theorem. Let H be the set of natural numbers as well as the commutative monoid
consisting of (^ under +. |\|" denotes the cartesian product of py with itself n times.
Let* = (jti,. . . ,xn) andj> = (yu ... ,yn)be inN" and define
x+y = (*i+j'i,...,*n+J'n)
and for m > 0, define
mx = (mxu ..., mxn).
Definition A set of the form
{a0 + n!<*! + ■ ■ • + nmam\ rij>0 for 1 </<m},
where a0,..., am are elements of N", is said to be a linear subset of Nn. A semi-
linear set is a finite union of linear sets.
Let us fix 2 = {«!, .. . ,an} and define a mapping from 2* into N" given by
HW) = (#ai(w),...,#an(w)),
where #a.(w) denotes the number of occurrences of at in the word w. Note that,
for anyx,j>£2*,
Mxy) = «h*) + My) = My) + <H*) = My*)-
\p is sometimes called a Parikh mapping. For any L 9 2*, define
i,(L) = {4/(x)\xGL}.
Sometimes </>(X) is referred to as the "commutative image" of L.

6.9* SEMILINEAR SETS AND PARIKH'S THEOREM 227
Example Let L = {a\a\\ i > 0}. Then
«H£) = {{iJ)\i>0} = {i(\,\)\i>0}.
Thus, </>(L) is a linear subset of N2.
We sometimes say that a language L 9 2* has a semilinear image if t/>(L)
is semilinear.
A mapping like \p can induce an equivalence relation on languages, as follows:
Definition Two languages Ll and L2 contained in 2* are called letter-
equivalent if i/>Li = ^L2.
The languages with a semilinear image can be characterized quite simply.
Theorem 6.9.1 A set L 9 2* has a semilinear image if and only if L is letter-
equivalent to a regular set.
Proof Suppose L has a semilinear image; that is,
\pL = MiU---UMm,
where each
Mi = {ai0 + nnan+ ■ ■■ + nir.\ ni}>0 for 1 </</•,}.
Let yi0,yn, ■ ■ ■ ,yir- be strings in 2* whose images under \p are, respectively, a,0,
an,..., air.. Then define G by the productions:
S ■+Al\---\Am
At ^ Aiyij\yi0 for all i,/, \<i<m, 1 </</•;.
Since G is left linear, L(G) is regular and, clearly,
«H£(C)) = «/,(£)•
Conversely, we shall show that any regular set has a semilinear image by
Kleene's Theorem. It is clear that
*0 = 0
and that ^{a,}is semilinear.
CASE 1. If 4/Li and t/>L2 are semilinear, then \[/Ll U t/>L2 is semilinear.
Proof If t//L,- = Ln U • • • U L,fe. for i = 1, 2, then
«H£iUL2) = U Ui,7
1 = 1 ;'=l
is a semilinear representation of Lx UL2.
CASE 2. If Li and L2 have semilinear images, then so doesLiL2.

228 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Proof Since concatenation distributes over union and we have closure under
union, it suffices to assume that \pLi = Mi and \jjL2 = Mi, where Mi and M2 are linear.
Let
M{ = {at + nnaji + ■•■ + nim.aim.\ nt > 0} for i = 1,2.
It is easy to verify that
4/(LiL2) = Mi+M2
= {o^ + a-2 + «i1a11+ ••• + «imialmi+«21a21+ ■ ■ ■ + n2m2a2m2\nij>0}.
The proof for * is a little tricky so it is convenient to do a special case first.
CASE 3. If 4/L is linear, then \[/L* is semilinear.
Proof Suppose
4/L = M = {a0 + nidi + ■ ■ • + nmam\ nt>0}.
Note that
4/(L*) = {0}u{ao + noao + "iai + ■ • ■ + nmam\ nt>0},
where 0 denotes the zero vector in Nn.
CASE 4. If \pL has a semilinear image, then so does \[/L*.
Proof Let L - Lx U • • • U Lk, where each L; £ 2* is a set whose image is a
linear set in Nn. By Cases 2 and 3,
I* • ■ • I*
has a semilinear image. The proof of Case 4 now follows easily from the observation
that
«/,L* = «K(£i U • • • U Lkf) = i,{LX ■ • • Li).
The converse now follows from Kleene's theorem. □
We are now ready to prove the main result of the section.
Theorem 6.9.2 (Parikh's Theorem) For each context-free language/,, \pL is
semilinear. Equivalently, each context-free language is letter-equivalent to a regular set.
Proof Let G-(V, Y.,P,S) be a context-free grammar such that L(G) = L,
and let p be the constant from the pairing lemma.
For any ae V*, define v(a) = {A eN\ a = (3Ay for some j3, y e V*}. That is,
v(a) is the set of variables that occur in a. We extend v to derivations. That is, if 6 is
a derivation,
6: a! =>•••=> a„,
define
n
K0) = U K<*<)-

6.9* SEMILINEAR SETS AND PARIKH'S THEOREM 229
Now let U be any set of variables containing S. Define Lv to be the subset
of L consisting of words derivable from S in which the only variables used were from
U. More precisely,
Lv = {weL\ Forsomefl: S =*• w,v{6) = U}.
Claim 1 It suffices to prove that each Lv has a semilinear image or, equiva-
lently, that each Lv is letter-equivalent to a regular set.
Proof There are only finitely many sets U and hence sets Lv. Moreover;
L = U Lv,
u
so that Claim 1 follows.
Consider a set Lv. Let k= \U\ and define
F = {weLu\ lg(w)<pfe}.
Now consider the set
{xy e 2* 16: A ^ xAy, v(6) 9 U, lg(xy) <pk}.
Because the length of xy is bounded, this is a finite set, so let
^1> . . . »Sm
be any fixed enumeration of the elements of the set.
Claim 2 WLu)^ UK* ■■■**„)•
Proof The argument proceeds by strong induction on lg(w), where w&Lv.
Let n0> 0 be fixed and suppose that
*(z)e *(/*:•■■**)
for all z&Lv, with lg(z)<w0. Now suppose w&Lv and lg(w) = n0. If lg(w)<pfe,
then w E F, so that \jj{w) E \jj(F,s* ■ ■ • s^).
Suppose lg(w) > p . Since w £ L,fj, there is some derivation
6: S 2* w
with v(Q)= U. Since lg(w)>pfe, the pairing lemma yields a derivation 0' such that
=* ViV2lAv4lv5
=> vxv2l ■ ■ • v2kAv4k ■ ■ ■ v4lvs
=> vlv2l • • • v2kv3v4k • ■ ■ v4lv5 = w
'fe-n

230 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
where A G U and 1 < \g(v2l •" ■ v2kv3v4k ' ' ' v4i)^Pk ■ Certain subderivations of 9'
have been distinguished. Note that v(8') = v(d) = U. Since
fe + i
v(8') = U ii8i) = U,
each variable in U — {A} is in at least one vfOf) for some i, 0 < i < k + 1. For each
variable in U — {A }, arbitrarily choose some i, 0 < i < k + 1, such that y(0,) contains
that variable. Some values of / may be chosen more than once. Since there are only
k — 1 variables in U — {A} and k values between 1 and k, some value of i,Ki<k is
not chosen at all. Let i0 be such an index. That is, p(9t ) contains only^l. Now delete
the subdrivation 0; from 0' to yield
0": S ^ vxv2l • ■ • v2io.iV2io+1 • ■ ■ v2kv3v4k ■ ■ ■ i>4,yh*>4;0-i • • • v41v5 = w'.
From the choice of i0, every variable in U — {A} is in v(Q"). Clearly, A E v(8"), so
v(0") = U. Hence, w E Lv. Since lg(w') < lg(w) = n0,
4>(w')e4,(Fst ■■■**„)
by the induction hypothesis. Since w'v2i v4i is a permutation of w,
\l/(w) = ^(w'v2iov4io)G\l/(Fs* ■ • ■ s*mv2iv4i).
But v2iov4io is in {su . . . ,sm}, so
iKw) S ^(Fs? • • • s*mv2iv4ii) = \l/(Fs*i ■ • ■ s^n).
Claim 3 *(/*!• ••s£)c,KLu).
Proof Let wSLy, and we prove that i//(ws,-)E t//(Xi/) for any i. From the
definition of thes,-, we can write s,- = v2v4 for some v2 and u4, and there is some B&N
and some derivation t
6: B =" u25u4
with ^(0) = U. Since w E L^, there is a derivation of w in which 5 occurs; for example,
5 =*■ xBz => xyz = w
but then t
S =*■ xBz =*■ xv2Bv4z =*■ jo^j^z = w
Now w' E Lv and tZ'(w') = t/'Cws,). Thus,
^(L[;S,)9 iHLfj) for each s,-
and since
this shows that
Mwsi) = *(w') G *(/*! • • • C) ^ <H^i • • • C) ^ Wu).
By Claims 3 and 4,
*(/.„) = *(/*?•••«;,).
Since Fs* • • • s^ is clearly regular, the proof is complete. D

6.10 HISTORICAL SURVEY 231
PROBLEMS
1 Show that a set of ^-tuples of natural numbers is linear if and only if it is
a coset of a finitely generated subsemigroup of N".
2 Show that the family of semilinear sets of N" is closed under intersection
and complementation. Show that if Li'- Nm and Z,2-^™ are semilinear sets, then
L i x L 2 is a semilinear subset of Nn * m.
3 Find a set 1^2* which is not a context-free language but which has a
semilinear image.
4 Use Parikh's Theorem to show that every context-free language L 9 a*
is regular.
We begin to build a logical system involving addition and the logical connectives. The
formulas will express statements about natural numbers. The set of Presburger formulas,
&, is the least class of formulas satisfying the following conditions.
a) For given nonegative integers nt, n\, 0 < ; < m,
m m
"0+ E"i*i = "0+ Y*n'lxi
is a formula in & with free variables x\,... , xn.
b) If/>1,/>2e.#8,thensois/>1n/>2.
c) If Pu P2 S & , then so is Pi U P2.
d) If P is in &, then so is ~l P.
e) If P(xi,. .. ,xn) is in & and 1 <<<«, then (Vx,)/>(xi,. ..,*„)£.#*.
f) lfP(xi,. . . ,x„) is in & and 1 <i<«, then the formula
(3xi)P(xu. ..,xn)
is in &.
A formula with no free variables is called a sentence.
*5 Show that it is decidable whether or not a Presburger sentence is true.
If P(xi,. .. , xn) is a Presburger formula, it defines a relation as follows:
Ap = {(al5. . ., a„)\ P(ai, ... ,an) is true}.
A Presburger set A £ N" is of the form Ap for some Presburger formula P(xu ... , xn).
6 Show that the family of Presburger sets in Nn is identical with the family
of semilinear sets in Nn.
6.10 HISTORICAL SURVEY
The pumping lemma first appears in Bar-Hillel, Perles, and Shamir [1961].
The iteration theorem is from Ogden [1968]. The example used in Theorem 6.2.4 is
due to Wise. Problems 10 through 12 in Section 6.2 are from Hartmanis and Shank
[1968]. The L nR theorem is due to Bar-Hillel, Perles, and Shamir [1961], although
our machine proof is much simpler than the original argument. Theorem 6.3.1 is from

232 THE ITERATION THEOREM, NONCLOSURE AND CLOSURE RESULTS
Ginsburg and Rice [1962]. While gsm's and context-free languages were first studied
in Ginsburg and Rose [1963], our development follows Ginsburg, Greibach, and
Harrison [1967]. Problems 4-7 of Section 6.3 are from Kosaraju [1975].
Theorem 6.6.3 is from Haines [1969]. Our notation for quotient follows
that of the "French school"; cf. Eilenberg [1974]. Exercise 9 from Section 6.3 is
fromNivat [1967].
Theorem 6.8.1 is from Greibach [1966]. At the end of Section 6.8, Problem
2 is from Bar-Hillel, Perles, and Shamir [1961], and Problem 3 is from Greibach
[1967]. Parikh's theorem occurred originally in Parikh [1961] and was republished
in Parikh [1966] due to its importance and the inaccessibility of the original note.
Our proof follows Goldstine [1977].

seven
Ambiguity
and Inherent
Ambiguity
7.1 INTRODUCTION \
In earlier sections, ambiguous context-free grammars have been introduced,
and we defined a language to be inherently ambiguous if every one of its infinitely
many context-free grammars was ambiguous. Either such languages exist or for every
ambiguous grammar there is an equivalent unambiguous grammar. In either event, this
is an interesting phenomenon. It turns out that the former case holds, and in Section
7.2, this is proved with the aid of the iteration theorem.
In Section 7.3, we strengthen the above result by showing that there are
inherently ambiguous languages that have an exponential number of derivation trees in
the length of the string. Since the number of factorizations of a string of length n is
exponential in n, this is as many as possible.
Section 7.4 is devoted to showing that certain operations preserve or do not
preserve unambiguity or inherent ambiguity. While those results are simple to prove,
they turn out to be very useful.
7.2 INHERENTLY AMBIGUOUS LANGUAGES EXIST
Our earlier discussion of inherently ambiguous languages was very brief and
consisted merely of a definition. Now we shall use the iteration theorem to prove that
they exist. The first step in the proof is to examine phrases.
233

234 AMBIGUITY AND INHERENT AMBIGUITY
Definition Let G = (V,X,P,S) be a context-free grammar and suppose
17, j3G V*. ]3is said to be a phrase of 17 if there exist a, 76 V*,A EN, so that
* n
S => aAy A =*/? for some n > 1 and 17 = a/37
/3 is a simple phrase of 17 if (3 is a phrase of 17 as above, in which n = 1.
Definition Let a and /3be subwords of 7, where a,/3, 76 K .aand /3 are said
to be disjoint if there exist p,a,r& V* so that 7 = pao/Jr or 7 = pfioca.
The following simple result is needed.
Theorem 7.2.1 Let G = (V,X,P,S) be an unambiguous context-free
grammar, let x 6 L(G), and let xx and x2 be phrases of x. Either:
i) xl andx2 are disjoint,
ii) Xi is a subword of x2, or
iii) x2 is a subword of xl.
Proof Consider the (unique) rightmost derivation for x:
Since j^ in 2* is a phrase of x, there is an index i such that ]3,- = atAVi and ^4 j**!.
Since the derivation is rightmost, v{ 6 2*. Similarly, there is a number / so that j3y =
oijBVj, B=>x2 and ^ 6 2*.
There are 3 cases: /=/, />/, and /</. The third case will follow from the
second by symmetry. Let z, be the node of the tree labeled/I while z2 has label B. If
i = /', then zl =z2 and at = as,A = B, and p,- = v-r Thus xt = x2.
Now assume / < i. In the derivation for x, we have either one of two choices.
Since j3y- j*/3; or a,ify- ^> <XiAvh our two choices for subderivations of the derivation in
question are
B ^ yAy for some 7 6 K* and y 6 2*
or
(fylj^yUy for some 7' 6 K* and y 6 2*.
These cases are represented by Fig. 7.1.
CASE 1. Suppose B => yAy for some 7 e V* and .y e 2*. Then we have
Bj>yAy=> yx^^ x2
because B ^> x2 and A J> xi and both are in 2*. Therefore, Xi is a subword of x2.
CASE 2. Suppose a,- s> 7'^' for some 7' e K* and some j'' e 2*. Then
S=>oijBVj =>UjX2Vj => y'Ay'x2Vj=>y'xly'x2vj =>x
Thus,*! andx2 are disjoint phrases of x. □
Corollary Let G = (V, 2,i>,5') be a context-free grammar. Let* &L{G) and
Xi,x2 be phrases of x. If *! and x2 are incomparable (with respect to <, that is, one

7.2
INHERENTLY AMBIGUOUS LANGUAGES EXIST
235
AT] isasubwordof x2
(a)
x\ x2
a,±> y'Ay'
AT] and x2 are disjoint
(by
FIG. 7.1
is not a subword of the other) and they are not disjoint, then G is ambiguous; that is,
if*! andx2 "overlap," then G is ambiguous.
Now the idea of our proof that there exist inherently ambiguous languages
can be explained. We shall take two languages L0 and L, and show that L0 u^i is
inherently ambiguous. We will study a string z0 in L0 —Lx and determine the form of
its factorization. Another string zx in L\ — Lo will be studied in its factorization
determined. By pumping these strings we shall obtain overlapping phrases of a string in
L0 U Lx and conclude that the grammar must be ambiguous.

236 AMBIGUITY AND INHERENT AMBIGUITY
Notation LetL0 = {alblc'\i,j> ljandLj = {a'b'c1] i,j>l}-
Theorem 7.2.2 L0 U Lx = {a'b'ck \i=j or/ = fc}is an inherently ambiguous
context-free language.
Proof Since Z,0 u L\ is context-free, let p0 be the number give by the
iteration theorem. Let p = p0\ and let
z0 = a2pb2pc3p = a2"^ ^ c3p = x0u0w0v0y0
where the underlined ft's are distinguished; that is, K = {3p + I,... ,4p}, and
x0UowoVoyo is a factorization given by the iteration theorem.
We must determine the factorization of z0 6i0 ~~L\. There are two cases
according as Kx, K2, K3 # 0 or K3, K4, Ks =£ 0.
CASE 1. Suppose KX,K2,K3±$. This leads to:
x0 = a2pbp*1 />0,
u0 = bk k>0,
w0 = b"'a+k)w' k <p0<p andf+ k<p.
CASE la. Suppose v0 6 ft*. Then we would have:
x0 = a2pb"+1
"o = bh
w0 = ft8
»b = ftm
= j»-<^
C+m)_3p
/>o,
ifc>0,
£>0,
m>0,
But then
But^oH'oj'o f L0, since k>0andxowoyo ^ Lx, because £> 0 implies that:
fc + w</ + fc + e + w<p.
CASE lb. v0 6 ft+c+. The factorization is:
*o = a2pbp+1 />0,
u0 = bh k>0,
w0 = bs £>0,
_ f,P-U+h + Z)nm
y0
Vo = frP-U^Ky" B1>0,/ + ik + B<Po<P,
„3p-m

7.2 INHERENTLY AMBIGUOUS LANGUAGES EXIST 237
Then
x0ulw0vly0 = a2pbp+ib2kbibp-u+k+Z)cmbp-<-i+k+Z)cmc3p-m Ea+b+c+b+c+
and thereforex0u%w0v%y0 £ L0 UL,.
CASE lc. v0 Ec*. Then the factorization becomes:
x0 = a2pbp+i />0,
u0 = bk p-2>k>0,
w0 = ip-<J'+fc)cc k<po<p,fl>0,
v0 = cm m>0,
„ _ „3p-(C+m)
Consider x0w0y0 = a2P^p-kc^p-m ^ Xqw0y0 $ L0, since 2p — k < 2p because k > 0.
But x0w0y0 EL0 Ui, by the iteration theorem, so that x0w0y0 6L,. This implies
that 2p—k~3p—m,ov that
m = p + k.
However, let us consider
x0ulw0v20y0 = a2pb2p+kc3p+m
But x0ulw0vly0 $L0 ULj, since fc > 0, and2p + k ¥^4p+k since p >0. This shows
that Case 1 cannot hold.
CASE 2. K3,K4,KS^ 0. We must have that:
y0 = blc3p i>0,
Vo = iJ' 7>0,
w0 = w'bk k>0.
CASE 2a. u0 Eb*. Then the factorization would be:
.Vo = b'c3p
Vo = b'1
w0 = bk
u0 - b
x0 = a2pb2p-^+h+i+i)
/>0,
/>o,
Jfc>0:
Z>0,
But then
x0w0y0 = a2pb2p-«+i+k+i)bkb'c3p
_ a2PA2p-(y+E)c3p
Since/ > 0, XoWcVo £ £o u £ i , which contradicts that w0 e A*

238 AMBIGUITY AND INHERENT AMBIGUITY
CASE 2b. u0 Ga+b+. In this case,x0«Jwo v\yoea+b+a+b+c+, which would be
a contradiction.
CASE 2c. u0 Ga*. This leads to the factorization:
y0 = b'c3" i>0,
v0 = b' />0,
w0 = aeb2p-u+i) i+j<p,V>0,
h0 = ah 0<Jfc<2p-£,
x0 — a
To complete our picture of the factorization, suppose k #/. But then
x0U(jW0v0y0 —a d c
But x0u%w0v\y0 $.L0, since &=£/, andx0u\w0v\y0 f /,,, since 2p + j <3p.
Therefore, we have the following factorization, in which we have subscripted all the indices
for future applications:
x0 = «w-<A> + E.>
"o = a1" /o >0,
w0 = a^b^-Vo+M p-(i0 +/o)<Po <P.«o >0>
»o = *'° po>/o>0,
y0 = b'oc3" p-l>io>0.
Note that/o <Po, since otherwise \K^\ >Po would contradict the iteration
theorem. Thus/0|p and q0 = (p//0) + 1 is an integer. We have that
S^>x0A0y0£>x0uQ0"A0vS'>y0UXoU9oWoV9oy0 = a3pb3pc3p = z
and that
^0^m3»Wo* = ap+J° + s°b3p-'° (7.2.1)
Next we consider the word
z, = a3p£2V = a3p £ bpc2p,
where K = {3p + 1,. .. , 4p}. The problem of obtaining a factorization of z, is "dual"
to the problem of factoring z0. To see this duality, we reverse z0 and interchange a's
and c's. (Note that the symmetry of the iteration theorem under reversal plays a role
here.) Thus it follows that zx =xlulwlvlyl and Kl,K2,Ki #0. The factorization
is:
*i = a3pb'> p-i >il >0,
"i = bj> p0 >/, ^ 1,

7.2 INHERENTLY AMBIGUOUS LANGUAGES EXIST 239
Let qi = (p//j) + 1. Since/, <p0,Qi is an integer. We have that
S^XiA^i ^>xlu1'Alv9^yl £■ xlu\<wlvq<yl = a3pb3pc3p
and * ...
Al=>u91>wlvV = b3p-l'Cp+1'+k' (7.2.2)
Now let us consider the string a3pb3pc3p. We know that we may "pump up"
our factorization of z0 and z1 to get factorizations of a3pb3pcip (see Fig. 7.2). We
must now see how the phrases that derive from^40 and/I, fit into the picture. Recall
that A0^ap*i°+*°b3p-i° (7.2.3)
and
A0Z>ap+i°+*°b3p-t°
Al±>b3p-i>cp+}''+k'
S
(7.2.4)
FIG. 7.2
But it is clear that the phrases (7.2.3) and (7.2.4) overlap, since the minimum length of
the b's in the righthand side of (7.2.3) is:
3p-z'0 >3p~P = 2p,
while the minimum length of the b's on the lefthand side of (7.2.4) is:
3p — il >3p— p = 2p.
There is no way to partition (3p) i's into two classes each of which has size
greater than 2p without overlap, as can be seen from Fig. 7.2. Therefore the arbitrary
grammar G is ambiguous, and hence L0 UL, is an inherently ambiguous context-free
language. □

240 AMBIGUITY AND INHERENT AMBIGUITY
PROBLEMS
1 Show that {aaT| a 6 {a, b}*}2 is an inherently ambiguous context-free
language. (Hint. Start withz0 = aba2pbba2pbaaba3pbba3pba.)
2 Find a minimal linear language L which is inherently ambiguous among the
class of minimal linear grammars yet L has an unambiguous linear grammar.
3 Let M={a'bai+lb\i>0} and define L0 = abM* and LX =M*a*b. Show
that L0 UI, is inherently ambiguous.
4 A set L 9 2* is bounded if there exist w,, .. . , wr 6 2* so that
L 9 w* • • • w*. Settle the following question by providing a proof or a
counterexample.
For each inherently ambiguous language L, is there some bounded regular set
R £ w* • • • w* such that L O R is inherently ambiguous?
7.3 THE DEGREE OF AMBIGUITY
Now that we know that inherently ambiguous languages exist, let us take a
more refined view of ambiguous generation.
Definition Let k > 1. A context-free grammar G is ambiguous of degree k if
each string in L(G) has at most k distinct derivation trees (equivalently, leftmost
derivations). If k > 2, then L is called inherently ambiguous of degree k if L cannot be
generated by any grammar that is ambiguous of degree less than k but can be
generated by a grammar which is ambiguous of degree k. L is finitely inherently
ambiguous if there is some k and some G so that G is inherently ambiguous of degree k.
For example, L0 Ui, = {a'b'ck\ i = j or /= k} is inherently ambiguous of
degree 2. Since a grammar which is ambiguous of degree 1 is unambiguous, it only makes
sense to use the phrase "inherently ambiguous of degree k" when k>2. Also note
that if G is ambiguous of degree k, then k = sup{0(x)|x 6 2*},t where a(x) is the
coefficient of* in the formal power series, as described in Section 4.9.
Definition A context-free grammar G is infinitely ambiguous if, for each
i > 1, there exists a string in L(G) which has at least i leftmost derivations. A language
L is infinitely inherently ambiguous if every grammar generating L is infinitely
ambiguous.
Example Consider the grammar G:
s^ssw
where L(G) = {A}. G is infinitely ambiguous but L(G) is an unambiguous language.
We begin our development by reconsidering L0 U L j of Section 7.2.
t Suppose S is a partially ordered set with respect to <and S0 - S. An element sES
is an upper bound for So if So **= * for every s0 &S0- An element s* 6 S is a least upper
bound for S0 if s* <s for every upper bound s of So. We also write s* = sup(S). A
similar definition holds for lower bounds and greatest lower bounds, and the notation
used is inf(S). If i^ is a function with domain X, one writes
sup ip(x) for sup{<p(x)\x EX}.

7.3 THE DEGREE OF AMBIGUITY 241
Theorem 7.3.1 For any n > 1,(L0 U L i)" is inherently ambiguous of degree
2".
Proof Let p0 be the constant of the iteration theorem for (L0 u^i)" and
let p — p0!; by a straightforward generalization of the proof of Theorem 7.2.2, we can
prove the following: If
w2a2pbplb^ c3pw3 6(L0 UL,)n with wf+,€(£,„ UZ,,)^
for some kh i= 1,2, then:
w2a2p62pc3pw3 = x0u0w0v0y0,
where
x0 = w2a2p-(yo+«o)
m0 = a/o
u>0 = aco62p-<i0+;0) £0>0,
Wo = *y° p>po>jo>0,
y0 = &''°c3pw3 p-1 >/0>0.
Moreover, if we take q0 = (p//) + 1, then
xottfrwovfrx, = w2a3pb3pc3pw3e(L0^>L1)n.
Similarly, if we consider
w2a33plb^ibpc2pw3 =x1«1w1d,}'1,
then we find that
Xl = w2a3pb^ p-1 >/, >0,
"i = b1'' P>Po>h >0,
w, = b2p-^+>i)ch> ix +/', <p,fc, >0,
Taking g j = (p//,) + 1, we have that
je,ii?'w,»f'^, = u>2a3p£3pc3pw3e(Z,0U/,,)".
Using these factorizations, we are ready to prove the following result.
Claim For any w,+ 1 £ (/,<, ULi)fc', i = 1,2, &i, fc2 > 0, the string
w2a3p63pc3pvv3 € (/,<, U L2)" has at least two derivations.
Proof We have already seen (in the proof of Theorem 7.2.2) that the two
phrases ul°w0v%« and «?'w,i;f' overlap, so that w2a3pb3pc3pw3 has at least two
derivations.

242 AMBIGUITY AND INHERENT AMBIGUITY
If we consider the string (a3pb3pc3p)n, then we see that this string has at least
2" distinct derivations since we may generate any of the n subwords a3pb3pc3p in at
least two ways.
A straightforward argument shows that L is of degree exactly 2". □
Next we turn to showing that there is a context-free language with an
unbounded degree of inherent ambiguity.
Theorem 7.3.3 (L0UL])* is a context-free language which is infinitely
inherently ambiguous.
Proof Let p0 he the constant of the iteration theorem for (L0 u L j )* and let
P = PoU we note that the claim of the previous theorem applies in this case as well.
Therefore, for any «>0, the string (a3pb3pc3p)n has 2" trees. Since (/,<, UL,)*
contains (L0 UZ,])" for any n, we can proceed as follows. Let i be any fixed integer
and let n be such that 2" >i. Then (L0 U L2)" 9 (L0 UL,)* and has a degree of
ambiguity that is greater than i. D
PROBLEMS
1 Show that L = {wwT \ w 6 {a, b}*}2 is infinitely inherently ambiguous.
2 Let 2 be an alphabet with 121 > 2 and c $ 2. Show that
Z, = {ucvxuTv2\ u, vx, v2 6 2*}
is infinitely inherently ambiguous.
3 Is there a (minimal) linear language that is infinitely inherently ambiguous?
4 Show that, for each k>7, there exists an inherently ambiguous grammar
of degree k.
5 Let G = (V, 2, P, S) be a reduced context-free grammar. Let it denote the
production A -*Bi ■ ■ ■ Bk and letx62*. The degree of direct ambiguity of (7T, x) is
the number of different factorizations such that
A =>Bi ■Bk^>x1-xk
where for each /, 1 < / < k, Bj 6 V, and
*
Bj=>xj
and
X = X i ' ' ' Xfc
The degree of direct ambiguity of G = (V, 2, P, S) is k > 1 if k is the least integer such
that any pair (n, x) with ttEP and xEL(G) has degree of direct ambiguity < k. A
grammar has an infinite degree of direct ambiguity if for each ;', there is some pair
(n, x) with degree of direct ambiguity 5s j. Similar definitions hold for languages. Show
that the degree of direct ambiguity of a grammar is less than or equal to the degree of
ambiguity.
6 Find a linear context-free language (i.e., give its grammar) which is infinitely
ambiguous but has degree of inherent direct ambiguity equal to 1.
7 Let 2 be an alphabet, |2| > 2, and let c, d $ 2. Show that
L = {ulCVxUiU2V2du1\ul,U2,Vl,V2&^*}

7.4* PRESERVATION OF UNAMBIGUITY AND AMBIGUITY 243
is a language which is infinitely inherently directly ambiguous; i.e., every grammar for
L has an unbounded degree of direct ambiguity.
8 (Open question due to Shamir). Are there any languages whose degree of
direct ambiguity is not 1 and not infinite?
9 Extend the concept of an unambiguous pda to obtain a family of pda's
which correspond to the finitely inherently ambiguous languages.
7.4* THE PRESERVATION OF UNAMBIGUITY AND INHERENT
AMBIGUITY BY OPERATIONS
For a number of applications, it is necessary to know which operations do or
do not preserve unambiguity and inherent ambiguity. Many of the basic facts about
this have been studied when we studied the relevant closure operations, and it is a
comparatively simple problem to combine our results.
Our first proposition restates some facts which have been proven earlier in a
different form.
Proposition 7.4.1
a) If L is a context-free language whose intersection with some regular set is
inherently ambiguous, then L is inherently ambiguous.
b) If L\ and L2 are disjoint unambiguous context-free languages, then
L\ U/,2 is unambiguous.
c) If L is an unambiguous context-free language and R is a regular set, then
L~ R and L U R are both unambiguous.
d) If L is an inherently ambiguous context-free language and R is a regular
set with L n R — 0, then L U R is inherently ambiguous.
Now we study gsm's.
Theorem 7.4.1 If L is an unambiguous context-free language and S is a gsm
which is one-to-one on L, then S(L) is also an unambiguous context-free language.
Proof The argument is merely a careful re-examination of the proof of
Theorem 6.4.3 in the case where S is a gsm. Assume thatL is unambiguous so that A
(using the notation of the proof of Theorem 6.4.3) is an unambiguous pda. Moreover,
S is a gsm which is one-to-one on L. Now Qaim 4 in the proof of Theorem 6.4.3
becomes the following proposition.
Claim 4' For each q, q eQus,s' 6Q2, w,y 6 A*,(36 P\ & 6 T+, and
(Om, A, A), wy, m \j ((?', s', A, A),.y, 0'),
using k > l moves of Type 1 if and only if there exist Mj, .. ., uh in 2 such that
s0 = s,
s' = SjCs.m, •• Mfe)
S("i •••"*) = XjO.Mi •••"fe) = w,
and +
fa.ii, ■■•uk,0Z)\I(fi',A,&').

244 AMBIGUITY AND INHERENT AMBIGUITY
Using Claim 4' and the facts that 5" is one-to-one on L and that A is
unambiguous, one can show that B is an unambiguous pda so that S(L) is unambiguous. We
omit the details. □
Next, we turn to a consideration of inverse gsm's.
Theorem 7.4.2 If L is an unambiguous context-free language and 7Ms a gsm,
then r_1(L) is an unambiguous context-free language.
Proof Again, we use Theorem 6.4.3. LetL 9 A* and let the sequential
transducer of Theorem 6.4.3 be T~x as defined in the proof of Theorem 6.5.2. Claim 4,
from Theorem 6.4.3, becomes:
Claim4" For each q, q'&Qx, s,s'eQ2, wj6S*, /36T*, 0'er+, and
zer,
((<7,s,A,A),wjM3Z)£ (.(q',s',A,A),y,l5'),
using k > 1 moves of Type 1 if and only if there exist u i,.. . , uk in A* such that
s' = S2(s,u1 ■■■uk)
T(w) = \2(s,w) = «! • • -uh
and
(q,Ul •■•uk,0Z)\I(g',A,&').
From Claim 4" and the unambiguity of the pda A, it follows that B is an
unambiguous pda, so T~l (L) is unambiguous. □
Corollary 1 If L is an unambiguous context-free language and <p is a homo-
morphism, then <p~lL is an unambiguous context-free language.
Proof Every homomorphism is a one-state gsm mapping. □
Corollary 2 If L 9 2* is inherently ambiguous and S is a gsm which is
one-to-one on 2*, then S(L) is inherently ambiguous.
Proof If S(L) were unambiguous, then
S-l{S{L)) = L
would be unambiguous and that would be a contradiction. □
Corollary 3 Let L 9 S+ be an unambiguous (inherently ambiguous) context-
free language and let wGS*. Then wL and Lw are both unambiguous (inherently
ambiguous).
Proof One can design a gsm S which maps L into wL and S is one-to-one on
2*. By Theorem 7.4.1, wL is unambiguous if L is and by Corollary 2 of Theorem
7.4.2, inherent ambiguity is preserved also. The case of Lw follows that of wL by the
preservation of these properties under transpose. (Cf. Problem 1 of Section 7.4.) □
The next theorem will have an application in proving a metatheorem about
decidability.

7.4* PRESERVATION OF UNAMBIGUITY AND AMBIGUITY 245
Theorem 7.4.3
a) The finitely inherently ambiguous languages are closed under inverse gsm
mappings.
b) The finitely inherently ambiguous languages are closed under quotients
with singletons on both the left and right.
Proof The argument for (a) is an extension of the proof of Theorem 7.4.2. If
L is finitely inherently ambiguous, the pda A has a bounded number of accepting
sequences and so does the pda B.
The argument for (b) involves giving a machine construction for L{w}~1. It
will be clear that this set is finitely inherently ambiguous if L is. The details are left to
the reader. □
The previous theorem is rather weak. Since an unambiguous language is
finitely inherently ambiguous, the result only says that these operations will not
produce an infinitely inherently ambiguous language. The theorem does not even
imply that these operations preserve inherent ambiguity. (Cf. Problem 2 of Section
7.4.)
In general, any form of "recoding" destroys ambiguity and inherent
ambiguity. We prove a simple result now and leave similar cases for the problems.
Theorem 7.4.4 Neither unambiguity nor inherent ambiguity is preserved
under projections (i.e., homomorphisms <p which map 2 into A) nor under product by
a two-word set.
Proof Let M= {da'b'c'li, j> 1}U {ea'tf cf\ i,j>\}. M is clearly
unambiguous. Define a projection ip by the condition that, for all x£S,
e if* = d or x = e,
x otherwise.
Clearly, . ,_
<p(M) = e(L0 ULO = {ea'b'ck\i = j or / = k).
By Corollary 2 of Theorem 7.4.2, M is inherently ambiguous.
On the other hand, define \p such that
\px = a
for all x 6 2. Then
1KI0UIO = {a'\i>3} = a3a*.
Thus L0Ui, is inherently ambiguous and ii{L0 UL)) is unambiguous because it is
regular.
Turning to the second operation, define
TV = {a'W| i, j > 1} U {cta'Vc'l i, j > 1}.
N is surely unambiguous. Now we form
{d,d2}N
<px =

246 AMBIGUITY AND INHERENT AMBIGUITY
and assume it to be unambiguous. Note that since d2a*b*c* is a regular set, then
{d,d2}NC\(d2a*b*c*) = d2(L0^Ll)
would be unambiguous. But by Corollary 2 of Theorem 7.4.3, it is inherently
ambiguous.
Let
L' = a*b*c*Ud2d*a*b*c*Ud(L0UL1).
L' is inherently ambiguous because
i) d(L0 U Li) is inherently ambiguous, by Corollary 2 of Theorem 7.4.3;
ii) a*b*c* V d2d*a*b*c* is a regular set disjoint from d(L0 ULi);and
iii) by Proposition 7.4.1(d).
But
{d,d2}L' = d+a*b*c*
is regular and hence unambiguous. This shows nonclosure of inherent ambiguity under
product with a two-word set. □
PROBLEMS
1 Show that L is an unambiguous context-free language if and only if L is.
2 Show that, if L is inherently ambiguous and <p is a homomorphism, then
<fl (L) is not necessarily inherently ambiguous.
3 Show that there is an inherently ambiguous language L and a gsm S which
is one-to-one on L, such that S(L) is unambiguous. Actually S can even be a sequential
machine, which means that A: Q x 2 -*■ A.
4 Show that neither unambiguity nor inherent ambiguity is preserved by init
or by subwords.
5 Show that the set
L = {flp6acrrfse'| (p = q and r = s) or (q = r and s = t);p,q,r,s, r 5s l}
is an inherently ambiguous language. Moreover, L = {a, b, c, d, e}* — L is also a
context-free language.
6 Find an unambiguous context-free language L whose complement is not
context-free.
7.5 HISTORICAL SURVEY
Parikh [1961] first showed the existence of inherently ambiguous languages.
Our argument derives from the iteration theorem and uses techniques of Ogden
[1968]. The notion of direct ambiguity is due to Earley [1968]. Problem 7 of Section
7.4 is from Shamir [1971]. Most of the results from Section 7.4 are due to Ginsburg
and Ullian [1966]. Problems 5 and 6 are from Hibbard and Ullian [1966].

eight
Decision Problems
for Context-Free
Grammars
8.1 INTRODUCTION
In this chapter, we look at context-free grammars and languages from the
point of view of actually computing certain information. Do there exist algorithms to
determine (say) whether L(G) = 0 or L{G{)= 2*? Section 8.2 is devoted to some
solvable problems, i.e., ones for which algorithms exist. Regrettably, that is a short
section because most of the questions that arise do not have algorithmic solutions.
In Section 8.3, the halting problem for Turing machines is reviewed
informally. The Post correspondence problem is defined and shown to be unsolvable by
reduction to the halting problem. Then this problem is encoded into some context-free
languages.
In Section 8.4, it is shown that one can't decide whether LlC\L2 = 9 for
two context-free languages L\ and L2. One cannot decide whether L(G)= 2*. It is
not possible to decide whether a context-free grammar G is ambiguous or not. A similar
result holds for deciding whether L(G) is inherently ambiguous.
In Section 8.5, we consider problems like "given a context-free language
L{G), can we decide whether L(G) is deterministic?" That problem is unsolvable. If
we replace "deterministic" by (say) "finitely inherently ambiguous," then the answer
is the same. A general theorem is proved which deals with the decision question
concerning whether
L(G)6y, (or not),
where Sf is an appropriate family of sets.
247

248 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
8.2 SOLVABLE DECISION PROBLEMS FOR CONTEXT-FREE
GRAMMARS AND LANGUAGES
A number of basic decision questions about context-free grammars and
languages will be investigated. In this section, the decidable questions are considered.
It is somewhat discouraging that most natural questions are undecidable, but these
negative results are important, because they indicate which problems can have no
algorithmic solutions. Among the solvable questions, the complexity of carrying out
the basic computations will be studied.
First, we turn to the very important question of "recognition." Given a
context-free grammar G = (V, 2, P, S) and a string w 6 2", is w 6 L(G) or not? This
problem will be analyzed in great detail in a later chapter, where a variety of algorithms
will be given. These algorithms run in under cubic time as a function of n, the length
of the string to be recognized. For the moment, we content ourselves with proving
that the problem is solvable and we give a simple (but computationally terrible)
algorithm.
Theorem 8.2.1 There is an algorithm to determine, for a context-free
grammar G = (V, 2,P, S) and a string w 6 2*, whether or not w 6 L{G).
Proof Assume, without loss of generality, that G is in Chomsky normal form.
(Cf. Theorem 4.4.1.) If w = A, then w 6 L(G) if and only if S -> A is in P. Assume
w6 2+, where lg(vv) = n > 1. Then S =>w implies t <2n ~ I, since G is in Chomsky
normal form. (Cf. Problem 4 of Section 4.4.) We can decide whether w&L{G) by
enumerating all derivations of length < 2« — 1. □
We now show that there is an algorithm that can be used to decide whether
a context-free language is empty, finite, or infinite. In later sections, it will be shown
that almost all other general questions of interest about context-free languages are not
decidable.
Theorem 8.2.2 There is an algorithm to determine whether an arbitrary
context-free language is empty, finite, or infinite.
Proof Let L = L(G), where G = (V, 2, P, S); L(G) is nonempty if and only
if S 6 Wn, where Wn is the set constructed in the proof of Theorem 3.2.1.
From the pumping lemma, we know that there exists a computable number
p = p{G) such that L{G) is infinite if and only if there exists w 6 L(G) such that
lg(w)>p. Let , «
Lp = in/ (J 2'j = {weL\ lg(w)<p}.
Lp is computable since U,- < p 2' is finite.
We also know that L~Lp is a context-free language since Lp is finite. But
I L(G) | = °° if and only if L - Lp # 0.
From the first part of this theorem, we have an algorithm for determining whether
L~LP is empty. □

8.3 TURING MACHINES 249
We can also decide whether a deterministic context-free language equals a
regular set.
Theorem 8.2.3 Let L be a deterministic context-free language and let R be a
regular set. It is decidable whether or not L = R.
Proof L = R if and only if
L £ R (8.2.1)
and
R 5 L. (8.2.2)
That is, if and only if
LHR = 0 (8.2.3)
and
Lr\R = 0. (8.2.4)
But L C\R is a context-free language, since R is regular if i? is. On the other hand,
L is a deterministic context-free language, since L is, and hence LC\R\% context-free.
By Theorem 8.2.1, (8.2.3) and (8.2.4) are solvable. □
PROBLEMS
1 Let L = T(A), where A is a dpda, and let R = T(B), where B is a finite
automaton. Show that the following problems are solvable in polynomial time as a
function of I ^4 | and | B \.
a)L^R b)R£L c) L = R d)Lr\R = Q
2 Let G = (V, X,P,S) be a context-free grammar and let a, (36 V*. Show
that it is decidable whether or not a =*ft and whether or not a t*j3.
3 Let G = (V, 2,/\ S) be a context-free grammar. A variable A 6 TV is
recursive if
*
for some «, V6S* and uv=£ A. Give an algorithm which determines whether a given
variable is recursive.
4 Show directly (without using grammars) that there is an algorithm for
testing whether or not T(A) = (3, where A is a pda.
8.3 TURING MACHINES, THE POST CORRESPONDENCE PROBLEM,
AND CONTEXT-FREE LANGUAGES
Although it is assumed that Turing machines are familiar to the reader,' we
shall give a quick sketch of a proof of the halting problem for Turing machines.
Although this is a familiar example of an unsolvable problem, there is another problem
called the Post Correspondence Problem which is more natural in the context of
language theory because it involves strings of symbols.
t Chapter 9 contains some material on Turing machines and computational
complexity. It is not needed here.

250 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
For the purposes of the present discussion, imagine a Turing machine working
over some fixed alphabet, and let us assume that the input is regarded as a nonnegative
integer. Let us assume that we have some mapping g which is a Godel numbering;
that is, g maps from the positive integers onto the set of all Turing machines. Moreover,
g is a totally computable function. Thus, given n,g(n) denotes the nth Turing machine,
which will also be written An. If machine An is run on input*, there are two possible
outcomes: Either An(x) eventually halts and gives some output, or it runs forever.
Definition The halting problem is the problem of deciding, for any number
n, whether An(n) halts or not.
Theorem 8.3.1 The halting problem is recursively unsolvable.
Proof Suppose it were solvable. Then there would be an algorithm P which
decides the halting problem. We could define a new algorithm Q based on P as follows:
For a given n, decide whether An(n) halts, using P. If An(n) halts, then go into an
infinite loop; else stop. This procedure Q is clearly an algorithm, so that there is a
Turing machine Ak which computes Q; that is, Q(x) = Ak(x) for all x. Let us
diagonalize and examine Ak(k):
Ak(k)ha\ts implies Q(k) loops implies Ak(k) loops.
On the other hand,
Ak(k) loops implies Q(k) halts implies Ak(k) halts.
Therefore, Ak(k) halts if and only if Ak(k) loops. Clearly, this is a contradiction. □
Now we turn our attention to a problem which is very useful in language
theory.
Definition Let x = (xu... ,xn) and y = (.Vi,. . . ,yn) be two lists of nonnull
words from a common alphabet 2. The Post Correspondence Problem (PCP for short)
is to determine whether or not there exist i{,.. . ,ik, where 1 < /,- < n such that
*'.■■■*»* =yt>'--ytk
If there exists a solution to a PCP, then there are infinitely many, since one
may iterate a solution any number of times to obtain another solution.
Examples
x = (a,b2a,a2b),
1.
y = (ba,a3,ba).
In this case there is no solution, since any solution must start with some index / and
X; and^,- must have a common nonnull initial subword.
x = (b3,ab2),
2.
y = (b2,bab3).
Then (1,2, 1) is a solution since
b3ab2b3 = b2bab3b2

8.3 TURING MACHINES 251
3. x = (ba,ab2,bab),
y = (bab,b2,abb).
Any solution must have /, = 1. Then
Xi = ba
y{ = bob
Now the choice for xt must begin with a b. Therefore, i2 = 1 or 3. If i2 = 1,
then
x{x{ — baba
y{yx = babbab
which fails. Thus, i2 =£ 1. If h — 3, one gets
x{x3 = babab
y^3 = bababb
By exactly the same reasoning as above, z3 = 3, z4 = 3, etc. But the sequence will
continue indefinitely and hence there can be no solution, because the x-word cannot
catch up with the .y-word.
It is convenient to use an intermediate problem to prove that the
correspondence problem is unsolvable.
Definition Let x = (j1, ... ,xn) and y = (yu . .. ,yn) be two lists of non-
null words from some common alphabet 2. The Modified Post Correspondence
Problem (MPCP, for short) is to determine whether or not there exist i\,... ,ik
where 1 <t)<n, such that
The only difference between the MPCP and the PCP is the condition that the
first strings used in the MPCP be xx and yx.
Lemma 8.3.1 If the Post Correspondence Problem were solvable, then the
Modified Post Correspondence Problem would also be solvable.
Proof Let
X = (*!,. .. ,Xn)
and
y = Oi, • • • ,yn)
be two lists of words over some alphabet 2. Assume that this is the instance of MPCP
that we wish to solve. Now an "equivalent" PCP will be constructed. Let 2' = 2 U {$, $}
where <t and $ are new symbols. Define two homomorphisms ipL and ipR by the
conditions
(pL(a) = <ta
<pR(a) = a<t
for each a € 2. Define
i , i i -.
X — [X\, . . . ,Xn+2),

252 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
where
x't+i = *Pr(*,-)
for each /, 1 < / < n, and
xn+2 — 3>
Also, define
y' = (y\,---,y'n+J,
where
y\ = <PlOi)
y'ui = *h(yt)
for each /, 1 < i < n, and
yn+2 = *$
The lists x' and y' represent an instance of a PCP. We shall show that the
PCP for x' and y' has a solution if and only if the MPCP for x and y has a solution.
It is clear that if 1, il}... ,ik is a solution to the MPCP with lists x and y,
then
1, z'i + 1,. . . , ik + 1, n + 2
is a solution to the PCP with lists x' and y'.
Suppose ii,. .. , ik is a solution to the PCP with lists x' and y'. Then clearly,
i{ = 1, since that is the only index for which both terms start with <t. Also ik =n + 2,
since that is the only index for which both terms end in the same character. Let / be the
least integer such that i, = n + 2. Then, it is not hard to see that ix,...,ij is also a
solution. It should now be clear that:
1, *2 — 1 .... , //_! — 1
is a solution to the MPCP for lists x and y. Therefore, if there were an algorithm to
solve the PCP, then we could also solve the MPCP. □
Now we are ready to prove the main theorem.
Theorem 8.3.2 For | 21 > 2, the Post Correspondence Problem is unsolvable.
That is, there is no algorithm which takes x and y as input and determines whether or
not an index sequence iu . . ., ik exists such that
*f, ■■■*»* =ytl---ytk
Proof It suffices to show that if the MPCP were solvable, then the halting
problem would be solvable, by Lemma 8.3.1. Given a Turing machine A and an input
w, an instance of the MPCP will be constructed that has a solution if and only if A
halts on w.
Let A — {Q, 2, T, B, 5, q0, {^})be a Turing machine, and assume that A has a
unique "halting state," say q. The idea of the construction is that .4's halting
computation on w will look like'.t
t aqfl denotes an ID of the Turing machine A and indicates that A is in internal state
q, the tape is a|3, and A is reading the first character of |3.

8.3 TURING MACHINES
q0w h otiqiPi h • • • h OLkqk$k,
where qk = q. In that case, a solution of the MPCP will be as follows:
#q0w#alqrf1#- ■■#akqkpk #
The construction proceeds by giving the pertinent lists in Table 8.1.
TABLE 8.1 Lists for the MPCP
Number
1
2
3
4
For each q G Q -
X
#
q##
#
a
-{q}, eachq'
qa
qa
cqa
q#
cq#
q#
aqb
aq#
#qb
#q0w#
#
#
y
a for each a G 2— {B}
G Q, each a.
q'b
bq'
q'cb
bq#
q'cb#
q'b#
q
q#
#q
,6,cG2-{5},
if S(q,a) =(q',b,0)
if 8(q,a) =(q',b,l)
if S{q, a) =(q',b,-\)
if5(<?,5) = (q',b, 1)
if 5 fa, B) = {q',b,- 1)
if 8(q, B) =(<?', b,0)
In order to analyze the MPCP, the following definition is helpful.
Definition Let x and y be two lists over an alphabet 2 for an MPCP. A pair
of strings (u, v), where u, v& 2*, is said to be a partial solution to the MPCP if u is
a prefix of v, and u and v are formed by concatenating corresponding strings from x
and y, respectively. If v= uu , then u is said to be the remainder of the partial
solution (u, v).
Claim If
q0w h<*i<7i!3i h--- hafe<7fe|3fe, (8.3.1)
where none of q0,. . . , qk^ are equal to q, then there is a partial solution
(u,v) = {#q0w#alq^l#- ■ • #afe_1^fe_1|3fe-1#,#q0w#alqlPl#■ ■ ■ #akqkPk#).
(8.3.2)
Moreover, this is the only partial solution whose second coordinate is as long as lg(z>).
Proof The argument is an induction on k.
Basis. k = 0. Since this is a MPCP, we must choose the first pair and we get
(#,#q0w#).

254 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
Induction step. Suppose the claim is true for some k and qk ¥= q. We will
show it true for k + 1. The remainder of the partial solution (u, v) is u = ctkqkQk#.
We must match this from list x in the next move. Note that there can be at most one
pair in Table 8.1 that is consistent with this requirement. Moreover this pair
corresponds to a move of the Turing machine. The other symbols of u come from items 3
and 4 in the lists. This provides a new partial solution
(«",""afe+i<7fe+i!3fe+1#),
where this is the configuration A can reach from OLkqk$k. There is no other partial
solution whose second component is so long. This completes the induction proof.
If qk = q, then pairs from 2, 3, 4, and the last 3 items on the list may be
added to the partial solution to give a total solution to the MPCP. On the other hand,
if A never reaches q, then there are partial solutions which are not total solutions.
Therefore there is a solution to the MPCP if and only if M halts on w. But this
is unsolvable. □
Our next task is now to "code" the PCP into some languages. For ease in
the construction, we use a three-letter alphabet 2 = {a,b,c}. All of the results may
also be proved for a two-letter alphabet by suitably encoding the three-letter alphabet.
The encoding to a binary alphabet is left for the problems.
First we define three languages which will be used extensively. Let
x = (xitx2, ■ ■. ,xn), y = (yi,y2, ■ ■ ■ ,yn), xhyte{a,b}+.
Let
L(x) = {ba'kb ■ ■ ■ ba^cxix ■■■xik\k>\, 1< /,- < w, Kf<k}.
The idea behind L(x) is simple. The integers ilt. .. ,ik are encoded as blocks of a's
separated by 6's in reverse order; c acts as a "center" marker and is followed by
XhXh ' ' X'k-
Proposition 8.3.1
a) Let w be a word in Z,(x); given the part of the word up to c in w, the
remainder is uniquely determined.
b) It is possible that different index sequences give rise to the same* andy
words. That is, it is possible that
ba'k . . . faix =£ fair ■ ■ ■ fai,
and yet
•*■', '' x'k ~~ xii '' xir'
Example x = {a,b, b2)- Then
ba2ba2 ¥= ba3 but x2x2 = b1 = x3.
Next let
L(x,y) = L(x){c}(L(y)f
= {ba'k ■ ■ ba'^cx^ ■ ■ -x^cyj^ ■ ■ -yjca^b ■ ■ ■ aJ'*b\ where
M> 1, 1 <W« <n, 1 <p<k, 1 <q <il},

8.3 TURING MACHINES 255
and
Ls = {vt^cwzcw^cwfl wu w2G {a,b}+}.
Theorem 8.3.3 L(x,y) and Ls are deterministic context-free languages.
Proof We give an informal description of dpda's which accept L(x,y) and
Ls, and leave as an exercise the formal proof that the constructions are correct.
The construction for L(x, y) = L(x){c}(L(y))T now follows".
1 Copy everything up to the first c onto the stack.
2 Erase the top block of a's from the stack, counting them. This is possible
since the number of a's in a valid block is<«. If more than n a's are found,
the string is rejected.
3 Now advance on the input, checking to see whether the input is*,-., where
/} is the number determined in Step 2. Again this is possible since there
are only a finite number of x;..
4 If input symbol is not a c, go back to Step 2.
5 Copy everything up to the next c onto the stack.
6 Count the number of a's in the current block of a's, from the input.
7 Now check to see whether the top of the stack is yf., where i} is the number
of a's counted in Step 6.
8 If the stack is 20, then accept by erasing Z0 and going to a final state under
A; else return to Step 6 and continue. If the input is not exhausted, go back
to Step 6.
Now the technique for Ls = {wycw^cw^cwjl wl,w2E {a, b}+}is given:
1 Copy everything up to the second c onto the stack.
2 Compare the rest of the input with the stack. □
The next lemma plays a key role in subsequent applications.
Lemma 8.3.2 L(x, y) n Ls contains no infinite context-free language.
Proof
L(x,y)nLs = {tiCtictfctf] t{ = ba'k ■ ■ ba^;t2 =xh ■ ■ -xik =y^ ■ ■ -yik}.
Fact tx uniquely determines tlct1ct^ct{.
Proof tx uniquely determines t2 by the encoding of L(x, y).
Now suppose L is an infinite context-free language and Z,^i(x,y)n/,s.
Then, by the iteration theorem, there exists a constant p such that the conclusions of
that theorem hold.
Let
w = txctici{ct\&L

256 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
with \g(t2)>p. This is possible since, for L to be infinite, it must contain arbitrarily
long words and if
tx = ba'p ■ ■ -ba'',
then \g(t2) > p, since xt ¥= A, 1 < / < n.
Now let all the positions in t2 be distinguished as shown below:
w = txcti.ct\ci['.
Then, by the iteration theorem, there exists a factorization
<P = (Vl,V2,V3,V4,VS)
such that
ViV^v3v%Vs&L forall^^O,
and either KuK2, K3±0 orK3,K4,Ks^0. AssumeKuK2,K3 =£0. (The case A:3,K4,
Ks ¥= 0 is similar to this one.) Then
Vi = ^cfli for some v\,
v2 * A,
and
vxv3vs&L.
Thus
^i^3^s = fiCTl^^ei.
But, by the fact proved above,
If
tiCv\v3vseL,
then / „ „ .t /r
Therefore,
ViV2V3V4Vs = W = V{V3VS
This is a contradiction, since v2¥=A, and therefore Z, cannot be both infinite and
context-free. □
PROBLEMS
1 Extend the theorems of this section to the case of a two-letter alphabet.
Encode Z,(x, y) and Ls into a binary alphabet.
2 Show that the Post Correspondence Problem is solvable if | 2 | = 1.
Let |S | > 2 and let x = (x{, . . . ,xn) and y = (y\,...,yn) be two lists of nonnull
S-words.
3 Is the Post Correspondence Problem solvable
a) if n = 1?
b) if/? = 2?
c) if n > 1 and for each i, 1 </</?, lg(x;) = lg(.y,-)?
4 Show that there exists some positive integer n0 such that the Post
Correspondence Problem with n = n0 is unsolvable.

8.3 TURING MACHINES 257
5 Is the following problem unsolvable or not? Given | 21 > 2 and let x =
(*!, . . . ,xn) and y = (yu . . . ,yn) be lists of nonnull 2-words; do there exist k~^ 1,
%> 1, ilt . . . , ik, /i,. . . ,/s, such that Kip, jq <n for 1 <p < A:, 1 < <? < 8, and
*!,■■■*«■* = >vV
6 Is the following problem unsolvable or not? Given | 21 > 2 and let
x = (xx,. . . , xn) and y = (yu . . . ,yn) be two lists of nonnull 2-words; do there
exist k~> 1, iu . . . ,ik, ju . . . ,/fe, such that 1 </p, jq <« for 1 <p, q^k, and
**.■■■*«* = >v^k?
7 Let /I = (0, 2, r, B, 5, q0, {/}) be a Turing machine, given that B is the
blank symbol and is in 2; / is the unique "stopping state." Assume Q C\ 2 = 0. A
computation oi A is a sequence
«o h «i h ■ • ■ han,
where a; € 2*G2+ and a; f-a,-+1, as specified by 6. Let c £ 2 and define
Z,, = {ac/3Tc| a k ^}*c
and
i2 = {<?0wc| wG2*}{aTc^c| a fI0}*{W»Tl x,>> G 2*}{cc}.
Show that Lx and Z,2 are deterministic context-free languages. Are they real-time
as well?
8 Show that there is an algorithm which takes as input a Turing machine A
and produces a context-free grammar GA such that
L(GA) = 2* — {oilca2c • • • ca%kcc\ ax = q0w for some w G 2*, a2fc = xfy
and a! h a2 h • " h a2fe}.
9 Show that there is a context-free language L such that the set of prefixes
of 2* — L is not recursive but is recursively enumerable.
10 Fix 2 = {<7,6}. Construct a family {G„| n > 0}of context-free grammars
such that:
i) For each n > 0, Gn may be effectively constructed from n;
ii) A GL(Gn) for each/?;
iii) |2*-L(G„)| <l;
iv) {n| L(Gn)¥= 2*}is not recursive.
///«?: Take a Turing machine /I which accepts a nonrecursive set over 2*. Encode
(non)computations such that there is a set J(A,n) which lacks one member from
{a, b}* if and only if A accepts/?.
11 Show that every recursively enumerable set L may be represented as
Z, 1Z-2x for some context-free languages Lx and L2. Hint: Represent L = T(A) for some
Turing machine A = (Q, 2, T, B, 8, q0, {/}).
12 Refer to the proof of Theorem 8.3.2. Show that A halts on w G 2* if
and only if there exists k > 0, 3 < /[, .. . ,ik<n such that
"■i*i, ■ • -xikx2 = yiyix-- ■J',-fc>'2

258 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
13 Show that, for any recursive function f and for arbitrarily large integers
n, there is a context-free grammar G so that \G\ = n, L(G) is a regular set, and the
reduced finite automaton acceptingL(G) has at least f(n) states. [Hint. Cf. Problem 8.]
8.4 BASIC UNSOLVABLE PROBLEMS FOR CONTEXT-FREE
GRAMMARS AND LANGUAGES
In Section 8.3, we worked out the properties of two deterministic context-
free languages. With the aid of these sets, we shall very easily develop the basic negative
results about the context-free languages and grammars. In what follows, the results
are claimed for all alphabets with at least two symbols. In fact, the proofs are given
only for alphabets with at least three symbols. In Problem 1, the reader is asked to
extend the result to a binary alphabet.
We begin our development by noting the form of strings in i(x,y)nZ,s.
Lemma 8.4.1 It is recursively unsolvable to determine for arbitrary lists
of words x and y whether
L(x,y)nLs * 0.
Proof
L(x,y)nLs = {txct2ct\ctTx\ tx = ba'k ■ ••bai>; t2 = *,-_ ■ • *;fe = yi% ■ ■ -yik)
Therefore, L(x,y)C\Ls =£0 if and only if there exists a solution to the Post
Correspondence Problem. But the Post Correspondence Problem is recursively unsolvable,
which proves the lemma. □
This establishes the following result.
Theorem 8.4.1 Let | 21 > 2. It is recursively unsolvable to determine whether
L\ n L2 = 0 for (deterministic) context-free languages Lx and L2.
Proof Lemma 8.4.1 gives the result for | S| > 3 by taking Ll = L(x, y) and
L2 = Ls. The case | 21 > 2 is left to Exercise 1. □
Next we learn that we cannot even decide whether Lt nz,2 is context-free.
Theorem 8.4.2 Let | 21 > 2. It is recursively unsolvable to determine whether
LXC\L2 is a context-free language for arbitrary context-free languages Lx and L2.
Proof L(x, y)nis # 0
if and only if there exists a solution to the Post Correspondence Problem. By an
earlier observation, we know that if there is one solution to the Post Correspondence
Problem, there are infinitely many. But we have shown that L(x, y) n Ls contains no
infinite context-free language. Therefore Z,(x,y)Ois is a context-free language if
and only if it is empty. But by Lemma 8.4.1 this is undecidable. □
Now, we wish to show that one cannot decide whether a context-free language
is regular. To do this, we need a simple lemma.

8.4 BASIC UNSOLVABLE PROBLEMS 259
Lemma 8.4.2 Let | 21 > 2. Let M(x, y) = 2* - (L(x, y) n Z,5). Vkf(x, y) is
a context-free language and it is recursively unsolvable to determine whether or not
M(x,y) = 2*.
Proof M(x, y) = 2* - (Z.(x, y) n /.,)
= (2*-L(x,y))U(2*-Ls)
= £(x,y)U£,.
We know that Z,(x, y) and Z,s are deterministic context-free languages and so are
T(x,y) and Ls, by Theorem 5.6.1. Thus M(x,y) is a union of two deterministic
context-free languages and hence is a context-free language. But
M(x,y) = 2*
if and only if
L{x,y)(Ms = 0.
Now, using the fact that Z,(x, y) n Ls = 0 is undecidable, we have that
M(x, y) = 2* is undecidable. □
Theorem 8.4.3 Let jW(x,y) be as in Lemma 8.4.2 above. M{x,y) is regular
if and only if M(x,y) = 2*. Thus, it is recursively unsolvable to determine whether
an arbitrary context-free language over an alphabet 2,121 > 3, is regular.
Proof Again we use the fact that £(x,y)nz,s is either empty or infinite.
Since L(x,y)C\ Ls contains no infinite context-free languages, it contains no infinite
regular sets.
Recall that
M(x,y) = r-(£(x,y)n£s)
and
Z*-M(x,y) = L(x,y)nLs.
We claim that M(x, y) is regular if and only if M(x, y) = 2*. The //direction is trivial.
If M(x, y) is regular, then 2* —M(x, y) = L(x, y) n Ls is regular. Since L(x, y) n Ls
cannot be infinite, we must have £(x,y)n£s = 0, which implies M(x,y)= 2*. By
Lemma 8.4.2, one cannot decide whetherM(x, y) = 2*. □
Corollary For |2|>2, it is recursively unsolvable whether or not an
arbitrary context-free language is regular.
Proof Cf. Problem 1 at the end of this section. □
Now we shall consider the containment and equality problems for context-
free languages.
Theorem 8.4.4 Let | 21 > 2. It is recursively unsolvable to determine whether,
for arbitrary context-free languages Lx and L2,
L\ — Z,2.

260 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
Proof Recall that
M(x,y) = r-(I(x,y)ny.
Let
L1 = 2*
L2 = S*-(i(x,y)ny = M(x,y).
Both Lx and Z,2 are context-free languages, and
L^L2 ifandonlyifAf(x,y) = 2*.
But it is recursively unsolvable to determine whether M(x, y) = 2*.
Take Lt=M(x,y) and L2= 2*. The unsolvability of the equality problem
follows from Lemma 8.4.2. □
Next we turn to a potentially useful idea. Could we find an algorithm to
decide whether a context-free grammar is ambiguous?
Theorem 8.4.5 Let | 21 > 4. It is recursively unsolvable to determine whether
an arbitrary context-free grammar over 2 is ambiguous or not.
Proof Let
X = (Xi, . . . ,Xn),
y = (yu...,y„)
be two lists of nonnull words over {a, b}*. Consider G where
S -> Si\S2
Si -> X;Sidc' I Xidc' for 1 < /' < n,
S2 -> ytS2dc'' | ytdc'' for 1 < / < n.
Clearly,
Z,(G)= {jcji---jc1-mdc''"' ■■■dci'\m>l,xje{a,b}*, 1</,-<«, for all 1 <f<m}
u fa, • • ->V*C'P '' ■dcJ'\P> l.^fc e{fl,6}*. 1 </b<«. for all 1 <C<p}.
Moreover, the context-free languages generated by G,- = (K, 2,/*, 5,) are both
unambiguous. Thus, G is ambiguous if and only if
{*,-_ • • x^dc'™ ■ ••dc^}n{yJi ■ ■ -yjpdc'p ■ ■ -de1'} * 0.
Thus, G is ambiguous if and only if m—p\ (z'i,. . . ,zm) = Q\,. . . ,jm), and
X; ■ ■ ■ xt —y: ■ ■ ■ y-t , that is, if and only if the Post Correspondence Problem has
a solution; but the Post Correspondence Problem is recursively unsolvable. Therefore
it is recursively unsolvable to determine whether an arbitrary context-free grammar is
ambiguous. D
Lastly, we consider inherent ambiguity.
Theorem 8.4.6 There is no algorithm to decide whether a context-free
language is inherently ambiguous.

8.4 BASIC UNSOLVABLE PROBLEMS 261
Proof Let x, y be nonnull lists of words over {a, b }*. Define
L = {a'Vc'l i,J>l}{d}L(x,y)U{aibici\ i,j>\}{d}Ls.
L is surely a context-free language. Moreover, each term is an unambiguous context-
free language.
Claim L is inherently ambiguous if and only if
i(x,y)ni, # 0.
Proof of the claim If L is inherently ambiguous, then there must be some
string in each term and surely L(x,y)nZ,s#(J. If L(x, y) HLS =£0, then let
z€/,(x,y)nis. Define R — a*b*c*dz, which is a regular set. Suppose that L were
unambiguous. Then
Lr\R = L2 = {a'Vc'l i,j>\}{dz}U{atb'c'\ i,j>\}{dz} = (L0\JL{)dz
would be unambiguous, by Theorem 6.4.1. But, by Theorem 7.2.2, ijUi, is
inherently ambiguous. Applying Theorem 7.4.3,
(L0UL{){dz} = L2
is also inherently ambiguous. But this contradicts that L2 is unambiguous.
It now follows that we can decide whether L is inherently ambiguous if
and only if L(x, y)CiLs¥z0. But this is undecidable, by Lemma 8.4.1. □
There are additional questions about context-free languages which are
recursively unsolvable but which are not included here. The above theorems should be
sufficient to provide the reader with a "feel" for the types of questions which are
undecidable and the methods used in proving that a given question is undecidable.
PROBLEMS
1 Extend all the results of this section to a binary alphabet.
2 Show that, given a context-free grammar G, there is no effective way to
construct G' such that
L(G') = L(G)-{A)
and \G'\ = min{\Gl\: Z,(G,) = Z,(G)-{A}, G, is A-free}.
3 Show that it is undecidable to determine, of two context-free grammars,
whether or not they have the same set of sentential forms.
4 There is an interesting class of decision problems called "birdie" problems."
We already know that one cannot decide whether a context-free language is regular.
Suppose a birdie tells us that Z-(G) is regular. Show that one cannot construct a finite
automaton^ so that
L(G) = T(A).
[Hint: Cf. Problem 11 of Section 8.3.]
5 Is it decidable whether or not L 9 R for a context-free language L and a
regular set R1

262 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
6 A context-free language L is prefix-free if, for any u,i)6S*,we have that
uvGL and « G Z, imply V = A. Show that it is undecidable whether or not L is prefix-
free. [Hint: Reduce to Lx C\ L2 — 0.]
7 Show that there is an algorithm to decide whether an arbitrary pda A is
finite-turn or not (cf. Section 5.7). Show that if A is finite-turn, it is solvable to find
the least integer k0 such that each sweep of A has (2k — 1) turns for some k <k0.
8.5* A POWERFUL METATHEOREM
In the previous section, certain problems were proved to be unsolvable.
Given a context-free grammar, we cannot decide whether it is unambiguous or regular,
etc. We will consider many other questions such as, "is L(G) a deterministic context-
free language," or "unambiguous," or "precedence." These problems are often
undecidable, and the proofs are sufficiently similar that one suspects that the subject
can be unified so that proving one theorem will distill the essence of the argument.
This section is devoted to such a theorem.
We will need some properties of the "bounded sets." These sets were first
studied because it was noticed that many of our examples were of the form of subsets
of a*b*c*. It is useful to generalize the concept as follows:
Definition A set L 9 2* is bounded if there exist k> 0, wh . . . , wfe G 2*,
such that:
L £ w* ■ ■ ■ w*..
A set that is not bounded is called unbounded.
For example,
LoULi = {a'b'ck\ i = / or / = k, i,f,k>\}
is bounded. Also,
{(aba)lc(aba)l\ i>0)
is bounded but
{a,b}*
is unbounded.
We need the following lemma about unbounded regular sets and finite
automata.
Lemma 8.5.1 Let a, b G 2 and let L 9 2*, where L = T(A) for some reduced
finite automaton A = (Q, 2, 5, q0, F). Either:
1 L is bounded, or
2 There exist a ¥=b in 2,x1,x2,x3,x4in 2*, and^ EQ such that
5(<7o,*i) = Q, (8-5.1)
8(q,ax2)=q, (8.8.2)
S(q,bx3) = q, (8.8.3)
and
8(q,x4) G F. (8.5.4)

8.5» A POWERFUL METATHEOREM 263
Proof Let \Q\ = k and we induct on k.
Basis, k = 1. T(A) = 0 or T(A) = 2*. If L = T(A) = 0, then L is bounded.
If T(A) = 2*, then q0&F and .x^ = x2 = x3 = x4 = A satisfy Condition 2 of the lemma.
Induction step. Assume the result for all finite automata with k or fewer
states and assume \Q\ — k + 1. Suppose (2) is false. That is, for all xlt.. . ,x4E 2*,
the conjunction of (8.5.1), (8.5.2), (8.5.3), and (8.5.4) is false. Now choose x{ = A,
which forces qo = q- Choose any x4 such that (8.5.4) is true (why does such anx4
exist?). This means that at most one of (8.5.2) and (8.5.3) can be true; i.e., there
exists a € 2,,x = x2,such that
S(q,ax2) = 8(q0,ax) = q0.
Moreover, by choosing x to be minimal among the set of all nonnull y, so that 8(q0,y)
= q0, the string ax is unique. For each c € 2, define
Ac = (Q-fao},2,6',5(?o,c),F-te0}),
where
5' = sn((Q-{?0})xSx(s-y)).
If 6(<7o>c) = <7o> define Ac to be a one-state finite automaton which accepts 0. Now
{(ax)* U cT(Ac) Xqo$F,
I c fc 2
r04) =
((«)* c U£ c7T>U) u («)* ^ <lo e F-
By the induction hypothesis, T(AC) is a bounded set. But the bounded sets are closed
under finite union and products, so that T(A) is bounded. □
The previous lemma is employed to prove the following useful theorem.
Theorem 8.5.1 Let L be any regular set over 2 = {a, b}. The following three
conditions are equivalent.
a) L is unbounded.
b) There exist u, v,x,y in 2* such that
x{au,bv}*y 9 L.
c) There exist u, v,x,y in 2* such that
x{ua,vb}*y 9 L.
Proof Parts (b) and (c) are clearly equivalent, since L is (un)bounded if and
only if LT is (un)bounded. (Cf. Problem 1 at the end of this section.)
Next, we show that (a) implies (b). Since L is regular and unbounded, it
follows, from Lemma 8.5.1, that there exist x = xu u = x2, v = x3, andy = x4, so that

264 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
(8.5.1) through (8.5.4) of the lemma are satisfied. Pasting together the computation
of A in Lemma 8.5.1 gives that
x{au,bv}*y£ L.
To show that (b) implies (a), we proceed by a sequence of claims.
Claim 1
a) If L contains an unbounded set, then L is unbounded.
b) If L is unbounded, then so is {x}L, where x € 2*.
c) If Z, is unbounded, then so is {x}L{y} for x,y £ Z*.
Proof (a) follows from Problem 8.5.1 by contraposition, (b) follows by
induction on lg(x). (c) follows from (b) and the closure of the bounded sets under
reversal (Problem 1 again).
Claim 2 In order to show that (b) implies (a), it suffices to show that
{au,bv}*
is unbounded.
Proof Suppose that {au, bv}* were unbounded. By Claim 1(c),
x{au,bv}*y
is unbounded. But, by (b), if x{au, bv}*y 9 L, then Claim 1(a) implies that L is
unbounded and we would be done.
Suppose now
{au,bv}* 9 w*- ■ ■ Wfe
for some w1;. .., wk G 2*. Let £; = lg(w,-) and let £ be the least common multiple of
£1;. . . , £fe. Consider the string
daufibvff = wj> • • • wjfe (8.5.5)
and let us call a string {auf{bvf a block.
Claim 3 Each block of (8.5.5) must contain (as a subword) W; and Wj, where
Proof If not,
(aufibvf = wp for some py;
but
lg(W;)| £ lg(fl!i)
and
lg(w,-)| £ Igfbv).
Thus Fig. 8.1 describes the factorizations.
But it is clear that the first character of ws must be an a (cf. the left arrow
in Fig. 8.1). On the other hand, the first character of w; must be a b, by the second
arrow in Fig. 8.1. This contradiction establishes Claim 3.

8.5* A POWERFUL METATHEOREM 265
* I
1
a i u
i
Wj
au
au
. . .
Wj
b i v
i
Wj
. . .
bv
. . .
Wj
FIG. 8.1
To finish the proof, Claim 3 establishes the need for two distinct words in
each block. So we have k blocks, and we must assign wx- ■ -wk to the word
{{auf{bvff,
using two different wt in each block. The best way we could do this is shown below.
Block 1 Block 2 Block k -1 Block k
Assignment 1,2 2,3 k — \,k k, k + \
But we need k + 1 words w; and we have only k of them. This contradiction establishes
the result. □
In order to state our main result, it will be necessary to introduce predicates
on languages.
Definition Let X be any set. A predicate & is any function from X into
{true, false). & is nontrivial if there exist xux2&X so that ^"(x,) = true and &(x2)
= false.
Now some special predicates are introduced on a family of languages over
the alphabet {a, b).
Definition Let 2 = {a,b} and let & be a predicate on a class of languages
over 2*. Let u, v,x,y € 2*. Define the one-to-one homomorphism <p where
ifi(a) = au
ip(6) = bv
Then define
^f(^",x,u,v,y) = {L'\ L' = ip'i(W-l(x-1Ly-l)),we{au,bv}+, &>(L) = true).
Define the one-to-one homomorphism tp by
\l/(a) = ua
\l/(b) = vb
Then
&(&,x,u,v,y) = {L'\ L' = \p'l(x'lLy'l)w'l,we {ua,vb}+, 3»{L) = true).
Note that £f{JP,x,u,v,y) and ■$?(&,x,u,v,y) are in fact families of
languages over {a, b).

266 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
Example Suppose one has a predicate on languages, which is always false
except for the language
L = abb{aba,b4,aa}*ab.
Here* = abb,u = ba,v = b, andy = ab. Then
x~lLy-x = {aba,b\aa}*.
Then
M = {L'\ L' = w'\x'1 Ly'l),wG {aba,bb}+}
= {{aba,b\aa}*,bb{aba,b\aa}*}.
Putting the pieces together yields
&(&,x,u,v,y) = {{a,bb}*,b{a,bb}*}.
Our next technical lemma contains the essence of the proof of the main
result.
Lemma 8.5.2 Let & be any predicate on the family of context-free languages
over 2 = {a, b). If there exists a context-free language L0 and strings x, u, v, y in
{a, b}* such that either:
a) £>(L0) = true,
b) x{au, bv}*y £ L0, and
c) J?f(JP,x,u, v,y) is a proper subfamily of the context-free languages
or overE*;
d) ^{L0) = true,
e) x{ua, vb}*y 9 L0, and
f) &(&,x,u,v,y) is a proper subfamily of the context-free languages
over 2*;
then, for any arbitrary context-free grammar G, the predicate ^(L(G)) is undecidable.
Proof Let & be any predicate over the context-free languages that satisfies
(a), (b), and (c). (There is a dual proof for (d), (e), and (f), which will be omitted.)
From (c), there is a context-free language L' £ 2* that is not in £f(& ,x,u, v,y).
Define two one-to-one homomorphisms yx and <p3 by:
ifiia = au
yxb - bv
<p3a — auau
<p3b = aubv
Let G be an arbitrary context-free grammar over 2:
Lx = x*p3(L(G))bvau{au, bv}*y
L2 = x {auau, aubv}* bvauiPx(L')y
L3 - L0r\(E*—{x {auau, aubv}* bvau{au,bv}*y})

8.5* A POWERFUL METATHEOREM 267
Clearly, Lu L2, and L3 are context-free languages. Then i1UZ,2U L3is a context-free
language and let Gj be a context-free grammar such that
Ud) = iiU/,2Ui3. (8.5.6)
Since G is given, Gx can be effectively constructed.t
Claim ^(UGJ) = true if and only if L{G) = 2*.
Proof of claim Suppose
L(G) = {a, b}*.
Then
if3(L(G)) = {auau, a«6p}*
and
LiVL3 = L0.
To see (8.57), note that, if z G L3, then z G Z,0. If z G Z,1; then, since Z,j 9 x{aw, 6^}*^
£ Z-o by (b) of the lemma, so zEL0. Thus LiC\L3£ L0. Conversely, assume zGZ,0.
If z is the form in Lx, then we are done; otherwise it is in L0 n (2* — Z,j) = Z,3. This
proves (8.5.7). From (8.5.7) and the fact that L2 £ L0, we have that
LiG^ = Z,,UZ,2U/,3 = /,0U/,2 = /,„,
and so
&(L(Gi)) = true.
Conversely, assume that L(G) $ {a,b}*. Then there exists
ze{a,b}*-L(G). (8.5.8)
Then
tfi3(z)G{auau,aubv}* —<p3(L(G)).
Let
w = tfi3(z)bvau.
Clearly,
JcwE*nZ,! = 0, (8.5.9)
because if xws€.Lu then, since w = y3{z)bvau,we would have zGZ,(G), which
contradicts (8.5.8).
Define
r = {r| xwry g/.3}.
We claim that
TC\{au,bv}* = 0, (8.5.10)
because f£Tn {aw, 6p}* implies that
xw/y ex{auflu,au6^}*6^au{flu,ftz;}*j; = M,
so that
xwfy<£z,3 = L0r\(Z*-M),
which contradicts the definition of T.
t That is, there is some recursive function that produces Gj from G. However, we
don't know which recursive function does so unless we have an effective presentation
ofZ,'.

268 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
We claim that
w'\x''lL(Gl)y-'1) = ^(Z,')ur. (8.5.11)
To see this, suppose t € w~i(x~1L(Gl)y~i). Then
xwtyEKG^. (8.5.12)
If xwty E L3, then t&T and we are done. Assume that xwty£.L3. By (8.5.6) and
(8 5 12^
v ' ' '' xwtyeLiULi. (8.5.13)
But xwty £LX because of (8.5.9). Thus (8.5.13) leads to
xwtyGL2. (8.5.14)
From the definition of L2 and (8.5.14), we have
te^(L').
The argument can be reversed and this completes the proof of (8.5.11).
Applying, <£i' to (8.5.11) yields:
*71(w"VI/.(Gi)J'~1)) = ri'iML') U T). (8.5.15)
Since ipj is one-to-one,
<PiVi(£') = £',
and moreover
^(^ = 0,
since rG^^T) would imply <p{t e{au,bv}* n T, which contradicts (8.5.10). Thus
(8.5.15) becomes
^1(w-1(x-1L(G1)>'"1)) = L'.
If ^(^(G!)) = ftw, then L' would be in &{& ,x,u, v,y), which would be a
contradiction. Therefore, ^(L(Gi)) = false, and the proof of the claim is complete.
The predicate &(L{Gd) is undecidable because it is undecidable whether or
not L{G)— {a, b}*, by Lemma 8.4.2. But given G, one can effectively obtain Gx,
so &(L(G)) is undecidable. □
Now, we are able to state the main result of this section.
Theorem 8.5.2 Let y be any subfamily of the finitely inherently ambiguous
context-free languages over 2 = {a, b} such that there is some language L0 in Sf which
has an unbounded regular subset. For an arbitrary context-free grammar, the predicate
L{G) e ¥
is undecidable.
Proof By Theorem 7.4.3, the finitely inherently ambiguous context-free
languages are closed under inverse homomorphisms and quotients with singleton sets
on both the left and the right. Define the predicate & by
(true if Ley,
P{L) =
[false UL^y

8.5* A POWERFUL METATHEOREM 269
Then, for a\\x,u, v,y €2*, Sf(& ,x,u,v,y)is a subset of the context-free languages
of finite degree of inherent ambiguity. There exist context-free languages L' which are
infinitely inherently ambiguous, by Theorem 7.3.3; and so (c) of Lemma 8.5.2 is
satisfied. Moreover, by assumption, there exists L0E J/'with an unbounded regular
set. By Theorem 8.5.1, (b) is satisfied. Since L0E y, ^(Z,0) = true and (a) is
satisfied. Since the hypotheses of Lemma 8.5.2 are satisfied, the conclusion holds. □
We list a few corollaries of this theorem. In the future, we shall make extensive
use of the result. Note that the simplest example of an unbounded regular subset is
{a,b}*.
Theorem 8.5.3 For the following families Y of context-free languages over
2 = {a, b} and for arbitrary context-free grammars G, it is undecidable whether or not
L(G) e y.
1 the finitely inherently ambiguous context-free languages',
2 the finitely directly inherently ambiguous context-free languages;
3 for any k > 1, the context-free languages of degree of inherent ambiguity
<k;
4 the unambiguous context-free languages;
5 the regular sets;
6 the deterministic context-free languages;
7 the transpose of any of the above classes.
PROBLEMS
1 Show that the collection of bounded sets over 2 is closed under product,
transpose, and union. Show that a subset of bounded sets is bounded.
2 Show that if | 21 > 2, then 2* is not bounded.
3 Verify that Theorem 8.5.2 holds if the languages are specified by
pushdown automata rather than by context-free grammars.
4 Show that, if L0 is any fixed context-free language which contains an
unbounded regular subset, then it is undecidable whether or not L(G) = L0.
5 Show that, for any arbitrary context-free grammar G, one can decide
whether or not
L(G) = {wcwT\ we{a,b}*}.
6 Show that one cannot decide whether a given context-free language can be
parsed in linear time.
7 Show that R is a bounded regular set if and only if R belongs to the
smallest family of sets which contains all regular subsets of w*, w G 2*, and which is
closed under finite union and finite product.
8 A set L is commutative if, for all wordsx,y EL,xy = yx. Show that a set
is commutative if and only if /. 9 w*.

270 DECISION PROBLEMS FOR CONTEXT-FREE GRAMMARS
9 Let u, DGE*. Show that uv=fcvu implies {u, v}* is unbounded. This gives
another proof of Problem 2.
10 Show that every context-free language L is commutative or else L*
contains an unbounded regular set.
11 Show that it is undecidable whether or not. L(G) is linear, or metalinear,
or ultralinear.
8.6 HISTORICAL SURVEY
The basic decision questions in this chapter date back to Bar-Hillel, Perles,
and Shamir [1961]. The unsolvability of the halting problem is due to Turing [1936].
The Post Correspondence Problem is from Post [1946]. Our proof follows Hopcroft
and Ullman [1969]. Exercises 7,8, and 9 of Section 8.3 are based on Hartmanis
[1967a] .problems 10 and 11 are from Ullian [1967]. Problem 2 of Section 8.4 is from
Gruska [1975], while Problem 3 was proved by Blattner [1973], Salomaa [1973b],
and Rozenberg [1972]. "Birdie problems" like Problem 4 of Section 8.4 were
systematically studied by Ullian [1967]. Problem 6 of Section 8.4 is due to Englefriet
[1973], and Problem 7 is due to Ginsburg and Spanier [1966].
The main results of Section 8.5 are derived from the work of Hunt and
Rozenkrantz [1978]. Bounded languages were introduced by Ginsburg and Spanier
[1964]. Lemma 8.5.1 is due to Hopcroft [1969]. Problem 4 is also from Hopcroft
[1969].

nine
Context-Sensitive
and Phrase-Structure
Languages
9.1 INTRODUCTION
It is possible to characterize both the context-sensitive languages and the
phrase-structure languages in terms of Turing machines. For the context-sensitive
case, one needs to restrict attention to nondeterministic, linearly space-bounded
devices. This focuses attention on both time- and space-bounded computations of
nondeterministic as well as deterministic Turing machines. While the general topic
of machine complexity is outside the scope of this book, some of the open questions
in language theory are related to questions in machine complexity theory. The serious
student of language theory is well advised to learn this subject in more depth than can
be provided here.
In Section 9.2, we begin by considering a number of alternative forms of
phrase-structure grammars. All of these are useful in one application or another.
Section 9.3 is devoted to a study of the basic properties of the context-sensitive
languages. For instance, every context-sensitive language is recursive but there are
recursive sets that are not context-sensitive. Section 9.4 is devoted to time-and tape-
bounded Turing machines. A few of the basic theorems of machine-based complexity
theory are proved there, such as Savitch's Theorem (9.4.3), which relates space-
bounded nondeterministic and deterministic computations. In Section 9.5, machine
characterizations of the context-sensitive and phrase-structure languages are given.
9.2 VARIOUS TYPES OF PHRASE-STRUCTURE GRAMMARS
In Sections 1.4 and 1.5, a number of types of grammars were introduced
for both the context-sensitive languages and the phrase-structure languages. We shall
271

272 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
now introduce one more grammatical definition. This family of grammars also
generates the phrase-structure languages.
Definition A nonterminal rewriting grammar (or NR grammar) is a 4-tuple
G = (V, 2, P, S), where V, 2, and S are as in a phrase-structure grammar. P is a finite
set of productions of the form
*0^1*l ' ' ' xn-\AnXn ~*x0&\x\ ' ' 'xn-lfinxn
with«>0,;c,G 2*,/3,-G F*,and4,. G7V.
Note that, in a nonterminal rewriting grammar, the terminals are not
displaced once they are generated.
If we abbreviate each class of grammars by its initials, Fig. 9.1 displays the
grammar class containments.
PSG
s' \ ps
/ \ TO
TOG NRG \ NR
MG CSE
CSEG / M
/ CS
CSG
FIG. 9.1
We shall now quickly show that the following classes of languages are
coextensive.
PSL = TOL = NRL = CSEL
and
CSL = ML.
To show how these families interrelate, a sequence of easy results will be
proven.
Lemma 9.2.1 For each phrase-structure (respectively, context-sensitive with
erasing, context-sensitive, monotonic,NR) grammar G= (V, 1,,P, S), there is a phrase-
structure (respectively, context-sensitive with erasing, context-sensitive, monotonic,
NR) grammar G' = (V', 2,P', S) such that
UP') = UG)
and each rule in P' is of the form
a ->■ (3
withaG7V+,/3G7V*,or
A -* a
with A G TV and a G 2.
= phrase structure
= type 0
= nonterminal
rewriting
= context-sensitive
with erasing
= monotonic
= context-sensitive

9.2 VARIOUS TYPES OF PHRASE-STRUCTURE GRAMMARS 273
Proof Let G = (V, 2, P, S) be a grammar of the desired type. Define a map
tp: v^ V' =VV{Aa\aE 2}, such that:
<*?(4) = A if A G7V,
*(«) = Aa if a G 2,
and extend «p to be a homomorphism. Define G' = (F', 1,,P', S), where
i>' = {<p(a)^<p(0)\a^0ishiP}U{Aa^a\ae2}.
It is a straightforward matter to verify that L(G') = Z,(G) and G' is of the same type
asG. D
Corollary 1 The class of phrase-structure languages equals the class of type 0
languages.
Proof G' is a type 0 grammar. □
In order to deal with the other classes, we need a reduction lemma.
Definition For any phrase-structure grammar G = (V, 2, P, S), let the weight
of a production -n = a -+ (} be lg(/3) and let the weight of G be max{lg(/3) \a -+ P is in i>}.
Lemma 9.2.2 For each type 0 (monotonic) grammar G = (V, 2,P,S), there
is a type 0 (monotonic) grammar G' = (V',2,P',S) of weight at most 2, so that
L(G') = £(G). Furthermore, if G is of type 0, then the only length-decreasing rules in
G' have the form A-*A for A G F' — 2; if G is monotonic, then every rule in G' has
the form a -► (3 for a GN+ and 0 G V*.
Proof First we must modify G so that it has no length-decreasing rules except
A-rules of the form A-*A. This step is not necessary for monotonic grammars.
Define a new grammar G' = (V\ 2, P', S), where V' = F U {i?} and i>' is
defined as follows.
1 If a -»(3 is in i> and lg (a) < lg (0), then a -> (3 is in P'.
2 UAi ■■■Am^B1 ■■■Bn is in P with Ai,..., Am &N, Bi,. . .Bn e V,
and« <w,then
A, ■■■Am^B1 ■■•BnRm-n
is in P.
3 P' contains i?-» A.
Note that a rule 45C-» A in G is transformed to ABC^-RRR in G'.
Clearly, L(G') = £(G), G' is type 0, and G' has context-free A-rules but no
other length-decreasing rules.
Now assume that we have a grammar G = (V, 2, P, S) which is monotonic
or is the result of the previous construction. In the former case, we may assume (by
Lemma 9.2.1) that every rule in P has the form a -> (3 where aEN+,0G V*.

274 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
Let 7r be a production in P. If the weight of -n is less than 3, then -n is in P'.
If the weight of -n is at least 3, then
* = Ai ■■■Am^B1 ■■■Bn
where m<n and n>3. Perform the following steps.
1 If m — n, then go to step 2. Let P' contain:
Xn_! -*■ Bm • • • Bn
This rule provides either rules whose left- and righthand sides have the same length,
or rules of reduced weight.
2 For rules a = Cx ■ ■ • Cm -»£>! • ■ ■ Dm, where m>3 and eachD,-G V — 2,
P' contains:
C,C2 -> DyYoA
Ya,iCi+2 -yDi+1 YaJ+1 for 1 <i<m-2
All of these rules are of weight 2. Thus steps 1 and 2 start with a production
of weight n and produce only productions of lesser weight. Repeat the construction
until all the rules associated with the original production have weight 2. The argument
is repeated for all productions whose weight is at least 3. □
Now we may apply this result to establish an equivalence between context-
sensitive and monotonic languages.
Theorem 9.2.1 For each type 0 (monotonic) grammar G = (V, 2, P, S),
there is a context-sensitive grammar with erasing (context-sensitive grammar) G' =
(V, 2, P', S) such that L(G') = L(G).
Proof By Lemma 9.2.2, assume without loss of generality that the weight of
G is at most 2, that G contains no length-decreasing rules except those of the form
A -> A, and that every rule in P has the form a -> (3 for some a GN+ and (3 G V ■
i) If a production n GP has only one nonterminal on its lefthand side, then
ttGP';
ii) If Tr=AB^CDandC = A oiD = B, then tt G P'\
iii) If 7r = AB -> CD with (7^4 and D^B, then the following productions
are in/'':
,45 -► (tt, A)B
(tt, 4)5 -► (tt, A)(v, B)
(7T, 4)(7T, 5) -► C(7T, 5)
C(tt,.8) -► CD
where (w, 4) and (w, 5) are new variables.

9.3 ELEMENTARY PROPERTIES OF CONTEXT-SENSITIVE LANGUAGES 275
It is clear that G' is context-sensitive (with erasing if G was not monotonic).
An easy induction verifies that L(G') = L(G). □
Corollary 1 The family of context-sensitive languages with erasing equals
the family of phrase-structure languages.
Corollary 2 The family of monotonic languages equals the family of
context-sensitive languages.
Corollary 3 The family of nonterminal rewriting languages equals the family
of phrase-structure languages.
PROBLEMS
1 Show by examples that Fig. 9.1 cannot have any lines added. That is, there
is a line between two grammar classes XG and YG in Fig. 9.1 if and only if
XG £ YG where X, Y e {PS, TO, NR, CSE, M, CS}.
93 ELEMENTARY DECIDABLE AND CLOSURE PROPERTIES OF
CONTEXT-SENSITIVE LANGUAGES
There are some very simple properties of the context-sensitive grammars
that will be developed here. Since only grammatical arguments are now available,
we shall use them as much as is reasonable. There are a number of closure results
that are intuitively clear when an automaton theoretic characterization is available.
These results could be proven with grammars but would be less intuitive.
Our first goal will be to give an algorithm that tests whether a given string
belongs to a context-sensitive language.
Definition Let G = (V, 2, P, S) be a context-sensitive grammar and let
w G 2" for some n > 1. Define a sequence of sets Wt £ V* as follows:
Wo = {S}
and for each / > 0,
Wi+l = W,-U{0G F+|a=>j3inG,aGH/,-,andlg(|3)<«}.
The next claim summarizes the properties of the Wt.
Proposition 9.3.1 Let G = (V, 2, P, S), w, and the Wt be as in the previous
definition.
i) For each / > 0, W,- c wi+l;
ii) If Wk = Wk + 1 for some k > 0, then
Wk = Wk+m forallin>l;
iii) For each/^0,
Wt = {peV+\S 2>p,]g(p)<n,Z<i\;

276 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
iv) There exists k < max {2\V\",n + 1} such that
Wk = Wk+l;
v) Let k be the least integer such that Wh = Wk+l. Then
Proof Part (i) is trivial. Part (ii) is a trivial induction on m. Part (iii) is clear,
or could be verified by induction on /.
To prove (iv), note that, from (iii),
n
W, c (J Vs (9.3.1)
for each /. Taking cardinalities
m <
U v'
y=i
£ Wi < max{2|Fr,« + 1}.
y=i
Cf. Problem 1. Since \V\ and « are fixed, the Wt are of uniformly bounded size; and
since W{ 9 Wi+l, it is clear that:
Wk = Wk+l
for some k where k< max {21V\", n+ l}. B
Part (v) follows easily because, if S^-p, lg(/3)<«, then 5=>|3 for some
fi > 0. If fi < k, then (3 G WB c i|/fe. if 8 > jfc, then, since WB = Wk by (ii), we know that
pew„. □
We can now show that there is an algorithm to decide whether w G L(G),
where G is an arbitrary context-sensitive grammar.
Theorem 9.3.1 Let G=(V, 2,P, S) be a context-sensitive grammar. Then
L(G) is recursive; i.e. there is an algorithm which, given any w G 2*, decides whether
or not wGZ,(G).
/W/ Let w = ai ■■ -an, n>0, a; G 2 for l</<«. If « = 0, AGZ,(G)
if and only if
5 -► A
is in P. Now assume that n>\. Let W= Wk of Proposition 9.3.1, where k is the
smallest nonnegative integer such that Wk = Wh+l. By (v) of Proposition 9.3.1, w G
L(G) if and only if w G PV. Since W is finite and has been effectively constructed, the
result follows. □
Now we shall show that the context-sensitive languages are a proper subset
of the class of recursive sets. The argument is a straightforward "diagonalization." The
reader not familiar with the technique should study it carefully, because it is very
powerful and widely applicable.
Theorem 9.3.2 There are recursive sets which are not context-sensitive
languages.

9.3 ELEMENTARY PROPERTIES OF CONTEXT-SENSITIVE LANGUAGES 277
Proof Let 2 be an alphabet. Let xt,x2,.. .be an effective enumeration
(without repetition) of 2*. Let Gx, G2,... be an effective enumeration (without
repetition) of the context-sensitive grammars with terminal alphabet 2. Define
L = {x^x^UGd}. (9.3.2)
Claim 1 L is recursive.
Proof Given wG2*, by using the effective enumeration of 2*, we can
determine the / for which xt = w.
Now using the effective enumeration of the grammars, we can find Gt.
By Theorem 9.3.1 it is decidable whether or not xt G L(Gt).
Claim 2 L is not a context-sensitive language.
Proof Assume that L is a context-sensitive language. Then
L = L(Gj) for some/ (9.3.3)
But now consider string xf,
Xj<=L
if and only if (by (9.3.3))
XjGLiGj),
which holds if and only if (by (9.3.2))
xj £ L.
We have shown that xj G L if and only if jcy ^ Z,, which is a contradiction. Therefore,
Z, is not a context-sensitive language. □
Now we turn to the second main topic of this section, elementary closure
properties of the context-sensitive languages. Our purpose here is to establish a few
easy results which are best proven by elementary grammatical constructions. Since
some of the same constructions work for phrase-structure languages, those will be
proved at the same time.
Theorem 9.3.3 The families of context-sensitive and phrase-structure
languages are closed under U, ■, *, and transpose.
Proof The arguments are all quite trivial and are omitted. Cf. Theorem 3.3.1
for similar constructions in the context-free case. □
In a later section, it will be seen that the context-sensitive languages are not
closed under homomorphisms. Moreover, it is the case where A G ip(2) that causes
problems. We now consider a restriction on homomorphisms which does allow
preservation of context-sensitive languages.
Definition A homomorphism ip on a set L £ 2* is said to be k-limited (k > 1
is an integer), if for each wGZ,, if w =xyz for some x, y,z G2* and ip(y) = A then
lgO0<*.

278 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
The intuition behind this definition is that for any w G L, $ never erases more
than k adjacent letters within w.
Example Let
Lx = {apbqcr\p,q,r>\},
L2 = {apb2cq\p,q>l}
and define ip by
(A ifd = b
<pd =
{d otherwise.
Now ip is not /c-limited on Lx for any k. To see that, suppose it were. Then
consider
w = abk*xc^Lx.
Let x = a, y = bh+l, and z = c. Then tf(y) = A but lg (j>) > k. On the other hand,
V is 2-limited for L2 since no word of L2 has more than two b's.
Theorem 9.3.4 The family of context-sensitive languages is closed under
k -limited homomorphisms.
Proof Let G = (V, 1,,P, S) be a context-sensitive grammar. There is no loss
of generality in assuming that G is in the form specified by Lemma 9.2.1. Let ip
be a &-limited homomorphism on L(G). Note that ip: 2* -*■ A*. Define
r = max{&+ l,lg(0)|a->-0is inP}.
Define G = {V,1,\P',S'), where
2' - {b G A| <p(2) n A*bA* ± 0};
that is, 2' consists of all letters of A that occur in ip(2). Then,
TV' = {[a]|aGF*,lg(a)<2r},
S' = [5],
V' = N'U E',
i>' is defined as follows:
1 If S -*■ A is in P or if there exists x G L (G) such that «p (x) = A then
[5] - A
■ • n*
isinr .
2 If a =>0 in G and [a], [|3] G N', then
[a] - \p]
isinP'.

9.3 ELEMENTARY PROPERTIES OF CONTEXT-SENSITIVE LANGUAGES 279
3 For any [a], [0, ]...., [0m ] G N1 such that
a =* /J, ■ • ■ /3m in C
where lg(/3;) = r for each /, 1 <i<m, and r < lg(/3m) < 2r, then
[«] - Wi] •■•[/3m]
isinP'.
4 For any [a,], [a2], [/J,],..., [0m] in N1 such that
<*i<*2 ^ Ui ■■■ftn
where lg(ft-) = r for each /', 1 <i<m, and
r<lg(/3;)<2r and r<lg(aI),lg(a2)<2r,
then
[«i][«a] - [Ui] •■[/3m]
isinP'.
5 For each [jc] G TV', jc G E*, v?(x) ¥= A,
[x] -► ^.(ac)
isinP'.
All the rules of P' are context-free except those of (4). G' is monotonic, so
by Theorem 9.2.1, L(G'), is context-sensitive. Moreover, G' is effectively constructable
since, in part (1), we can check' to see whether there exists xEL(G) such that
ip(x) = A, because ip is k-limited on L(G) and we need check only those strings x
where lg(x) < k. This can be done by Theorem 9.3.1. The tests required in parts (2),
(3), and (4) can be carried out, so G' is effectively constructable from G.
We leave to the reader the verification that L(G') = ^p(L(G)). □
Corollary The family of context-sensitive languages is closed under A-free
homomorphisms.
PROBLEMS
1 Show that
n . [2\V\n if|K|>l,
I IKK <
hi [n+l if|K|=l.
2 In subsequent sections, we shall learn that there are phrase-structure
languages which are not recursive. Where do the proofs of Theorem 9.3.1 and
Proposition 9.3.1 break down if G was assumed to be an arbitrary phrase-structure grammar?
t Later we will see that the question L (G) = 0 is unsolvable for context-sensitive
grammars G.

280 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
3 What is the recognition time of the algorithm given in Theorem 9.3.1?
Give this as a function of n, the length of the input.
4 Show that the family of phrase-structure languages is closed under
a) substitution (and hence homomorphisms);
b) gsm mappings;
c) inverse gsm mappings.
5 Show that the family of context-sensitive languages is closed under
a) A-free substitution; [ip is a A-free substitution if for each s6E, A^
b) inverse gsm mappings.
6 Complete the proof of Theorem 9.3.4 by showing that Z-(G') = *p(L(G)).
7 Formulate suitable restrictions on the type of substitution mappings under
which the context-sensitive languages are closed.
8 Give effective enumerations of 2 and of the context-sensitive grammars
over 2, as described in the proof of Theorem 9.3.2.
9 Does it matter whether or not there are repetitions in the enumerations
used in the proof of Theorem 9.3.2?
9.4 TIME-AND TAPE-BOUNDED TURING MACHINES- 9 AND JV3>
In this section, we shall take a close look at Turing machines and examine
some implications of nondeterminism versus determinism. We shall also consider time-
and tape-bounded computations, as well as two particularly important classes of sets
called & and ^V&.
Our coverage of these topics will not be thorough, as we need only a few
results for our purposes. This general topic is important and textbooks should become
available on this subject soon.
We begin with an intuitive version of a fc-tape nondeterministic Turing
machine. Such a device is illustrated in Fig. 9.2.
Finite -
state
control
Y I
• • • Tape 1
••• Tape 2
; :
• • • Tape k
FIG. 9.2

9.4 TURING MACHINES 281
In this version, the fc-work tapes are semi-infinite (to the right) and the first
tape will contain the input. It can be shown that the results to be obtained are very
insensitive to these details.
It is now necessary to give a precise definition.
Definition A k-tape nondeterministic Turing machine is a 7-tuple A =
(Q, 2, T, B, S,q0,F), where:
i) Q is a finite nonempty set of states;
ii) 2 is a finite nonempty set of input letters;
iii) T is a finite nonempty set of work symbols and 2 Q V;
iv) B € r — 2 is the blank symbol;
v) Qo e Q is the initial state;
vi) F£ Q is the set of final or accepting states;
vii) 5 is a mapping from Q x Tfe into subsets of Q x (r x {—1,0, l})fe. 5 is
called the transition function.
Such a device is called a k-tape deterministic Turing machine if
\8(q, ai,... ,ak)\ < 1 foreach^G2;a1;... ,ak G r.
Next, we define an instantaneous description or ID of a k-tape Turing
machine. Intuitively, an ID must tell us what state the device is in, the contents of each
work tape, and the head position on each tape.
Definition Let A = (Q, 2, T, B, 5, q0,F) be a fc-tape nondeterministic Turing
machine. An instantaneous description or ID of A is any element of Q(T*{\}T+)k,
where we assume that Q n r = 0 and that I" is a new symbol not in Q U T.
Thus a typical ID is of the form
^(air|31,...,afer/3fe),
which indicates that the contents of the /th tape is a,-/3,- and that the read-write head
on this tape is scanning the first letter of ft. For single-tape Turing machines, one
sometimes writes the ID as otq0.
Now we are ready to indicate how the device works. This is done by defining
a "move" relation on ID's. To avoid a surplus of notation, an informal definition
is given.
Definition Let A = (Q, 2, r,B,8,q0,F) be a fc-tape nondeterministic
Turing machine. If a is an ID of A, where
a = q{ux \alv'[,...,uk\akvl)
and
(q', (bud!),..., (bk, dk)) isin5(q,a1,... ,ak),
then A rewrites each a,- by bj and moves one position to the right (left) on tape / if
dj = + 1 (—1), while, if d{ = 0, the head position on tape /' does not change. If any
v'j = A and dt = + 1, then the contents of tape / is represented by utbi tB in 0. If 0 is

282 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
the resulting ID, we write a \— (3. As usual, pMs the reflexive-transitive closure of
h.
An ID is called initial if it is of the form qo(tw, \B,..., \B), wheret
w G 2+ U {5}. An accepting ID is any element of F(F* {l~}r+)fe.
Now we can discuss acceptance by a nondeterministic Turing machine.
Definition Let A = (Q, 2, I\ B, 5, #0> -F) be a fc-tape nondeterministic
Turing machine and let w£S*U{B}J accepts w (or when w = 5, A accepts A), if
q0(\w, \B,. .. , IB) \—q(oti,.. ■ ,ak) for some q £F
and some a1,... , ak G r* {\} Y*.
We also write T(A) = {w G 2* | ^1 accepts w}.
Next, we name the sets defined by a Turing machine.
Definition A set L is said to be recursively enumerable (or r.e., for short)
if L = 7X4) for some Turing machine A.
It can easily be seen that this definition is insensitive to whether or not A
is deterministic or has multiple tapes, etc. Thus an r.e. set has the property that
there is a procedure which returns the value true if the string is in the set but, if the
string is not in the set, the procedure may return the value false or it may fail to halt.
Of course, we can't determine whether a Turing machine will stop or not.
Definition A set L is recursive if both L and L = 2 —L are r.e.
Thus, a recursive set is one for which we can determine whether or not a
given string is a member. This corresponds to our previous informal usage of the
phrase.
This formalism and the ideas involved can be appreciated by studying a
detailed example.
Example Let L = {xzyzT \x,y,zG {a, b}*, z ¥= A}. We shall design a 2-tape
nondeterministic Turing machine which accepts L. The idea is quite simple. We shall
read the input, which is on the first tape. We shall guess when z begins and ends, and
during z, we write it on tape 2. After this is done, we skip merrily along ignoring j>.
When we guess that zT begins, we compare the input against tape 2. We have 2 =
{a, b}, T = 2 U {c,B}, F = {q5}, and 5 is defined as follows.
i) For each d G 2,
8(q0,d,B) = (<7i,(rf,0),(c,l)).
Mark left end of tape 2 with a c.
t An initial ID in which w = A is represented by (\B, . . . ).

9.4 TURING MACHINES 283
ii) For each d G 2,
S(qi,d, B) = {(qi, (d, 1), (B, 0)), (q2,(d, 1), (d, 1))}.
The first alternative is a guess that the input letter is part of x and should
be skipped over. The second alternative is to assume that it is part of z
and should be written on tape 2.
iii) For each d G 2,
8(q2, d, B) = {(q2,(d, 1), (d, 1)), (q3, (d, 0), (B, - 1))}.
The first alternative continues to write z on tape 2 while the second one
prepares to consider that y is next,
iv) For each d,eG 2,
8(q3,d,e) = {(q3,(d,D,(e,0)),(q.,(d,0),(e,0))}.
Either skip over j> or prepare to check zT against tape 2.
v) For each d G 2,
8(q4,d,d) = fa4,(rf,l),(rf,-l)).
Go right on tape 1 and left on tape 2 to check for zT.
vi) Stop and accept.
8(q4,B,c) = (qs,(B,0),(c,0)).
The reader should follow several computations of this Turing machine on a
given input. For instance, it is worth checking that there is no computation sequence
that accepts ab.
There are many variants of Turing machines. One of the most common is a
Turing machine with a separate input tape which is delimited by endmarkers. This
input tape may be read but not written. Another variation concerns whether or not a
Turing machine must halt in order to accept. In some instances this is desirable, while
for other results just passing through a final state is sufficient. Halting is represented
by having a final state qf such that 5 (flf ,alt... ,ak) = 0 for all a\,... , ak G V.
Another variant in the way in which a Turing machine accepts concerns
"on-line" versus "off-line" computation. Let A be a Turing machine with a separate
read-only input tape. We equip the device with an output tape on which it may write a
zero or a one. A computation of A on input a\ • • -an is on-line HA never goes left
on the input and writes a one or zero on the output after reading a ,• (indicating whether
ai • • • a,- was accepted or rejected) but before reading a1+1. A computation is
sometimes said to be off-line if it is not on-line. It is much easier to prove lower bounds on
on-line computations, because the form that the computation takes is so restricted.
Next we deal with measuring the time or space complexity of a
computation.
Definition Let A be a &-tape Turing machine and let S and T be functions
from M into M where M denotes the set of natural numbers. A is said to be of time
complexity T(n) if, for each accepted input of length n, there exists some computa-

284 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
tion sequence of at most T(n) moves which leads to an accepting ID. Similarly, A is
of space or tape complexity S(n) if, for every accepted input of length n, there is some
accepting computation in which at most S(n) different squares are scanned by any
read-write head.
Example If we return to the previous example, it is clear that T(n) = n + 4
and S(n) = n + 1.
When we have a Turing machine with a separate read-only input, we measure
only the space used on the work tapes as the input tape cannot be changed. Thus we
may talk about accepting a set in, say, log« space.
One may wonder which functions could be the space bound of a
computation.
Definition A function S from N into N is a measurable space function, or
measurable, if there is a Turing machine A which has one work tape and a read-only
input tape and which visits exactly S(n) squares on its work tape, where n is the length
of the input.
For example, it is easy to see that log« is measurable. So is any polynomial
in n or in log n, as well as functions like 2". Of course, nonmeasurable functions exist,
by a counting argument. Techniques for exhibiting them will occur in the exercises.
The first result to be proved here will be useful in estimating the tape
complexity of language recognition problems.
Theorem 9.4.1 If a set L is accepted by a deterministic A:-tape Turing
machine A with space bound S(n), then L is accepted by some deterministic one-tape
Turing machine C with space bound S(n).
Proof The proof is quite intuitive, so we content ourselves with a sketch.
In Fig. 9.3, we show what A looks like at some given time. Figure 9.4 shows how C
represents all of this information on a single tape.
A
V
a\
"2
"3
a4
"5
•
•
•
ag
j£
b\ t>2 63 b4 b5
Cl c2
FIG. 9.3 A three-tape Turing machine.

9.4
TURING MACHINES
285
"\
*1
■
■
C\
"2
h
Cl
>'
"3
*3
B
a4
b4
B
Q
V
"5
'
f
b5
B
"6
B
B
"7
B
B
"i
B
B
a9
B
B
FIG. 9.4 A single-tape Turing machine which
simulates a three-tape machine; q is only one
component of the state.
To simulate a move of A, C must visit each of the cells which are pointed
at, recording in its state the symbol scanned by each of ,4's heads. When all this
information is available, C determines ^4's move. C then goes back to each tape,
changes the symbol scanned, and moves the pointers as A would require. It should be
clear that T(C) = T(A). □
Our next development shows that constant factors do not matter in
computing space bounds. The analogous result for time is left for Exercise 2.
Theorem 9.4.2 Let e > 0 be any real number. If L is accepted by a
deterministic A:-tape Turing machine A with space complexity S(n), then there is a
deterministic A:-tape Turing machine C which accepts L and has space complexity eS(n).
Proof The argument merely involves recoding the tape alphabet of A to
use, as basic symbols, blocks of length r where r is the least integer such that re > 1.
□
In preparation for the next theorem, a lemma about space-bounded Turing
machines is useful.
Lemma 9.4.1 Let A = (Q, 2, T, B, 8, q0, F) be a &-tape Turing machine of
tape complexity S(n), and w € 2* be an input of length n.
i) The number of different ID's which can arise from w is at most cs^"\
where c is a constant independent of n.
ii) If A accepts w, then there is an accepting computation
a0 = q0(lw, IB,..., [B) |-<*i I h^-i,
where
(1) at * aj for all i */ and (2) k < cS(n).

286 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
Proof Since lg(vv) = n, A can visit at most S(n) squares on any work tape.
Thus the total number of ID's which can occur during the computation is
|(2l(|r| + l)feS(">(5(«))fe,
because there are \Q\ choices for the state, and for each one of k tapes, there are at
most (|r| + l)S(n) tape contents,-!" and S(n) places for the head to be (and so a total
of (S(n))k head positions).
It is now clear that any accepting computation of A on w can be converted
to one which satisfies (ii) because any duplicate ID's involve a "loop", which can be
omitted.
Note that
IGI(|r| + l)feS(")(S(H))fe<cS(")
for a suitable constant c independent of n. The same constant may be used in both
parts of the lemma. □
It is a well-known fact that nondeterministic and deterministic Turing
machines are equivalent in accepting power, and this is proved in any course on the subject.
We prove a stronger result, namely, the relationship between the space complexity of
the nondeterministic and deterministic machines.
Theorem 9.4.3 (Savitch's Theorem) Let S(n)>n be a measurable space
function. If a set £ is accepted by a nondeterministic Turing machine A of tape
complexity S(n), then L is accepted by some deterministic Turing machine C of tape
complexity (5(«))2.
Proof Suppose we have w € T(A) and lg(vv) = n. Then
a \- y,
where a is the initial ID, y is some accepting ID, and k > 0. Now this computation
takes place if and only if there exists an ID 0 so that
Tfe/21 a ,lfe/2j
a F^- 0 h1-2- y.
Recall that \k/2] + [k/2\ = k for all k.
The idea of the proof is to check for such a 0 (there are only finitely many
of them, by part (i) of Lemma 9.4.1). If we were working on a model of computation
which allowed us to have recursive programs, the result would be trivial. As it is, we
must carry out the analysis far enough to see that all the computations can be done on
a Turing machine. Let us define a predicate P(f], a, k), which is supposed to be true if
there is a computation of A from r? to a of length k which uses at most space S(n).
P is defined for all ID's r? and a which use at most space S(n) and all k > 0, as follows."
If we used exactly S(n) squares of tape, the bound would be |r|S(n), but we could
use less, so the bound is
iri + --- + inS(")<(in + i)s(">.

9.4 TURING MACHINES 287
true if k = 0 and 77 = a,
true if k = 1 and 77 h~ a,
P(f], a,k) =\ true if k > 1, /"(r?, 0, T&/2] ) which uses no more than tape
S(n), and P(9, a, [k/2\) for some ID 9.
, false in all other cases.
It is easy to see that P may be calculated because all of the ID's which use
at most space S(n) may be enumerated.
Let f(j], a, k) be the number of intermediate ID's that must be stored in
evaluating P(f], 9, k). It is easy to check by induction that this quantity is < [log2&];
and when storage is also provided for 77 and a, the bound is 2 + [log2&].
The strategy for constructing the deterministic simulating machine C now
becomes clear. In view of Lemma 9.4.1, we can restrict ourselves to accepting
computations of length <cS(n). Cwill deterministically compute P(a, y, cS(n)) for all possible
accepting configurations which use at most S(n) tape. (We know the number of these,
by Lemma 9.4.1 again.) The number of ID's to be stored is at most
2+ <?\logcSM] = <?(S(n)).
On the other hand, the length of an ID is {7(S(n)) and the total space which will be
used is t?((S(n))2 ). Use of Theorem 9.4.2 allows all constants to be reduced to unity. □
Some notation is useful in dealing with the various classes of languages
accepted by time- and tape-bounded Turing machines. The notation facilitates the
classification of languages and the study of relationships between classes.
Definition Fix some countably infinite alphabet and assume that all the
machines which are discussed have input alphabets which are finite subsets of this
alphabet.
a) DTIME(r(«)) (respectively, NTIME(7,(«))) is the family of all sets
accepted by deterministic (respectively, nondeterministic) Turing machines
which operate within time-bound T(n).
b) DSPACE(5(«)) (respectively, NSPACE(5(«))) is defined analogously to
part (a), except that "time" is replaced by "space".
c) >#*= U DTIME(«fe)
fe = i '
&)Jr&= u NTIME(«fe)
e)^-SPACE= U DSPACE(«fe)
fe = i
For instance, Savitch's theorem asserts that
NSPACE(5(«))£ DSPACE((5(«))2).
From this, it follows that
^-SPACE = U NSPACE(«fe).
fe = i

288 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
One reason for the importance of these concepts is that various problems
can be coded into languages and then we may speak of reducing one problem to
another. More precisely, we have the following.
Definition A language Lt £ 2* is polynomially reducible or p-reducible to a
language £2 - A*, written Z,x <p L2 if there is a function/: 2* -> A* computed by a
deterministic polynomially time-bounded Turing machine such that
xGIi if and only if /(x)GZ,2.
Proposition 9.4.1
a) If L2 G ^and U <p L2 then LY € >#*.
b) The relation <p is transitive; that is, if Lt <p L2 and L2 <p L3, then
Next, we define the important notions of "hard" and "complete."
Definition A set L is J^&'-hard if, for each L' € Jr&1, we have L' <p L.
L is J^S* -complete if L e ^^ and Z, is ^#* -hard.
Proposition 9.4.2
a) Let Lt be ^T^1 -complete. If L2ej^&> and Z,x<p Z,2, then Z,2 is also
J^S6 -complete.
b) Let Z, be ^T^-complete. L G J* if and only if J* = jTS6.
Part (b) focuses attention on an open question, namely, whether .#* = j^S6.
It turns out that a large number of important problems can be shown to be J^&
-complete. If any of these problems can be shown to be in >#*, then they all will be.
PROBLEMS
1 Design a deterministic Turing machine to accept L = {xzyzT \ x, y, z G
(a, b}*, z =£ A}. What time and tape complexity does your machine have?
2 Given a deterministic A:-tape Turing machine A with k > 1, which accepts
a set Z- with time complexity T(n), and given any real number e, show that there is
a A:-tape Turing machine B which accepts L and has time complexity n + eT{n).
3 Does the speed-up theorem of Problem 2 imply that a linear time
computation (T(«) = Cx« + c2) can be speeded up to a real-time computation (T(n) = «)?
4* Speed-up theorems have been extensively studied and proved in abstract
form for arbitrary measures of computations that include time and space. We state a
general version of the result below for space. Prove the result.
Theorem Let Ao, Ai,...be an enumeration of deterministic Turing
machines with a fixed input alphabet. For any recursive function r, there exists a
recursive set L such that if Aj (the ith Turing machine) accepts L in space 5,-(«), then
there exists Turing machine Aj which accepts L in space Sj{n) and
r(Sj(n))<Si(n)
almost everywhere (i.e., with only finitely many exceptions).

9.4 TURING MACHINES 289
5 Show that, for each recursive function r, there is a recursive function T
which is monotone increasing, T(n) > r{n), such that any language L recognizable in
time h(T(n)), where h is any recursive function, is also recognizable in time T(n).
This result is called the "gap theorem" because it shows that there are arbitrarily
large gaps among the recursive functions which contain no running times.
6 Give an example of a nonmeasurable space function.
7 The proof of Savitch's theorem is merely a sketch. Complete the argument
by giving a detailed construction and the necessary proof. Extend the theorem to the
case where S(n) > logn.
8 Extend Savitch's theorem to nonmeasurable space functions.
A Turing machine is said to be realtime if it is a deterministic Turing
machine with time bound T(n) = n and has a separate input tape. In problems 9—13, the
input tape is not counted among the work tapes.
9 Can a realtime Turing machine accept any deterministic context-free
language?
10 Find large classes of functions computed by realtime Turing machines.
11* Show that the set L £ 2*, where 2 = (a, b, c, d, e,f], given by
L = {uveuT \Ue{a,b}*,ve {c, d}*} U {uvfvT \ u G (a, b}*, v G {c, d}*\
can be accepted by a realtime Turing machine with two work tapes, but by no
realtime Turing machine with one work tape.
12* (Very difficult) Find a family of sets {Lk} such that Lk can be
recognized in real time by a A:-tape Turing machine but not by any (k — l)-tape realtime
Turing machine.
13* Let 2„ = {aj, . . . , a„, ax,. . . , a„}. Consider the set 0„ which encodes
the "origin-crossing problem" defined by
0„ = {w G 2* | For each i, 1 < i < n, #a.(w) = #a,(w)}.
The name of this set derives from thinking of a,- (respectively, a}) as an instruction to
take one step to the right (left) along the ith axis in n-space. Show that 0„ is real time
recognizable by a Turing machine with one work tape.
14 Consider the following set L £ 2 , where 2 = (a, b, c}:
L = (wic • • • cw2kcuic ■ ■ ■ cui\wi,Uj&{a,b}*,lg(w,) = k = lg(uy)
for all i, /; J2, k > 1; there exists some
/, 1 </ < 2fe such that uc = wj}.
Show that any on-line deterministic multitape Turing machine that accepts L requires
time T(n), where
T(n) > c(«2/(log n)2) infinitely often
for some constant c. [Hint: Count the number of different ID's which could exist
after all the w,'s are read.]
15 Show that if there exists a real number r > 1 such that DSPACE(«r) c^1,
then
&> = JT& = j^-SPACE.

290 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
16 Show that there exists no pair of real numbers r and s, 1 </■<.$, such
that
DSPACE(«r)£ ■& ^ DSPACE(«S)
or
NSPACE(«r)^ &"^ NSPACE(«S).
A similar result holds for Jf&.
17 Show that, for any real number r > 1,
a) J* =£ DSPACE(«r) b) ^ =£ NSPACE(«r)
c) jr&i= DSPACE(«r) d)J^=£ NSPACE(«r)
9.5 MACHINE CHARACTERIZATIONS OF THE CONTEXT-SENSITIVE
AND PHRASE-STRUCTURE LANGUAGES
Similar characterizations of the phrase-structure and context-sensitive
languages can be given. The former family is exactly the class of sets accepted by
unrestricted Turing machines. The other family is accepted by nondeterministic linearly
space-bounded Turing machines. The proofs of the characterizations are so similar that
we shall consider them at the same time.
Definition A linear bounded automaton, or Iba, for short, is a one-tape
nondeterministic Turing machine which is (S(n) — n + l)-space bounded, which
accepts by leaving the input tape at the right end in a final state.
The fact that S(n) = n + 1 instead of S(n) = n in the space bound arises
from the fact that a Turing machine must scan the blank to the right of the last
symbol of the input in order to detect the right end of the input string.
Since we know that constants do not change a space-complexity class (by
Theorem 9.4.2), an lba has a linear amount of space at its disposal.
Our first result will be to show that each set accepted by an lba is a context-
sensitive language.
Lemma 9.5.1 For each lba C, there is a monotonic grammar G such that
L(G)=T(C).
Proof Let L = T(C), where C= (Q, 2, I\ B, 8, q0,F) is an lba. Construct
G = (F,2,P,S),where
v = {(s,a}uzux r)(rur')x r)u((gu{r})x (rur'),
with T a new symbol not in Q. r' denotes a primed copy of T. The rules of P are
given by cases.

9.5 MACHINE CHARACTERIZATIONS 291
(Tape generation) For each a&"L,P contains:
S -> (q0, a, a)A
A -> (a, a)A
A+(a',a)
S^-(q0,a',a)
5-> A if ^o isinF.
(Stay) If (q,b,0)G8 (p,a) then, for eachcG 2,/"contains
(p, a, c)->(?,&,c)
(p,a',c)^-(^,fe',c)
(Move left) If (q, b, — 1) € 6 (p, a), then, for each c€S,d,c6r,P contains
(J, e)(p,a,c) ->(q,d, c)(b, c),
(d, e)(p, a',c)->(q, d, e)(b', c)
(Move right) If (q, b, 1) G 6 (p, a), then, for each c,e&"L,d&V,P contains
(p,a,c)(d,e)^(b,c)(q,d,e)
(p, a, c)(d', e) -> (b, c)(q, d', e)
(Accept) If (q, b, 1) € 8(p, a), and q&F, then, for each c€2,P contains
(p,a',c)^(T,a,c).
(Retrieve original input) For each b,d&'L,a,c&T,P contains
(a,b)(T,c,d)^(T,a,b)(T,c,d)
and
(T,a,b)^b
Rather than give a formal proof that the construction works and L{G) =
T(C),we shall explain each phase of the construction.
1. 5 generates a string (q0, a i, ai)(a2, a2) ■ ■ -(tfc-i, ac-i)(tfc, ag). In this
string,(q0,ai , tfi) denotes that Cis scanninga\ in state q0.
2—4. These rules simply simulate the action of C on the tape. The first coordinate
of each ordered pair is used to record the changes that C makes to its tape.
5. When the device has entered a final state, we have
S => (pi.ai)-- (6c_1,ac_1)(7,,6c,ac)
6. Then T sends a signal and we get 7s throughout, that is,
(r,&,,«,) •••(r,&B,«fi).
Finally, a^ • • ■ a% is produced as the terminal string. Note that the construction is
still valid if C<1.
Note that if C "guessed wrong," the grammar will not generate anything
since it will have a string with variables in it that cannot be rewritten as terminals. □

292 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
Now, we prove a similar result for r.e. sets.
Lemma 9.5.2 If L = T(A) for some Turing machine A, then there is a phrase-
structure grammar G, such that L = L(G).
Proof Let L = T(A), where A = (Q, 2, I\ B, 8, q0, F) is a Turing machine.
There is no loss of generality in assuming that A is deterministic and has one tape.
We construct a phrase-structure grammar G = (V, 2, P, S), such that L(G) — L.G will
nondeterministically generate some word in 2* and simulate the action of A on it. Let
V = 2 U ((2 U{A}) x T)U Q U {S, T, U\ P consists of the following productions.
1 S^q0T
2 r-> [a,a]T foreacha G2
3 T^U
4 U-*[A,B]U (Recall that B is the blank)
5 £/-->A
6 p [a, C] -> [a, Z>] q- for each a € 2 U {A}, ? G £, and Ce r, so that
SG?,C)=(<7,A1)
1 p[a,C]^q [a, D] for each a € 2 U {A}, q&Q, and CGT, so that
S(p,C) = (q,D,0)
8 [6, £■] p [a, C]^q [b, E] [a, D] for each a, b G 2 U {A}, £, C G r,
and q- G Q, so that 5(p, C]I = (q-, A -1)
q [a, C] -> q-aq-
?->A foreachaG 2U (A},CGr, andq-GF.
A generation must begin with rules (1) and (2), and yields
S ^ q0[ai,ai] [a2,a2] ••• [a„, an] T
with each at G 2. If A accepts a\ • • -an, then it stops and so never uses more than
some m cells to the right of the input. Next, use rule (3), then rule (4) m times,
and then rule (5). This yields
S ^<7o[«i.«i] ■■■[an,an] [A,B]m
Only rules (6), (7), and (8) are used until a final state is reached. Note that the first
components of the tape symbols are never changed. We claim the following.
Claim Ifq0ai • • • an \~Xi • • • Xr^lqXr.. .X„+m,then
q0[ai,ai][a2,a2] ••• [an, an] [A, B]m
^ [ai,Xl ] [a2, X2] •■• [ar^,Xr^]q[ar, Xr] ••• [an+m, Xn+m]
where a,- = A for /' > n and Xs G r for 1 </ < n + m.
Proof of claim The argument is a straightforward induction on the number
of moves and is omitted.

9.5 MACHINE CHARACTERIZATIONS 293
By (9), q&F implies that:
[ffi, Xi ] • • • [ar_!, Xr_! ] q [ar, Xr] ■ ■ ■ [an+m, Xn+m ] ^a, • • • an
Thus, r(^4) G L{G). The reverse argument is again an induction, which is omitted. □
Now we turn to the arguments which take us from grammars to automata.
Lemma 9.5.3 If L is a context-sensitive language, then L = T(A) for some
IbaA
Proof Let L = L(G), where G = (V, 2, P, S) is context-sensitive. The lba.4
is constructed to work in the following way. Suppose, without loss of generality,
that A has endmarkers. This is justified in Problem 1 at the end of this section. A will
work as follows:
1 (Initialization) A divides the input into two channels. Channel 1 contains
the original input string, while channel 2 is for scratch work. A writes S
on channel 2 at the lefthand end (next to the endmarker).
2 (Search) A slides back and forth across the tape until it finds some string
a on channel 2 such that a -> 0 is in P.
3 (Replace) Replace a by 0 and slide the symbols on the rest of channel 2
to the right if necessary. It will not be necessary to move symbols to the
left. (Why?) If it is necessary to move a symbol past the right endmarker,
A goes to a "dead" state and halts without accepting.
4 (Decide) A guesses whether to continue or quit. If it continues, it goes
back to 2 and repeats the cycle. Otherwise it goes to 5.
5 (Compare) A moves to the left end of the input and then compares
channels 1 and 2. If they agree, A accepts. Otherwise^ halts without accepting.
It is easily seen that this construction works and that T(A) = L. □
Combining our results leads to the following important theorem.
Theorem 9.5.1 L is a context-sensitive language if and only if there is an
lba/1 such that L = T(A).
We can define a deterministic lba in the obvious way and define a
deterministic context-sensitive language to be a set accepted by such a device. It can easily
be seen that the family of deterministic context-sensitive languages has many
attractive features. For example, it is a boolean algebra. It is not known whether or not the
deterministic and nondeterministic lba's are equivalent in computation power.
Now we finish our simulations of grammars by automata.
Lemma 9.5.4 If L is generated by a phrase-structure grammar, then L =
T(A) for some Turing machine A.
Proof The argument is quite similar to the proof of the previous lemma and
may be safely omitted. □

294 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
Combining the previous results leads to a characterization of phrase-structure
languages.
Theorem 9.5.2 The following conditions are equivalent.
1 L is a phrase-structure language.
2 L is accepted as T(A) for some nondeterministic (or deterministic)
multitape (or single-tape) Turing machine A.
3 L is recursively enumerable.
Now that both the phrase-structure languages and the context-sensitive
languages have been given machine characterizations, our coverage of the Chomsky
hierarchy is complete.
There are two classical open questions about the context-sensitive languages
which should be stated. Is every context-sensitive language accepted by a deterministic
lba? In other words, is it the case that
NSPACE(«) = DSPACEfa)?
Another question of interest is whether the context-sensitive languages are
closed under complementation. We know that DSPACE(«) is closed under
complementation so that if NSPACE(«) = DSPACE(«), then we have closure of NSPACE(«)
under complementation. It could be the case that context-sensitive languages are
closed under complementation but that DSPACE(«) £ NSPACE(«).
While these problems are interesting in their own right, they can be viewed
as special cases of open problems in machine-based complexity theory. For instance,
we know of no function S(n) ^ log« such that the question
DSPACE(5(«)) = NSPACE(5(«))?
has been resolved.
PROBLEMS
1 Show that, for any (deterministic) lba A with endmarkers, there is an
equivalent lba C without endmarkers.
2 Show that every context-free language is accepted by some deterministic
lba. Symbolically,
£f£ DSPACE(m).
A stronger result will be proved later, namely, that^t DSPACE((log«)2).
3* {Open question) Is
"T^ NSPACE(log «)?
4 Show that, for each lba A, there is an lba B which is equivalent and which
always halts.
5 Work out the closure properties of the r.e. sets, the context-sensitive
languages, and the deterministic context-sensitive languages. Be sure to include nega-

9.5 MACHINE CHARACTERIZATIONS 295
tive results such as the fact that the context-sensitive languages are not closed under
homomorphism.
Let us fix an alphabet 2 with 2 = {a0, . .. ,am_j}, and use the notation xy = z to
stand for the concatenation predicate {(x, y, z)\ x, y, z G 2 and xy = z}. We
construct a formal logical system &, starting from the concatenation predicate, and
allowing the following operations:
a) If Pi and P2 are in &, then Px V P2 is in ^(disjunction).
b) If P is in ^then IP is in ^(negation).
c) If P{x\,. . . ,xn) isin SI, then
Q(yi,... ,ym)<>P(zi,... ,z„)
is in J?, where each z? is either one of theyt or a string from 2*, or even
a concatenation of such terms. This operation is called explicit
transformation and allows us to form a new predicate from an old one by
interchanging variables, identifying them, replacing a variable by a constant,
etc.
d) If P(xi,. . . , xn, y) is an (« + l)-ary predicate in^, then
(3z)z<y P(xlt. . . ,xn, z) (*)
is in $1. This is called bounded existential quantification and means that
(*) is true if and only if there is a string z G 2 such that z <;v for which
P(xi,. .. ,xn, z) holds. We interpret z <;y as follows: Since z,y G 2 ,
regard them as numbers in /n-adic notation, since < is thus the usual "less
than or equal to" relationship between natural numbers.
e) Nothing else is in Sf.
Thus J% is a formal logical system and we can define relations or sets on 2 . These
sets or relations are called rudimentary.
6 Show that each rudimentary set can be accepted by a deterministic lba.
7 Show that every regular set is rudimentary.
8 Show that any context-free language is rudimentary.
9 Show that the rudimentary sets (i.e., definable by 1-place rudimentary
predicates) are coextensive with the least family which contains the context-free
languages and is closed under boolean operations and length-preserving homomor-
phisms.
10 Give a machine characterization of the rudimentary sets.
11 Show that there is no algorithm which can determine, of a given context-
sensitive grammar G, whether or not L(G) = 0. (Thus the problem is unsolvable for
phrase-structure grammars also.)
12 Show that there is no algorithm which can determine, of a
context-sensitive grammar G, whether or not L(G) is infinite.
13 Show that there is an algorithm which takes a given context-sensitive
grammar G = (V, 2, P, S) and two strings a and |3 in V and which decides whether
or not
a 4- 0

296 CONTEXT-SENSITIVE AND PHRASE-STRUCTURE LANGUAGES
14 Show that the problem mentioned in the previous exercise becomes
unsolvable if G is a phrase-structure grammar.
15 Show that there is no algorithm which can decide whether a given
production is ever used in a derivation of a string in a context-sensitive language.
16 We can effectively enumerate the lba's over an alphabet 2 as A\,. . .
Let tp be a fixed homomorphism which maps strings in 2* into a fixed alphabet, say
{0, 1, #}*, such that lg(<p(<z)) > \rt\ for eacha G 2 and where Tr is the work alphabet
of.4;.
L0 = {#^#V(w)#|wG2*,#£2,andwGT04/)}.
Prove the following three statements.
a) L0 is a context-sensitive language.
b) L0 G DSPACE(m) if and only if NSPACE(m) = DSPACE(m).
c) L0 G NSPACE(m) if and only if the family of context-sensitive languages
is closed under complementation.
The previous problem asserts that L0 could be thought of as a "hardest
context-sensitive language" with respect to space. It is also a hardest language with respect to time.
17 Show that every context-sensitive language is in^* if and only if Lq G &.
18 Let LINEAR denote the set of linear context-free languages. Show that
LINEAR £ NSPACEdog/?).
19 Define E\ = L (G), where G is given by
S -> aiSa^a-iSa^ A
Define
L(E1)= {cxlyiZiCX2y2z2c • • ■ cxkykzkc\k> l,^2 = {a1,a2,fl1,ff2},
Vx ■ ■ -yk e*i ,x, G (2 U {c})*{c}U {A},y. G (c}(2 U {c})* U {A}. }
Show that L (Ex) is a linear context-free language.
20* Show that, for any e, 0 < e < 1, the following statements are equivalent:
a) LINEAR c DSPACE(log«)(1+e)
b) L(Et) G DSPACE(log«)(1+e)
c) NSPACE(£(«)) £ DSPACE([£(n)]1+e) for allS(n) > logn.
21 Can you find a recursive set of phrase-structure grammars^7 = {G;} such
that JS^= {L(G)\ G G ^}is exactly the family of recursive sets?
9.6 HISTORICAL SURVEY
The basic material on phrase-structure and context-sensitive grammars is due
to Chomsky [1959]. linear bounded automata were studied by Myhill [I960].
Connections with context-sensitive languages were established by Landweber [1963]
and Kuroda [1964]. Some closure results about context-sensitive languages were

9.6 HISTORICAL SURVEY 297
derived by Ginsburg and Greibach [1966b]. The importance of time- and space-
complexity classes of Turing-machine computations was first noted by Hartmanis
and Stearns [1965]. Theorem 9.4.3 is from Savitch [1970]. Cook [1971] was the
first to realize the importance of ^and J^. Karp [1972] systematically reduced
many combinatorial problems to the & = J^ question. Speed-up theorems such as
Problem 4 at the end of Section 9.4 were first proved by Blum [1967]. Problem 5 is
from Borodin [1972]. Problem 11 is from Rabin [1963], while Problem 12 is from
Aanderaa [1974]. Problem 13 is from Fischer and Rosenberg [1968]. Problem 14 is
from Hennie [1966], while Problems 15 through 17 are from Book [1972b].
Problem 6 of Section 9.5 is from Myhill [1960]. Problems 7 and 8 are from
the work of Jones [1969], while Problems 9 and 10 are due to Yu [1970]. A number
of decision questions about context-sensitive languages come from Landweber [ 1964].
Problems 16 and 17 are from Hartmanis and Hunt [1973]. Problems 18
through 20 are from the work of Sudborough [1975ab, 1976ab].

ten
Representation
Theorems
for Languages
10.1 INTRODUCTION
In this section, a number of characterization theorems for classes of
languages will be proved. All of the results are important and are used extensively.
There have been a number of papers in the literature which place various
restrictions on phrase-structure grammars or on their derivations, so that the
languages that are generated are exactly the context-free languages. Section 10.2 is
devoted to proving one theorem which captures the essential idea of such constructions.
Then, we shall obtain a number of the results from the literature as corollaries.
Section 10.3 is devoted to showing that any r.e. set L may be represented
as
L = y(Li r\L2),
where Lx and L2 are deterministic context-free languages and tp is a homomorphism.
Some applications for this result are given.
In Section 10.4, the semi-Dyck and Dyck sets are introduced. These are
sets with several types of left and right brackets, which are balanced according to
onesided or two-sided cancellation rules. It will be shown that each context-free language
L is of the form
L = y(DnR),
where D is a semi-Dyck set, R is a regular set, and y is a homomorphism.
299

300 REPRESENTATION THEOREMS FOR LANGUAGES
Section 10.5 is devoted to proving that there is one context-free language
L0 such that any other context-free language L may be written
L = tp~l L0,
where tp is a homomorphism. The importance of this result will be clear in Chapter 12
where it will be shown that L0 is a "hardest" context-free language with respect to both
time and space.
10.2* BAKER'S THEOREM
In this section, we shall start from NR grammars of Section 9.2, which are
known to generate exactly the family of phrase-structure languages. By placing
suitable restrictions on the rules, one can get context-free languages. The main theorem in
this section has, as corollaries, almost every other result of this type that is known.
Next we introduce an important property of NR grammars.
Definition Let G = (V, 2, P, S) be an NR grammar. G is terminal bounded
if each rule in P is of the form
x0AiXi ■ ■ ■xn.lAnxn -» y0Bi ■ ■ -ym-iBmym
where each xt, yt G 2*, Ab Bt G N, and either n = 1 or there is some /, 0 </ <m,
such that, for every k, 1 <k <n,\g(yj)>\g(xh).
This definition says that either a rule is context-free (n = 1) or that there is
some terminal string >>y which is strictly longer than any of the terminal strings xit
. .. , *„_!; that is, y}- is longer than any string that can appear between two variables
in the lefthand side.
Example
AaB -»• BaCbb
This is a terminal bounded rule since \g{bb) = 2 > lg(a) = 1.
Notation. For any NR grammar G = (V, 2, P, S), define
N(G)= JJpflgC*)"!).
is inP
N(G) measures the excess of symbols on the lefthand side of rules over that which
would be present in a context-free grammar with the same number of rules.
Fact Let G = (V, 2, P, S) be an NR grammar, N(G) = 0 if and only if G is
context-free.
Example If we consider the rule in the previous example as an entire
grammar, then
N(G) = 2.
We need one more definition before we can begin to prove a lemma.

10.2* BAKER'S THEOREM 301
Definition Let G = (V, 2, P, S) be an arbitrary NR grammar. Define
LG = max {0, lg(x)| xy -» 0 is in P for some y, 0 G V*, x G 2*};
/?G = max {0, lg(jc)| yx -» 0 is in P for some y, 0 G F*, jc G 2*};
BG =max{0,lg(x)| yiAixA2y2^PisinP for some xG'E*,AuA2GN,y i,y2G V*}.
Thus LG (respectively, RG) is the largest length of a terminal string appearing on the
left (right) end of a rule in P. BG is the maximum length of a string appearing
between two variables in the lefthand side of a rule in P.
We are now ready for a nontrivial lemma.
Lemma 10.2.1 Let G= (V, 2,P,S') be a terminal bounded grammar with
N(G)>0. Then there exists a terminal bounded grammar Glt a regular set R, and
a gsm S such that
L(G) = S(L(G{)nR),
and
N(Gi) < N(G).
Proof There are basically two cases in the proof.
CASE 1 LG >BG otRg >Bg.
The idea will be to take a rule of G which has in its left side a terminal string
that is too long to appear between variables in any rule, and shorten the terminal
context of this rule.
Assume, without loss of generality, that RG^LG and RG>BG. (For
LG > RG, the construction would yield a G\ symmetric to the present one.)
The idea is to consider a string AwB where wG 2* and \g(w) = RG >BG,
A,BG V. No rule of P can span AwB, so that, if some initial part of a derivation
produces w, then any later step must apply a rule to the left of w independent of what
occurs to the right of w, or to the right of w independent of what occurs to the left
of w.
We shall replace a single rule in P whose lefthand side contains a terminal
string of length RG by a rule with shorter terminal context. But this could allow
derivations in which the new rule could be applied when the old rule was not
applicable. Accordingly, a special symbol $ will be used with the new rule to flag unwanted
derivations.
Define Gj = (F1; 2 U {$}, Pu S'), where Vl = W {$}. Choose a rule yw^-
0w in P, where w G 2* and lg(vv) = RG ■ Define
Pi = (P-{yw^-$w\) U {y^QwS}.
Gt is terminal bounded since G was. Since lg(vv) = RG > 0, lg(y) < \g(yw), so that
N(Gi) < N(G).
Given a string aywp, we can use yw -*■ j3w in G to produce o&wp. w is
available as context for the further rewriting of 0 or of p. In Gx if we start with aywp
and use y -*■ j3w$ in Pi, we get o(5w$wp and again both 0 and p have the context w

302 REPRESENTATION THEOREMS FOR LANGUAGES
with which to continue. But we have to restrict the context in which 7-»0w$ is
applied and to erase the extra w$l
Let R = (V U ($w))* and define the gsm S as follows: Let w = wi • ■ • wn,
Wj G 2, and:
a/a
Q^^ . rri^Q
a*% wn/A
To complete the proof, we must prove the following claim.
Claim 1 L(G) = S(L(G1)nR).
Proof We shall only sketch the argument here but will give the essential
parts so that the details can be easily supplied. The argument proceeds by first showing
that L(G)£ S(L(Gi) nR). To prove this,it suffices to prove Claim 1(a).
Claim 1(a) For anyj> G 2*, if
5"
where n > 0, k > 0, and each
S' r? y = iiiwu2w ' • • ukwuk+i
Uiev*-v*wV*,
then there exist il7... ,ik>0 such that
S'^ z = u1(w$)''wu2(w$)'', •••ufe(w$),'*wufe+,
Proof The argument is an easy induction on n and is omitted.
Note that Claim 1(a) is sufficient to prove that L(G)£ S(L(Gi)nR) since
zGR andy=S(z).
The reverse direction is more interesting. There are derivations of strings in
L(Gi)C\R which cannot be simulated exactly in G [y->-(}w$ may be applicable to
a string while yw -*■ 0w is not]. One can show that, for each y G L(G\) n R, there is
some derivation of y in which y -*■ j3w$ is applied only when context w has already
been generated to the right of y. This derivation is "almost" a G-derivation and it is
easy to construct a G-derivation of S(y) which imitates it. The argument proceeds
in a sequence of steps.
Claim 1 (b) Ify G L(Gi)nR, then G\ has a derivation
G, ' G, G, " -^
and each a,- G /?.
/Voo/ The argument involves labeling all occurrences of $ in y and
considering where each subscripted $ was introduced. A straightforward case analysis proves
the result, and details are left to the reader.

10.2* BAKER'S THEOREM 303
Claim 1 (c) If, for some n > 0,
and each a: G R, then
S' => an =* ai => • • • => a
G, ° G, l G, G, "
ST = S(a0) f S(ai) t •'' t S(an)
Proof The argument is a straightforward induction on n.
Claims 1(a) and 1(b) prove that
S(L(Gi)nR)cL(G)
and that completes the proof of Claim 1.
CASE 2 LG <BG and/?G <BG.
The idea is similar to Case 1 except that we must consider a string w of
length BG + 1. Choose a rule
XoAiX! ■ ■ -Xn-iAnXn -+ XoJlXi ■ ■ -Xn-iJnX,,
where each xt G S*, At G N, j; G F*, and there exists/, 1 </ < n — 1, so that \g{xs)
= BG. We claim n > 2. Otherwise, n = 1 and all rules have one variable on the left,
so that BG = LG = RG = 0, and thus N(G) — 0. Since G is terminal bounded, the
righthand side of this rule has a terminal substring w of length >BG. Since BG >LG
and BG > RG, lg(vv) > lg(jc,) for any /, 0 < / < n.
Thus some part of the -y's must be included in w. Let us say that yk must be
included in w. (By symmetry, assume k < n.) We rewrite the rule as
°AkxkAk+lp ■+ 0iw02
where lg(0,) < lg(x07i ■ ■ ■ 7k-i^fe-i7fe)- Define G, = (K,, 2 U {«, $},P, ,5") where
K, = Kujt, $},andletP' = {a>lfe^/31w$,xfeylfe + 1p^xfe<|rw/32}.
Then
Pi =(P-{oAkxkAk+lP^PlWp2})UP'.
Gx is surely terminal bounded. Clearly,N(Gi )<N(G).
Let R = (V U {$xfe<tw})* and define a gsm as indicated in Fig. 10.1. Let
xk =x\ ■ ■ ■ x'p, and w = Wi • • • wr, when jcJ, Wj G 2.
FIG. 10.1

304 REPRESENTATION THEOREMS FOR LANGUAGES
Again the completion of the argument depends on the following result.
Claim 2 L(G) = S(L(Gl)nR).
Proof The argument will be sketched by giving an outline.
r n
Claim 2(a) For any y, if S =j» y then there exists z G R such that
and
S(z) = y
Proof The argument is a straightforward induction on n.
As in Gaim 1, it is the reverse direction that is more complex. Take y in
L(Gi)C\R, and index occurrences of <t and $, so that
y = 7o$i^k<t:i7i$2^'fe<te • • • 7p-i%pXk<tp7p
Claim 2(b) If>> GL(Gi) HR, then
s' t ai t • • • t an = y
such that for all/, oj first appears in ar+1 if $y first appears in o^..
Proof The argument is a case study of how symbols may be generated. The
reader should provide the details.
Claim 2(c) If
. * * *
"° G, ' G, G, m y
where, for each / <m,OLi=> ai+l by a single rule of P or via consecutive applications
of the two rules of/1', then
S' = %)^''f%)= S(y).
Proof Induction on m.
The argument above completes the proof of Claim 2. The main proof is
now complete. □
We are now going to show that terminal bounded grammars can generate only
context-free languages. Intuitively, this seems clear since the long terminal context
prevents reapplication of a rule and hence destroys the ability to transmit information.
Theorem 10.2.1 (Baker's theorem) If G = (V, 2,P, S') is a terminal bounded
grammar, then L(G) is context-free.
Proof We induct on N(G).
Basis. If N(G) = 0, then G is context-free.
Induction step. Assume the result for all grammars G' such that N(G')< k.
Consider G = (V, 2, P, S") with N(G) = k. By Lemma 10.2.1, there exists a regular

10.2* BAKER'S THEOREM 305
set R, a gsm S, and a terminal bounded grammar G\ such that N(Gi )<k and L(G) =
S(L(Gi) n /{). By the induction hypothesis, L(Gi) is context-free. Therefore
L(G) = S(L(Gi)DR)
is also context-free and the proof is complete. □
Now we begin a sequence of corollaries of this theorem. Each of the
corollaries is nontrivial and interesting in its own right.
Corollary 1 If G = (V, 2, P, S) is a phrase-structure grammar with each rule
of the form a ->■ 0 with a &N* and 0 G F* 2 F*, then £(G) is context-free.
Corollary 2 If G = (V, 2, P, S) is a context-sensitive grammar with erasing,
such that each rule is of the form a -*■ j3 with a G 2*N 2*, then £(G) is context-free.
Corollary 3 If G = (V, 2, P, S) is a context-sensitive grammar with erasing,
in which each rule is of the form xA(l-*■ xy(3, where A EN, x G 2*, 0G V*, and
lg(x) > lg(/3), then L(G) is context-free.
Corollary 4 If G = (V, 2, P, S) is a context-sensitive grammar with erasing,
such that each rule is of the form aAy -*■ o#y, where either
i) aG2* and lg(a)>lg(7)
or ii)7G2* and lg(7)>lg(a),
then L(G) is context-free.
Corollary 5 Let G = (V, 2,/\ S) be an NR grammar and let < be a partial
ordering on V. Suppose that each rule is of the form
Ax ■■■An+Bl ■Bm
where each AtGN and there is some k, 1 <A:</w, such that for all /, 1 </<«,
A;<Bk. Then L(G) is context-free.
Proof The argument is involved and is left for Problem 2 at the end of the
section, where some hints are given.
Definition Let G = (V, 2, P, S) be an NR grammar. If x G 2*, a ->• 0 is in P,
and 7 G V*, write
xorf^x&y and -yax ?f-y/fr.
Define L „ R
Left (G) = {w G 2* | S f w}
and
Two-way (G) = {wG 2*15 L=JR w},
where L^R means that each rewriting is either a leftmost one (by =*■) or a rightmost
one (by ^).
Corollary 6 If G = (V, 2, P, S) is an NR grammar, then Left (G) and
Two-way (G) are context-free.

306 REPRESENTATION THEOREMS FOR LANGUAGES
Proof We shall first construct a new terminal bounded grammar G' such that
L(G')= Two-way (G1) and $L(G')= Two-way (G), where <p is a homomorphism.
Let <7 = max {lg(a)| a ->■ |3 is in P}. Let d, £, /?, and Si be new symbols. Define G' =
(V, 2', P', 51!), where 2' = 2 U {d}, and K' = K U {d, £, R, Si}. P' is defined as
follows.
i) L -» A
R -» A
Si -» dqLSRd9
are all in P.
ii) If uoz;-»-w/k is in P and w, t>, w, *,>> G (2')*, a, 0G {A}UJV(2*A0*
with lg(>0 = q, then
yuLuv -*■ ywLfbc
uaRvy^ w$Rxy
yuLaRv ->■ ywLQRx
are all in/1'.
The idea is to add markers L and R to designate the leftmost and rightmost
variables at any point in a derivation. To also make G' terminal bounded, extra
terminal context is added on the left or right side of each rule; this terminal context does
not interfere with the application of rules, since each rule can be applied only where
there are no nonterminals on at least one side.
It is now easy to see that
L(G') = Two-way (C).
Let ip be the homomorphism given by
yd = A
ipa = a iia^d.
A straightforward argument shows that
<p(L(G')) = Two-way (G).
Since G' is terminal bounded, Two-way (G) is context-free. □
PROBLEMS
1 Show that if G is an NR grammar, then Left (G) is context-free.
2 Prove Corollary 5. Hint. Assume that < is a total order and that S is the
least element in this ordering. Also assume that, if a-*- 0 is a rule, then either 0G N*
or 0 G 2. Let d, E, and ^ be new symbols. Define 2j = 2 U {d}, Vx = V U {d, E, Si},
and let
p = max {lg (0) I a ->■ 0 is in P}.

10.3* DETERMINISTIC LANGUAGES 307
Define </> to be a homomorphism given by
ipA = Ad"'
if A G V and A is greater than exactly i elements in V. Define Pi to consist of the
following rules:
a) Si -» SdE
E^d
b) If a ->■ 0 is in P and Y G N, then add to Pi
<p(a)Y -» ^(/3)r
Show that Gi = (Vi, Si, Pi, Si) is an NR grammar and is terminal bounded. Define
a homomorphism \p: Vi -> V* by
iiA
Show that
A if.4 = d,
A otherwise.
\p(.L(Gi)) = L(G)
and hence L(G) is context-free.
10.3 DETERMINISTIC LANGUAGES GENERATE RECURSIVELY
ENUMERABLE SETS
Any recursively enumerable (or phrase-structure) language L may be
generated as
L = y(Li C\L2),
where Li and £2 are deterministic context-free languages and </> is a homomorphism.
This result turns out to be extremely useful when investigating new families of
languages. For example, if one is investigating a new family of languages that is sub-
recursive, and one knows that the family is closed under homomorphism and contains
the deterministic context-free languages, then the new family cannot be closed under
intersection.
The construction is not very hard at all. Let
G = (y,-L,P,S)
be a phrase-structure grammar. Assume that G has k rules for some k. Define
L\(G)= bioidfocylPfyfcl 7,, y2G V*, a,- -»• ft is inP, 1 </ <k}\
where dit.. . ,dk and c are new symbols, not in V.
Lemma 10.3.1 L\{G) is a deterministic context-free language for each
phrase-structure grammar G.
Proof For at ->■ ft in P, 1 </ < k, let at = aa • • • a,m(0 and ft = fti • • •
ft„(0, each au and ft,- in V.
Note. By the definition of a phrase-structure grammar, m(i) > 1 and «(/) > 0.

308 REPRESENTATION THEOREMS FOR LANGUAGES
Let A = (Q, 2', T, 6, Z0,{/})be the deterministic pda with 2' = VU {c, di,
...,dk}.Then
Q = {<7o, <7i. «2, d,f}U{Sij\ 1 </ <*, 0 </< w(0} U{rtf| 1 </ <*, 0 </ <«(/)};
and r = 2' U {Z0}, Z0 being a new symbol, and 6 defined as follows:
0 8(<7o, a, Z0) = (<7o. Z0a) for eacha in V;
6(<70, a, &) = (<70, &a) for eacha and b'mVU {d,| 1 </ <A};
8(q0, c, b) = (q!, b) for each b in V U K11 </ < A:}.
ii) 6(<7i ,a,a) = (qi, A) for each a in V;
6 (<71, A, d,-) = (s,-0, A) for each 1 < /' < k.
ui) 6 (sij, A, a,(m (0.j)) = (x,-0+ d, A) for each 1 < / < A:, 0 </ < m(i) - 1;
8fe(m(o-i),A- aa) = 0;o, A) for each 1 </ < k.
iv) 6(r;/, ft(„(0-y),a) = (r,0+1))a) for each a in VU{Z0}, Ki<k,
0</<n(0-l;
^(?i(n(0-1)' fti» a) = (<l2, a) for each 1</<A:, a in KU{Z0}, If
n(i) = 0, this becomes 6(ti0, A,a) = (q2, a).
v) 6 (q2, a, a) = (q2, A) for each a in V;
8(q2,c,Z0) = (f,Z0).
vi) S(f,A,Z0) = (q0,Z0).
vii) S(<7, a, &) = (d, b) for all (<7, a, b) not previously specified.
The dpda j4 operates as follows:
1 In (i), A copies the input onto the stack until the symbol c is read.
2 In (ii), it checks the input against the stack until some dt is read on the
stack. The d; is then erased.
3 Without moving the input, A checks in (iii) to see whether the word a,- is
on the stack (erasing a,- in the process).
4 In (iv),A sees whether ft is on the input while not altering the stack.
5 In (v), A matches the input against the stack until c is read.
6 In (v), A goes to an accepting state if c is read on the input and Z0 on the
stack.
7 In (vi), A then goes to the start state, the cycle (beginning at step (1))
being repeated.
It is a straightforward matter to verify that T(A) = L\ (G). □
The following notation is useful.
Notation. For a in 2, let a be a new symbol. Let A = A and w = Si ■ ■ ■ ak
for each w = a\ ■ ■ ■ ak, all a,- in 2. For each U9 2*, let U = {«| u in £/}. For each
phrase-structure grammar G, let Li(G) = £i (G)(2)*.

10.3* DETERMINISTIC LANGUAGES 309
Corollary L\(G) is a deterministic context-free language.
Proof Since strings over (2)* involve a new alphabet, it is clear that
L\(G) (2)* is a deterministic context-free language.
Now, a second encoding must be prepared.
Notation. Let V be a finite set containing S, k ^ 1, and let c, di,. .., dh
be k + 1 new symbols. Let
£(A:, P0={.Stf,-c|l <i<k}{ula2ca2rdJa[c\ \<j<k,ax anda2 in K*}* {/3c0Tl0in K*}.
Lemma 10.3.2 £(&, F) is a deterministic context-free language.
Proof Let ,4 be the deterministic pda (Q, 2', T, 6, <70, Z0, {/j ,/2}), where
2 = teo.?i.?2,rf,/i./2}U{s/|0</<4},
2' = KU2U{c,rfi,...,rffc},
r = fufu{z0},
Z0 being a new symbol, and 6 is defined as follows (a and b are arbitrary elements in
vy.
i) 8(q0, S, Z0) = (qi,Z0);
8(qi,d.,Z0) = (q2,Z0) forl<i<k;
8(q2,c,Z0) = (s0,Z0).
ii) S(s0) a, Z0) = (s0, Z0a);
S(s0> a, b)=(s0,ba);
8(s0,c,a) = (si,a).
iii) 5(s1(a,a) = (s2,A);
8(s1,a,a) = (s3,A).
iv) S(s2)a,a) = (s2) A);
6 (s2, dy, a) = (s4, a), 8 (s2, djt Z„ ) = (s4> Z0).
v) 603,5)fl) = 03, A);
8(s3,A,Z0) = (f1,Z0).
vi) 5(s4, a, a) = (s4, A);
S(s4) c,Z0) = (x0,Z0).
vii) 8(s0,c,Z0) = (f2,Z0);
8(f2,d„ Z0) = 04, Z0) for Ki<k.
viii) Sfa, a', Z) = (d, Z) for all (q, a', Z) in Q X 2' X Y not previously
defined.
Note that the rules in (vii) are needed in case aj a2 = A or /3 = A.
Intuitively,^ operates as follows:
1 In (i), A checks to see whether the first three input symbols are Sdtc for
some /.

310 REPRESENTATION THEOREMS FOR LANGUAGES
2 In (ii), A copies the input symbols onto the stack until the symbol c is
read.
3 In (iii) and (iv), it matches the input against the stack until some d}- is read
on the input.
4 In (iv), the dj is read on the input without altering the stack.
5 In (vi), the input is again matched against the stack until the symbol c is
read.
6 A then goes to s0 and cycles.
7 If, at step (3), a symbol 2 is read, then A goes to s3, where it compares,
in (v), S on the input with a on the stack.
8 A accepts if, after reading the input, the stack is empty except for Z0.
It is a straightforward matter to formally verify that T(A) = L(k, V). The
details are omitted. □
We now obtain the desired representation theorem for recursively enumerable
sets.
Theorem 10.3.1 For each recursively enumerable set E£ £*, there exist
deterministic context-free languages L\ and L2»and a homomorphism </> such that
E = tp(Lx C\L2).
Proof Let E = L(G), where G = (V, 2, P, S) is a phrase-structure grammar.
Let L\ = L\(G) and L2 = L(k, V), where k is the number of productions in P. Let tp
be the homomorphism defined by ${fi.) = a for each a in £ and <p(a) = A for each
a in V U {c, d\,..., dk}.
Let x be a word in E. Then there exist y^, ~fn'ae(i)^e(i^or 0</<w,
such that
701 = 702 = A, Ofto) = S, 7ml0*(m)7m2 = *,
and
T7i "tfOVto implies that 7/1 Pe<JyYj2,
with as(/) ->■ j3s(y) inP, for 0<j<m,and
for 1 </ < 7n — 1. For each 71, /, and y2, let
s(7iW', 72) = Jtj'i cyxdfoc
and
K7i.'. 72) = 7iUidnicylpT-fic.
Then
z = K7oi,^(0),7o2)K7ii»^(l),7i2)---K7mi.^('"),7m2)^
is in Z/i(G). However,
z = Sdgi0)cs(yn<xgm,g(l),712) • • • s(ymlag(m),g(m), ym2)xTcx.

10.3* DETERMINISTIC LANGUAGES 311
Therefore, z is in L(k, V); that is, z is in L\ n L2• Then x = </>(z) is in </>(/,! n£2);
that is, E£<p(L 1 n/,2).
Suppose that X is in </>(£i n£2). Thus x = tp(z) for some z in L\ n£2.
Then
z = t(y0i,g(0),y02)---t(yml,g(m),ym2)X
for some 701,7o2, • ■ ■ ,7mi, 7m2,,?(0),... ,g(m). Similarly,
z = &?W)«(|„,/(1), |12) • • ■ sttnl,/(«), |„2);cTc*
for some |n, |12,..., |nl, |„2, /(0),...,/(«). Then w = n (since there are exactly
2m + 2 and 2n + 2 occurrences of c in z). By observing the occurrences of dj in z,
we see that/0') = g(i) for 0 < /' <m. We also see that
_, 7oi = 7o2 = A, as(o) = ■?, 7n<W) = In, 7i2 = 1,2,
and
T(i-i)iPg(i-i)7(/-»2 = lnl/2 = 7ti<*gii)7i2
for 0 <i <m, and ymiPgim)ym2 = *.Thus
S =»• 7oife(o)7o2 =»• 7n^(i)7i2 =>•••=> 7miftr(m)7m2 = *;
that is,* is in E. Then ip(£i O L2) E £ and the proof is complete. □
The application that was discussed earlier can now be formalized.
Theorem 10.3.2 Let Jzf be any family of recursive sets that contains the
family of deterministic context-free languages.
a) If Jzf is closed under intersection, then Jzf is not closed under
homomorphism.
b) If -Sfis closed under homomorphism, then jgf is not closed under
intersection.
Other variations are possible as well.
Theorem 10.3.3 Let .S^be any family of recursive sets such that there is a
nonempty language inJzf. IfJzfis closed under intersection and homomorphism, then
Jzfdoes not contain the family of deterministic context-free languages.
PROBLEMS
1 Extend Theorem 10.3.1 to the following: Given 2, let S0 = 2 U {«,*},
a and b being two new symbols. There exists a homomorphism <p of Sj onto 2* with
the following property: For each recursively enumerable set E£ X*, there exist
deterministic context-free languages L\ *=. Sj and L2 E So such that E = v(Li n L2).
2 Give a Turing-machine proof of Theorem 10.3.1.
3 Let ,4 =(Q, S, A, S, <70, Z0, F) be a pdaand let L £ S*. Define
HL = {aGT*| (q0,w,Z0) h1 (<7,A,a) forsomewGL}.
Thus HL is the set of all pushdown contents that can occur when L is fed into A. If L
is context-free, is HL regular or context-free? In general, describe what kind of sets can
occur as HL.

312 REPRESENTATION THEOREMS FOR LANGUAGES
4 Show that an arbitrary deterministic Turing machine may be simulated by
a deterministic device with two pushdown stores.
5 Show, with the help of the previous problem, that any deterministic Turing
machine may be simulated by a deterministic device with four iterated counters (i.e.,
they can count to zero more than once).
6 Show that a deterministic Turing machine may be simulated by a
deterministic device with two iterated counters.
7 Show that, for each Turing machine A, there exists a finite alphabet T and
two deterministic iterated counter languages L\ and L2 over T such that A halts on
some input if and only if L\ n L2 =0.
8 Show that it is undecidable whether or not L = 2* where L is an iterated
one-counter language. Thus the equality problem is unsolvable for this family of
languages.
9 Show that it is undecidable whether or not Lj £ L2 for deterministic
iterated counter languages L^ and L2.
10.4 THE SEMI-DYCK SET AND CONTEXT-FREE LANGUAGES
An algebraic characterization of the context-free languages will be obtained
in this section. It hinges on the semi-Dyck set, which is a particular deterministic
context-free language.
To motivate the semi-Dyck set, let us imagine a terminal alphabet consisting
of left and right parentheses, ( and ). Define D\ to be the set of all well-balanced
parentheses, for example,
()ez>!,
()()e/>i,
and
(()(()))€£,,
but
D\ is a (deterministic) context-free language, as we shall see very soon. An alternative
way to describe Dx is to code ( as a^ and ) as a\ [or a_x ]. Then Dx is the set of all
strings of {ai, flj}* subject to the relation
aid'i = A
[respectively, ax a _, = A].
Correspondingly, one could allow two-sided cancellation, which would give
us the "two-sided" or simply the "Dyck Set," D\.
Example ) ( G D\,
)(())(ez?i.
The last relation is more natural if we write
a\a\a\a\a\a\ &D\.

10.4 SEMI DYCK SET AND CONTEXT-FREE LANGUAGES 313
If we think of ax as the inverse of ai, it is like group theory. In the exercises, we shall
see that the analogy is appropriate.
We shall soon introduce the semi-Dyck set on r letters. Many of the results
to be given hold for D'r as well with similar proofs but these will be omitted and left
for the problems.
The terminal alphabet of Dr, the semi-Dyck set on r letters, will be 2r U 2r,
where 2r = {a\,..., ar) and 2r = {at\ at G 2r}.
We may define
2 = 2r U 2r
and consider 2* which is a free monoid. We can define relations on 2*, first defining
p0 = {(a,5i,A)|fllG2I.}u{(A,fll5i)|flJG2I.}.
Let — be the least two-sided congruence relation that contains p0. Not only is — an
equivalence relation on 2 but
xafljy — xy
for all /, 1 < / < r, and all x, y G 2*. Intuitively, if x, y G 2*, then x^y if and only
if x can be obtained from y by introducing or cancelling adjacent pairs aflj or vice
versa (meaning that y can be obtained from x in this fashion).
Example Let r = 2. Then,
a.\a.\a.\a.ia.\ — aidiaiaiaia2aiaiai-
It is convenient to have a normal form for such words.
Definition Let 2 = Sr U Sr for some r > 1. A word w G 2* is reduced (for
Dr) if w contains no consecutive pairs afit for any /, 1 < / < r. A word w is reduced
for Dr if u> contains no consecutive pairs afii or a"^,- for any /, 1 < / < r.
Given any word in (2)*, we may reduce it by cancelling consecutive pairs. We
shall now carry out this process carefully.
Definition Let 2 = 2r U 2r for some r > 1. Define a mapping /i from 2*
into 2* as follows:
m(/0 = a,
nixdi) = n(x)at for each /, 1 < / < r,
li(xya, if/u(jc)^2*«j, Ki<r,
x if n(x) = x'ar
Examples
H(A) = A,
li(at) = at,
li(ai) = a„
ix(a\aiaia\a\ai) = n(aia2a2a1di)a2 = n(aia2a2)a2 = P-fai)^ = ^1^2-
At(xa,) =

314 REPRESENTATION THEOREMS FOR LANGUAGES
We summarize the pertinent properties of 11 in the following proposition.
Proposition 10.4.1 Let 2 = 2r U 2r for some r > 1 and let u, v, w, x, y G 2*
a) /i(xaiai) = /i(x) for any /, 1 < / < r.
b) n(w) is reduced.
c) n(w) — w.
d) If u is reduced, ii{u) — u.
e) AiGi(w)) = ii(w).
f) jufoO = ^G^bO-
g) ju^) = ju(ju(jc)ju(»).
h) lfn(x) = n(y), then n(uxv) = n(uyv).
i) n(xata{y) = /u(xj>) for any /, 1 < / < r.
j) n(x) = /i(y) if and only if x —y.
k) Let wG 2*. If jv G 2*,j>is reduced andjv — w, thenj> = n(w).
Proof The proof of (a) is a straightforward induction on the length of x.
The proofs of (b) and (c) are easy inductions on the length of w. A trivial induction
on the length of u proves (d).
To show (e), let w G 2*. By (b), n(w) is reduced. By (d),
H(li(w)) = At(vv).
Now (f) will be shown by induction on y.
Basis, y = A. Then
H(x) = ii(li(x))
and the basis follows from (e).
Induction step. Assume the result for all words y and consider ya with
a G 2. There are two cases, depending on whether a G 2r or a G 2r.
CASE 1 a = ate 2r.
Ii(xya) = n(xWi)
= n(xy)at by definition of /i,
= ti(p.(x)y)ai by the induction hypothesis,
= n(n(x)yaj) by the definition of /i.
The induction is now extended in Case 1.
CASE 2 a = ate 2r.
There are two cases here, depending on whether or not
1 zb for some b^ah
za{.
H(xy) =

10.4 SEMI-DYCK SET AND CONTEXT-FREE LANGUAGES 315
CASE 2(a) n(xy) = zb, where b =£ at.
Then
n(xydi) = /i(xy)ai by definition of n,
= ii(p.(x)y)ai by the induction hypothesis.
Since ti(p.(x)y) = n(xy) = zb,bi^ ah we may invoke the definition of /i again and
H(n(x)y)at = ii(jx(x)yai),
and the induction is extended in Case 2(a).
CASE 2(b) n(xy) = zat.
H(xyat) = K(xy)ai)
= n(jx(xy)aj) by the induction hypothesis,
= liiza-fii) = ju(z) = z.
Note that zat and z are reduced because za,- is in the range of /i. The only way that
z could fail to be reduced would be to have some cancellation at the right end, but
that is impossible, by the one-sided nature of our system. On the other hand,
nQjL(x)yai) = z
because ti(p.(x)y) = n(xy) = zat; the induction is extended and the entire proof of
(f)is complete.
The proofs of (g) and (h) are left to the reader; (i) follows from (g) and (a),
because
VixdiCiiy) = n(p.(xatai)n(y))
= ii(fi(x)ii(y))
= v(xy).
The proof of (j) follows in a straightforward manner from (i). To show (k), note that
y — w implies
Ky) = J"(w)
but, since y is reduced,
li(w) = n(y) = y. □
The /x function is quite helpful in dealing with the semi-Dyck set.
Definition Let r> 1 and define Gr = (Vr, 2, P, S), where 2 = 2r U 2r,
Kr = 2 U {5}, and P consists of the following productions:
S^SaiSatS\A
for each /, 1 <i<r. The semi-Dyck set is formally defined by
Dr = L(Gr).

316 REPRESENTATION THEOREMS FOR LANGUAGES
We may think of the at, 1 <i <r, as r different kinds of left brackets, and
the dj for 1 < / < r, as their respective right brackets.
It is clear that
Dr = {wG2*Iai(w) = A}.
Our first result is that Dr is a deterministic context-free language.
Theorem 10.4.1 Dr is a deterministic context-free language.
Proof Let A = ({q0, q^}, 2, 2 U {Z0}, 6, q0, Z0, {q0}), where 2 = 2 U 2r.
Define :
6(q0, ab Z0 ) = (q1, Z0at) for each i, 1 < / <r,
6 (q1, ait a,-) = (<7i, a/a,-) stack first letter for all 1 < /,/ < r;
8 (q i, ait at) = (q 1, A) for each /, 1 < i < r; cancel a^;
8(qlt A, Z0) = (q0, Z0) accept and prepare to repeat.
Corollary Dr is an unambiguous context-free language.
Now we list some basic properties of Dr.
Proposition 10.4.2 LetZ)r be the semi-Dyck set.
a) If;c,j<GZ)r,then;c>>GZ)r.
b) If x G Dr, then a;xdi G Dr for all i, 1 < / < r.
c) For each word x G Z)r, either x = A or x = fljj'aiz for some /, 1 < / < r,
and some j, zGZ)r.
d) Ifa,ajz GZ)r, thenz GZ)r.
e) IfyzGDr and><GZ)r, then zGZ)r.
f) x£ init(Z)r) = {w G 2*1 vvy G Dr for some >< G 2*} if and only if u(x) G
2* (that is, w has no letters in 2r).
Proof Since w G Dr if and only if n(w) = A, Proposition 10.4.1 can be
used to prove (a) through (e) elegantly. As an example, consider (e).
y G Dr implies /i(y) = A,
yz G Dr implies n(yz) = A.
But
l±(yz) = ii(ti(y)z) = n(z) = A,
so
zGZ)r.
To see (f), suppose x=x1 • • -;c8, ju(jc) = jc{i '' '*ik, %>k>0, and each ^i.G2r.
Define>< =Jc,- • ' 'ic,^ , where jc,-. is a barred copy of jc,-.. Qearly,
Kxy) = ^(^(x)^) = juCjc;, • • • x;k xik ■ ■ -x-h ) = A,
so xy G Z)r, which means that x G init(Z)r). This proves the */direction.

10.4 SEMI-DYCK SET AND CONTEXT-FREE LANGUAGES 317
Conversely, suppose x G init(Z)r). There exists >< G 2* so that
xy G Dr.
We show that fi(x)& 2* by induction on the length of x. The basis is trivial.
Induction step. Assume the result for a string x. If y = A, we are done. Let
a be the first letter of j\
CASE 1 a G 2r. By the induction hypothesis, /i(;c)G 2*; and by the
definition of ii,
H(xa) = /i(x)a G 2*,
which extends the induction.
CASE 2 a £ 2r. Let a =ai. By the induction hypothesis, Ai(x)G 2*. Since
l*(xy) = A, it follows that
ju(x) = *'a;, a,- G 2„, Jt'GJ*.
But, clearly,
H(xa) = At(xOi) = iiix'aiOi) = n(n(x')n(aia,)) = x'e 2*.
The induction has been extended and the proof is complete. □
We are now in a position to state and prove the main theorem of this section,
which is a characterization of context-free languages in terms of Dr. The result says
that every context-free language L may be written as
L = <p(DrnR),
where Dr is the semi-Dyck set on r-letters, R is a regular set, and tp is a homomorphism.
Of course, for any homomorphism <p, and any regular set, it follows, that <p(Dr H7?)
is a context-free language, by our closure properties.
Theorem 10.4.2 (Chomsky-Schutzenberger Theorem) For each context-free
language L, there is an integer r, a regular set R, and a homomorphism y such that
L=<p(DrnR).
Proof Let L = L(A), where A = (Q, 2, T, 6, q0, Z0, F) is a pda satisfying
the conditions of Problem 3 of Section 5.5. That is,
a) L = L(A).
b) If AGi, then 6(<70, A, Z0)= {(<70, A)} and q0 GF; otherwise, S(<70,
A,Zo) = 0.-
c) If 9 G Q -fao}orZGT-{Z0}, then 6(9, A, Z) = 0.
d) If (q, a) G 6 fa, a, Z), then lg(a) < 2.
Assume that Q, 2, and T are pairwise disjoint and that |2 U T\ = r. The
alphabet for the semi-Dyck set will be 2rU2r; but in order to retain notational
consistency, we will write _ _
20 = (2 U D U (2 U T).

318 REPRESENTATION THEOREMS FOR LANGUAGES
Define $ as the homomorphism from 2^ into 2* determined by
y(a) = a, <p(a)= A ifaGD,
<p(Z) = ip(Z)= A ifZGT.
Let Dr be the semi-Dyck set over 20 •
Define a grammar G = (Q U 20 U {S}, 20> p> •$% where S is a new symbol
and P is given as follows:
i) S-+Z0q0isin P.
ii) If 8 too. A, Z0) = {(<?„, A)}, then
is in P.
iii) If fa', a) G 5(qr,a, Z)for q, q G Q, ,4 G 2, Z G T, a G T*, then
<7 -> aaZaq'
is in P.
iv) If? GF, then
<7 -> A
is in P.
Since G is right linear, then R = Z,(G) is regular.
It only remains for us to show that
L = <p(DrnR).
Define Gq = (Q U 20 U {S},20, i>, 9) and then L(Gq) = {t£ Sol <7 =* *}■
First we prove that L£ <p(Dr C\ R). Suppose x G L. If * = A, then, by our
assumption about the pda^4,
8(q0, A, Z0) = {(q0, A)} with q0 G F.
By rules (ii) and (iii) in the definition of P, we have
S => q0 =*• A
so A G R. Since A G Z)r, we have that
AG^(Z)rnfl).
If* =£ A, then
fao, x, Z0) p1 (qf, A, A) with qfGF.
To establish the desired result, we need the following result.
Claim 1 For any q G Q, x G 2+, a G T+, and qf G F, if
fa, x, a) p- (qf, A, A),
then
xe<p(DrnaL(Gq)).

10.4
SEMI-DYCK SET AND CONTEXT-FREE LANGUAGES
319
Proof The argument is an induction on the length of the computation
sequence.
Basis. If (q, x, a) \~(qf, A, A), then xSJU{A} and a G I\ Then A
contains the rule (qf, A) G 6 (q, x, a) with qf G F. By construction, P contains the rule
qf "* A
and
<7 -* xxaqf
Clearly,
<7 => xxuqf => xxa
Hence, cade a G Z)r n a(Z, ((?„)) and
x = ip(cwcJca)G^(Z)rna(Z,(Gq))).
Induction step. Suppose that (q, x, a) (—— (<7^, A, A) with qf€F implies
x G i£(Dr ria(Z,(Gq))). Assume (<7,a;c, aZ) |— ((7', x,aa') by the rule(<7', a')GS(<7,
a, Z) and let (qf, x, aa') |—— (qf, A, A) with n > 1, qf G F, a G 2 (by our assumption
about the pda). Then
q -> aaZolq
is in P by part (iii). Clearly,
aaZa'L(Gq-)£ L(Gq),
which implies that
uZaa~Za'L(Gq')£ aZL(Gq).
By intersecting withZ)r and applying \p, we have
<p0rnaZaaZa'L(Gq')) £ <p(Drr\(aZL(Gq))).
By the induction hypothesis,x G tp(Dr n aa'L(Gq')). So
ax G aip(Z)r n aa'Z, ((?„'))•
Now, we claim the following is true.
Claim 1(a) For each q'EQ,ae T+, a G r*. a G 2,
av(Drnaa'L(Gq-)) £ <p(Dr n aZaaZu'L(Gq')).
Proof of Claim 1(a) If j> G atp(Z)r n aa'L(Gq-)), then there is z such that
j> = aip(aa'z), with aa'zGZ)r and zGL(Gq')). Moreover, note that
y = atp(aa'z) = atp(z),
since <p(a) = <p(a') = A, because ^(P) = A. Consider
aZaaZa'z.
This string is obviously in aZaa~Za'L(Gq'), since z is in L(Gq'). It is surely in Dr
since aa'z is. But
ip(aZaaZa'z) = ^(a)a^(a')^(z) = ay(z) = a<p(aa'z) = y.

320 REPRESENTATION THEOREMS FOR LANGUAGES
Thus>< is in the righthand side.
Returning to the main proof, we have that
ax G y(Dr naZaaZa'L(Gq')) £ *(Dr n aZL(Gq)),
which completes the induction proof of Claim 1.
Using Claim 1, we can show that L £ <p(Dr C\R) as follows: If x G L and
lg(x)>0, then (q0,x,Z0) £- fay, A, A), with qfeF. By claim l,*G<p(Drn
Z0L(Gq )). Since S^-Z0q0 is always a rule of P, Z0L(Gqa)<=. L(G). Therefore,
xg<p(Dr nz0L(GQo))c ^{Dr ni(G)) = ^(Dr n7?).
We must now deal with the reverse direction. Suppose x G <p(Z)r n R); then
jc = ^(y), where y G Z)r Pi R. Since y E R, S =>y. There are two possibilities for the
generation of y. Either
1 S=*q0 ^y
01 *
2 S=>Z0q0=>y
CASE 1 If S=*q0 ^•j>andj>eZ?I.n/?,thenj> = Aand AEL.
Proof S-+q0 is in P if and only if Sfa0, A, Z0) = {fa0, A)} and q0 G F,
which implies A G L. Consider the derivation q0 =>y. Assume that y =h A. Then the
derivation begins as
<jr0 => aaZaq' => y'
where a G 2, ZG T, aG r*, and q G Q. Thus >> =aaZay' where / G £(£„•)■ But
j> is not in Dr because there is no way to cancel Z. This contradiction establishes
thatj> = A.
CASE 2 Suppose that
S => Z0q0 4- y
with y &DrC\R. We wish to show that ip(y) is in L. This is accomplished with the
following result.
Claim 2 For each q G Q, a G T*, z G 2$, if og =^az with z G £(£„), and
az G Z)r, then
fa, ip(z), a) (— fay, A, A) for some qf G F.
Proof of Claim 2 The argument is an induction on the length of a
generation of aq =>az.
Basis. Suppose aq=*az with z&L(Gq) and azGZ)r. By the definition of
Gq, we must have that q -* A is a rule of P and z = A. (This is the only string in 2*
that Gq can generate.) Then az G Dr implies a G Dr. But a G Z)r implies that a = A,
since r* n 2* = {A}. Since q -+ A is in i>, then ? G F. Thus fa, <p(z), a) = (q, A, A)
r~ fa, A, A) with q G F and the basis is complete.
Induction step. Suppose that aq => az, n > 1, zG L(Gg); then azGZ)r
implies that fa, <p(z), a) £- (qf, A, A) with qf G F.

10.4 SEMI-DYCK SET AND CONTEXT-FREE LANGUAGES 321
Suppose that
aq => aaaXa. q => az
wherea G 2;XG T;a, a'G T*;z G L(Gq), and oa GDr.Clearly,
z = aaXa'z'
where z'GZ,(Gq')-Thus
—— / i (n~i) —— / /
007 =*■ aaaXaq => aaaXaz
where z & L(Gq') and aaaXaz GZ)r. But otaaXa'z £Dr implies that there exists
|3 in T* such that a = j3X and j3a'z' GZ)r. (The form of the rules in G prevents us from
having a = 0Xy with 7 G Z)r.)
Rewriting the primary derivation again,
PXq => QXaaXa'q ^ pXaaXa'z'
where z G L(Gq>) and j3ZaaZa'z' G Dr. This implies that
j3a<7 =*■ j3az
where z'GZ(Gq') and ]3a'z'GZ)r (since pXaaXa'z' GZ>r). By the induction
hypothesis,
(<7Xz'),/to') ^(^,A,A)
with qf&F. But since we have a rule <7 -* aaXa'q in />, then (q',a')€8(q,a,X).
Therefore,
(<7, <p(z), a) = fa, y(aaXa'z'),(3X)
= (q,ap(z'),pX)
\-(q'Mz'),ptt')
IrOlf.A.A)
with qf G F. This completes the induction proof of Claim 2.
To complete the entire proof, we need only show that <p(Dr C\R)£ L.
Suppose that
S => Z0q0 ^ y
andyeDrnR. Then >< = Z0y', where/ &L{GqJ and Z0y' eDr. By Claim 2,
(Qo, V(y),Z0) \^ (qf, A, A),
which means that ^(j)G L. Hence,
<p(DrnR)CL,
and the proof is complete. □
The result can now be sharpened by coding Dr into D2 .
Theorem 10.4.3 If L is context-free, then there is a regular set R, and two
homomorphisms \pi and tp2 such that
L = <p2{<p[l{D2)C\R).

322 REPRESENTATION THEOREMS FOR LANGUAGES
Proof Let D2 be the semi-Dyck set on {0, 1, 0, 1}, and let A =
(Q, 2, T, 6, <70, Z0, F) be the pda of the previous theorem. Let 2 = {a^,..., ak } and
r = {Ai,..., Am }. Define ^ which maps (2 U r)* into {0,1,0, T }* by
for each/, 1 </<£,
<fii(a,) = 10'1
Ma}) = TO'I
^(^■)=io'+fei;
. _ } for each /, KKm.
and
^(I/) = l0''+fel
Note that t^i(tf,tf,) = <£i(^4,^4,) = A for each /. Moreover, if r = |2 U T\, then
By the proof of the previous theorem,
L = <p2(Dr n R) = <p2(^"i1 (Z)2)n R). □
PROBLEMS
1 Show that Z>r and Z>r are n°t regular sets.
2 Define —reduction, and /j.' in the case of the Dyck set, i.e., for two-sided
cancellation. Reprove all the results of this chapter for D'r noting carefully where and
how things change.
3 Note that n (and //) determine relations — and —' on 2* by
x — y if and only if n(x) = n(y).
These relations are congruence relations. What sort of algebraic structure is 2 /—?
Show that 2 /— is isomorphic to the free group on r generators. Note that Dr is
the kernel of a free group.
4 Give an unambiguous grammar for Dr and Dr. Does Gr suffice for Dr1
Prove your answer.
5 Let G be a group which is finitely presented. The word problem for (the
presentation of) G is to give an algorithm which determines whether or not a given
word in the generators of G defines the identity. Show that the word problem for
the free group with generators di, ... , ar is decidable.
6 Let G = (V, 2, P, S) be a context-free grammar. Define a homomorphism
<p: V* ^N*,
(A if A G N,
*(A) =
U ifv4G2.
Let D be any derivation of G,
D: S = a0 =*•»!=*•■ • =>an
Define the index of D as
index(D, G) = max {lg (^(a,))| 0 < i< n}.

10.4 SEMI-DYCK SET AND CONTEXT-FREE LANGUAGES 323
IfwGL(G), then define
index (w, G) = min {index (D, G)\ D is a derivation of w}.
Also, the notion can be extended to grammars by defining
index (G) = lub{u| index(w, G)<u for all w G L(G)}.
G is of /zw/?e z'wdex if index (G) is finite. Show that the following grammar for Z>i
has infinite index:
S -> S5|A| a^Sax
7 Show that, for each positive integer k, there is some context-free grammar
of index k.
8 Using Problem 6 as a guide, define the concept of "leftmost index."
Produce a context-free grammar G which is of index 2 but has infinite leftmost index.
9 Define the index of a context-free language L by
index (Z) = min {index(G)| L(G) = L}.
Show that Dx has infinite index.
10 Show that, for each positive integer k, there is some context-free language
of index k.
11 Prove that every context-free language is a homomorphic image of a
deterministic context-free language.
12 Characterize phrase-structure languages in terms of Dyck sets, homomor-
phisms, and regular sets. (There are many ways to do this.)
13 A set L £ 2* is a standard regular set if there exist two binary relations
Pi and p2 on 2 (not 2*) such that x G L if and only if
i) x G j2 nj i for some (a, b) G p2, and
ii) x ^ 2*jfe2*, where (j,i)6p2-
Show that, for every regular set L £ A , there is some alphabet 2, a homomorphism
<p: 2 ->■ A , and a standard regular set i? such that L = <p(R).
14 Show that every context-free language L may be represented as L =
(p(Dr Hi?) for some r, some homomorphism <p, and some standard regular set R.
*1S Give algebraic characterizations similar to Theorem 10.4.2 for the families
of:
a) minimal linear languages,
b) linear languages,
c) metalinear languages.
16 Let Ai ={j1,j1}. Define a mapping v(ai)— 1 and v(ai)=—l, and
extend v to be a homomorphism of Af into the integers. Define the set:
L = {x €E Ai | «c = — 1 and, for each proper prefix x' of x, wc' > 0}.
L is called the Lukasiewicz language, and it is closely related to "Polish" or
parenthesis-free notation. We can generalize to A„ = {a0,. . . ,an) and define v(at) — i—\
for each i, 0 < i < n. Then

324 REPRESENTATION THEOREMS FOR LANGUAGES
L„ = {x £ A^ | vx = — 1 and vx > 0 for each proper prefix x of x}.
Show that L„ is an iterated counter language.
17 Show that a necessary and sufficient condition for L52 to be a counter
language is that there exists a regular transduction r from {jj, a"i} into 2 such that
£ = t(Z>i).
Cf. the problems in Section 6.5 for the definition of regular transductions.
18 Show that Z>i is not a counter language.
19 Show that the family of iterated counter languages is exactly the least
family containing Z>i and closed under union, inverse homomorphism, intersection
with regular sets, concatenation, +, and arbitrary homomorphisms.
20 A language L is stack bounded of order p if there is a pda A such that
L = N(A) and each stack contents is an element of Z0Ait ■ • • A; where each Atj G
r — {Z0}. L is stack bounded if L is stack bounded of order p for some p. Show that
the set of stack bounded languages is exactly the closure of the iterated counter
languages under substitution.
*21 Show that any iterated counter language that is not nonterminal bounded
must contain an infinite regular set.
22 Show that the semi-Dyck set can be accepted by a deterministic Turing
machine with space function S(n) = log n.
*23 Extend the result in Problem 22 to the Dyck set (i.e., allow two-sided
cancellation).
Let F be a field. A linear group (over F) is any group which is isomorphic to the
group of k X k invertible matrices over F for some k > 1.
*24 Show that the word problem for finitely generated linear groups over
a field is solvable in log space (i.e., by a Turing machine which runs in space S(n) =
log n). Relate to Problems 5 and 23.
There are a number of other interesting ways to utilize parentheses in language theory.
Definition A parenthesis grammar is a grammar G = (V, 2, P, S) where
i) The symbols ( and ) are in 2, and
ii) Each rule is of the form
A -* (a)
where a e (V -{(,)})*.
A set L is a parenthesis language if L is generated by some parenthesis grammar.
25 Show that a parenthesis language may be recognized in deterministic log
space.
*26 Show that there is an algorithm to decide whether L(Gi)£L(G2),
where Gj and G2 are parenthesis grammars. Hence the equality problem is also
solvable.

10.4 SEMI-DYCK SET AND CONTEXT-FREE LANGUAGES 325
*27 Show that there is an algorithm to decide whether a given context-free
language is a parenthesis language.
A general two-type bracketed grammar is a 7-tuple G = (N, 2, /, Blt B2, S, P), and
where N, 2, / are finite nonempty sets called nonterminals, terminals, and
intermediate symbols, respectively. Bi contains only indexed square brackets while B2
contains only indexed round brackets. S £ N is the start symbol. P is a finite set of
productions of the following forms:
a) A -> 77 where A S N and 77 G (2 U N U / U flj )*
b) tt-Mf-JTUG
c) a-+],-Xj3)z-
where ttS/, J'GfJVU 2)*, ]3S / U {A}, and all square brackets are in
2?i and all round brackets are in B2.
Example G = ({S,A}, {a,b,c}, {a,j3}, {[j, [2, ] j, ]2}, {(1, (2,)i, )2}, S,P)
where P is given by
S -* [Mli a -* ]2ta)2 0 -* [2j3(2
A -* <z[2.4]2c a -* hj3), 0 -* [2(2
4 -* a[2a]2c |3 -* [,/Jd
The rules in such a grammar are used as context-free rules except that the
brackets and parentheses "commute with terminals" and "cancel each other" if
properly nested. To make this precise, define a form as any string in (N U 2 U / U
B\ U B2 )*. A form 7?2 is a normalization of a form r?i if there is a 7 £ 2?i U B2, a G 2,
and forms tj3, tj4 such that tj1=tj37jtj4 and tj2 = tj3j7TJ4. A form will be
called normalized if there is no other form which is a normalization of it (e.g., if Tjj =
a(3)3A[iaB]jc[2D]2 with a,c&~L,A,B,D&N, a normalization of it would be
V2 = 1(3)3 Aa[iB] icf2D] 2. A normalization of 772 is tj3 = a(3)3Aa[1Bc]! [2Z>]2;t73
is normalized).
A form tj2 is a reduction of a form rjj if there are forms Tfa, 774 such that
either rji = 772[,];774 or 7?i =773(^)^-774,and 772 = 773774. Note that in any case the
vanishing brackets have the same index.
A form will be called reduced if there is no other form which is a reduction
of it (e.g., if 772 = j(3)3^4j[15] ic[2Z>]2 as in the previous example, then 774 =
aAa[1B]1c[2D]2 is a reduction of 772 ,• 774 is also reduced).
A form will be said to be in canonical form if it is both normalized and
reduced.
A form I is called the canonical form of a form 6 if | is in canonical form
and there is a sequence 771,772,. .. , 77m of forms such that d = 771, i; = 77m and for
each i, 1 < i < m, 77,-+1 is either a reduction or a normalization of 77,-.
28 Show that any substring of a reduced form is also a reduced form.
29 Show that the canonical form of a form 77 is unique. It will be denoted by
cf {77}. Also show that
cf{7?iP772} = cf{771}pcf{772} ifpGIUN.

326 REPRESENTATION THEOREMS FOR LANGUAGES
A form rji which is in canonical form is said to directly derive a form rj2, written
Vi =>V2, if there is a form 773 such that 773 is obtained from r}\ by replacing a symbol
u (u G NUJ) of Tj! by the righthand side of a production in P having the symbol u
on the lefthand side, and tj2 = cf {tj3}.
30 Show that for the example grammar G defined before Problem 28, we
have
L(G) = {anbnc"\n>\}.
31 Show that every recursively enumerable set L can be generated by a
general two-type bracketed grammar.
10.5 CONTEXT-FREE LANGUAGES AND INVERSE HOMOMORPHISMS
We shall show that there is one context-free language L0 from which any
context-free language L may be obtained by an inverse homomorphism. That is, for
each context-free L £ 2*, there exists a homomorphism \p so that L = <p~l(Lo)- This
result has a number of implications that are not at all obvious. It will turn out, when
we study parsing, that L0 is a "hardest context-free language." Any context-free
language can be parsed in whatever time or space it takes to recognize L0.
The key to finding L0 is to use the semi-Dyck set. Choosing L0 = D2 could
not work because D2 is deterministic and hence unambiguous. For any
homomorphism <p, <p~1D2 is unambiguous, and we could not generate the inherently ambiguous
languages in this way. We shall choose a nondeterministic version of D2 as a
"generator."
Definition Let 2 = {a\, a2, a\, a2, <t c). Define
L0 = {A}U{^1c>'1cz1d-- -xncyncznd\n>\, yi •••>>„£ *D2,xu z(G 2*
for all/, 1 </<«,
yi^{ai,a2,aua2}*
for all / > 2}.
(Note that the x('s and z,-'s can contain c's and <t's.)
It is easy to see that L0 is context-free. Imagine a pushdown automaton
which "guesses" a substring y\ in the first block and processes it in the natural way
a pushdown automaton would work on the Dyck set (i.e., stack everything and pop
when the top of the stack is ai and the input is at). The computation is repeated for
each block as delimited by d's.
We are now ready to begin the main argument.
Theorem 10.5.1 If L is a context-free language, then there is a
homomorphism y so that L = <p'L (L0) if A G L and L = ip"1 (Z,0 ~ {A}) if A £L.
Proof There is no loss of generality in assuming that A is not in L. (If A G L,
then the argument to be given works for L — A. Since i^A = A G L0 and tpx = A if

10.5 CONTEXT-FREE LANGUAGES 327
and only if x = A for our <p, the construction also works for L.) We may also assume,
without loss of generality, that L = L(G), where G = (V, ~L,P, S) is in Greibach form.
Thus every rule ■ni of P is of the form
A -> aa,
where A G N, a G 2, and ae (TV — {5})*. Let us index N as {Ai,... ,An), where
The idea of the construction is to encode productions. We shall do this by
only encoding the variables in the production, not the terminals. Then, when $ is
defined on a terminal a, it will encode all possible productions whose righthand sides
start with a.
We begin by defining two mappings from P into 2*. If nk is A{-^a, then
Tirk = a^aiai.
If Trk is Aj -*■ aAjx • ■ • Ajm for some m > 1, then
Tirk = aiaiax a1a{mai ■ • ■a\a\xa\.
To define r, we recall that / is the index of the lefthand side and
if / ^ 1 then Ttrk = Ttrk
else Tnk = <ta1a2aiTiik
Since r and r encode only the nonterminals in the production and not the terminal,
we let Pa = {nPi,... ,np } be the set of all productions whose righthand side begins
with a G 2. We define the homomorphism tp by
<pa = ifPa^0then ct(ttPi) • • • CT(itPm)cd
else <$<$
It only remains for us to show that L = tp'1 (L0 — {A}). The following claim
is the key to the proof and gives the exact correspondence between a derivation in
G and the structure of L0.
Claim For each b\,... ,bh in 2',Atl,... ,A{r in A^ — {5}, we have that
S fbi ■■■bkAh ■■■Air
under production sequence (nPl,... ,nPk) if and only if there exist x{, yit zh 1 < /
< k, such that
i) (/>(&! • • • bk) = xlcylczld • • ■ dxkcykczkd
ii) y\ '' ')>k e init(*D2) = {x\ xw G <tZ)2 for some w G 2*}
iii) iiiyi • • •yk) = taia'Jai •••aia'jai, where, for each w, /i(w) is the
unique reduced word that can be obtained from w by cancellation;
cf. Section 10.3.
Before proving the claim, let us work an example which will illustrate the
construction.

328 REPRESENTATION THEOREMS FOR LANGUAGES
Example Consider a grammar with productions
7Ti :Ai -> aA2A2
■n2:A2 -*■ a
ir3:A2 -*■ bA2
1T4-A2 -*■ b
First we compute r:
Clearly,
Now
7(7:4) = t(7t4) = aiaiai
t(7t3) = F(7r3) = aiajh a^aU^
Code for Code for
parent A2 on the
A2 right
t(7t2) = F(7r2) = aiaiai
t(iti) = WL\a2<i\ d\a2a.\ a\a\ai aia\ai
$a = ct(tti) CT(n2)cd
yb = ct(7t3) CT(Tr4)cd
A1 =j> aabA2
under production sequence (jix, ir2, ir3). Let us verify how the derivation corresponds
to conditions of the claim:
ip(aab) = c<taia2aiaia2aiaia2aiaialaica1a2aicd
c$a\a2aiaia2aiaia\aiaia\aicaxa\~axcd
ca\a\a\aid\a\ca\a\a\cd
and (i) of the claim is clearly true. Within each block (or line), there are different
ways to choose xt, y(, and Z;. In the last line, we could choose
x3 = caxalaidialaic
x3 = axa\a\
z3 = A
or we could have chosen
x3 = c
y3 = axa\.a\axa\ai
z3 = ca1a2aic

10.5 CONTEXT-FREE LANGUAGES 329
To be concrete, choose
yx = <taia2aiaia2aiaia2aiaialai = 7(7:1)
yi = t(7t2)
and
y^ = t(jt3)
This will determine xt and zt for 1 </< 3. To check whethery\ ■ ■ -y3 G init(<fcD2),
we use part (f) of Proposition 10.4.2. So both parts (ii) and (iii) of the claim require
computing n(y\y2yi)- But that is fun, as the reader who does it will learn. The answer
is
A'O'iJ'zJ'a) = W\a\ax
Now we return to the proof and note that the claim suffices to prove the
theorem. This is because (ii) and (iii) hold if and only if yx ■ ■ -yh G<fcD2; we have
that w G L if and only if there exist xlr yt, zh 1 < / < k such that
i) ipw —x\cy\CZ\d • • • xkcykczkd
ii) y\ ■■■yk^tDi
iii) yi = rnPi G S2 for / > 2,
which holds if and only if w G L0 — {A}.
Now we shall prove the claim by induction on k.
Basis, k = 1.
CASE 1 The derivation is
Ai => a
where this production is nPi. Then, applying 7 to this production gives
■nip. = <ta1a2aia1a2ai
If ^ = {*„,...,*,„,}, we have
<pa = ct(tiPi) ■ • ■ CT(iiPm)cd = xxcyxczxd
If we let j>! = 7(7Tp.),then \i{y{) - <t and so yx G init(<tD), and all the properties of
the claim are established. Conversely, if each part of the claim is satisfied, there must
be a derivation A1 =>a and Case 1 has been verified.
CASE 2 The derivation is
A1 =>aAhAh ■■■A;,,,
where m > 1, and the production is iiPi. By definition,
THPi = <taia2aia~ia~2a1aia{ma1 ■■■a1a121a1
lfPa = {iiPl,...,iiPm},then
ya = ct(tiPi) ■ ■ ■ ct(tip )cd = xxcyxczxd

330 REPRESENTATION THEOREMS FOR LANGUAGES
satisfying property (i) of the claim. As in Case 1, we let j^ = t(hp.). Then
Hiyi) = tfl^a^ax ■■■axa)ia\
satisfying property (iii). Consequently, y^ S init (*£>), satisfying property (ii).
Conversely, if each part of the claim is satisfied, there must be a derivation
Ax =>aAhAh •■•Ahn
and the basis has been verified.
Induction step. Assume the result for k > 1. Suppose we have
S f fei ■■■bhAh ■■■Air
the production ttp is Atl -*bjt+iAjl • • ■ A-n and ${b\ • • • b^)—X\cy\CZ\d • • • dxh
cykczkd, where ju(j>i • • -yk) = W\a\rax ' • -fliai'ai • Let yk+1 = t(itp) = a^al'
fliflta^ai ■■■ala'1xai andip(bk+1)=xk + 1cyk+iczk+1cd. We compute
Kyi '••J'fe+i) = K"ia-fai •••aiai'aij'fe + i)
= a\al{a\ • • • a\a-?a\a\a}£a\ ■•■aia12'a1
while
S => bx ■■■bkAil ■■■Air =* bx ■■■bkbk+1Ah ■ ■ ■ AJtA,t ■ ■ ■ Air
This establishes one direction of the proof.
On the other hand, suppose tp(&i • • -bk) = X\cy\CZ\d ■ ■ -dxk+1cyk+1
czk+1d and n(yi ■ ■■yk+i)^ *{ai> a2}*. By the construction of y, the only way
this can happen is if
Kyi '"A) = <Wi«2rai •••aiai'ai
and .
Kyn + i) = t(iip) ~ aiaUia'Jai ■■■a1a]2'a1
where irp must be a production
X. ~ bk+1Ah ■■■Ajt (10.5.1)
But since ju^j • • 'J'fe+i)£ <t{tfi, #2}*, there must be some cancellation between
^(71 '' • a) and ju(yfe+i); that is, s = /i. By the induction hypothesis,
S fbx ■■■bkAh ■■■A(r
and by using equation (10.5.1), we have
Sfbr ■■■bhAh ■ ■ ■ AitA-H-■ ■ Air
Therefore the induction has been extended. □
Corollary 1 L0 is an inherently ambiguous language.
Proof If L0 were unambiguous, then for any homomorphism ip, ip_1(Z,0)
would be unambiguous. By the theorem, every context-free language would be
unambiguous and that contradicts Theorem 7.2.2.
Corollary 2 L0 is not a deterministic context-free language.
Proof If it were, it would be unambiguous and would contradict Corollary 1.

10.6 HISTORICAL SURVEY 331
PROBLEMS
1 Write a context-free grammar for L0.
2 How would you recognize strings of L01 Can you give a polynomial-time
algorithm for recognition of strings in L0? There is a great deal of interest in efficient
recognition of strings in L0.
**3 Find a set L0 (not context-free) with the property Vna*.&=J^?>\i and
only if L0 G &. [Hint. Take three copies of L0, shuffle them (cf. Problem 8 at the end
of Section 6.4), and find a suitable restriction on the result. Relate to the family of
sets accepted by nondeterministic Turing machines in quasi-realtime.]
10.6 HISTORICAL SURVEY
Theorem 10.2.1 is from Baker [1974]. Corollary 1 of that result is due to
Ginsburg and Greibach [1966]. Corollaries 2 and 3 are due to Book [1972]. Corollary
5 is from Hibbard [1966]. Corollary 6 was proved by Matthews [1967].
Theorem 10.3.1 is from Ginsburg, Greibach, and Harrison [1967]. Problem
6 of Section 10.3 is due to Minsky [1961].
Theorem 10.4.2 is from Chomsky and Schutzenberger [1963]. The concept
of index, and relevant results about it, appear in Yntema [1967], Nivat [1967],
Brainerd [1968], and Salomaa [1969]. Problems 13 through 15 of Section 10.4
derive from the paper by Chomsky and Schutzenberger [1963]. The exercises on
counters are from Boasson [1971, 1973, 1974] and Greibach [1969,1975 a]. Ritchie
and Springsteel [1972] appear to have been the first to recognize the semi-Dyck set
in log space. Problem 24 is from Lipton and Zalcstein [1976]. Problem 25 was
suggested by the work of Mehlhorn [1975] and Lynch [1976].
The equivalence problems for parenthesis grammars are due to McNaughton
[1967] and Knuth [1967]. Various kinds of bracketed grammars are given in Schkol-
nick [1968] and Harrison and Schkolnick [1971]. Exercise 31 is from Santos [1972].
Theorem 10.5.1 is from Greibach [1973].

eleven
Basic Theory
of Deterministic
Languages
11.1 INTRODUCTION
Deterministic context-free languages are one of the most important classes of
languages because it is possible to construct efficient parsers for them. In this chapter,
we begin a systematic study of this family.
We start by studying the closure properties of the class. It takes a great deal
of work to prove that, if L is deterministic and R is regular, then LR_1 is also
deterministic. That result implies that adding or removing a right endmarker does not affect
the class of deterministic languages.
Section 11.3 is devoted to simple closure and nonclosure results. One gains a
certain amount of advantage by studying the more tractable subfamily of strict
deterministic languages. This study is begun in Section 11.4, and machine characterizations
are obtained in Section 11.5. It turns out that a language is strict deterministic if and
only if it is both deterministic and prefix-free.
In Section 11.6, an infinite hierarchy of strict deterministic languages is
established, based on degrees or on the number of states of dpda's accepting them.
Next, the length of time required to accept deterministic languages is
considered. The real-time strict deterministic languages are defined and studied in Section
11.7; and it is shown that there are some (strict) deterministic languages that cannot
be accepted in realtime even on a multitape Turing machine.
In Section 11.8, the trees that occur in strict deterministic grammars are
characterized. Using this characterization, it is possible to prove iteration theorems for
both strict deterministic and deterministic languages.
333

334 BASIC THEORY OF DETERMINISTIC LANGUAGES
In Section 11.10, we look at the family of simple languages, i.e., those
accepted by one-state realtime dpda's which accept by empty store. It will be shown
that this family has an unsolvable inclusion problem but a solvable equivalence
problem.
11.2 CLOSURE OF DETERMINISTIC LANGUAGES UNDER RIGHT
QUOTIENT BY REGULAR SETS
Our goal is to show that if L is a deterministic context-free language andi? is
a regular set, then LR'1 is also a deterministic context-free language. The proof
proceeds in an indirect manner; and first we define tables which will be used as
pushdown symbols.
Definition Let A = 02, 2,r,8,q0,Z0,F) be a dpda and let n = \Q\. For
each a S T*, define an n x n matrix T(a), called a transition matrix for a, such that
1 if (p,w, a) \— (<7, A, A) for some wS 2*,
0 otherwise.
(T(a))PQ =
Let ^~ be the set of all such matrices for A. Note that, for a dpda ,4, there are at
2
most 2" such matrices; and our first question concerns their effective construction
from A. Note that the matrix T(A) is the n x « identity matrix.
Lemma 11.2.1 Let A = (Q, 2,r,8,q0,Z0,F) be a dpda and let .T be the
set of all transition tables for A. If/is the function from ^xT* into _f defined by
f(T(a),p) = r(a/3)
for a, |3 S r*, then /may be effectively computed from A.
Proof First, we must show that /is well defined. More precisely, we have:
Claim 1 For any a1; a2) 0 e Y* if T(a{) = T(a2), then/(r(a,), 0) = f(T(a2), 0).
Proof Assume T(a{) = T(a2), where a1; a2iz A.. The special cases where
a, = A or a2 = A may be handled separately later. For any p, q e Q if T(c*\, ff)pg = 1,
then there exists w S 2* so that
(p,w,a!0) f*-(<7,A,A).
Since at ¥= A, there exist w1; w26E*,(?'sQ such that
w = W\\v2
Moreover, we may choose wi, w2, q' such that in
(p,w,axP) ^-(q',w2,a{) (11.2.1)
no part of a! is ever read. Thus we may conclude from (11.2.1) that
(P.wufi) ^(<7',A,A).

11.2 CLOSURE OF DETERMINISTIC LANGUAGES 335
mP
But
(fl',w2,ai) ]r(q,A,A)
implies
n«iv, = i.
From T(ai) — T(a2), we have
T(a2)g-Q = 1.
Thus there exists w3 S 2* such that
(q',w3,a2) f-(q,A,A).
Hence
(p, wtw3,a2/3) f- (q\ w3, a2) f~ (q,A, A).
Therefore T(a20)pg = 1.
Conversely, if T(a2$)pq ~ 1, then Tiptop,, = 1, so that
This argument also shows that, if we have T(pi) and T(0), we may effectively compute
/(7X«),0) = 7X«».
We need only show how to compute T(a) from A, where a S T*. For any p,q G <2>
1 if{We2*|(p,w,a) ^(<7,A,A)}*0,
,0 otherwise.
We know from Section 5.6 that
L(A,p,a,q) = {w6S'|(p,W,a) f-fa.A.A)}
is a context-free language which is effectively constructible from A. Moreover, its
emptiness is decidable, so that T(a) is computable from A. □
It is important to observe that the computation of / depends on T(a) and
not on a.
Corollary Let A = (Q, 2, r,8,q0,Z0, F) be a dpda with p, q&Q. Define
Rpq = {aer*| (p,w,a) f- (q, A, A) for some we 2*}.
i?p„ is a regular set.
Proof Define a relation r on T* by
(a1;a2)er «• r(a,) = r(a2).
r is a right congruence relation on T*. r has finite rank since there are only a finite
number of such matrices. Moreover
Rp« =aeUI*[«]r.
T(a)=l
where [a]T denotes the equivalence class of a under r. Therefore, Rpq is regular.
(Cf. Problem 10 of Section 2.2.) □

336 BASIC THEORY OF DETERMINISTIC LANGUAGES
Our next result not only is useful by itself, but is an important intermediate
result in showing closure under quotient.
Theorem 11.2.1 If L 9 2* is a deterministic context-free language, then so
is init(L) = L(2*)_1.
Proof By Problem 8 of Section 5.5, if c^S, then there is a dpda
A i = (<2i, 2 U {c}, r,, 6, <7o, Fi) such that
Lc = L(4i) = {wS(2U{c})*| (<7o,w,Zo) ^ (<7, A, A) for some? 6f,}.
There is no loss of generality in assuming that
8(q,a,Z) = (q',a)
implies that lg(a) < 2.
Let S~ be the set of transition tables for A\. Let/: .Tx Y\ -> J^be the
function defined by
f(T(a)J) = Tiap)
for a, j3srj. /is well defined and is computable by Lemma 11.2.1. Construct
A2 = (G2, 2, T2,62, /•„,Z„, ^2), where
Q2 = to, 011G fii. *G {o, 1,2}} u {/•<,},
T2 = ri x JTkj {Z0} where Z0 is a new symbol,
^2 = {(<7,2)|<7e<2,}u{/-o|L^0}
62 is defined by cases as shown below.
CASE 1. For any a S 2 U {A}, if 6(<70,a,Z0) = (<7,a), then:
a) if a = A, let b2(r0, a, Z„) = ((q, 0), Z„);
b) if a = A S r,, let 62(/-0, a, Z0) = ((q, 0), Z0(4, r(A)));
c) if a =^,4,,let 62(/-o,a,Z0) = ((<?, 0),Z0(A2, T(A))(AUT(A2))).
For any a e 2 U {A}, Z e r,, let 6 fa, a, Z)= (p, a).
CASE 2 (Simulation of erase rules in (q,\) or (q, 2).) If a = A, then for all
SS J^let:
b2((q,\),a,(Z,S)) = 62(fa,2),«,(Z„S)) = ((p,0),A).
CASE 3. (Preservation of tables.) If a = Z,er,, then, for all S6i", let
62((<7,l),fl,(Z,5)) = 82((q,2),a,(Z,S)) = ((p, 0), (Z„5)).
CASE 4. (f/se transition tables to predict what will happen next.) If a =
A2AU where AUA2 e r,, then for all S e S~, let
«2((<7,l),a,(Z,5)) = 82((q,2),a,(Z,S))
= (0»,0),(X2,SXi4i,/(S,i4j)).

11.2 CLOSURE OF DETERMINISTIC LANGUAGES 337
CASE 5. (Deciding to accept.) For each A&TU S&.T, and q&Qu let
((<7,2), (A,S)) if S' = /(S,i4) and there is
pS^ such that (S')<y, = 1,
(fa, 1), 04, S)) if S' = /(S,.4) and for all
peF,(S\p = 0.
62((^,0),A,(^,5)) ={
States (#, 1) and (#, 2) are the states in which input is processed. Only
A-moves are done in states (q, 0). When in state (q,0), A2 determines whether^
would accept under some future input. If such an input exists, A2 continues in states
of the form (q, 2); and if not, it uses states of the form (q, 1).
We shall prove that the construction works, by a series of simple claims.
Claim 1 FoieacliqGQ1,n>l,AiGruwG'S*,
(q0,w,Z0) ^ (q,A,An ■■■Al)
if and only if
(r0,w,Z0) f3J-((<7,0),A,a),
where a = Z0(An, TVWAn-u KAn)) ■ ■ ■ (A2, T(A3 ■ ■ • An))(A,, T(A2 ■ ■ ■ An)).
Proof The argument is a straightforward induction on k. The basis involves
only part (1) of the construction, while the induction step uses parts (2), (3), (4), and
(5). Details are left for Problem 2 at the end of the section.
Claim 2 r0GF2 if and only if L =£ 0 if and only if A S init(L).
Claim 3 init(L) = Z,(2*)_1
- [Lc((2 U {c})*)"1] n 2*
Claim 4 T(A2)^ init(L).
Proof Let w e T(A2). If u> = A and rQGF2, then L =# 0, by Claim 2, and so
A e init(L). The case r0 fi F2 is left for the problems. Suppose w ¥= A. Then there
exist q G Qu a2 G H, A G r,, 5 e S~, such that
(/•o,H-,Z0) |j2((<7,0), A,0,(^,5)) fc; «ft,2),Ka2(A,S)).
Note that a2 = Z0 if and only if S = r(A). By Claim 1, there exists a! e rf such that
(q0,w,Z0) f-(q,A,aiA)
and
[TCaO if a, =£ A,
l^A) if a, = A.

338 BASIC THEORY OF DETERMINISTIC LANGUAGES
By Claim 1, a, = A if and only if a2 = Z0. Let S' = f(S, A). If a! = A, then a2 = Z0
and S = r(A). Thus
S' = f(S,A) = f(T(A),A) = T(A) = T{a,A).
If a! =#= A, then
S' =f{T(al),A)= TfaiA).
By part (5) of the construction of 82, if 82((q,0), A,(A,S)) = ((q,2),(A,S)), then
there exists p £ T7! so that
(«')„ = 1-
Thus there exists w' £ 2* such that
(q0,ww',Z0) \j (q,w',aiA)
ff- (p,A, A) where pGFx.
Thus
W £ Z,(j4 i)
and so
w £ [£(i4i)((S U {c})*)"1] n 2* = init(L)
by Claim 3. Therefore
T(A2) £ init(L).
Claim 5 init(L) £ T(A2)
If w £ init(L) and w = A, then L ¥= 0 and r0 £ ^2 by Claim 2. Thus w £ 7X42).
Now suppose w t^ A is in init(L). There exist w £ 2* so that
ww' £ Z,
and
ww'c £ L(A{).
Hence there exists GQupGFuA £Ti anda^Tj such that
(q0,ww'c,Z0) ^ (<7,w'c,a,y4) (11.2.2)
£ (P.A.A).
But (11.2.2) implies that
(q0,w,Z0) ^ (<7,A,a,y4)
and allows us to use Claim 1 to conclude that
(r0,w,Z0) |j (q,0),A,a2(A,S))
for some a2 £ Tj. Moreover,
(7XA) if a, = A,
S =
(TXa,) if a, + A.

11.2 CLOSURE OF DETERMINISTIC LANGUAGES 339
Again, note that ai = A if and only if a2 = Z0. Let S' = f(S, A) and so S' = T(otiA).
We already know that
,A) tr(p,A,A)
where p&F,. Thus
iS')qp = 1.
By part (5) of the construction of 62,
52((<7,0),A,(^,S)) = ((<?, 2), (4,5)),
so that w S 7X-42). Therefore, init(L) c 7T(.42).
This completes the entire proof. □
The next step in showing closure under right quotient is to show closure
under removal of the right endmarker. The idea is similar to what is done in the proof
of Theorem 11.2.1, but that construction will not work directly. We wish to consider
the computation
(q,c,aA) ^(P,A,A). (11.2.3)
There are two different ways in which (11.2.3) can occur, namely:
i) The letter c can erase A and a can be erased by the null string,
or ii) A can erase A and c can erase a.
It is necessary to generalize transition tables to keep track of these possibilities.
Definitions Let A = (Q, 2, r,8,q0,Z0,F) be a dpda. Let n = \Q\, c$ 2,
and a S T*. A c-transition table for a is an n x n matrix ^(a) defined by
(Tc(a))PQ = (/,/),
where
i =
and
/ =
1 if(p,c,a) h(<7,A,A),
0 otherwise,
1 if(p,A,a) ^(<7,A,A),
0 otherwise.
Let S~c denote all c-transition tables for A.
Given a dpda, one can effectively compute the set of transition tables.
Lemma 11.2.2 Let A = (Q, 2,r,8,q0,Z0,F) be a dpda and let S~c be the
set of all c-transition tables for A. If fc: J*~c xT*-> S~c is the function defined by
fe(Te(a),P)) = Te(a®
for a, |3 S T*, then fc is computable from A.
Proof The argument is similar to that of Lemma 11.2.1 and is left as Exercise
4 at the end of the section. □
With the aid of c-transition tables, the right endmarkers may be removed.

340 BASIC THEORY OF DETERMINISTIC LANGUAGES
Theorem 11.2.2 Let c be a letter not in 2. If L 5 2* c is a deterministic
context-free language, then so is Z,(c_1).
Proof Assume that
L = 7X^0 - rd(4,)
for some loop-free dpda^. Construct a dpda^42 such that
l = 1(40 = zp2).
By using c-transition tables and fc from Lemma 11.2.2, construct ^43 from ^42 such
that
T(A3) = Lc-\
A construction similar to that employed in the proof of Theorem 11.2.1 will work to
produce A3 from A2. D
Finally, the main result can be proved.
Theorem 11.2.3 If L is a deterministic context-free language and R is a regular
set, then LR'1 is also a deterministic context-free language.
Proof Assume L,R 9 2* and let c <fc 2. Define
<p: (2 U {c})* -> 2*
to be a homomorphism defined by
la if a e 2,
ya =
(A if a = c.
Now
I, = lp-1(L)n2*ci?
is a deterministic context-free language by Problem 1 of Sec. 6.5 and Theorem 6.4.1.
Note that
Lx = {wcy\ wy EL andy GR}.
Define
L2 = (init(Z,1))n2*c-
init(L i) is a deterministic context-free language and hence so is L2. But
L2 = {wc| wcyGLi for some j> Gi?}
= {wc\ wyGL for somey£R} = (LR~l)c.
By Theorem 11.2.3, ZJ?"1 is a deterministic context-free language since L2 — (L/?_1)c
is. □
PROBLEMS
1 Do Lemma 11.2.1 and its Corollary hold for nondeterministic pda's as
well as deterministic ones?
2 Prove Claim 1 in the proof of Theorem 11.2.1.

11.3 CLOSURE AND NONCLOSURE OF DETERMINISTIC LANGUAGES 341
3 The proof of Claim 4 in Theorem 11.2.1 does not explicitly deal with the
case where w = A and r0 £ F2. Analyze this case.
4 Prove Lemma 11.2.2.
5 Complete the details of the proof of Theorem 11.2.2.
11.3 CLOSURE AND NONCLOSURE PROPERTIES OF DETERMINISTIC
LANGUAGES
It is necessary for us to know which operations do or do not preserve
deterministic context-free languages.
We recall from Theorem 6.4.1 that if L is deterministic and R is regular, then
L n R is a deterministic context-free language. We also know that, if L is a deterministic
language and S is a gsm, then 5_1(L) is also a deterministic language. (Cf. Problem 1 at
the end of the section.) It follows as a corollary of this result that the family of
deterministic context-free languages is closed under inverse homomorphisms. The first new
operation that we shall study here is product with regular sets.
Theorem 11.3.1 If L is a deterministic context-free language and R is a
regular set, then LR is also a deterministic context-free language.
Proof Let .4, = (Q,, 2, r,6,,<7,,F,) be a dpda such that T(Al)=L. There
is no loss of generality in assuming that Ay is loop-free, by Lemma 5.6.5. Let
A2 = (G2, 2,52,q2,F2) be a finite automaton such that T(A2) = R. We construct a
dpda A3 for LR as follows. l£tAi = (Q3, 2, r, 6 3, q3,Z0,F3), where
Q3 = <2i x 2Q>,
<73 = (<7i,0),
^3 = <2i x{<2^ Qt.\Q^F2 #0}.
63 is defined by cases as shown below.
CASE1. For ZST, 4, q'eQu Q<k Q2, aST* if 8l(q,A,Z)= (q',a),
define
f(<?',fiuM,a) iff'e*"!,
«3((<7,0,A,Z) =
l((<7\G),a) iff'g*1.
As ^4i goes to a final state, place the start state of Ay into the finite
control ofy4i.
CASE 2. For Z er, q,q'eQuQ£ Q2,a GS,aer*,if S^q,a,Z)=(q',a),
then define
((07',{<72}U62(G,fl)),a) if<7'e^i,
«3((<7,G),«,Z) =
Ufa,Saffi,«)),«) if* £*"i.

342 BASIC THEORY OF DETERMINISTIC LANGUAGES
We simulate all possible computations of A2 in the second component and
continue to add the start state every time A i hits a final state.
Note that A 3 is a deterministic pda because A i is.
To complete the proof, we must show that T(A3) = LR. The argument is
given in several simple claims.
Claiml Let q, q'GQlt p&Q2, wS2*, a, j3er*, and S, S'^Q2. If
(fa, S), w, a) £ (fa', S'), A, 0) and ifpGS, then S2(p, w) G 5'.
/•too/ The argument follows easily by induction on lg(w).
The next claim states a correspondence between A\ and A3 computations.
Claim 2 For any h>i, w2e 2*, for any k, 8> 0, any q,q' &Q, any a,0e T*,
and any 5, S' 9 <22,
fai.w^.Zo) S" (<7'.w2,a) I7 fa",A,|3)
if and only if
(G7i,0), WiWa.Zo) ^(fa',5),W2,a) £ (fa",5'); A,|3).
/"too/ The argument is an induction on k + 2 and is omitted.
In our next claim, we show that LR £ T(A3);
Claim 3 LR 9 T(A3).
Proof w S LR
if and only if there exist wh w2 S 2* so that
W = WiW2
WiEL,
and
w2ER.
This holds if and only if
fai,w,,Z0) £ fa', A, a) (11.3.1)
and
8(q2,w2)eF2, (11.3.2)
where q S F^ Since A y is loop-free and because (11.3.1) holds, we have
fai,WiW2,Z0) |^ fa',w2,a) ^ fa",A,0)
for some q e Fi, some 0 e r*, and some q" e Qi. By Claim 2,
(fai,0),w,w2,Zo) |j ((q',S),w2,a) \j (fa", 5'), A,|3). (11.3.3)
■"•3 -"3

11.3 CLOSURE AND NONCLOSURE OF DETERMINISTIC LANGUAGES 343
Since q G T^, we have
q2e S
by the construction. Moreover, by Claim 1,
8(q2, w2) G S'.
But we assumed that w2 G R, so
%2,w2) G F2.
Thus
and (11.3.3) is an accepting computation of A3 on wlw2. Therefore wlw2GT(A3).
In order to prove the converse, an intermediate proposition is helpful.
Claim 4 For any k>0, u>G2*, |3er*, q'eQu S'£ Q2 if there exists
q" ES' and
((<7i,0),w,Za) tj3(fe',5'), A,|3),
then there exist wu w2 G 2*, q G Qu S 9 <22, and a G T* such that
w = w1w2
((^iJXwj^Zo) £ ((<7,S% w2)a) £ ((q',S'),A,P),
where
<72 G 5,
and
%2,w2) = <7".
Proof The argument is a straightforward induction on k and is omitted.
Claim 5 T(A3) 9 £/?.
/Voo/ Assume that w G 7X/43). Then,
((«7„0), w.Zo) £ ((<7',S'),A,|3) (11.3.4)
for some /3ST*, and some (q',S')£F3. This means that S' (~}F2¥=<I), so let
q" e 5" n F2. By Claim 4 there exist w,, w2 G E*, ? G Q,, S^ Q2, a e r*, such that
(fei^Xw^^Zo) f3((<7,.S),w2,a) £-3((<7',S'),A,0),
where # G Fi, #2 G5, and 62(<72, w2) = q". By Claim 2,
(<7i,w,,Zo) ^ (<7,A,a)
with q&Fi and
62(42, w2) = q"^F2.
Thus, h> = w! h>2 with h> i G Z, and w2 G i?, and therefore T(A 3) 9 Li?. □
Next we study product on the left with a single string.

344 BASIC THEORY OF DETERMINISTIC LANGUAGES
Lemma 11.3.1 Let 2 be an alphabet, let a S 2 and let c £ 2. The following
conditions are equivalent.
1 cL is a deterministic context-free language.
2 L is a deterministic context-free language.
3 ah is a deterministic context-free language.
/Voo/ Suppose cZ, is a deterministic context-free language. Assume, without
loss of generality, that cL = T(A), where A = (Q, T,,r,8,q0,Z0, F) is a loop-free
deterministic pda. Define B = (<2, 2 — {c}, T, 6B, q0, Z0, Z7), where
8B(q,a,Z) = 8(q,a,Z) for a\l(q,a,Z)eQ x ((2 U {A}) - {c}) x I\
The initial state ^0 and the initial symbol Z0 are determined by the condition
(q0, c, Z0) If- (<7o> A, aZ0) for some a£ T*. It is now a straightforward matter to
verify that L = r(5), which is still a deterministic language.
To prove (2) implies (3) is a trivial dpda construction and is omitted.
To show that (3) implies (1), assume that ah is a deterministic context-free
language. Define a homomorphism <p by
ipb = b iffe€2
and
ipc = a.
Then
V~l(aL) n c2* = cZ,
is deterministic by the closure of deterministic languages under inverse homomorphism
and under intersection with regular sets. D
We can use this lemma repeatedly as follows.
Theorem 11.3.2 Let L 9 2* and w S 2*. L is a deterministic context-free
language if and only if wL is.
Proof Let h> = ax- ■ ■ an, where n > 0, afe S 2 for 1 < fc < n. The proof
follows from using the lemma n times. □
Now we turn to showing that the deterministic context-free languages are
not closed under certain operations. We recall that they are not closed under union.
Theorem 11.3.3
1 The family of deterministic context-free languages, A0, is a proper
subfamily of the family of context-free languages.
2 A0 is not closed under homomorphisms. More strongly, A0 is not closed
under "projections";thatis,maps from 2 into A extended to be
homomorphisms.
3 A0 is not closed under concatenation. More strongly, A0 is not closed under
concatenation on the left by two element sets.
4 A0 is not closed under * (or + either).
5 A0 is not closed under transposition.

11.3 CLOSURE AND NONCLOSURE OF DETERMINISTIC LANGUAGES 345
Proof Since every deterministic context-free language is unambiguous, it
follows from the existence of inherently ambiguous languages that A0 is a proper
subfamily of the context-free languages.
To see (2), recall that Dr, the semi-Dyck set, is deterministic. For any regular
set R, DrCiR is deterministic; and if A0 were closed under homomorphism, then
L = ^>(DrnR)
would be deterministic. By Theorem 10.4.2, every context-free language would be
deterministic, which would contradict (1).
It is convenient to define
L0 = {«W| /,/>l},
Li = {a'ftVl i,i>\\
Recall that L2 is inherently ambiguous and hence not a deterministic context-free
language. (Cf. Section 7.2.)
To prove part (3) of the theorem, let
L = cLqUZ,!.
It is clear that L is deterministic. Let R = {c, c2}. Suppose that RL were deterministic.
Then , » „ -
RLDc2a*b*a* = c2L2
would be deterministic. Using Theorem 11.3.2 allows us to conclude that L2 is
deterministic, which is false.
To show (4), define
L = Z,oUcZaU{c}.
Suppose that L* were deterministic. Then
L*nc2a+b*a* = c2L2
would be, and that was already shown to be impossible.
To show (5), let
L = cZaUc2Z,0.
Clearly,/, is deterministic. Suppose LT were deterministic. Then
(LT{c,c2}"1)n2* =L0UL! = L2
would be deterministic by Theorem 11.2.4. But this is a contradiction. □
PROBLEMS
1 Show that if L is a deterministic context-free language and R is a regular
set, then iUi?>
L-R,
and
R -L
are all deterministic context-free languages.

346 BASIC THEORY OF DETERMINISTIC LANGUAGES
2 Let 2 be an alphabet and <t=# $ $ 2. If Lu L2£ 2*, then ^ U $i2 is
called the marked union of L\ and L2. The marked concatenation of L\ and L2 is
Lx$L2 while the marked star is (/,!$)*. Show that the family of deterministic context-
free languages is closed under marked union, marked product, and marked star.
3 Show that, if L is a deterministic context-free language and R is a regular
set then
div(L,R) = {u\ uR£ l}
is also a deterministic context-free language.
4 Find a context-free language L and a regular set R such that div(L,R) is
not even context-free. R can even be chosen to have cardinality two if L is chosen
correctly.
5 Let x, y&T,*. Recall that x<y if y = xy' for some / S 2+. Define
min(I) = {yGL\ If x <y, thenx <£/,}.
That is, j» is in min(i) if y S i and no proper prefix of j» is in L. Show that if L is a
deterministic context-free language, then so is min(i).
6 Show that, if L is a context-free language, then min(L) may not be context-
free.
7 IflS 2*, define
max(I) = (ySil Ifj><x,thenx£z,}.
Thus y S max(i) means that y£i and no continuation of y is in i. Show that, if Z,
is a deterministic context-free language, then so is max(i).
8 Give an example of a context-free language L so that max(i) is not context-
free.
9 Extend the proof of Theorem 11.3.2 to cover projections.
10 Show that, if L is a deterministic context-free language, then L — min(i)
is also a deterministic context-free language.
11 Are the deterministic context-free languages closed under left quotient
with regular sets?
12 Show that it is unsolvable to determine, of two deterministic context free
languages Lx and L2, whether or not:
a) Li U L2 is deterministic;
b) LjCl,;
c) LiL2 is deterministic;
d) L\ is deterministic.
11.4 STRICT DETERMINISTIC GRAMMARS
In this section, a new family of grammars will be introduced and related to
the deterministic languages. The intuition behind this family is technical and its use
will simplify many arguments.

11.4 STRICT DETERMINISTIC GRAMMARS 347
Notation. Let aS V* and «>0. (n)a denotes the prefix of a of length
min{«,lg(a)}. The notation a(n) is used for the suffix of a of length min{«,lg(a)}.
Definition 11.4.1 Let G = (F, 2,P, S) be a context-free grammar and let
7r be a partition of the set V of terminal and nonterminal letters of G. Such a partition
7r is called strict if:
1 2 Sir;and
2 For any ,4, ,4'SAT and a, 0, 0' e K* if,4-+a0 and ,4'-+a/3' are inland
^4 =^4' (mod w), then either:
i) both 0,0' =£ A and (1)0 = (1)0' (mod tt), or
ii) 0 = 0' = AancM=j4'.
In most cases, the partition tt will be clear from the context and we shall
write simply A =B instead of A =B(mod v), and [A] instead of [^4] n = {A' G V\
A' =A (mod w)}.
Definition Any grammar G = (V,'2,P,S) is called strict deterministic if
there exists a strict partition -n of K. A language Z, is called a sfrz'c? deterministic
language if Z, = Z,(G) for some strict deterministic grammar G.
The motivation behind this definition is that we wish to make certain
restrictions on the simultaneous occurrences of substrings in different productions.
Intuitively, if ^4->a0 is a production in our grammar, then "partial information"
about A, together with complete information about a prefix a of a0, yields similar
partial information about the next symbol of 0 when 0 =# A, or the complementary
information about A when 0 = A. In the formal definition, the intuitive notion of
"partial information" is precisely represented by means of the partition tt.
Example 1 Let Gx be a grammar with the productions
S -> aA \aB
A -> aAa\bC
B -> aB\bD
C -> bC\a
D -> bDc\c
The blocks of a strict partition are: 2, {S}, {A,B}, {C,D}. The language is
L(Gl)={anbkan,akbncn\ k,n>\).

348 BASIC THEORY OF DETERMINISTIC LANGUAGES
Example 2 Our second example is a grammar for a set of simple arithmetic
expressions (enclosed in parentheses). A natural grammar for them might have
productions
E -> E+T\T
T -> T*F\F
F -> S\a
Unfortunately, this grammar has no strict partition. However, we can find an equivalent
strict deterministic grammar G2:
S -> (E
E -> TXE\T2
Ti -> FlTl\F2
T2 - F1T2\F3
Fi -> (E *\a *
F2 -> (E + \a +
F3 -> (E)\a)
The blocks of a strict partition for G2 are 2, {S\ {E}, {Tu T2\ {Fu F2, F3).
Next, we turn to the problem of testing a grammar for the strict deterministic
property. First we introduce a convenient concept.
Definition Let a, 0 e V* and let A, B be two letters in V such that A ¥=B
and we have a = yAa1 and |3 = yB^ for some y, au /3j S V*. Then the pair (A, B) is
called the distinguishing pair of a and |3. A distinguishing pair (A, B) is said to be
terminal \fA,B&'L.
From the definition, we can make the following observation.
Fact Any two strings a, |3 S V*, have a distinguishing pair if and only if
either a is not a prefix of |3 or |3 is not a prefix of a. If they have a distinguishing pair,
then it is unique.
We shall now give our algorithm.
Algorithm 11.4.1 The algorithm takes as input any context-free grammar G
and determines whether G is strict deterministic or not. In the former case the
algorithm produces the minimal' strict partition of V.
* The partition is minimal with respect to the standard lattice ordering of partitions;
that is, 7Ti < tt2 if =i 9 =2-

11.4 STRICT DETERMINISTIC GRAMMARS 349
Assume that the productions of G are consecutively numbered; that is,
P = {Ai + attf = 1,...,\P\)
and all productions are distinct.
begin
tt= {{A}\AGN}UZ;
L:forz': = 1 to \P\-1 do
begin
for/:=z' + 1 to \P\do
begin
comment At -*■ at and Aj -*■ a,- are two distinct productions in P;
if At = Aj then
if a-, and a;- have no distinguishing pair then return (fail)
else
begin
let (B, C) be the unique distinguishing pair of a,- and a;-;
if B=tC then
ifB<£ 2 and C$2 then
begin
replace [B] and [C] inir by one new block [B] U [C];
gotoZ,
end else
begin
comment B e 2 or C G 2 but not both since B $ C;
return (fail)
end
end
end
end;
return (succeed)
end.
Proof of the correctness of the algorithm The algorithm is relatively simple
and we give only an outline of a proof of its correctness. At a later time, we will
need the properties that may prevent a grammar from being strict deterministic.
First let us check that the algorithm always halts. There are two for loops,
each of which is bounded. Variables i and / are not changed inside of these loops.
Branching inside of the for-loops is either:
i) forward,
ii) out of the loops, or
iii) restarts the loops.
The first case must terminate since the loops are bounded. The second case trivially
halts. In case (iii), the cardinality of -n is decreased. Once we get to |ir| = 2, then, if
B^C, we have B G~L or CSS (but not both). Thus the loops cannot be restarted.

350 BASIC THEORY OF DETERMINISTIC LANGUAGES
Now, let us investigate how a grammar may fail to be strict deterministic.
Under the hypothesis and notation of Definition 11.4.1, we note that, if A —A'
(in the initial stage of the algorithm) or if A =A' (as a result of preceding stages in
any subsequent stage) and if 0, 0 =# A and (1)|3, (1)|3' S N, the case (i) in Condition 2
of the definition can always be satisfied by forcing ^/Js*1*/?' (the statement
preceding the goto does the job). We do not obtain an essential failure under these
circumstances. But the following three kinds of failure are essential. Assume A =A'
(forced by preceding stages).
Failure! A -> aft
A' -> a
where /3X =# A. Taking |3 = |3i and |3' = A in Definition 11.4.1, neither (i) nor (ii) can
occur. Algorithm 1 directly returns "fail" since a/3i and a have no distinguishing pair.
Failure II A -> a
A' -> a
where A ¥=A'. Taking/3 = |3' = A, we obtain case (ii) with the contradictory requirement
A = A'. Algorithm 1 halts in the same way as in the case of Failure I.
Failure III A -> aBfa
A' -> aaf}2
where B£N and a e 2. Taking |3 = 5j3j and |3' = a/32 in Definition 11.4.1, we obtain
case (i) with the requirement B= a, which contradicts condition 1. Algorithm 1
returns "fail" from the innermost else clause.
These are all the possible ways for a grammar to fail to be strict deterministic.
If no failure occurs, G is strict deterministic. Also, Algorithm 11.4.1 cannot return
"fail" and since it always terminates, it returns "success".
The fact that the partition produced is strict and minimal follows from the
property of the algorithm that two letters are made equivalent if and only if it is
required by Definition 11.4.1. □
We need to establish some simple but important properties of strict
deterministic grammars, all of which are almost direct consequences of Definition 11.4.1.
Theorem 11.4.1 Any strict deterministic grammar is equivalent to a reduced
strict deterministic grammar.
Proof The standard construction (cf. Section 3.2) for reducing a grammar
consists only of deleting some productions. The result thus follows directly from the
following claim.
Claim Let t be a strict partition for a grammar G = (V, "L,P, S), and let
G =(K', ~L,P',S) be a reduced grammar for G constructed by the methods of
Section 3.2. Then G' is also a strict deterministic grammar.

11.4 STRICT DETERMINISTIC GRAMMARS 351
Proof of the claim In G', we have V' 9 V and P' 9 P. Let w' be the partition
obtained from -n by deleting all symbols of V — V'. Then the only way that -n' could
fail to be strict would be to have two productions Pi and p2 for which a failure occurs.
But thenpi,p2GP and the failure would occur in -n also and n would not be strict. □
The following lemma extends the property of strict partitions from
productions to certain derivations in the grammar.
Lemma 11.4.1 Let G = (V, 2,.P, S) be a context-free grammar with a strict
partition -n. Then, for any A, A' &N, a, |3, |3' S V* and n > 1, if A =>£ ofi,A' ^1 aQ'
and A =A', then either
i) both p\p-'=£ A and (1)|3=(1)p-'
or iii) |3 = |3' = A and A=A'.
Proof The argument is an induction on n. The basis is immediate since, for
n = 1, the assertion of the lemma is equivalent to condition 2 in Definition 11.4.1.
Inductive step . Assume the assertion of the lemma is true for a given n > 1
and consider the case of n + 1. We can write:
A =£ wBfa =*L wfla/J, = a/3
and
A =>L w B j3, =>L w Mi = a|3
for some B, B1 £N, w, w S 2* and |3i, j3'i, j32, j32£ K*, and assume, without loss of
generality, that lg(p2)>lg(p2).
CASE 1. 0<lg(a)<lg(w). Then both (1)0, (1)/3'S2 and thus (1)|3 = (1)/3'.
For the remaining cases we make the following claim.
Claim In the above derivations, if w is a prefix of w , or w is a prefix of
w, then h> = h>' and B = i?'.
Proof of claim Assume, without loss of generality, that w is a prefix of w;
that is, that h>' = ww" for some h>" G 2*. We have A =>L wBfii and A' =*L ww"B'$\.
Then, by the inductive hypothesis, (1)(5|3i)= (1)(w"5'|3i) and since, for any u€E,
B^a (by the property that 2 e w), we conclude that w = A, u> = u>', and B = B'.
This completes the proof of the claim.
CASE 2. lg(w) < lg(a) < lg(w/32). In other words, we have a = w7i and
02 = 7i72 f°r some 71, 72 £ K* where 729t A. w is a prefix of w , or, conversely
(because they are initial subwords of a), we have w — w' and B=B' by the claim.
Therefore we can write also |32 = 7i72 for some 72S V+ (we have used the
assumption that lg(|32) > lg(|32)). Now we have B-^-y1y2, •6'->7i72, and 72, 72^ A.By the
strictness of it (property 2(i) in Definition 11.4.1), we conclude that (1)/3= (1)72 =
(1>72=(1)P'.

352 BASIC THEORY OF DETERMINISTIC LANGUAGES
CASE 3. Ig(wj32) < lg(a) < lg(wj32|3i). Again, from the claim, we have
w = w' and B = B'. Moreoever, j32 is a prefix of j32, or conversely. By the strictness
of 7r (property 2(ii) in Definition 11.4.1), this is possible only if /32 = /32 and B = B'.
Thus wB = wB' and the result follows from the inductive hypothesis. □
The following concept will start to appear regularly.
Definition A set L9 2* is prefix-free if uv&L and u&L imply v= A.
That is, a set is prefix free if no prefix of a string in L is also in L.
The following simple theorem is important.
Theorem 11.4.2 Any strict deterministic language is prefix-free.
Proof Let G = (V, 2,,P, S) be a strict deterministic grammar, and assume
S =*L w and S =*L wu
for some w,«£ 2* and«,«'> 1. If «<«', we have
5 =*i, u> and S =*£ a =>L w« (11.4.1)
for some a S K*. On the other hand, if n < n, we have
5 =*£, a =*£, u> and S =*L wu. (11.4.2)
In either case, for any k, 0<k<\g(a), (fe>a=(fe)u> implies (fe+1)a= (fe+1)u>, by
Lemma 11.4.1. Moreover,
(fe+Da = (fe+i)w
since any terminal prefix of a is (by the definition of =*£,) a prefix of wu in (11.4.1)
or of w in (11.4.2). But this shows that a = w, and so u = A. □
Corollary If L is a strict deterministic language, then either L = {A} or A ^ L.
The following result will often be applied in the sequel.
Theorem 11.4.3 Let G = (V, ~L,P, S) be a reduced strict deterministic
grammar. Then, for any,4,.BeAfandae V*,
A =*-+ Ba implies A ^ B.
Proof Let G be as in the theorem and let A =>+ Ba for some A,B£N and
a S K*. Then, for some a' e K* and n > 1, we have
X ^ 2ta\
Assume, for the sake of contradiction, that A=B. Then, by Lemma 11.4.1, for any
A'e [A], A'=>"Lp implies (1)/3e [A], and we obtain, for arbitrarily large k>l,a
derivation
kn
A =>L |3fe where ft, SAT*. (11.4.3)

11.4 STRICT DETERMINISTIC GRAMMARS 353
By the fact that G is reduced also
A =>™ w where w£ 2* andm> 1. (11.4.4)
Applying Lemma 11.4.1 to (11.4.4) (and using the definition of =*£,), we obtain that
foranym'>m and yGV*, A=>™ y impliesy62K* or7 = A,
which contradicts (11.4.3). □
Corollary No reduced strict deterministic grammar is left recursive (i.e., for
no .4 SAT and a€ K*, does.4 =>+Aa).
We shall now prove a useful grammatical normal-form result. Other results of
this type are left for the exercises. The technique used in the proof is very instructive.
Theorem 11.4.4 Any strict deterministic grammar is equivalent to a strict
deterministic grammar in Greibach normal form.
Proof Let G = (V, Z,P,S) be a context-free grammar with strict partition tt.
If L(G) = {A}, we take the grammar with the only production S -*■ A and the result
is immediate. Assume, now, that L(G) ¥= {A} and that G is reduced and A-free. There
is no loss of generality in this assumption by Theorem 11.4.1 and Problem 3 at the end
of the section. Moreover, we may assume that G is in canonical two form by Problem 5.
We shall fix our attention upon the leftmost derivations in G; the main idea
of the proof is based on Lemma 11.4.1. First, we need the following auxiliary result.
Claim 1 For any AG.N, there is a unique integer nA > 1 such that for any
aev*.
A ^"'a implies aGNV* (11.4.5)
and
A =£A a implies a€2K*. (11.4.6)
Moreover, for any A, A' G.N,
A = A' implies nA = nA\ (11.4.7)
Proof of the claim Let A &N. By the fact that G is reduced and A-free, we
can find an integer n > 1 such that A=^lol for some a e 2K*; and with the property
that for any 0 S V*, A =?L 0 implies 0 e NV* (it is enough to find a shortest leftmost
derivation from A producing a string in 2K*). Denote nA—n. We have already
established (11.4.5). To prove (11.4.6), assume that A =>1A a for some arbitrary
a E.V*. By Lemma 11.4.1, we have immediately (1)a = (1 V (note that a' =# A), and
hence (1)a' e 2. This shows (11.4.6).
Now, let A' GN, A' =A, and assume that nA< satisfies (11.4.5) and (11.4.6)
for A'. Assume, without loss of generality, that nA< ~>nA. Then we have A =*-LA a
and A' =>£A a' =>L a", where k = nA'— nA and a, a', a"SK*. By Lemma 11.4.1,
then, (1)a=(1V and thus a'SSF*. Hence, nA><nA and we conclude that
nA = nA\ We have (11.4.7). The uniqueness of nA for a given A GNis a consequence
of (11.4.7), taking A' = A. This completes the proof of Claim 1.

354 BASIC THEORY OF DETERMINISTIC LANGUAGES
We define the following finite relation PSL N x V*,
P = {(A,a)\ A GN,aG V* and A =>"A a}.
By Claim l,(A,a)GPimplies a<E 2V*.
Let us now define a new grammar G' as follows:
G' = (V.Z.P.S).
Since P is finite, G' is well defined. Also, by (11.4.6), A-*ainG' implies aS 2K*
and therefore G' is in Greibach normal form.
Claim 2 L(G') = L(G).
Proof of the claim We have Pk =>j;G <k =>G and therefore any G '-derivation
is a (7-derivation. Thus, for any wS2*, S^G'w implies S=>G w. Hence, L(G')
k L(G). For the converse, we note that any derivation of the form
S <G w (11.4.8)
where w G 2*, can be written as a composition of one or more derivations of the form
uAp^nLAG ua& (11.4.9)
where u G 2*, A G7V, ae 2K* and |3e K*. Here (4,a)£P and thus(11.4.9)can be
written as uA(t^G' ua&- Thus (11.4.8) implies S=>G- w. Hence Z,(G)^ L(G') and the
claim is proved.
Claim 3 G' is strict deterministic.
Proof of the claim We shall show that the partition -n which is strict for G
is also strict for G'. But this is simple since the statement of Lemma 11.4.1 can be
converted to the second condition of strictness (in Definition 11.4.1) by taking
n~nA= nA' and replacing =*L by -*-G\ This proves Claim 3. □
PROBLEMS
1 Show that any strict deterministic language is unambiguous.
2 Show that there are strict deterministic grammars that do not have maximal
strict partitions. Thus the family of strict partitions on a grammar forms a semilattice
but not a lattice.
3 We know, from the Corollary to Theorem 11.4.2, that either a strict
deterministic language L has the property that L = {A} or A ^ L. Prove the following result.
Theorem Let G be a strict deterministic grammar. Then either L(G) = {A}
or there exists an equivalent A-free strict deterministic grammar G'.
You may find it more convenient not to use the standard construction.
4 Prove the following result. (Cf. Section 4.5.)
Theorem Every strict deterministic grammar G is equivalent to an invertible
strict deterministic grammar G'.

11.5 DETERMINISTIC PUSHDOWN AUTOMATA 355
5 Prove the following result. (Cf. Section 4.4)
Theorem Any strict deterministic grammar is equivalent to a strict
deterministic grammar in canonical two form.
6 Can you extend Problem 5 to use Chomsky normal form as opposed to
canonical two form?
7 Show that there is an algorithm to decide whether or not a given
deterministic context-free language is prefix-free.
11.5 DETERMINISTIC PUSHDOWN AUTOMATA AND THEIR
RELATION TO STRICT DETERMINISTIC LANGUAGES
In this section, we reconsider dpda's and note some different ways in which
they may accept languages. This will give us some characterizations of strict
deterministic languages.
Definition Let A = (Q, Z,r,8,q0,Z0,F)be a dpda and, for a given AT £ T*,
define the language 1\A,K) £ 2* as follows:
T(A, K) = {w G 2* | (<7o, w, Z0) )r (q, A, a) for some q G F and a G K}.
In particular, let
T0(A) = T(A,r*),
T,{A) = T(A,r),
and
T2(A) = T(A,A).
We shall use these definitions to define three families of languages.
Definition We define the following three families of languages for i = 0,1,2:
A,- = {1}(A)\ A is a dpda}.
A0 is our family of deterministic context-free languages. Note that A2 does
not even contain all regular events but only the prefix-free ones.
The family A! contains all regular languages. It can be shown that A! is
a subfamily of the closure of A2 under the Kleene operations.
First some simple facts about these families will be established.
Theorem 11.5.1
1 Every language in A2 is prefix-free.
2 A2£ A,c A0.
3 All inclusions in (2) are proper.
Proof Part (1) is an immediate consequence of the definition of T2(A),
since if
(q0,x,Z0) ^ (<7,A,A),

356 BASIC THEORY OF DETERMINISTIC LANGUAGES
then
(q0,xy,Z0) )r(q,y,A) f^-fa'.A.A)
is possible only if y = A (and q' = 17).
To prove A2 c Au let I€A2 and let A = (G, 2, T,5, <70, Z0, F) be a dpda
such that Z, = 72 (<4). We shall construct another dpda A' with the same behavior as
A except that A' will have an additional initial pushdown letter Z'0, which is never
overwritten. Formally, let
A' = (QU{q'0},Z,ru{Z'0},8',q'0,Z'0,F),
where q'0$. Q, Z'0$. Y, and
IS(q,a,Z) if,? GG,aG2A, and ZGT;
(<7o, ZoZ0) if <? = q'0, a = A, and Z = Zq;
undefined otherwise.
Now, for any x G 2*, first (170,x,Z'0) \^ (q0,x,Z'0Z^); and afterwards, for any q&F,
(q0,x,Z'oZ0) ^7 (<7,A,Zq) ifandonlyif (<70,x,Z0) [^ (<7,A,A).
Since q&F implies that q =£<70, we have 5'(<7, A,Zq) = 0, and therefore x G T^A')
if and only ifx G r2(yl). Hence,! = T^A').
Next, we wish to show A! f= A0. It should be clear to the reader why a
construction is necessary. Let I6Aj and let A = (Q, 2, T, 5, <70, Z0, F) be a dpda such
that L = ^(/i). We shall construct another dpda A' which possesses a special copy of
each letter in T, to be able to distinguish the bottom of the store. A' simulates^ but
accepts only with a single letter on the store. Formally, let
A' = (e',2,r',S',<7o,Zo,F),
where
F= {q\q&F},
Q' = 2UF,
f = {ZIZGT},
r' = ruf.
5' is defined by cases.
CASE 1. For each q G Q, Z G r, and a G 2 U {A},
8'(q,a,Z) = 8(q,a,Z).
CASE 2. For each qGF,Ze r, and a G 2 U {A},
8'(q,a,Z) = 5(<7,a,Z).
CASE 3. If 8(q,a,Z) = (<?', Fa) with 7GT,<7 G Q -F, then
S'fo.fl.f) = fo'.Fa).

11.5 DETERMINISTIC PUSHDOWN AUTOMATA 357
CASE 4. If 8(q,a,Z) = (q', Fa) with 7e r and ^G F, then
S\q,a,Z) = (q',Ya).
CASE 5. If qGF, then
5(<7,A,Z) = (q,Z).
It is easy to see that, for every x S 2*,
(q0,x,Z0) fa (q,x,Za)
if and only if
(q0,x,Z0) fa (<7,A,Za).
Thus, xS T\(A) if and only if a = A and qGF. In turn, this holds if and only if
(q,A,Za) fa{q,\,Z).
But this holds if and only if x e T0(A'). Therefore,! = T0(A').
To show that A2 ¥= Alt let Z,! = {a}*. ij is not prefix-free and hence L is not
in A2. It is clear that ij = Ti(A), where
A = ({<7o), {«}, {Zo},5,<7o,Z0, {<7o}) and 8(q0,a,Z0) = (q0,Z0).
To show A, ^A0, let
I2 = {anbn\ «>l}Ua*.
First, Z,2=r0U), where A = (Q, 2, r,8,q0,Z0,f) is a dpda with Q =
{<7o,4i>> 2 = {«, ft}, r = {Z0,Z1,Z2},F= {«70}, and
((q0,Z{) ifZ = Z0;
5(<7o,a,Z) =
{(q0,ZZ2) if Z¥=Z0;
and for any q€Q,
L0,A) ifZ = Z,;
5(<7,ft,Z) =
(<7,,A) ifZ = Z2;
undefined if Z = Z0.
Second, assume that Z,2 = ^(d) for some dpda A. Since there are only a finite number
of accepting configurations (q,A,Z) where qGF and Z£T, there is a sufficiently
large « such that, for some k<n,
(q0,an,Z0) ^(<7,A,Z)
(q0,ak,Z0) h(<7,A,Z),
where q eFandZer. But, since,for some q' SFand Z'€ r,
(<7o,a"ft",Z0) ^(<7,ft",Z) f-fo'.A.Z'),

358 BASIC THEORY OF DETERMINISTIC LANGUAGES
we also have
(q0,akbn,Z0) \^(q,bn,Z) \^(q',A,Z'),
and hence akbn G£2, which is a contradiction since k<n. Therefore L2£ &i- D
Our next result is quite simple but informative.
Theorem 11.5.2 L G A2 if and only if L is prefix-free and L G A0.
Thus A2 coincides with the family of prefix-free deterministic languages.
Proof The only if direction is immediate from Theorem 11.5.1.
For the if direction, let L G A0 and let A = (Q, 2, T, 5, q0, Z0, F) be a dpda
such that L = T0(A). We can construct another dpda A' which simulates.4 and erases
all its store after A accepts. If L is prefix-free then L — T2(A').
Formally, let
a' = (Gu{(?f},s,ru{y0},6,,(?0,y0,y
where q{$Q and Y0£ I\ 5' is defined as follows:
1 For each <7 G g-F and ZGT,
8'(q,a,Z) = 8(q,a,Z).
2 For each <7 GFU {<7f}and Z G T U {r0},
5'(<7,A,Z) = (<7f,A).
3 8'(q0,A,Y0) = (q0,Y0Z).
Assume that L is prefix-free. If
(q0,xy,Z0) \^(q,y,a) \^(q\A,a),
then q' G F implies q £ F or y = A. Therefore, for any xG2* and<7GF,
(<7o,x,Z0) [^ (<7, A, a) if and only if (q0,x,Y0Z0)\j,(q,A,Y0a)\j,(qt,A,A),
and thus* G ro04) if and only if x G r204'). Hence,! G A2. D
We now begin the long argument to show that every language in A2 is strict
deterministic. Our first lemma puts a dpda into a more convenient form.
Lemma 11.5.1 Let L G A2. Then L = T2(A) for some dpda .4 with a single
final state.
Proof Assume L = T2(A) for some dpda A = (Q, 2, T, 5, q0, Z0, F). Using a
similar construction as in the proof that A^ A0 in Theorem 11.5.1, we construct
another dpda A which can distinguish the bottom of the store. Otherwise, it simulates
A except that if A erases its store and accepts, A does the same by entering the single
final state.
Formally, let
A = (G,Z,r\S\<7o,.Zo,{<7f})

11.5 DETERMINISTIC PUSHDOWN AUTOMATA 359
where r' = ruf= {Z, Z\ ZGT} and qt GF is chosen arbitrarily. 5' is defined by
cases. For each q G g, a G 2, andZGT,
CASE1. &'(q,a,Z) = 8(q,a,Z);
!(q',Z'y) if 8(q,a,Z) = (q',Z'y);
(<7f,A) if5(<7,a,Z) = (<7',A)and<7'eF;
undefined otherwise.
Now, for any w G 2*,
w&T2(A') if and only if (q0, w,Z0) ^ (q, a, Z) \^ (qt, A, A)
for some a G 2A, Z G r, and q€Q,
if and only if (<70, w, Z0) \j(q,a,Z) [j fa', A, A)
for some q' 6f,
if and only if wS^i). D
The next phase in our proof is to pass from A to a grammar which generates
the same set. We shall employ the standard construction used in Section 5.4. The
reader should review that construction. We shall refer to the canonical grammar of A
denoted by G^ as defined there.
Lemma 11.5.2 For any dpda A with a single final state, the canonical
grammar G^ is strict deterministic.
Proof Let G^ be the canonical grammar as given in Theorem 5.4.3 for a
given dpda A (with a single final state). Define an equivalence relation = on F such
that:
A =B ifandonlyif A,B&l,ovA = (q,Z,q)
B = (q,Z,q") for some q,q',q" GQ,ZGF.
Let 7r = V/=. Obviously, 2 G v. Let
A = (q,Z,q')-+ay = a/3
A' = (q,Z,q")-+a'y' = a/3'
where a.a'SZU {A}, y, y G (V — 2)*. Let us make the following three observations.
Observation 1. Either a = a — A or both a, a G 2.
Observation 2. If a = a', then either 7 = 7' = A and q' = q", or we have
7 = (p,Zuq{)(quZ2,q2)- ■ ■ (qk-i,Zk,q')
y = (p,Z1><7i)(<7i,Z2>(7i)---(<7;_1,Zfc><7")

360 BASIC THEORY OF DETERMINISTIC LANGUAGES
for some p, qu q\ G Q, Z; G I\ and k > 1. In either case, lg(7) = \g(y'). (This
observation is a consequence of the construction and of the fact that 5 is single-valued.)
Observation 3. Either 0 = 0' = A or 0,0' ¥= A.
Assume that 0 = A but 0' =£ A. Then a = a0 = ay and a'7' = Q0' = 070'. Thus a' = a
and, by Observation 2, lg(y) = lg(7'). But now lg(a) = lg(a0'), which contradicts that
0'#A.
To prove the strictness of n, assume first that 0 ¥= A. Then, by the last
observation, 0' ¥= A.
CASE 1. a= A and a&~L. Then a'G2 by Observation 1 and both (1)0,
(1)0'G2.Hence,(1)0=(1)0'.
CASE 2. a = aG2A. Then a' = a = a and, by Observation 2 (note that
7*A),wehave(1)0=(p,Z1,«71) = (p,Z1,<7,)=a)0'.
CASE 3. a€2AJV+. Then again a = a' and, by Observation 2, <x(1) =
(<7,-_!, Z,-, <7;) = (<7,._,, Zj, (7J) for some /, 1 < i < k. Thus <7,- = q\ and ('}0 = (qt, Zj+,,
«,-+i)=(«t,^+i.«;+i)=(!)u'-
Now, assume that 0 = A and thus 0' = A. Then a-a and 7 = 7'. By
Observation 2,17' = q". Hence, A —A'. □
We already know that
l(ga) = r2tf)
from the proof of Theorem 5.4.3.
We now have half of our characterization theorem.
Theorem 11.5.3 A2 is a subfamily of the family of all strict deterministic
languages.
Proof Let L G A2. Then L = T2(A) for some dpda A. Then, by Lemma
11.5.1, L-T2(A) for some dpda J with a single final state. Now the canonical
grammar GA is strict deterministic by Lemma 11.5.2 and L = Z.(Ga ). Thus L is a
strict deterministic language. □
Now we turn to a proof of the converse. Although the construction we are
about to give could be simplified in some ways, it has the advantage that its state set
is in a natural correspondence with the strict partition of the grammar. This
correspondence will be exploited later.
First we introduce certain notational conventions which will simplify our
formalism.
Definition For any strict partition w on a given context-free grammar
G = (V, I,,P, S), define the size of n as
IHI = max IK; I.
V;G7T-{£}
Thus || 7r II is the cardinality of the largest non-S block of tt.

11.5 DETERMINISTIC PUSHDOWN AUTOMATA 361
For our proof, let G = {V, ~£,P, S) be a grammar with the strict partition
7T = {2, V0, Vu . . . , Vm\ (11.5.1)
where m > 0 and V0 = [S]. We use special indexed symbols Ai}- for the nonterminals
of G, so that, for all i (0 < / < wz), we have
V, = U,-o,^n,.-.,^4 (11-5.2)
where «f = | Kf| — 1. (Note that max; «,■ = || tt\\.) Moreover, let Aw = S.
Definition Let (7 = (V, ~L,P, S) be a grammar with strict partition -n for
which we use the notation from (11.5.1) and (11.5.2). We define the canonical dpda
A G for G as follows:
AG = (Q,X,r,8,q0,Z0,{qo}), (11.5.3)
where
Q = fa| 0</<||ir||},
r = Vi u r2,
Tl = {(Vi,a)\ A-+a0isinP for some,4 G K*,a,0G K*},
r2 = {(Vha, Vj)\ A+aBpismP foisameAeVt,BeVj,anda,fieV*},
Zo = (m,A) = (K0,A),
and 5 is defined by means of four types of moves, as follows. For any V„ Vk 6ir-{2),
a G K*, a G 2, and <7;- G £), we have:
TYPE 1. 5(«7o, A, (V„ a)) = (q0, (Vt, a, Vk)(Vk, A)) if A -> aBjS is in /> for
some ,4 G K;,5 G Kfe, and /J G K*;
TYPE 2. 5(<7o, a, (V,, a)) = (q0, (Vh aa)) if A -> aa/3 is in P for some ,4 G Vt
and /J G K*;
TYPE 3. 5(<7o, A, (yu a)) = fo, A) iMu -+ a is in/>;
TYPE 4. Sfo,, A, (Vk, a, K,)) = (<7o, (ffc, <fc4tf)).
Otherwise 5 is not defined. Moves of Types 1,2, and 3 are called detection moves;
a move of Type 4 is called a reduction move.
First we have to verify that our definition is correct.
Lemma 11.5.3 For any strict deterministic grammar G, the object AG is a
well-defined dpda.
Proof We have to show that 5 is single-valued and is deterministic. Assume
that we have the following two moves:
(q, a, X) (q, a', Z)
(11.5.4)
(p, a) * (p', a')

362 BASIC THEORY OF DETERMINISTIC LANGUAGES
We shall show that a¥=a (hence the single-valuedness) and, moreover,
a,a GI,.
CASE 1. Both moves in (11.5.4) are of the same type in the terminology of
the definition of AG. Then either a = a = A or a, a'S 2. Suppose a=a'. Then we
obtain (p,a) = (p,a) immediately for Types 2 and 4 or using the strictness of tt
for Types 1 and 3. Hence a =£ a' and a, a' e 2.
CASE 2. The two moves in (11.5.4) are of different types. Then neither of
them is Type 4 since that is the only type with Z € r2. Assume therefore that
q = <7o, Z = (F,-, a). Now moves of Types 1 and 2 are incompatible since B = a is not
possible for 5 e N and a S 2, and neither Type 1 nor Type 2 is compatible with Type
3 since A =A', A -*■ afi, and A' -*■ a in P imply 0 = A by the strictness of tt. Hence,
The following result is intuitively obvious but the formal proof is rather
lengthy.
Lemma 11.5.4 Let AG be the canonical dpda for a strict deterministic
grammar G. Then T2(AG) = L(G).
Proof Let us partition the yield relation |— in accordance with the four
possible moves in the definition of AG.
(q,aw,yZ) fy (q',w,ya) (11.5.5)
if and only if 8(q,a,Z) = (q',a)is amove of type /,/= 1,2,3,4.
Thus |—= U?=1 \j. Moreover, let us define
n = (hru ^r ^. (n.5.6)
(Intuitively, if we interpret AG as a parsing algorithm, \= is the detection phase and
h^-the reduction phase of the algorithm.)
Let C be the set of configurations of AG and define two sets 6?d,
£?r£ & as follows:
dd = Gx2*xr2*r,,
Gr = Qx2* xH.
Using the definition of AG, we can interpret relations fy as partial functions (|— is
single-valued) as follows:
(11.5.7)
Now (11.5.6) and (11.5.7) imply:

11.5 DETERMINISTIC PUSHDOWN AUTOMATA 363
We have the following.
Fact IfciS &d and c2e ^..then
c, \^c2 implies c, ((= tD*|=c2. (11.5.8)
Further, we will need the following claim about the contents of the store of AG.
Claim 1 For any (q,w,y), (q ,w',y')e £r (that is, 7, y &T$\ if
(<7> w, 7) |— (17', w', 7'), then 7 is not a prefix of 7'.
Proof of the claim The argument is an induction on the length of 7. (For
convenience, we use the notation 7 \~y as an abbreviation for "(<7, w,7) h-(17', w', 7')
for some q,q' eg and w, w'e 2*.")
5as/'s. The case 7 = A is vacuously true, since |— is, in this case, undefined.
Inductive step. Assume the claim proved for all 7! shorter than some 7 e r2.
Now let
7 = 7i(K,a, V') \^y'
for some (V, a, V') e r2. Then, by the definition of AG,
7i(K, a, aV) f^- 7,(^,0^4) |— 7' for some .4 e K.
Now cl4 in (V, A) cannot be changed back to a in (V, a, V') without being erased,
i.e., without an application of a move of Type 3. Hence, a necessary condition for 7
being a prefix of 7' is
ydV.a.V) \rtjyi P-V.
Since 7i ^7', we have yx \— y , By the induction hypothesis 71 is not a prefix of 7',
and hence 7 is not also. This completes the proof of Claim 1.
The following two claims are crucial in our proof.
Claim 2. Let we 2* and /, />0. If Aij=>*w then, for any .ye 2* and
7 2' foo, *y, JiV,, A)) ((= tr)* (= (qhy, T)-
Let Ci = (<7o,vvy,7(F,-,A)) and c2 = (qj,y,y). Assume A^^w. The argument is
an induction on n > 1.
.Basis, « = 1. Then Ajj-*-w and if w=£ A then for every factorization w =
Witfw2 (wi, w2 e 2* and a e 2), we have
(qo,aw2y,y(Vi,wi)) I7 (<7o, w^T^i, wi<0);
and in any event we subsequently have
(<Jo,y,y(Vi,w)) \j(qj,y,y) = c2.
Thus,Ci I7 I7 c2 or,using(11.5.6),C! |=c2.

364 BASIC THEORY OF DETERMINISTIC LANGUAGES
Inductive step. Let «>1 and assume the result proved for all «'<«. Let
a G V* such that
Aij =*• a =* w
For any prefix a! of a, we have one of the following three cases:
CASE 1. a! is followed by a G 2; that is, a = a^aa2. Then, for some suffix
w2 of w we have a2 =*■* w2 and
(<7o, aw2y, y(Vi, a,)) |y (<70, w2y, y(Vh a^)).
CASE 2. a! is followed by B&N; that is, a = aiBa2. Then, for some suffix
wBw2 of w, we have 5 =*■ wB («' <n), a2 =>* w2 and
(<7o, wBw2y, y(Vi, a,) h^-(«o, WbW2j, y(V{, a, [B] )([B], A))
(r= ^)* 1= (<7fc > w2.y, 7(7;, ai,"[5])) by the inductive
hypothesis from
B=*n'wB if B
= Aike[B],
\^(Qo,w2y>7(VhaB)).
CASE 3. a, = a. Then
(qo,y,y(Vi,<x)) \j(qj,y,y) = c2.
Now starting with configuration cx, that is, a! = A, we can combine Cases 1 and 2
depending on the form of a G (2 U AO*, until we finish with Case 3 in which c*i = a.
Therefore,
ciOiuhrChhrrhhr)*^;
hence,
using the identity (p U a)* = (p*a)*p* (cf. Problem 2);hence,
ci((=f7((=hrrri=c2,
using (11.5.6) and the identity p*p = pp*; hence,
c.(htr)*Nc2,
using the identity (pp*)* — p** = p*. This proves Claim 2.
Our last claim is essentially the converse of Claim 2.
Claim 3 Let w,y G 2*,ye rj and/,/3*0. If
(<7o, wy, T( F,-, A)) f- fay ,7,7) then A
Again, let ct = (<7o, wy. 7(^i, A)), c2 = (qhy, y), and let c, ^c2. By (11.5.8),
c\ (1= I7)" 1= c2 for some n > 0. Then, by the definition of A G and (11.5.6),
c.(N^r(hrU^)*c3^c2, (11.5.9)

11.5 DETERMINISTIC PUSHDOWN AUTOMATA 365
where c3 = (qo,y,y(Vk,y)). First, we show that Vk — Vj. Indeed, if Vk ¥= Vt, we
would have (using the notation from the proof of Claim 1),
liYu A) ^riiYuff) \jy f-^y(Vk, A) \ry(yk,a) \jy
for some 0 6 V* since this is the only way of replacing Vt by Vk in agreement with
the definition of AG. But here y \—y is contrary to Claim 1. Therefore, Vt = Vk and
from c3 fjc2 we have Ay -*■ a.
We proceed by induction on n in (11.5.9).
Basis. « = 0. Then Ci (hf u br)* c3- Moreover, no move of Type 1 can occur
(since that would increase the length of 7). Thus Ci |yc3 \jc2. This is possible only if
all letters in a are terminal or if Cj = c3, which occurs if a = A. Hence, a S 2*, a = w
and Aij -*■ w.
Inductive step. Let n > 1 and assume the result proved for all ri <«. From
(11.5.9), the definition of AG and Ay -*■ a, we conclude that, for every prefix o^ of
a, there is a unique configuration c = (q0, w2y, y(Vi, aO) such that c^ \~c f~c2 (the
uniqueness follows from Claim 1). Here w2 is a suffix of w. Let w = Wi\v2 and W]6 2*.
First we show that £*! =*•* wt by induction on the length of ax. This is immediate if
a! = A =>* A = Wx. Assume that at =*■ Wx was already proved and let a = alBa2
(B S Kand a2& V*). Let us consider two configurations
c = (.qo,w2,y(Vt,ai)),
c = (<7o,W2,7(^,ai^))-
There are two cases.
CASE 1. 5 = ae2. Then, by the definition of AG, c |yc', w2 = aw'2, and
otid =** Wia is trivial.
CASE 2. BGN. Then we have
c \j (<7o, w2>>, y{Vu au [B])([B], A))
\r fa.w'iyMVi.au [B]))
assuming that B = Aike[B]. From (11.5.8), we see that the relation |— can be
replaced by (|= (-)" \= for some ri <n. Moreover, w'2 is a suffix of w2, say w2 = w\w2.
Then by the inductive hypothesis B =*-+ w\ and therefore o^i? =>+ Wiwi. This ends the
inductive proof that a^ =>+ Wi.
Now, for a! = a we have .4y =*■£* =*•* w' where w is a prefix of w. We shall
show that w = w. For, assume w = w'w". By Claim 2, then,
Ci = (<7o,wV>,7(F,,A)) p-(q},w"y,y).
But then, to avoid 7 r~7, which would contradict Claim 1, we must have (qs, w"y, 7)
= c2 and hence w" — A. We conclude Ay =** w, as asserted. This finishes the proof of
Claim 3.

366 BASIC THEORY OF DETERMINISTIC LANGUAGES
Now, to conclude the proof of the lemma, we combine Claims 2 and 3 and
substitute y = y = A, i =j = 0 (note that S = A00 by convention). We obtain that,
for any w S 2*,
S =*-+ w if and only if (q0> w, (V0> A)) ^ (<70, A, A),
and therefore w S £(G) if and only if w S T2(A G). □
We can immediately apply the previous lemma.
Theorem 11.5.4 The family of all strict deterministic languages is a
subfamily of A2.
Proof The theorem is a direct consequence of the definition of a canonical
dpda and of Lemmas 11.5.3 and 11.5.4. □
Now, combining Theorems 11.5.2, 11.5.3, and 11.5.4, we obtain one of our
main results.
Theorem 11.5.5 The family of strict deterministic languages coincides with
the family of prefix-free deterministic languages.
TABLE 11.1 Closure Properties of A2, Alf and A0
Operations with Regular Sets
Product
Intersection
Quotient
Boolean Operations
Union
Intersection
Complement
Kleene Operations
*
Product
Marked Operations
Union
Product
Other Operations
Min
Max
Reversal
Homomorphism
Inverse gsm
LR
LvnR
LR-1
LjULj
LXC\L2
L
L*
LXL2
C\L\ U C2L2
JL1 &.L2
A2
No
Yes
No
No
No
No
No
Yes
Yes
Yes
Yes
Yes
No
No
No
At
No
Yes
No
No
No
No
No
No
Yes
Yes
Yes
Yes
No
No
Yes
Ao
Yes
Yes.
Yes
No
No
Yes
No
No
Yes
Yes
Yes
Yes
No
No
Yes

11.6* A HIERARCHY OF STRICT DPDA'S 367
PROBLEMS
1 Consider the following equivalence relation.
Definition Let L 9. 2* be a deterministic context-free language. We define
the relative right congruence relation induced by L, RL, as follows: For x, y&L,
(x,y)GRL if and only if for all zS 2*, xz&L^-yz&L.
Prove that if L G At then RL has finite rank. Can you prove the converse?
2 Prove the following identities for sets over 2* or for binary relations over
a set:
a) p*p = pp* b) (p U a)* = (p*o)*p* c) (pp*)* = p** = p*
*3—47 Table 11.1 summarizes the closure operations for A0, Ai, and A2.
Prove all entries that have not yet been done in the book.
11.6* A HIERARCHY OF STRICT DETERMINISTIC LANGUAGES
In the last section, we defined ||ir|| where t is a strict partition. We shall
relate this to the number of states of a dpda accepting the language generated by
such a grammar.
Theorem 11.6.1 Let L be any language and let n > 1. Then L = L(G) for
some strict deterministic grammar with partition tt such that ||7r|| = « if and only if
L = T2(A) for some dpda.4 with n states.
Proof The result is immediate in the only if direction from Theorem 11.5.4
and from the fact that the canonical automaton has, by definition, 1177" 11 states. For
the // direction, we first note that the construction of dpda A with |F| = 1 in the
proof of Lemma 11.5.1, doesn't change the number of states. The rest then follows
from the proof of Theorem 11.5.3, in particular, that the partition -n used for proving
the strict determinism of the canonical grammar has the property that || vr|| = | Q \. □
Let us recall from Section 11.4 that every strict deterministic grammar has
a unique partition 7r0 which is the minimal element in the semilattice of all strict
partitions of G. Also, ||7r0ll< IMI for any other strict partition G. This suggests the
following definition.
Definition Let G be a strict deterministic grammar. We define the degree of
G as the number
deg(G) = llwoll,
where it0 is the minimal strict partition for G. For any language L S A2, define its
degree as follows:
deg(Z,) = min{deg(G)| G is strict deterministic andL(G) = L).
Our next objective will be to show that strict deterministic languages form a
nontrivial hierarchy with respect to their degree or, equivalently, with respect to the

368 BASIC THEORY OF DETERMINISTIC LANGUAGES
minimal number of states of the corresponding dpda. This result can be intuitively
explained by pointing out that the main reason for having more states in a dpda is
when a greater amount of information from the top part of the store has to be
preserved while any letters beneath it are read. This consideration leads us to an
example which will serve as a basis for the formal proof of the result.
Lemma 11.6.1 Let 2 = {a, b}. Define, for any n > 1, the language
Ln = {ambkambk\ Km.KKn}.
Then Ln — T2(A) for some dpda.4 with n states.
Proof Let An = (Q, Z,r,8,q0,Z0,F) be a dpda where Q={q0,--- ,<7„-i}.
T = {Z0,0, 1,2, 3},F = {<70}, and 5 is defined as follows:
8(q0,a,Z0) = (<7o,02),
5(<7o,a, 2)= (<7o, 12),
5(<70,&,2) = (<7o,3),
S(q„b,3) = (<7m>3) for 0 < / < « - 1,
8(qha,3) = (<?,•, A) for0<i<n-l,
5(<7,-,a,l) = (<?,-, A) for0</<n-l,
8(qh b, 0) = (<?,-!, 0) for 0 </ <n - 1,
5(<7O)ft,0) = (<7o,A).
To show that T2(An) =Ln,it is enough to observe that a computation leads to
acceptance of a string w S 2* if and only if it has the following form for some m > 1, where
\<k<n:
(<7o, w,Z0) h (<7o,a_1w,02)
^1(<7o,(0"V01m-12)
h fe0,(flmft)"1w,01m-13)
^=L((7fe-1)(amftfe)-1w)01m-13)
h (<7fc_1,(«mftfcfl)-V01m-1)
Fi(<7fc_1,(fl"»ftfcflm)"1w,0)
h (fe(rtk/iVw,A).
The proof is complete. D
Next we shall prove that n states are necessary for the acceptance of Ln (as
defined in Lemma 11.6.1). The formal proof is by no means trivial since we have to
prove this for all conceivable dpda's. First we need a lemma.

11.6* A HIERARCHY OF STRICT DPDA'S 369
Lemma 11.6.2 Let A = (Q, 2, I\ 5,<70, Z0,F) be a loop-free dpda and let
uSE. Then either:
a) there exists n0>l such that for any q&Q, 7ST*, and m~>\, if
(<7o,am,Z0) \— (<7, A, 7), then lg(7)<«0 (that is, the size of the stack is
bounded); or
b) there exist m0, f> 1, q G Q, y0 G T*, 17 G T+, and Z G T such that, for
every fc > 0,
(<7o,am°+/l',Z0) ft^-fe.A.To^Z) (11.6.1)
and for any k > 0 and 7 G r*,
(q0,ak,7orihZ) f-(fi',A,i) implies 7 = 7c>tjV (H.6.2)
for some 7rGr+ (i.e., the stack is essentially ultimately periodic in some word tj).
Proof Let us assume that (a) is not satisfied. A useful claim is established first.
Claim 1 For each r > 1 there exist nr > 1, «r+1 > nr, Zr G r, qr G Q, yr G r*
with lg(7r) >r, such that:
(q0,anr, Z0) (f1 (<7r, A,TrZr) (11.6.3)
and for eachs>0,
(qr,as,Zr) \^(ps,A,8s) (11.6.4)
for some ps&Q and 0„ G r*.
iVoo/ Suppose that Claim 1 were false. Then there exists r > 1 and for each
s there exists a computation sequence
(<7„o, "«o, 7«o) h ' '' h (<7,fe,, "«fes, 7sk)> 01 -6-5)
where q^ = q0, us0 = am«, 7„o = Z0, wsfes = A, and lg(7«fes) <r. Note that, for each s,
there exists an integer ms and so there are infinitely many ID's (qsk , A, y$ks)- But
Q x T is finite,so there exist s =£ r such that
(<7«fes>TSfes) = (gtkt)ytkt). (11.6.6)
CASE 1. Suppose /n„ =mt. There is no loss of generality in assuming that
ks<kt. (If it were not so, we could switch notation.) But A is deterministic so for
eachCXi <£„,
Qsi = Qti, usi = uti, and ysi = yti.
Then
(<7so, "so, 7«o) h ' ' • h (<7sfes, wSfes, 7^
II II (11.6.7)
(<7(o,W(o,7ro) h '' ' h (<7ffes, utk$, ytkg) \- ■ ■ ■ \- (qtkt,Utkt,7tkt)-
Thus
(««,. A, ytk) \r (Qtkt, A, ytkt)- (11 -6.8)

370 BASIC THEORY OF DETERMINISTIC LANGUAGES
But from (11.6.6) and (11.6.7),
(q,ks>1sk) = (Qtkt,7tkt) = (<7tfes> Ttfej)- (11.6.9)
Thus (11.6.8) represents an infinite A-loop and
(<7o,am*,Z0) Vr(.q,A,i)
is false for all q e Q and y & V*. This proves that ms¥=mt.
CASE 2. ms¥^mt. There is no loss of generality in assuming that ms <mt.
Since A is deterministic, ks<kt, and for each 0 < i < ks,
Qsi = Qti, u$i = "ti. and ysi = yti.
But we have
(qs0,am',Zo) | h (<7sfes, A, yskg)
II II
(qto,amt,Z0) | h(<7,fes,flm'""\W h(<7^,A-W-
Thus from (11.6.9),
(<7tt,.«m'~m'.7*k,) r1 (Qtkt, K7tkt)-
Now let
n0 = max{lg(7fl-)| \<i<kt}<r.
Thus we have a uniform bound and (a) is satisfied. This contradiction proves Claim 1.
Let r > 1 and let nr, qr, and Zr be as in Claim 1. Since Q x T is finite, there
exist / </ such that Z; = Z,- and <7,- = <7y. Since .4 is deterministic,
(<7o, A"',Z0) tf- (<?,-, A,T,Z,) (11.6.10)
and
(q0,a">,Z0) |^-(<7i,A,7,TjZi)
for some tj e r*. Moreover, tj e r+, since otherwise we would be in case (a). Then,
{quanr-i,Zi) \r («?,•, A,r?Z,-). (11.6.11)
Now choose m = «,• and/= «y — «,-. For each A > 0, we have
(«7o, am+V, Z0) (t^ (<?,-, a", yiZd from (11.6.10)
ttf-(<7l,A,7,T?hZ,) from (11.6.11).
Note that (11.6.11) also implies that if A; > 0, fc > 0, and
fe.-.fl*,^^,) ^(<7',A,T),
then 7 = 7,t/17' where 7' =£ A. Therefore (b) holds and the proof is complete. □
Now, we shall start the long argument which shows that any dpda which
accepts Ln (as T2(A)) has at least n states.
Lemma 11.6.3 Let Ln be the language from Lemma 11.6.1 for some n > 1
and let A be a dpda such that T2(A) = Ln. Then A has at least n states.

11.6* A HIERARCHY OF STRICT DPDA'S 371
Proof Let A = (Q, 2,r,5,<7o,Z0,F) be a dpda satisfying the assumptions
of the lemma and assume that \Q\ <«. We assume that n > 2, since for n = 1 the
lemma is trivial. Our strategy will be to prove a sequence of claims about A which will
eventually lead to a contradiction. We shall investigate an accepting computation of
the form
(q0>ambkambk,Z0) f-(qt,A,A), (11.6.12)
where 1 </n, 1 <&<«, and qt 6f. Informally, we distinguish four phases of
computation in (11.6.12) in a natural way: In each phase one block of the same letter
(am or bk) is read. A typical history of the store is illustrated in Fig. 11.1.
First we show that the content of the store after the first phase is periodic.
Claim 1 There exist m0,f> 1,qvGQ,y0£V*, t?GT+,andZl&V such that,
for every h>0, t
(q0,am°+hf,Z0) H-fe1,A,7or/kZ1). (11.6.13)
Proof of the claim To be able to apply Lemma 11.6.2, we need the loop-free
property. Consider therefore a dpda Aa = (Q, {a}, T, 8a,q0,Z0,F) obtained from A
by restriction of input alphabet to {a}; that is, for all q G Q and Z G r,
|5(<7,tf',Z) ifa'G{a, A};
[undefined ifa' = b.
Clearly, for any m > 0, q& Q, and y G T*,
(<7o, am, Z0) Hjfl 07, A, T) if and only if (q0, am,Z0) \\j (<?, A,T). (11.6.14)
For any m>\, since am is a prefix of an acceptable string (for example, ambamb),
there exist q E Q and y G T* such that (^0, am,Z0) Hj(<7, A, T). Hence, by (11.6.14),
Aa is loop-free. We shall now apply Lemma 11.6.2 to Aa, since, for any m > 1, there
must be a different configuration (^,A,7) in (11.6.14) — otherwise we would have
ambkambk G T2(A) for some m ^m. Also, since Q and V are finite but m may be
arbitrary,thereisnobound on the length of 7. Thus we cannot have case (a) in Lemma
11.6.2 and we obtain Claim 1 as a consequence of (11.6.1).
(Store)
8a(q,a',Z)
Can't occur
by Claim 3
Input a
FIG. 11.1

372 BASIC THEORY OF DETERMINISTIC LANGUAGES
For the rest of the proof we fix symbols m0, f, y0> qt, and Z\ with the same
meaning as in Claim 1. Our next claim is that during the second phase (during the
reading of the bk), only a bounded number of tj's are erased from the store.
Claim 2 Let 1 <&<« and h >0. Then
fai, bk, Tor/%) \\^ (q2, A, yotf-'y)
for some q2 S Q, y € r* and s < n1.
Proof of the claim Let q2&Q and y G T* be such that
ci = fa, bk, y0rfZl) Vr fa, A, V). (11 -6.15)
Such q2 and 7' always exist, since otherwise A could not accept a string with the
prefix am"+hfbk (cf. (11.6.13)). It is enough to show that y0rih~s is a prefix of 7' for
somes <n2.
Assume, for the sake of contradiction, that this is not the case; that is, that
To7?71 " is not a prefix of y . However, it is a prefix of 70TjhZi in C\ and therefore
there exists a configuration c\ such that (11.6.15) can be written in the form
c\ ^ c\ (f1 Oh.A.V)
and the contents of the store in c\ is exactly 70r?/l_" (cf. Fig. 11.3). We shall
concentrate on those transitions in cx \c\ which are associated with erasing a letter
from the original contents of the store (i.e., a letter written before configuration C\
was entered; cf. thick lines in Fig. 11.2). Let us call these transitions proper erasing.
Since at most one pushdown letter can be erased at a time, there are at least
lg(rj" Zi) = n2' lg(rj) + 1 proper erasing transitions in cx \—c\. At most k of the
erasing transitions correspond to non-A-moves of A, since only k input letters are
(92-A.y')
n"2z,
,/i-n2<
yoi
FIG. 11.2
Time

11.6* A HIERARCHY OF STRICT DPDA'S 373
read in (11.6.15). Consider the sequence of consecutive transitions corresponding to
A-moves (we call them h-transitions) in C\ \~c'\ which contain the maximal number
•^max °f proper erasing transitions. Since k<n, lg(r?)> 1 and, using the fact that
Ci can be followed only by a non-A-transition (a consequence of (11.6.13)), we obtain
the inequality JV^ > n • lg(r?). Now any suffix of y0rihZ1 of length Nmax contains
at least n — \ repetitions of tj or, using the assumption that \Q\<n, it contains at
least IQI repetitions of tj. Consequently, there exist two configurations c2 and c2
such that Ci |— c2 t~c2 \~c'u where c2 \~ c'2 consists entirely of A-transitions and
exactly \Q\ repetitions of tj are erased from the store during c2 \— c2 (cf. Fig. 11.2).
Formally, we have for somep,p GQ,\ </<k, andi>\Q\,
ci\^c2 = (p,bi,70T,h-WQ{) H" (?',b>.kiP'1) = ci, (11.6.16)
where t>\Q\'lg(Tj). Therefore at least two distinct configurations occur in
c2 \—c'2 such that both have the same state and the same topmost string tj in the store.
In other words, for some p" G Q and 1 < r < | Q |,
c2 £■ c3 = (pH,bt,y0rf-irn ^ (?",*', ToTJ*"') = c\ f- c'2. (11.6.17)
But this introduces a loop and (11.6.17) must repeat until almost all the periodic
part of the store is erased:
c's ^(p'>;',7oT?r°) = c4, (11.6.18)
where r0 < r. Now let ti = h + r. Then
foo, am°+h'fbk, Z0) ft*- (<?,, b', jorf'zd by Claim 1,
^ (p",bi,y0rih'-irir) by (11.6.16) and (11.6.17),-
f- (P",b}\ yo^'V) = c3\^c4 by (11.6.17) and (11.6.18),
using h' = h + r.
Let m = m0 + hf and rri — m0 + h'f¥=m.We have
(q0,ambkambk,Z0) f- (p", b}amb\y0rir°) f-(qt,A,A)
on the one hand, and
(<?o, am'bkambk, Z0) ^ (p", bJambk, y0Vr") ^ fat, A, A)
on the other hand, contradicting that m ¥=m and am bkambk $Ln = T2(A). Thus
To7?71 " is a prefix of y which concludes the proof of Claim 2.
Thus for any h^n2, m =m0 + hf, and for any k, 1 <£<«, we have a
common prefix % = y0rih~n of any possible contents of the pushdown store after the
first two phases of the accepting computation,
(q0,ambk,Z0) tf-(q2,A,ft), (11.6.19)
where q2GQ and 7G r*. Here y is dependent only on k(cf. Fig. 11.2).
We shall turn our attention to the last two phases of the computation
(11.6.12).

374 BASIC THEORY OF DETERMINISTIC LANGUAGES
Claim 3 Let 1 < k < n andm > 1. There exist mk<m and q( ' G Q such that
(<72,amfe,lT)^ (<7<fe),A,l), (11.6.20)
where q2, ?, y are the same as in (11.6.19).
Proof of the claim We have (q2,ambk ,%y) \^(qt, A, A) since ambkambk G
r2(/l). Therefore for some prefix w of ambk we have also (q2, w, %y) \~~(<7(fe\ A, |) for
some <7<fe)GQ. It is enough to show that lg(w)</n. Suppose the converse; that is,
w = amb} for some 1 </' < k. Then we have
(qw,b"-',IO f-(qt, A, A). (11.6.21)
Here %-y0Tih~" may be an arbitrarily long string since h is arbitrarily large. An
argument similar to the proof of Claim 2 leads us to the conclusion that (11.6.21)
contains a subsequence of A-transitions erasing more than | Q \ • (k — C) occurrences
of tj in i- and therefore erasing almost all %', but, since % was not changed during the
second and third phases, we would have, as in Claim 2, that, for some m ¥=m,
(q0,am'bkambk,Z0) £- (qw,bk-j, £') ^(<?f,A,A)
(where £' differs from % only in the number of repetitions of tj; we omit the details,
which are analogous to the arguments in the proof of Claim 2). This contradicts
T2(A) = Ln and therefore w = amk for some mk </n. This completes the proof of
Claim 3.
The proof of Lemma 11.6.3 can now be completed without difficulty. Let
m> 1, 1 <k<n, and 1 <k' <n. Let qw,q(k)&Q and mk,mk' be as in Claim 3
(for k and k\ respectively). Using (11.6.19) and (11.6.20), we can write
(<7o,am6W,Z0) ^ (q«\am-m*bk,%) = c ^ (qt,A,A)
and, for m =m + mk' — mk,
(q0,ambk'am'bk,Z0) f- (qlk'\am-mkbk, %) = c .
By the assumption that \Q\<n, k and k' can be chosen in such a way that k¥=k'
but qw = q(k'\ But then we have c = c \r(.Qt, A, A) and thus ambk'am'bk G T2(A)
which is our final contradiction and establishes that A cannot have less than n states.
□
Now we can state the main theorem of this section.
Theorem 11.6.2 For any n > 1 there is a language L G A2 such that deg(Z,) = n.
Proof Let n~>\ and consider the language Ln from Lemma 11.6.1. By
Lemma 11.6.1 and Theorem 11.6.1, Ln = L(G) for a grammar G with strict partition
7r, || -7T|| = n. Since || 7r0ll < II ttII we have deg(£n) < n. Assume deg(Ln) < n. By Theorem
11.6.1 there exists a dpda for Ln with less than n states, contradicting Lemma 11.6.3. D
Theorem 11.6.2 establishes a hierarchy of strict deterministic languages under
their degree.

11.7 TIMING QUESTIONS IN OPOA'S 375
PROBLEMS
1 Check which of the transformations of strict deterministic grammars
preserve degree.
2 (Open problem) Is the degree of a strict deterministic language effectively
calculable?
11.7 DETERMINISTIC LANGUAGES AND REALTIME COMPUTATION
Our goal in this section is to investigate timing questions in dpda's. We
already know that any deterministic context-free language may be accepted in linear
time by a dpda from Problem 5.6.10. This tells us that a Turing machine would also
accomplish this in linear time. We shall show, in this section, that there is a (strict)
deterministic language L such that no realtime multitape Turing machine can accept L
in realtime.
Notation. Let L = {a^b^b • • •ai'-^bair(fai'-'+l\ r>\, \<is for all /,
1 </<r, Ks<r}.
It is clear that I6AjEA,c A0. We shall now show that L cannot be
accepted in realtime. To do this, we consider a certain equivalence relation.
Definition Let L £ S* and k>0 be given. For any x, yG'L*, define the
relation Ek(L), by the condition (x,y)GEk(L) if and only if for eachzGS* with
lg(z) < k, we have xz € L if and only if yz G L.
Clearly Eh{L) is an equivalence relation on 2*.
Next we turn to the theorem which is the basis for this type of proof.
Theorem 11.7.1 Let A = (Q,2,r,8,q0,F)be a realtime Turing machine
with m work tapes, L = T(A), and let k > 0. Then
rk(Ek(L))<\Q\\r\^k+1)m.
Proof Let x, y G2* and suppose that (x,y)fiEk(L). If we start A on x
and on y, then let ax and cty be the last ID of A's computations. Clearly, a^ =?tay.
[Otherwise, for any zG2fe, xzGL if and only if yzGL, which would contradict
that (x,y)fiEk(L).] As x, y range over 2* without limit, the number of ID's ax
and Oy grows without bound. Since A is realtime and z G2fe, we must calculate the
number of ID's that A can distinguish in k steps. This bound is
iqi in(2fe+i)m
because an ID can contain any state, and in k steps, one can only move k positions
on a work tape. Since this can involve k in either direction, we find that the number
of different contents of a work tape is | T |(2fe+1) and there are m work tapes.
To complete the argument, we note that if rk(Ek(L)) = p were greater
than the bound, then there would be p tapes which were pairwise inequivalent
mod£'fe(Z,). These must lead to p distinct ID's which is more than A can distinguish
in k steps, and leads to the contradiction that some two of them would have to be
equivalent mod Eh (L). D

376 BASIC THEORY OF DETERMINISTIC LANGUAGES
Now we are ready to prove the desired result.
Theorem 11.7.2 L, as defined above, cannot be accepted by any realtime
multitape Turing machine.
Proof Suppose, for the sake of contradiction, that L = T(A) for some
realtime multitape Turing machine with m tapes. Define
Lk = {a'lb ■ ■ ■ a'r-iba'rc'a'r-**! | r > 1,1 < z) < k, 1 </ <r, 1 < s <r}.
Thus we bound the number of successive a's to be k. Note that Lk£ L. Next we
consider the equivalence relation Ek+r(L).
Claim rk(Ek+r(L))>kr.
Proof of the claim Consider two distinct prefixes of words in Lk,
Xi = a'*b ■ ■ -a'r-lbatr
and -I,
x2 = ahb .. -a3r-iba3r,
where xl=£x2 and 1 <zp, jq <k for 1 <p, q <r. Clearly, there exists a sequence
z = csal, where s + t < k + r such that
xlzE.Lh but x2z£Lk.
Thus (xi,x2)£Ek+r(L). Thus there are at least as many classes of Ek+r(L) as there
are distinct prefixes of this form. Clearly, there are k values for each of iu ... ,ir
and each may be chosen independently. Therefore
\Ek+r(L)\>kr
and the claim has been proven.
Now to complete the proof, we employ Theorem 11.7.1. We know that
k" <rk(Ek+r(L))<\Q\\r\Wk+r)^)m. (11.7.1)
In this inequality, k and r may grow but everything else is fixed. Thus (11.7.1) reduces
to
/f<c(fe+r) (H-7.2)
for some constant c. As k and r grow, (11.7.2) is obviously violated for any constant c,
and we have a contradiction. D
Corollary There exists a strict deterministic language which is not a realtime
language.
Next we define some families of deterministic languages, recalling that quasi-
realtime pda's have a bounded number of consecutive A-moves.
Definition A language L is called At-(quasi) realtime if L = Tt(A) for some
(quasi) realtime dpda A, i = 0, 1,2.
With this definition, we can state another corollary of Theorem 11.7.2.

11.7 TIMING QUESTIONS IN OPOA'S 377
Corollary The family of Arrealtime languages is properly contained in the
family A; for/=0,1,2.
Proof Clearly, L as defined at the beginning of this section has the property
that
L£A2CA,c A0.
But it has been shown that L cannot be accepted by any realtime, on-line multitape
Turing machine. Thus, L is not a realtime A,- language for i = 0,1, 2. D
We begin to relate quasi-realtime and realtime languages. We shall do so for
A2 now.
Theorem 11.7.3 A language L is A2-quasi-realtime if and only if it is A2-real-
time or L = {A}.
Proof The if direction is a direct consequence of the definition and the fact
that {A} = T2(A) for any dpda A with a single move
6(<7o, A,Z0) = (<7f,A) for some qt e F.
This dpda is quasi-realtime {t = 1).
For the only if direction, let A = (Q, ll)r,8,q0,Z0,F)be a dpda and assume
A is quasi-realtime with time constant t. One can define an equivalent realtime dpda
A' in such a way that (i) the pushdown letters of A' are codes for n-tuples (« = 2t + 1)
of pushdown letters of A, and (ii) one move of A{ has the same effect as a sequence of
at most n moves of A. A' must simulate the writing and erasing moves of A in its finite-
state control and must transfer information to and from its pushdown storage in
blocks of size n. We shall omit the formal construction and proof and leave the details
for Problem 2 at the end of this section. D
It turns out to be valuable to have a grammatical characterization of the
A2-realtime languages.
Definition Let G = (V, 2,P, S) be a strict deterministic grammar with
minimal strict partition n. G is called a realtime strict deterministic (or simply realtime)
grammar if it is A-free and the following condition is satisfied for all A, A' ,B,B' GN
anda,0eF\
If A -*■ aB and A' -*■ aB'/} are in P, then A=A' (mod n) implies j3 = A.
(11.7.3)
Note that if G is a realtime strict deterministic grammar, then for any A,
A' GN, if A -+a.B is inP,^'->-a5'|3isinP, and .4 =A'^,then|3= A and either .4 —A'
and B = B' (so that there is but one rule involved) or B =2?'.
It is interesting to note that the requirement that n be minimal is necessary
in this definition. Consider the following example:

378 BASIC THEORY OF DETERMINISTIC LANGUAGES
S -+ aAS\b
A ■+ aB
B -+ aAB\aCA\b
C ■+ aD\c
D ■+ aAD\aCC
and let n0 = {2, {S}, {A,C}, {B,D}} and it = {2, {S}, {A,B, C,D}}. G is a realtime
grammar. If the minimality condition were dropped from the definition, G would
not be realtime with respect to it but would be with respect to n0.
It is clear from the definition that the reduced form of any realtime grammar
is also realtime.
Theorem 11.7.4 A language is a A2-realtime language if and only if it is
generated by some reduced realtime grammar.
Proof (Only if) First note that, since acceptance is by final state and empty
pushdown, one can restrict attention to a dpda with one final state and there is no loss
of time. More formally, the dpda A constructed in Lemma 11.5.1 is realtime if and
only if A is realtime. By Lemma 11.5.2, G\ is strict deterministic under partition it
defined in that proof. It is easily seen that it is minimal. Now let us recall that Gx has
productions of two types as defined in the proof of Theorem 5.4.3. By the realtime
property of A in both cases, we have a =£ A. Thus G^ is A-free and, moreover, in
Greibach normal form. To prove (11.7.3), let A -+aB and A' -+aB'B be in P and
assume A = (q, Z, q) = (q, Z, q") = A'. Gx is in Greibach normal form; hence a =£ A
and (1)a £ 2. By the determinism of A and the definition of Gx, there is a unique
7GT* (as well as q &Q) such that 6(<7,(1)a,Z) = (q',y). Consequently, lg(a5) =
lg(7) + 1 = lgM'0). Hence, 8 = A.
(If). Let L = L(G) for some reduced realtime grammar G. Our strategy will
be to modify the canonical dpda AG in such a way that the resulting equivalent
dpda A'G will be quasi-realtime. This is sufficient for a proof that L is A2-realtime
language since the A-free property of G implies that L ¥= {A}. To make it easier to
follow the proof, we shall first examine the properties of AG related to the
occurrence of A-moves. We shall use the same notation as in the proof of Theorem 11.5.3.
We have defined the 6-function of A by means of four types of moves: Types 1,3, and
4 are A-moves, and Type 2 is always a non-A-move. Accordingly, we have distinguished
four types of yield relations
(q,aw,yZ){r(q',w,ya) if and only if 8(q,a,Z) = (q',a)is amove of Type i,
where i = 1,2,3,4. Let c, c' be two configurations of AG such that
for some m > 0, where c is the result of a Type 2 move. We are interested in finding
a bound on the number m in (11.7.4). We shall see that such a bound exists after
suitable modification of AG. First we notice that (11.7.4) can be rewritten in the
following form.

11.7 TIMING QUESTIONS IN OPOA'S 379
n /*ii \ ft.
c', (11.7.5)
where £u, k2, n >0,k2 + 2"=1 (ku + 2) = /n,and n denotes composition of relations.
Claim 1 If G = (V, ~L,P,S) is a reduced realtime grammar and c, c' any two
configurations of AG satisfying a relation of the form (11.7.5), then ku = 0 for all
i and k2<\N\.
Proof of the claim By definition, a Type 1 move can be followed by a
Type 3 move if and only if there is a A-production in the grammar. Since G is A-free,
kn = 0 for all i. Also, k2 consecutive moves of Type 1 can occur if and only if
ft -l
A h- Ba for some A, B&N and a&V*; but since there are only \N\ nonterminals
and A =>Aa is not possible in a reduced strict deterministic grammar (the Corollary
to Theorem 11.4.3), k2 < IN\ (in fact, k2 < II nil). The claim is proved.
As a consequence of Claim 1, only the number n in (11.7.5) can be
unbounded. Therefore, we restrict our concern to the case c (\^ \-^)" c , or, more
conveniently,
where c, c are two configurations of AG and n >0. Directly from the definition of
moves of Types 3 and 4, we have:
Claim 2 For any q}-, qz&Q — {q0\, Vk,Vt&-n and a G V*,
(qj,A,(Vk,a,Vi))^[j(q&,A,A) (11.7.6)
if and only if Aki -*■ atAy is a productiont in G.
Let us call any letter (Vk, a, Vf) G T a saturated letter if and only if ^4 -»■ a5
is in P for some A&Vk and 5 G Kf. Denote Ts the set of all saturated letters (we
have Ts c r2).
Claim 3 If Zi = (Kfe,a, K;) is saturated and Z2 = (Kfe,a0, K;-)Gr, then
0 = A and / = /*.
Proof of the claim This claim is a consequence of property (11.7.3) of
realtime grammars: Since Z\ is saturated, we have A -*■ olB is in P for some A&Vk and
B G Vt; and since Z2 G T2, we have A' ■+ a0C0 for some A' &Vk,C& Vh and 6GV*.
Then by the strict determinism of G, 0C0=5'0' for some B'&N, 0'G F\ But
0' = A by (11.7.3), and hence C = B', 0 = A and Kf = [5'] = [C] = K;- by the strict
determinism of G. The claim is proved.
t Recall the convention V; = {Ai0,.. ., A^.} for V; G 77.

380 BASIC THEORY OF DETERMINISTIC LANGUAGES
By Claims 2 and 3, any saturated letter on the top of the store is always
erased (never rewritten) and this is realized by a transition of the form [j [j. Now,
for any saturated letter Z = (Vk, a, Vt) G rs, we can uniquely specify a partial function
fz'.QfQ, (11-7-7)
such that fz{Qj) = <7b if and only if q}, q% ¥= q0 and one or the other side of equivalence
(11.7.6) holds. We can extend (11.7.7) to:
fn-Q-^Q (H-7.8)
for tj G T, using (11.7.7) as a basis and defining inductively/^ = /,/z, that is, for any
Q G Q, fnzi.0)— /tj(/z(<7)) provided fz(q) is defined. Note that there are only a finite
number of distinct functions /„. In fact, |{/„| n G rs+}| <(|Q| + 1)IQI.
Claim 4 For any q, q G Q; y, y' G r* and n > 1,
(q,A,y)(tj\^)" (<?'.A,V) if and only if y = y'ri
for some tj G T" and (7' =fri(q).
This claim follows from Claims 2 and 3 by a straightforward induction on n.
Thus the only effect of fl-7 h^)" is erasing a saturated string of length n
from the store and changing the state of control. Claim 4 gives the basis for the
elimination of an unbounded sequence of A-moves: instead of a saturated string tj of
length > 1, on the store of the modified dpda A'G appears a single letter fn representing
the function /,,.
For the completeness we describe formally the construction ofA'G. For this,
let AG = (Q, 2, r,8,q0,Z0, {q0}) be the canonical dpda defined prior to Lemma
11.5.4. Define A'G as follows:
A'o = (e',Z,r',6',<7o,Z0,{<7o}), (11.7.9)
where Q' = Q U {qz\ Z6T,} (the new states are added for technical reasons only),
T' = T U {/„ I tj G r;} and 6' is defined as follows: If A ■+ aB is in P for some A G V,
and B G Vk, then define:
i) 5'(<7o, A, (V,,a)) = (qz,A) where Z = (V,, a, Kfc) G Ts;
ii) 8\qz, A,/„) = (.q0Jvz(Vk, A)) for all /„ G r' - T, Z = (K,, a, Kfc);
and
iii) 8'faz,A, 10 = fao, 17z(Kfc,A)), where Z = (Vha,Vk), and for all
rer.
Moreover, for any <7;- G g — {<70} and/,, GT'-T, define
iv) 5'(<?;-, A,/„) = (/,(<?), A).
In all other cases define
v) 8'(q,a,Z) = 8(q,a,Z).
Moves (i), (ii) and (iii) are new special cases of Type 1 moves of AG; move (iv) is a
special case of the Type 4 move. We observe that a saturated letter never appears in
the store, and even if some A-moves were added, A'G is quasi-realtime. It is easy to see
that T2(A'G) = T2(AG). Claim 4 suffices for that. □

11.8 STRUCTURE OF TREES 381
PROBLEMS
1 Show that the following set L1 cannot be accepted by any realtime
multitape Turing machine:
L' = {xldx2 ■ ■ ■ xr-ldxrcdkcx?-k+1\ x; G {a, b}*, 1 =*f<r, 1 =* k<r}.
2 Give a detailed proof of Theorem 11.7.2.
3 Prove the following proposition about regular sets.
Proposition Let L £ 2* be a regular language. Then there exists a number
n > 1 such that any string x&L, where lg(x) > n can be written in the form x = y \zy2,
where yu yiGZ*, z eZ+,\g,(zy2)<:n,an<l, foi all k> 0, y iZky2 &L.
4 Prove the following useful proposition:
Proposition Let G = (V, 2;JP, S) be a reduced realtime grammar. Assume
S => uAu
and
A => vA0
for some A ejV,uG2*,a6F*, fl£Z+and 06 V*. Then for any n >0 and wGZ*,
S => uv"w
implies lg(w) > n.
5 Prove the following result:
Theorem Let G be any reduced realtime grammar. Then L(G) is regular if
and only if G is not self-embedding.
Hint. Problems 3 and 4 can be helpful.
6 Show that the result in Problem 5 is best possible in the following sense.
Find a reduced, A-free, self-embedding, strict deterministic grammar G such that
L(G) is regular. Thus the result in Problem 5 cannot be extended to the full family
of strict deterministic grammars.
Let Si be the family of sets definable by realtime multitape Turing machines.
7 Show that SI is closed under union, complementation, intersection, and
product on the right with regular sets.
8 Show that 9 is not closed under product, *, transpose, homomorphism,
either left or right quotient by regular sets, or max.
11.8* STRUCTURE OF TREES IN STRICT DETERMINISTIC
GRAMMARS
In this section, the structural properties of derivation trees ih strict
deterministic grammars will be developed. In addition to the property of unambiguity, the
strict deterministic grammars have a unique "partial" tree for every prefix of a terminal
string. We utilize more general objects than derivation trees, namely, the grammatical

382 BASIC THEORY OF DETERMINISTIC LANGUAGES
trees (with roots labeled by any letter) in order to facilitate induction proofs. The
uniqueness of the "partial" trees is the content of the Left-Part Theorem. This
theorem will be also used to prove two deterministic iteration theorems.
Recall that our conventions in Section 1.6 were to use the term grammatical
tree for a tree whose frontier is in V*. A derivation tree has a frontier in 2*. For a
given grammar G, .T'q denotes the set of derivation trees of G while -3~G{A) is the
set of derivation trees whose root is A. (Cf. Section 1.6 for the rest of our notation
involving trees.)
We are now ready to define the left-part of a grammatical tree.
Definition Let T be a derivation tree of some grammar G. For any n > 0, we
define <n)r, the left n-part of T (or the left part where n is understood) as follows.
Let (xi,... ,xm) be the sequence of all terminalt nodes in T (from the left
to the right); that is, {xu ... ,xm} = \~l(I,)andxl\lx2\l- ■ -L!:xm.Then
(n'T = {x in T\ x LTxn} ifn<nz,and
(n)r = T ifn>m_
We consider (n)r to be a tree under the same relations T, L, and labeling X
as T.
Thus, for instance, the shaded area on Fig. 11.3 (including the path tox) is
the left «-part of T if x is the nth terminal node in T. Note that, in general, ^rmay
not be a derivation tree.
As immediate consequences of the definition, we have
Factl <»)7'=(»+i)7' ifandonlyif(")7,= r.
Fact 2 If (n)r= (n)r', then U)T= U)T' for any/<n.
Our principal result about the structure of derivation trees of a strict
deterministic grammar can be informally stated as follows. A reduced grammar G has a
w = uv
FIG. 11.3
t Note that the A-nodes are not being indexed.

11.8 STRUCTURE OF TREES 383
strict partition it if and only if, given any KG w,. then each prefix u of any string
w = uv G 2* generated by some symbol A&Vt uniquely determines the left partial
subtree of tree T (cf. Fig. 11.3). T corresponds to a derivation A =>w up to the path
to the terminal node (x) labeled by the first letter of v. We allow the labels of nodes
on this particular path to be determined modulo the partition it (in particular,
X(x)=2). In the case when v= A, the complete tree is specified uniquely (and
then it cannot be a proper subtree of any other tree with an equivalent root label).
For the formal statement of the theorem we first introduce the "left-part
property" of a set of derivation trees.
Definition Let S~^. JTG for some grammar G = (V, S,P,5) and let w be
an arbitrary partition on V, not necessarily strict. We say that S~ satisfies the left-part
property with respect to it if and only if, for any n > 0 and T, T' G ^"if
rt(T) = rt(r') (mod if)
and
(n)fr(D = <n)fr(7"),
then
(.n+Dj* ^ (n+l)j-» (11.8.1)
and, moreover, if x **x' is the structural isomorphism ("+1)7' o- (n-n)^ then for
every x in ("+1)T,
\(x) = X(x')ifxll^forsome7G<"+1)rorif("+1)r = <n)T (11.8.2)
and
\(x) = \(x') (mod ir) otherwise. (11.8.3)
Note that the condition "x\ly for some ;,e<"+1>:r' in (11.8.2) is equivalent to
"x is not on the rightmost path in <"+1>7"'.
An example of this kind of mapping can be obtained by considering grammar
G2 from Section 11.4. Let us consider the trees shown in Fig. 11.4.
If we let n = 2, we see that rt(7,1) = S = S = it(T2) and wfi(Ti) = (a =
(2)fr(T2). Thus (3)r! s <3)r2 and the path from 5 to + in 7^ is "equivalent modulo n"
to the path from S to ) in T2.
Tree Tx
FIG. 11.4
/FJ\
Tree T2

384 BASIC THEORY OF DETERMINISTIC LANGUAGES
Note that the left-part property is a global property of trees in distinction to
the strictness of a grammar, which is a local property of derivation trees (being a
property of their elementary subtrees).
Theorem 11.8.1 (The Left-Part Theorem) Let G = (V, 2,P,S)be a reduced
grammar and let w be a partition on V such that 2 G it. Then it is strict in G if and only
if the set 9~G of all derivation trees of G satisfies the left-part property with respect
to it.
Proof (The if direction) Let G and w be as in the assumption of the theorem
and assume J7~G satisfies the left-part property with respect to it. We shall show that
n is strict.
Let A, A' GjV, let A -*■ a/3 and A' -*■ a/3' be in P and assume A=A' (mod it).
Since G is reduced, we have a =*wa, |3 =»wp and |3' =*wp' for some wa, wp, wp- G 2*.
Let us consider two derivation trees T, T' corresponding to derivations A =>a(S
=*wawp and ^4' =*ot(i' =*waw'p, respectively. Let n = lg(wa). Then rt(T)=A=A' =
rt(7") and <">fr(r) = <">fr(r'). Thus (ntl)rs <"+1>r' by the left-part property of
5~G. Let x <*x' be the structural isomorphism from (n+1)rto <"+1>:r'. We distinguish
two cases:
CASE 1. wp =t= A. Let x be the (n + l)st terminal node of T (labeled by
(1)W(3) and y the (lg(a) + l)st node among the immediate descendants of the root in
T, counted from the left (that is, XQO = (1)|3). Clearly, y is in <"+1>;r and by the
structural isomorphism also / is in <"+1>r', \(y') = (1)|3'. Then (1)0 = (1)|3' by (11.8.2)
or by (11.8.3) (depending on whether y \lx or not).
CASE 2. wp = A. Then <"+1>r= (n)r and, by (11.8.2), X(x) = \(jc') for all
x in (n+1)r. Thus ("+1)r= ("+1>r. Then also in)T= <-n)T' by Fact 2 and using Fact 1
we conclude T = T'. Hence A = ^' and 0' = |3 = A.
(77z£ on/y i/ direction) Let G be a reduced grammar with strict partition it.
Let r, T' G .9^3 be two derivation trees such that
rt(r) = rt(r') (mod ?r) (11.8.4)
and
<»)fr(D= (n)fr(7") (11.8.5)
for some n>0. To show that in+1)T and <"+1)r' satisfy (11.8.1), (11.8.2), and
(11.8.3), we proceed by induction on the height h of the larger one of the two trees.
Assume, without loss of generality, that the taller tree is T.
Basis, h = 0. Then T consists of a single node labeled by some a G 2. (Note
that we cannot have a = A since, in such a case, assumption (11.8.4) would be
meaningless). Then by (11.8.4) and since 2 G 7r also, T' has a single node labeled by some
<z'G2. Thus T=T'. Clearly, a=a and, moreover, if n> 1, we have a = a from
(11.8.5).

11.8 STRUCTURE OF TREES 385
tVl wk (r)
(")tr(T)
FIG. 11.5
Inductive step. Let h > 0 and assume the result; i.e., (11.8.1), (11.8.2), and
(11.8.3), true for all trees with heights smaller than h. Let us denote this assumption
(A). Let x0 and x'0 be the roots of Tand T1, respectively,and let x0Fxu ... ,x0rxm,
x'0rx\,...,x'0Fx'm', XiL.x2\~-■-\~xm and xiLx^L • •-LxJ„' (cf. Fig. 11.5).
Let us define, for 1 < i < m and 1 </ < m ,
7j = {x in T\ xt T x],
T) = {xmT'\x'jrx}.
Clearly, all the T,- and T) are derivation trees of G and, because their heights are
smaller than h, the inductive hypothesis (A) is applicable. Denote wt = fi^r,),
w; = fr(r;).
Claim If for some k < m, no tree among Ti,... ,Tk contains the (n + l)st
terminal node of T (in particular, if no such node exists), then m'^-k and Tt = T't
for l</<*.
The argument is an induction on k. The basis (in which k = 0) is vacuously
true.
Inductive step. Assume the premise of the claim for some k > 0 and, as
inductive hypothesis (B), the conclusion for k — 1; that is, m S* k — 1 and T{ = T\
(1 < / < k — 1). Then, in particular, \(x,) = X(xJ) for 1 < / < k — 1 and by the
strictness of it, using (11.8.4), \(xk) = \(x'k). Since wi • • • wfe is a prefix of (n)fr(r) =
<n)fr(r') and since Wi • • • Wk-i = w'i ''' w'k-i by inductive hypothesis (B), either
fr(Tfe) is a prefix of fr(T'k), or conversely. But since rt(rfe) = \(xk) = \(x'k) = rt(T'k),
and using inductive hypothesis (A), we conclude that wk = w'k and Tk = T'k.
Therefore the claim follows.
Now if n > lg(fi(T)) (that is, if (n)fr(r) = fr(T)\ then by the claim, Tt = T\
(1 < i < m < m'). But then fr(r) = w1 • • • wm = w[ • • ■ w'm and by the strictness of

386 BASIC THEORY OF DETERMINISTIC LANGUAGES
tt we have m = m; and since X(x0) = X(xo), we have, again by the strictness of tt,
\(x0) = X(xo). Hence <"+1>r= T= T' = <"+1>r'.
For the rest of the proof, we assume n < lg(fr(r)) (cf. Fig. 11.3). Let y be
the (w + 1 )st terminal node in T and let k and r be such that y is the (r + 1 )st terminal
node in Tk+1. By the claim Tt = T'i(Ki<k) and thus, by the strictness of tt, m > k
and rt(7,fe+1) = X(xfe+1) = X(xfe + 1) = rt(7,fe). Moreover, w, • • • wfe = w\ • ■ • w'k and
since (r)wfe+1 isasuffix of (")fr(r) = (n)fr(r), we have (r)wfe + 1 = (r)(u4+i • • • w'm').
The only possibility which is in agreement with the induction hypothesis (A) is
Now, <n+1)T consists of jc0, Tu..., Tk, and (r+1)rfe+1; <n+1>r consists of
jco, T\,..., T'h,wA <r+1)rfe+1.Here (r+1)rfe+1 = <r+1)T'k+1 by the inductive hypothesis.
Define a function /: <"+1)r_y ("+1)7" such that / restricted to Tt is the structural
isomorphism T( »■ T\ (0<i<A:), / restricted to (r+1)rfe+1 is the structural
isomorphism (r+1)7ft+1 ** (r+1)rjj+1 and f(x0)=x'0. Now / is a structural isomorphism
<"+1)r^("+1)r'. Hence (11.8.1). Let x e(n+1)7\ If x\ly, then \(x) = \f(x), since
either x£ (r+1)rfe+1 and we can apply the inductive hypothesis; or else x£ Tt, i*^k.
Hence (11.8.2). Otherwise \{x) = ^f(x), since either x£ir+1)Tk+1 and we can again
apply the inductive hypothesis, or x=x0 and/(x) = Xo. Hence (11.8.3) and _7~G
satisfies the left-part property. □
Note that the assumption that G is reduced was not used in the only i/part
of the proof. On the other hand, in the // part, even if ^TG satisfies the left-part
property, the strictness of tt may be violated for some useless production which does
not appear in any grammatical tree. By Theorem 11.4.1, the requirement of a reduced
grammar causes no loss of generality.
For reduced grammars we have the following result:
Fact 3 Let G be a reduced grammar. The left-part property is satisfied by
JTG if and only if it is satisfied by J~G (S),the set of derivation trees in G.
The only if direction of this fact is trivial while the //direction follows from
the observation that, in a reduced grammar, every grammatical tree is a subtree of
some derivation tree.
The following two corollaries are immediate consequences of the Left-Part
Theorem.
Corollary 1 Any strict deterministic grammar is unambiguous.
Corollary 2 If w is a strict partition on G, then for every U&tt, the set
{w G 2* I A =*■ w for some A G U} is prefix-free.
Thus, in particular, L(G) is prefix-free, which we already know.
Our next objective is to give a machine-independent proof of a deterministic
iteration theorem. The reader may wish to reread Section 6.2 before continuing,
since our development will closely follow the earlier argument.

11.8
STRUCTURE OF TREES
387
Theorem 11.8.2 {Iteration Theorem for A2) Let L G A2 and L 9 2*. There is
an integer p(L) such that, for any string wGZ, and any set K of positions in w, if
I ATI ^p(L), then there is a factorization y = (z>j,..., v5) of w such that:
1 v2*A;
2 for each w, m>0, wG2*,
^^'"z^MeZ, if and only if ^t^z^m GZ,;
3 if */* = {*!,..., tf5}, then
i) KUK2,K3±Q or K3>K4,K5^(D;
ii) i/s:2u/s:3u/s:4i<p(z).
Corollary For each «>(),
Proof of the corollary Take m = l and u = v4vs in (11.8.2). This gives
z>iZ>21+1P3P4+1PsGZ, if and only if ViV2v3v4vs = wGZ,.
Now take m = 0 and « = z>5 in (11.8.2). That gives:
Vivlv^vlvs&L if and only if VxV3vs£L.
Since w = V\ ■ ■ ■ vs is in L, the Corollary follows. □
Proof of Theorem 11.8.2 The argument mirrors the proof of Theorem
6.2.1. Let G = (V, 2,P, 5) be a strict deterministic grammar generating L. Choose
p{L) = p(G) as in the proof of Theorem 6.2.1, and follow out that proof. That is,
find </? = (Vi,..., vs) satisfying the requirements of the proof.
Property (3) follows from the iteration theorem for context-free languages.
To show property (1), suppose v2 = A. Then we would have:
A ^Av4.
But left recursion is impossible in a strict deterministic grammar. Thus (3) has been
satisfied.
To establish property (2), let m, n > 0. By the iteration theorem for context-
free languages,
5 ^-v^AvTvs ^ v-,v¥*nv3vnSmVs = w^Vg (11.8.6)
where Wj = v1v™+"v3v!i G 2*. Assume now
5 £■ v1v?+nv3viu = wlU (11.8.7)
Let T, T' be two derivation trees corresponding to (11.8.6) and (11.8.7), respectively.
Let k = lg(w!). By the left-part theorem,
Let x be the node in T, which corresponds to the indicated occurrence of A in
sentential form ViV™Av™v$. Clearly, x is left of the (k + l)st terminal node in T (labeled

388 BASIC THEORY OF DETERMINISTIC LANGUAGES
with (1)z>4). By the left-part property (2), the corresponding node in T' is also labeled
with A. Therefore the following derivation corresponds to T1:
S =*• ViV™Au =*• \x>iit
Since A=*v" v3v% for any ri > 0, we have
ViV?+n'v3viuGL, anyri>0.
In particular, taking ri = 0, we obtain the only if part of property (2); taking n = 0
in (11.8.7) and ri arbitrary, we obtain the //direction of (2). □
The main purpose in proving an iteration theorem for A2 was not to use it
in showing sets to be non-strict deterministic. Our main purpose was to be able to
prove a similar result for A0. This result will be useful in showing that a number of
particular sets are not deterministic.
Theorem 11.8.3 (Iteration Theorem for A0) Let L £ 2* be a deterministic
context-free language. There is an integer p'(L) such that for every w GZ,, and every
set K of positions in w, if \K\>p'(L), then there is a factorization
<p = (vu ...,vs)
of w such that:
1 v2*A;
2 For si\n>0,v1v"v3v1vs^L;
3 ifKfo={Klt...,K5),then
i) KUK2,K3^<D or K3,K4,K5^<D,
ii) \K2VK3UK4\<p'(L);
4 if vs =£ A, then for each m, n > 0, u G 2*,
v1v™*nv3v1u&L if and only if v^v^u GZ,.
Proof Much of the previous proof and all of the proof of Theorem 6.2.1
still apply here. Thus (3) follows. Also, (2) follows easily. To complete the proof,
let Z,GA0. Then Z,$GA2 and is subject to Theorem 11.8.2. Define p'(L) = p(L$)
and let w GZ, with a set of positions K where |ATI >p'(L). Consider the factorization
^' = (^,^2,^3,^4,^5) of w$ obtained from Theorem 11.8.2 applied to L$. Since $
doesn't occupy any position from K, there are only two possibilities where it can
occur: either z>3G2+$ and u4 = f5 = A, or z>3G2+ and ^5=^5$ for some z>5G2*.
In either event, (1) holds. Assume vs^A. We can restrict ourselves to the second
possibility. Take ^ = (^1,^2,^3,^4,^5) as the factorization of w. The result is now
simple, since, for any n, m > 0 and u G 2*, we have
v1v"+mv3v1ueL if and only if vlv1*mv3v1u% GZ,$
if and only if ViV™v3u$ GZ,$
if and only if vxv™v3u€lL.
This satisfies (4), and completes the proof.
□

11.8 STRUCTURE OF TREES 389
To illustrate the importance of the last theorem, some applications will be
given.
Theorem 11.8.4 The set L = {a"b"\ n>l}u{anb2"\ w>l}is not a
deterministic context-free language.
Proof Let p = p(L) be an integer satisfying Theorem 11.8.3. Let w = apbp
and assume that all p positions which are b's are distinguished. By the theorem, there
exist vu ..., vs such that w = vt • • • vs.
and
Then
But this
Claim 1 v2 £ b*.
Proof If Vi — b' for some / > 1, then:
v1 = a"bh,
v3 = b}, ;>0,
v* = b\
v - foP-(h + i+j+k)
V!V3v5 = a"bp-<i+kyGL.
: is a contradiction since / > 0 implies
p-(i+k)<p.
Claim 2 v2£a*b*.
Proof This is clear.
Claim 3 vx = a' i>0,
v2 = aJ ;>0,
v3 = ap-ii+nbk k>0,
v4 = b>
Vs = bp-i**i)
Proof This is clear except for the fact that lg(z>2) = \g(v4). Suppose that
Vt=b' and vs = bp~ik+'\ and we shall prove that/ = /'. First, consider viv3vi =
ap-iy?-i ejr, But, if p— f = p— /", then /=/". Then, suppose 2(p— f) = p— /',
which implies that/' = 2j—p. An analysis of V\v\v3viv^L shows that either/ = /'
or ]"> p, which would contradict that v2v3v4 can have at most p distinguished
positions. Thus; =/' and now Claim 3 is proven.
To complete the proof, we note that vs must have a distinguished position,
by part 3(ii) of Theorem 11.8.3. Thus v5i=A. By condition (4) of Theorem 11.8.3,
letw =vsbp. Note that:
V]V2V3Va,U &L.

390 BASIC THEORY OF DETERMINISTIC LANGUAGES
Thus
ViV3vsbp GZ,.
But , , .
VMVsb" = a"~3b2p~' GL.
The only way that this string is in L is if
p -j = 2p -j
or if
2(p -j) = 2p -j.
Neither of tnese conditions is consistent withp > 0 and; > 0. □
Another example will be given which is an even stronger implication.
Theorem 11.8.5 The language L = {wwT\ wG{a,b}*} is not a finite union
of deterministic languages.
Proof Assume, for the sake of contradiction, that
N
L = U Lt for some N > 1 and Lt G A0.
Let !=1
P = max p'(Z,,),
wherep'(Lk)are the constants from Theorem 11.8.3 for every i<N. Consider the set
H = {(bapbf"\ n>l}9L.
H is infinite and therefore, for some Lt, \HC\L{\>2 or, more specifically, there are
two (even) numbers nu n2>2 such that bothw = (ba"b)"1 GZ,f and w' = w{bapb)n*
G L;. Applying the Iteration Theorem 11.8.3 to w, let
K = {/| (w1-l)(p + 2)</<w1(p + 2) = lg(w)}
(that is, K corresponds to the rightmost substringap in w). Clearly |ATI =p>p'(Li).
Let (yi,Vi, v3, z>4, vs) be the factorization of w from Theorem 11.8.3. By Property (3),
and since strings in L{ are palindromes, we have V\ G ba+, v5Ga+b and v^ = v4 = aq
for some q, 1 <q <p. Define u = v4vs(bapb)"'i. Now, taking m — n = 1 in 2 (note
that v5 =£ A), ViV2V3u — w' G L; implies
vlVlv3v4u = bap+qb(bapb)">-'ibap+9b(bapb)n>eLh
which is not a palindrome since q, n2 =£ 0. We have a contradiction with Z,,- 9 L. □
Since L is an unambiguous language, we have the following consequence.
Corollary There exists an unambiguous context-free language which is not
a finite union of deterministic languages.

11.8 STRUCTURE OF TREES 391
PROBLEMS
1 The proof of Theorem 11.8.2 is not completely formal. Find the "hand-
waving" part. Formulate and prove a lemma about cross sections and make the proof
completely rigorous.
2 Extend Theorem 11.8.3 in that (2) is replaced by the following statement.
(2') For each n > 0, viv"vzv1vsG. L if and only if h-GZ,.
A sequence of definitions is necessary before the next problems can be stated.
Definition A (finite) tree automaton is a system
j^= (G,2,a,F) (1)
where Q is a finite nonempty set, F^ Q, 2 is an alphabet, and a: Q* x 2 -*■ Q is a
function with the property that a~l{q) is finite for all q G Q.
Definition Let T be a tree with labels in 2 and let sf be a tree automaton
of the form (1). A function /: T-*■ Q is called a computation for T if and only if:
i) if x is a leaf in T, then f(x) = a(A, \(x)); and
ii) if x is an internal node in T and xu ... ,xn are all the immediate
descendants of x such that xxLxjL • ■ -L x„,
then/(oc) = a(f(Xl)f(x2) ■ • •/(*„), \(x)).
Define
£(j*0 = {fr(T)| T is a tree with labels in 2 and there is a computation
f.T^-Q such that f(x0) S F where x0 is the root of T}.
It is not a coincidence that the following definition reminds us of the definition of
a strict deterministic grammar.
Definition We shall say that a tree automaton sf = (fi, 2, a, F) has the
strict deterministic property if and only if \F\= 1 and there exists a partition w on
g such that, for any 77, r)h i?2G g*, quq2G Q, and a, b G 2, the following two
implications hold:
i) Ifa(77^1771,a) = a(77q2772,6),thenfl1 Sq2;
ii) If 01(7777!, a) = a(r), b), then r)i = A and (77 =£ A implies a = i»).
3 Prove the following result.
Theorem L G A2 if and only if Z, = {A} or Z, = Z,(j*0 for some tree
automaton with the strict deterministic property.
Definition A grammar G = (V, 2,/>, S) is deterministic if and only if there
is a partition it of V and a subset E £ V — 2 such that 2 G jt and for every A,
,4'G K-2anda,|3,|3'G V*,
a) if ,4 -»■ a0 is in P and a G K*£, then 0 = A and A G £;
b) if ^4 ->■ a/3 and /T ->-aj3' are in P and A =A' (mod jt), then at least one of
the following four cases is true:

392 BASIC THEORY OF DETERMINISTIC LANGUAGES
i) 0, 0' =£ A and (1)0 = (1)0' (mod w);
ii) 0 = 0' = Aand^ =^';
iii) 0 = A and A G £;
iv) 0' = Aand^'G£'.
4 Note that, in the preceding definition, condition 1 is equivalent to the
requirement
pc (VX(V-E)*)V(EX(V-E)*E),
and that we obtain a strict deterministic grammar as a special case when E = 0. Prove
the following result.
Theorem L G A0 if and only if L — L(G) for some deterministic grammar G.
11.9 INTRODUCTION TO SIMPLE MACHINES, GRAMMARS, AND
LANGUAGES
In Section 11.6, we were able to construct a nontrivial hierarchy of strict
deterministic languages. This hierarchy was based on the degree of a language, which
was related to the number of states in a dpda. In this section, we shall restrict our
attention to the "simplest" class of strict deterministic languages in the hierarchy,
i.e., to the strict deterministic languages of degree one.
First we show a convenient characterization of strict deterministic grammars
of degree 1. This characterization makes it easier to recognize these grammars.
Lemma 11.9.1 Let G be any context-free grammar. Then G is strict
deterministic of degree 1 if and only if A -+ a \ 0 in G implies that either a = 0 or a, 0 have
a terminal distinguishing pair.'
Proof Let G be strict deterministic, deg(G) = 1 and let A ■+ a \ 0 be in G,
a =£ 0. If a, 0 have no distinguishing pair, one of them is a (proper) prefix of the other,
which contradicts the strict determinism of G. Let (C,D) be the distinguishing pair of
a, 0. Then C = D. By definition of distinguishing pairs, C^D. If C,D GAT, we have a
contradiction of deg(G) = 1. Hence, C, D G 2.
Assume that G has the above property. Recall Algorithm 11.4.1, which tests
a given context-free grammar for the property of being strict deterministic. Now let us
apply that Algorithm to G. Starting in Step 1 with partition w consisting of 2 and
otherwise only of singletons, the algorithm never reaches Step 6 (Step 5 is always
followed by Step 3 since 2 G it), and thus n is not altered. The algorithm halts in Step
8 (Step 4 is never followed by Step 7 since A{=Aj implies At =Aj, which implies
(Xj =£ otj by the assumption of the indices in the algorithm). Therefore, w is strict and
llir|| = ||ir0||=l. □
t Recall the definition (given in Section 11.4) of distinguishing pair.

11.9 INTRODUCTION TO SIMPLE MACHINES 393
Now let us introduce "simple" grammars.
Definition A context-free grammar G = (V, 2,P,5) in Greibach normal form
is said to be a simple grammar (s-grammar for short) if for all A G N, a G 2, and a,
B&V*
A -+ aa and A -*■ a(i in P
imply a = |3. A simple language, or s-language, for short, is a language generated by
an s-grammar.
Example 5 ->■ a&4 | b
A -*■ a
This is clearly an s-grammar that generates the set L(G) = {anban \ n > 0}.
Consider the grammar
5 -»■ aAd\aBe
A -+ <fc4fc|fc
5 -»■ a5c|c
This grammar is not an s-grammar. The language generated is
L = {a"b"d\ n>l}U{a"c"e\ n>\).
It can be shown that L is not an s-language. L is, however, a realtime strict
deterministic language.
Next, we relate strict deterministic languages of degree one to dpda's.
Lemma 11.9.2 L is a strict deterministic language and deg(Z,) = 1 if and only
if either L = {A} or L = T2(A) for some realtime dpda A with one state.
Proof We note that {A} is of degree 1. If L = T2(A) for some dpda with one
state, then, by Theorem 11.6.1, L is strict deterministic of degree one.
Conversely, let G be a strict deterministic grammar of degree 1 and let
L(G) =£ {A}. Assume, without loss of generality, that G is A-free. We shall show that
G satisfies the main condition, from the definition of a realtime strict deterministic
grammar. But this is immediate from the assumption that deg(G) = 1; that is,
llwoll = 1, since then A =A' (mod w0) implies A =A'. If we assume A =A', and
A^-aB and A' -*■ aB '|3 are in P, then, since tt0 is strict,
B = B',
so we have B = B' by the degree 1 condition. Clearly, this implies (3 = A, by the
strictness of n0. Thus G is realtime. Clearly, L is accepted by a realtime dpda with
only one state, as can be seen by a careful check of the proof of Theorem 11.7.4. □
Next, we show that the realtime assumption is not necessary.
Lemma 11.9.3 L = T2(A') for some one-state realtime dpda A' if and only
if L = Ti(A) for some one-state dpda A and L =£ {A}.

394 BASIC THEORY OF DETERMINISTIC LANGUAGES
Proof Only the // condition needs proof. Let {A}i=L = T2(A) for some
one-state dpda A. Then L is strict deterministic of degree 1, by Theorem 11.6.1.
By Lemma 11.9.2, L = T2 (A') for some one-state realtime dpda A'. □
Now the various results can be collected into one proposition.
Theorem 11.9.1 LetZ, £ 2*. The following statements are equivalent:
a) L is simple.
b) L is strict deterministic of degree one.
c) L is accepted as T2(A) for some realtime dpda with one state or L = {A}.
d) L is accepted as T2(B) for some dpda with one state.
Proof By our lemmas, (b), (c), and (d) are already known to be equivalent.
Clearly, if L is simple, it is strict deterministic of degree one.
Assume {A}^L = T2(A) for some one-state realtime dpda A. By the proof
of Theorem 11.7.4, L = L(G), where G is realtime, in Greibach form, and since A has
one state, L is strict deterministic and of degree one. By Lemma 11.9.1, if we have
A -*■ aa and A -*■ a/3 in P
for /I G AT, a G 2, and a, |3 in AT* (and all the rules are of this form), then a = |3. □
Now we turn to an important result about s-languages. We shall show that
the inclusion question, that is, Li £ L2, is unsolvable for s-languages. There are two
reasons why this result is noteworthy. First, the s-languages are very primitive and
are contained in many of the other families of deterministic languages that have been
studied. Thus it follows, as corollaries to this result, that the inclusion problem is
unsolvable for all families that contain the s-languages.
There is a second reason why this inclusion result is important. There is an
obvious relationship between the inclusion problem and the equality problem. If the
inclusion problem is solvable, then the equality problem is also, since Lx = L2 if and
only if L\ £ L2 and Z,2- L\. In the examples we have seen, such as regular sets or
context-free languages, both problems have been solvable or both unsolvable. It
could be the case that there is some family of languages that has an unsolvable inclusion
problem but a solvable equality problem. The s-languages will turn out to be such a
family.
In the argument which follows, some simple languages will be used and will
be specified by dpda's. As we have only one state, it can be omitted from a description
of a dpda, and we need only write:
A = (2,r,5,Z0),
where
5: 2xr->r*
is a partial function, in order to specify an s-language.
Theorem 11.9.2 The inclusion problem for s-languages is recursively
unsolvable; i.e., there is no algorithm which can decide, of two s-languages Lx and L2,
whether or not Li £ L2.

11.9 INTRODUCTION TO SIMPLE MACHINES 395
Proof The argument will proceed by taking an arbitrary Turing machine A
and input w to A. Two s-languages Li and Z,2 will be constructed such that Li % Z,2 if
and only if A halts and accepts w. (This actually shows more; namely, that the inclusion
problem is not partially decidable.)
Let A = (Q, 2, T, B, 8,q0, {q}) be any Turing machine and let w G 2*. There
is no loss of generality in assuming that A is totally defined on Q — {q} and that q is
a unique final "halt" state. The idea of the proof will use the reduction from A to the
MPCP done in Section 8.3. The reader is expected to be aware of the details of those
arguments. Assume that the lists x and y given in Table 8.3.1 have been constructed.
For each integer i, 1 < i < n, let/; be a new symbol not in
A = (S-{B})UfiU {#}.
Let <t and $ be two new symbols not in A U {/,,... ,/„}. Define a homomorphism
ip: A*->(AU {$})* defined by:
<p(a) = a<t
for each a G A. Now we define the first language of interest.
Li = {fifik---fiMiy,yii---yiky2)%\ k>0,3</„...,ifc<«}.
We construct a one-state dpda Ax such that T2(A1) = Ll. Let^4! = (AU
{<t,$,/1;...,/„}, 5 ^IYZo), where
r, = {Z0,ZU C} U {[z] I z G A+, lg(z) < max {IgO,)}}
I
U {[z<t] I z G A*, lg(z) < max (IgO,)}}.
The transition function is defined by cases.
CASE 1. Read the /}'s but stack the corresponding j>;'s.
«i(/i,Z0) = Z,\y\\C,
5 lift, C) = {yj ] C for each /, 3 < / < n,
«i(A,c)= [yj].
For each a G A, z G A*, lg(z) < max, (lgO,)}.
CASE 2. 5 i(a, [za]) = [z<t]; Check whether symbols match.
CASE 3. 5,($, [z<t])= [z] ifz^A.
CASE 4. «i(^[«]) = A.
CASE 5. 51($,Z,) = A.
(3) and (4) check whether every other symbol is a $. (5) checks that the last symbol
is a $.
It is not too difficult to see that T2(A i)=L1.
The tricky part of the proof is the construction of an appropriate L2. This
set is

396 BASIC THEORY OF DETERMINISTIC LANGUAGES
Li={fifik ■ ■ -hjYP(y\y-h ■ ■ -y^yiW k>0,3<h,.. .,ih <n, Xjx,-, • • -xikx2
^y\yi,---yi^^J
where J (an abbreviation for junk) is some set such that J Pi L\ = 0. Now we construct
an appropriate machine A2 = (A U {$, $,/i,... ,/„}, r2,52,Z0), where
T2 = {Z0,ZUA,C, D,E)
U {[z] | z G A+, lg(z) < max (lg^)}
U {[z<t] | z G A*, lg(z) < max {lg(x,-)}}.
52 is defined by cases below.
CASE1. 52(/2>Z0) = Z1[x2r]C.
CASE 2. 82(fi, Q = [xf] C for each i, 3 < i < n.
CASE 3. 52(/1>0= [xj].
In steps (1) through (3), the /j are read and the corresponding xt are stacked.
For each a G A, z G A*, lg(z) < max; {lg(x,-)}.
CASE 4. 52(a, [za])= [z<t].
CASE 5. 52(<t, [<t]) = A.
CASE 6. 52(<t, [z<t]) = [z] if z =£ A.
In (4) through (6), inputs are matched against the stack and it is checked that every
other symbol in the input is a <t.
CASE 7. 52($, Zi) = E. reject if final symbol is $ and pushdown is only Zx.
For each a, b G A, a =£ b, each z, z G A*, such that
lg(z) < max {lg(x,)},
I
lg(z)<max{lg(x.)}.
CASE 8. 52(<z, [zb]) = A. If a mismatch occurs, pop the stack.
CASE 9. 52(<t, [z]) = D. If z¥=\, use <t to encode D on the stack.
CASE 10. 52(a,D) = A.
For each c G A U {$}
CASE 11. 82(c,Zi) = A.
CASE 12. 82(c,A)=A.
CASE 13. 52($,^) = A.
In (11) through (13), if the bottom of the stack was encountered when a
symbol other than A was read, the stack symbol A is used to encode that the input
will be eventually accepted when $ is finally read.

11.9
INTRODUCTION TO SIMPLE MACHINES
397
Claim 1
T2(A2) = L2
= (ii ~ {hfik ■ ■■filfi*(?cixtl • ■■xikx2)S\k>0, 3 </,,... ,/fc < n})\jJ.
where / consists of some irrelevant strings so that
Li n/ = 0
Proof The proof of this claim is long and tedious and is left for Exercise 7
at the end of this section.
Claim 2 L,9 L2 if and only if
Li H {f2fik ■ ■ ■f,if1<fi(x1xii ■ ■■xikx2)$\ h>0, 3 </„ ... ,ik <n) = 0
if and only if there is no sequence iu...,ik,k>0, with 3 </,,...,ik <n such that
*i*i, ■ ■ -^ife^2 = y\yix • ■ -yi^i-
Proof From Claim 1, write L2 = (£, — AT) U /. Now
L^L2
if and only if
Li nL2 =0 = li 0(1, -Ar)u/ = i1 0(1, nAO("1.7=1, n(r,ui) = l1nj
The second equivalence holds because of the structure of Lx and X.
Claim 3 L, £ L2 if and only if the Turing machine halts on w £ 2*.
Proof The result follows from Claim 2 and Problem 12 of Section 8.3, in
which it is shown that
*i*i, • ■ -xikx2 = y^ ■ ■ -yiky2
for some k > 0, 3 < /",,..., ik < n if and only if A halts on w. Now the proof is
complete. □
PROBLEMS
1 Let R£ 'L* be an arbitrary regular set and let $ be a new symbol not in
2. Show that R$ is a simple language.
2 Show that the family of s-languages is not closed under union, intersection,
or complementation.
3 Show that the family of s-languages is closed under product.
4 Show that the family of s-languages is not closed under reversal.
5 Let Dr be the semi-Dyck set over 2r U 2r and let $ be a new symbol not
in 2r U 2r. Show that Dr% is a simple language.
6 Show that it is undecidable to determine whether a given context-free
language is an ^-language or not.

398 BASIC THEORY OF DETERMINISTIC LANGUAGES
7 Prove that T2(A2) = L2 in the proof of Theorem 11.9.2.
8 Does Theorem 11.9.2 hold for single-turn simple machines?
9 Show that, for each .s-grammar G, there is an equivalent .s-grammar which
is in 2-standard form.
11.10 THE EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES
In this section, we will show that the equivalence problem for s-languages
is solvable. We shall assume that the two s-languages Lx and L2 are presented by
s-grammars Gx and G2. If either Lx = {A} or L2 = {A}, the problem is trivial, so we
shall henceforth assume that Gx and G2 are A-free. The algorithm to be given will
depend on the following definition of equivalence of strings.
Definition Let Gi = (Vi,'L,Pi,Si), i= 1, 2, be two context-free grammars
such that TV, n N2 = 0 and let a, 0 G (TV, U N2)*. We say that a is equivalent to 0
(written a = 0), if for each x G 2*, a =>x if and only if 0 =>x
Note that since a and 0 may have variables from each grammar, the
generations involved may employ productions from both grammars. Also note that
L(G{) = L(G2) if and only if Sx = S2.
Example Let Gx be the grammar
Si -> ab
and let G2 have productions
S2 -> Ab
A -> aB
B -> b
Note that St ffe S2. However, it is easy to see that
b\B = (J2.
Since our grammars are A-free and may be assumed to be reduced, we have
that 0 = A if and only if 0 = A.
Our first result is that = is a congruence relation.
Lemma 11.10.1 Let G; = (T;, 2,^,5,) be two context-free grammars with
Nx n N2 = 0 for / = 1,2. The relation = is a congruence relation on (Nt U 7V2)*; i.e.,
it is an equivalence relation, and furthermore, for any a, 0, 7, 5 in (NiUN2)*, if
a = 0 and 7 = 5, then
cry = 05.
Proof The proof that = is an equivalence relation is immediate. Suppose
a = 0 and 7 = 5 for arbitrary strings a, 0, 7, and 5 in (TVi U N2)*. For each x G 2*,
*
(*7 =»■ x

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 399
if and only if there exist y, z G 2* such that
x = yz, a => y, and y =*■ z
This holds if and only if
x = yz, 0 => j> and 5 => z
which, in turn, holds if and only if
(35 => jc
Thus 07 = 05. □
We shall always assume in what follows that the grammars to be considered
are in Greibach normal form or, equivalently, in r-standard form for some r > 2. The
lemmas we shall use in this section are being proved in a rather general form since
they can be used in proving a number of different results. In the simplest case, any
s-grammar may be placed in 2-standard form by Problem 11.9.9.
Definition Let G = (V, 2,/*, S) be any grammar in Greibach form and let
A GN and a ex. Define
R(A,a) = {y\ A -*ayisinP}.
Note that, for any grammar in standard 2-form, it is always the case that
R(A, a) £ N\. In an s-grammar,
R(A,a) = 0 or \R(A,a)\ = 1.
We shall often write R(A,a) for this unique string when no confusion can result.
We now know that the equivalence problem reduces to deciding if 5i=52.
Let us introduce some terminology which helps to determine whether or not a = 0.
Definition Let Gt = (yl,'L,Pi,Si) be two context-free grammars with
jVi n 7V2 = 0 and let a, 0 be in (7V1 U A^)*. We define the set of witnesses for a and 0 as
W(a,$) = {xGZ*| a ^ x if and only if 0 f- x).
The set of shortest witnesses is
W(a, 0) = {x G W(a, 0)| for eachy in W(a, 0), lg(x) < IgO)}.
Thus a witness is a terminal string that distinguishes a from 0.
Example Consider grammars Gx and G2 from the first example of this section
and let a = 5, and 0 = S2. Then
W(SUS2) = {ab,abb},
W(SuS2) = {ab}.
Our algorithm for deciding whether £1 = £2 will work if we first consider
S\ =S2. Several transformations will be defined which map the pair (5'1, S2 ) into a
finite set of pairs to be tested, and the process will be repeated. Since there are a
number of other algorithms of this type, we establish some preliminary lemmas which
hold for general types of transformations.

400 BASIC THEORY OF DETERMINISTIC LANGUAGES
To simplify technical definitions assume that two grammars Gt = (Vt, 2, Pt, St)
are fixed with Ni D N2 = 0. Let N = Nj U N2.
Definition A transformation Tisa partial function from N* x TV* into
{fail} V{U\ U^N* xN*, U finite and nonempty}.
If T(a, 0) = fail, we say that T failed on (a, 0). If T =£ fail and is defined, then
T(a, 0) = {(a„ 0,), • ■ •, (<*m, ftn)l a,-, ft eN* for all /, 1 < / < m, where m > 1}.
Two important properties of transformations are now introduced, which
are relevant in determining which transformations are useful in testing for equivalence.
Definition A transformation T is valid if, for each a, 0 at which 7Ms defined,
1 a = 0 implies 7,(a,0)= {(a^fa),..., (am,0m)} for some m>\ and
at = 0; for each i, 1 < i < m, and
2 a#0 implies that either T(a,Q)=fail or there exists (a,-,ft)e r(a,0)
anda,-#ft.
The next property is a little more complicated.
Definition A transformation T is monotone if, for each a, 0GA^* such that
a ffe 0 and
7X«,0) = {(a1,01),...,(am,0m)},
and if y is a shortest witness of a and 0, then there exist i, 1 </<m, and some
shortest witness x for a; and 0f such that lg(x) < lgO>).
Now we introduce one of the transformations to be used in our final
algorithm.
Definition Let Gt = (Vh 2,/";, 5f) be two s-grammars with Wj O 7V2 = 0. For
a,0G7V+,let a = Aa and 0 = 50' for some ,4, fi G N, a, 0G7V*. The A-transform-
ation, TA, is defined by:
(fail if 1^04, a)\ + \R(B, a)\ = 1 for some a G 2,
7a(<*,0) =
(. {(i?04, a)a', #(#, a)0')| a G 2, fl(,4, a) * M(5, a) * 0} otherwise.
Also, ^(a, A) and TA(A,0) are undefined.
Note that TA(a,{f) =fail if one of the sets is empty and the other is not.
Example Let G^ be the s-grammar
Si -> aABC
A -> b
B -> c
C -> aC| $

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 401
S,=S,
ABC=D
BC=E
C=¥
A = A
C = F
C = F
FIG. 11.6
and let G2 be the s-grammar
S2 -> aD
D -> bE
E -> cF
F -> aF\$
We begin by applying TA to (5i,52), which yields (ABC,D), and continue.
It is often helpful to represent this process by a tree, as shown in Fig. 11.6. The
process may continue indefinitely.
Lemma 11.10.2 If the underlying grammars Gx and G2 are s-grammars, then
TA is both valid and monotone.
Proof The argument is subdivided into separate claims.
Claim 1 If a = 0 and TA(a,0) is defined, then 7U(a>0)= {(<*i»0i)» • • •,
(am, 0m)} for some m > 1 and a,- = ft for each i,\<i<m.
Proof of Claim 1 Suppose a = 0. It must be the case that both a and 0 are in
N* for, if either a or 0 were A, then a = 0 = A and TA(a, 0) is undefined. Thus there
exist A, B G TV, a', 0' G N*, such that
a = /4a' and 0 = 50'.

402 BASIC THEORY OF DETERMINISTIC LANGUAGES
Next we will argue that T(a, 0) =£ fail for, if TA (a,0)= fail, then there would
exist a G 2 for which
\R(A,a)\ + \R(B,a)\ = 1.
Without loss of generality, suppose R(A,a) = 0 and R(B,a) = {7}. Then no
string that starts with a can be derived from A, and since the grammars are A-free, no
such string can be derived from a = Ad. On the other hand,
/3 = 5ft => 37/3' => axy
where 7 =>x and 0' =>y, since the grammars are both reduced. Thus axy is a string
that distinguishes a from ft, contradicting that a = ft
Hence ^(a,|3) = {(<*!,ft),... , (a^ftn)} for some m>\, and we must
now show that a; = ft for each /, 1 < / < m. Let
a, = R(A,a)d and ft = R(B,a)0,
from the definition of TA. Let a,- ^£2*. Then
a = Ad =*■ ai?(/l, a)a' => ax
But then |3 =>ax. But this latter derivation must start with
0 = 5/3' =*aR(B,a)tf ^ ax
because the grammars are s-grammars, and so
ft = R(B,a)0 ^x
The same argument applied in the other direction yields that a,- = ft, and the
proof of Claim 1 is complete.
Claim 2 If a # U and TA(a, (3) = {(a,, ft),..., (am, /3m)} for some m > 1,
then, for any y G £+, y = ay' with a £ 2 is in W(a, ft), if and only if there is some /,
1 < / < m such that y G ^(a,-, ft).
Proof of claim 2 Let
a = Ad and /3 = 5/3'
for A, B £ N; a', ft £ W*. Then
a =*•;> if and only if a = Ad =j*aR(A,a)d ^ay'
if and only if R(A, a)d =>/
A similar argument holds for ft Thus y distinguishes a from /3 if and only if there is
some i,Ki<m, such that y distinguishes R(A,d)d fromR(B,a)0. But this proves
Claim 2, by the definition of TA.
Claims 1 and 2 combine to show that TA is a valid transformation. In
addition, Claim 2 implies that, if a#/3 and TA(a,&) = {((*!, ft),. .. ,(am,/3m)} for
some m > 1 then y = a/ G W(a, 0) if and only if there is some /, 1 < i < m, such that
/ G ^(a,-, ft). It is easy to verify that j> is in W(a, fi) if and only if/ G W(a,-, ft). This
follows because, if there were a shorter witness for (af,ft) than /, there would be a
shorter witness than y for (a, ft). The argument is reversible. □

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 403
Next, we need a few more simple definitions.
Definition Let G = (V, 2,,P,S) be any reduced context-free grammar. For
each A £ N, define
£04) = min{lg(x)| A =^x,xG2*},
£ = max{£(4)| A GW}.
When we deal with two grammars Gt = (Vh S./'j,S,), / = 1,2, with TV, nN2 = 0, we
define
£ = max{£i, £2},
where £,- = max{£(,4)| A GN,-}.
The following facts are useful.
Lemma 11.10.3 Let Gi = (Vi,l,,Pt,St) be two reduced A-free grammars.
1. For all a £ N+ and x £ 2*, if lg(x) < lg(a), then a £ x.
2. If 0eN+, then there exists x€Z* such that 0=^* and lg(x)< £ lg(|3).
3. Let a, 0GW+. If lg(a) > £ lg(0), then a #0.
4. Given x€2* and 0eN*, define |3' to the shortest string such that
0 = 0'0" and 0' =>*7 for some 7 e W *. Then lg(j3') < lg(x).
Proof (1) is a trivial consequence of the fact that the grammars are reduced
and A-free. Part (2) is a straightforward consequence of the fact that the grammars
are reduced and A-free and using the definition of £. To show (3), we know that there
existsx62* such that 0 =*•*and
lg(x)<£lg(0)<lg(a)
by (2) and the hypothesis of (3). By (1), a f>x, and so x G W(a, 0). Therefore a ^ 0.
To show (4), we use the definition of 0' to write
P'=Bi---Bi, x = xi--xt, and B}eN,
for each /, 1 </ < /, and xs G 2* such that Bj =*Xj for each /, 1 </ < /, and B-, fxfl.
Since the grammars are A-free, we get that lg(xy) > 1 for 1 </ < /. On the other hand,
t lg(xj) = lg(x),
so that ]~l
W) = /<lg(x). D
Now the second type of transformation is introduced, the5-transformation,
Definition Let Gi = (yi,'L,Pt,S{) be two reduced A-free grammars with
N1nN2 = Q and N = NiUN2. For all a, 0GN+ with lg(a)>2 and lg(0)>£, let

404 BASIC THEORY OF DETERMINISTIC LANGUAGES
a = Aa and let x be the shortest terminal string generated by A. If 0'eW"t' is the
shortest string such that there exists 0", y&N+, with 0 = 0'0"and 0' =*xy then we
say that
rB(«,0) = {(pL',y$'),(Ay,&)}.
If no such 0' exists, then
T(a,0) = fail.
Tb is undefined elsewhere.
Note that, since lg(0) > £ and, by part (3) of Lemma 11.10.3,
lg(0')<lg(x),
it must be the case that 0" =£ A.
Note that the ^-transformation may be applied to strings a G N\ and 0 £ N\~.
The resulting string Ay is in N^N2 and this is the reason that the symbols from the
two grammars can be mixed together.
Example Let Gi be the s-grammar
5, -> aAC
A -*aB\bAB
B -> b
C -> a
and let G be as follows:
S2 -> aDE
D -> bDF\a
E -> bG
F -> b
G -> a
Let a = ABSC and 0 = DFSE. The shortest terminal string derivable from A is x = ab.
We now attempt to derive x = ab from 0
0 = DF5^ =» aFsE=*abF4E
This tells us that
p" = .DF, 0" = F*E, and 7 = A
and thus,
TB(a,(S) = p5C,F4£% 04,/^)}.
Next we show that the i?-transformation is valid and monotone for s-grammars.
Lemma 11.10.4 Let Gx and G2 be reduced A-free grammars in Greibach
normal form with Ni C\N2 = 0, N = Wi U W2, and let a, 0 GW* such that lg(a) > 2,
lg(0)>£.lf

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 405
i) a=Aa for some A&N such that L(A) = {w G 2* | A =»w} is prefix-
free, and if
if) 0 =*z8 for some z6S+,5 G TV*, and 5 is uniquet,
then the ^-transformation is both valid and monotone.
Proof Again the argument is subdivided into separate claims.
Claim 1 If a = 0 and TB(a,0) is defined, then TB(a,P) = {(a,,0,), (a2,02)}
anda,- = 0,- for/= 1,2.
Proof of Claim 1 Let a = 0, a = ,4a' for some A GA' and a' €N+. Let x be
a shortest terminal string derivable from A and assume that TB(a, 0) is defined. Let j>
be a shortest terminal string derived from a'. Then
A ' A
a = Aa => xy
Since a = |3, we have 0 ^xy. Moreover, the derivation may be factored as
0 f> *0'" f xy
since the grammars are in Greibach normal form. Moreover, there must exist 0', 0",
and 7 £ TV* such that
0 = 0'0" and 0'" = 70"
so that 0' 7**7. (That is, 0' is the shortest prefix of 0 which generates xy.) Therefore
TB(a,ff) = {(<*', 70"), 047,0')}.
Next, it is necessary to show that a' =70" and ,47 = 0'. The first of the conditions
is easy to check using the hypotheses of the lemma. For each;/ € 2*,
a =*y
if and only if
a = Aa =* xy
if and only if
0 = 0'0" t *y'
because a = 0. In turn, this holds if and only if
00 =* X70 f xy
if and only if
70 f /
For the second condition, assume, for the sake of contradiction, that Ay ^ 0'
and let j'' be a shortest witness for Ay and 0'.
CASE 1. Ay=£y'. Let z be a shortest terminal string derived from 0". Then
t This means that 0 f-z81 and 0 => z52 imply that 5t = 52-

406 BASIC THEORY OF DETERMINISTIC LANGUAGES
Since we already know that a' = 70", we have that
a = Ad =* y'z
and thus
0 = W t y'z
since a = 0. There must exist j>1;j>2 e 2* such that
JCiJ^ = y'z
and
0' ^ yi and 0" ^> j>2
Since z is a shortest string derivable from 0", we get lg(z) < lg(j>2). If lgO^) = lg(z),
then y2 = z and yx = y', and thus
P'fy'
which contradicts that y' is a shortest witness of Ay and 0'. If lg(j>2) > lg(z). then
lgCvi)<lg(/)andso
implies
which contradicts that Ay' generates a prefix-free set, since yx is a prefix of/.
CASE 2. 0' ^>/. Choose z as above, and obtain
Wfy'z
Then
a = ,4a =]f jez
Since a' = 70", we get
Ayfi" fy'z
The argument now parallels that of Case 1, and a similar contradiction may be easily
obtained.
Claim 2 Let a # 0 and TB(a, 0) = {(a', 70"), (d-y,0')}, where a', A, 0\_0", 7,
and x are as in the definition of TB. Then, for every / G £*, we have xy' G W^a, 0)
if and only if/ G R>(a', 70").
Proof of Claim 2 Since L(A) is prefix-free and ,4 =>x it follows that
a = Ad =* xy' = y ifandonlyif a' => /
If 0 = 0'0" =>*/, then, for some 5 G7V*,
0 = 0'0" f*5 f-x/ (11.10.1)
Now
0 = 0'0"=>*70" (11.10.2)

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 407
But (11.10.1) and (11.10.2), together with hypothesis (ii) of the Lemma, imply
S = 70" 4/
Conversely, suppose
7/3 => y
Then, since 0' =>xy, we have
0 = 0'0" 4 *7/3" ^ xy' = y
Thus we have shown that
a => xy' = y if and only if a' => y
and that
0 => xy' = j> if and only if 70" => /
Since a ffe 0, xy' G W^a, 0) if and only if exactly one of a or 0 generate y . This holds
if and only if exactly one of a' or 70" generates y', which holds if and only if
y' € W(a, 70"). This completes the proof of Claim 2.
Let us agree to say that two strings rj and 6 agree on all strings of length
less than or equal to k if, for each z G 2*, lg(z)<k,r) =>z if and only if 6 =>z. We
shall write tj = 6, in this case. Also let p = lg(x).
Claim 3 For any k>\, if all elements of W(a, 0) of length<p + k are not
of the form xy', then a' = 70". (That is, W(ol', 70") cannot contain any members of
length less than or equal to k.)
Proof Suppose that a'= 70" is false. Then there exists/ G £*,lg(/)<fc,
such that/ G wy,70"). By Claim 2,
xy' G W(a, 0)
but
¥*/) = P + lg(/)<P + fc,
which is a contradiction.
The next claim gives additional information about witnesses.
Claim 4 Let Gx and G2 be as before. Assume that affe0 and TB(a,ff) =
{(<*', 7/3"), (Ay, 0')}. For any k > 1, suppose a' = 70". Then, for all y G 2*, lgOO <
k + p, we have y G W(a, 0) if and only if there exist yx, y2 G 2* such that y = y\y2,
j>, G W(Ay, 0'), and 0"^j>2 =£ A.
/Voo/ Let y G 2* with IgO) < A: + p. Clearly, a 4j> if and only if
A => y
and y
y = yy

408 BASIC THEORY OF DETERMINISTIC LANGUAGES
for somey,j>" G Z*. Since
W)>Wx) = p,
it follows that
lg(/') = Wy)-W)<k + P-P = *.
Since a' = yft", we have that a =>j> if and only if
A => y
nil *. ft
7/3 =*y
and
r it
y = yy
In turn, this holds if and only if
A^y'
* II
7 ="^1
and
;e = y'y'iyi
This is all true if and only if
A *. I II
Ay =* yyi = yt
& =* yi
and
y = y\y-i
On the other hand, ft=*y if and only if there exist yu y2€ 2* such that
0' ^i, 0" 4^2, 0 = 0'0", and j> =yiy2. Since lg(0)> £,lg(0')<lg(x) < £,so lg(0") >l,
which implies y2 =£ A.
Thus we have shown that
i) a=*y if and only if Ay=*yu ft"' =*y2, and y=y\y2, with y2¥= A; and
ii) 0=>j>if and only if 0'^i,0"=^^2^: A,andj> =yiy2.
From the definition of W(a, ft), it follows that y G W^a, |3) if and only if y =y\y2,
ft" =*y2, and yx € ^(,47,0'). This completes the proof of Claim 4.
Now, the claims can be used to complete the proof of the Lemma. Let a,
0GMV+ such that T(a,ft) is defined. If a = 0, then Claim 1 proves the first part of
the validity. If a ffe ft and T(ol, ft) ^fail, then two cases may arise.
CASE 1. There exists a shortest witness of the form xy for a and ft. By
Claim 2, y is a witness of a' and 70". Thus TB is a valid transformation. To complete
the proof of monotonicity, note that lgCv') < lg(xy'), since the grammars are reduced
and A-free.
CASE 2. No shortest witness of the form xy' exists for a and ft. hety be any
shortest witness for a and ft. It must be the case that lg(y) > p, so let lg(y) = p + k for
some k>\. By Claim 3, a' = 70". By Claim 4, there exist yu y2ei,* so that

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 409
yx G W(Ay, |3') and |3" =>j>2 =£ A. This proves that TB is valid in this case, also. Mono-
tonicity also follows, for, if y is a shortest witness of a and 0, we have that yx is a
witness of Ay and 0. yt is a shortest witness by the if part of Claim 4. (If z were a
shorter witness of Ay and 0', then zy2 would be a shorter witness of a, 0 than y.)
Therefore, TB is guaranteed to be valid and monotone. □
Now that both TA and TB are available, we could start to use them on some
sample grammars. However, situations can arise in which neither transformation is
applicable. For example, suppose the pair ,4 = A occurs. We augment our repertoire
with two additional transformations so that for every pair in TV* x TV* — {(A, A)},
there is always at least one applicable transformation.
Definition Let Gt be our underlying grammars. If a, 0 G TV+ and lg(a) > £ lg(|3),
then
7g(a,0) = rB(U,a) = fail.
7g is undefined elsewhere.
Definition If a G TV+, then
rA(a,A) = rA(A,a) = fail.
TA is undefined elsewhere.
Lemma 11.10.5 7a and 7g are valid and monotone transformations.
Proof If a#/3 and TA(a,0) is defined, then TA(a,0) = fail. Part (1) of
validity and monotonicity are vacuous. The proof of 7g uses part (2) of Lemma
11.10.3 and is trivial. □
Now the process of using transformations to form a tree is stated precisely.
Definition Let Gt = (Vt, S,Pj,St), i= 1,2, be two reduced A-free grammars
with TVi n TV2 = 0 and TV = TVi U TV2. Let J~ be a set of transformations such that
TV* x TV*- {(A, A)} 9 U dom(T),
where dom(7') denotes the domain of T. A ^-transformation tree is a tree with a
potentially infinite number of nodes, each labeled by elements of TV* xTV* U {fail},
where each elementary subtree corresponds to an application of T G $~. If the root is
labeled (a, 0), we say that it is a ^-transformation tree for a, 0.
Note that the only nodes of the tree which can be labeled fail are leaves. Also
note that there are many different such trees for fixed G\, G2, S~, a, and 0, since the
order in which transformations may be applied is not prescribed.
In the next lemma, the properties of such trees are explored.

410 BASIC THEORY OF DETERMINISTIC LANGUAGES
Lemma 11.10.6 Let S~ be a set of valid and monotone transformations.
Suppose a ^ ft and y is a shortest witness for a and ft For any .y-tree for a and ft
there exists a finite path from the root to a leaf with successive labels
(«o,U(i),(«i,Ui),...,(«t,Ut),/flff
where a0 = a, ft0 = ft and there exist strings j> =y0,... ,yt € 2* such that
0 *<igG0;
ii) for each /, 0 < / < t, at $ ft- and yt is a shortest witness for a,- and ft; and
iii) for all /, 1 < / < t, lgOO < lgCi-i)-
iVoo/ The argument is an induction on lg(j>).
Basis. y = A. By the definition of monotonicity, any TS ^ can fail only
on a and ft Thus t = 0 < lgOO- The other conditions are trivially satisfied.
Induction step. Assume the result true whenever lg(j>)<& and k>0. Let
lgOO = k+ I, wherey is a shortest witness for a, ft By validity,either T(a,ft) = fail,
in which case the argument proceeds as in the basis; or else T(a,ft) = {(a[, ft[),...,
(a'm,ft'm)}. By monotonicity, there exists /, 1 < i < m, a- ffe ft' and (a), ft) has a shortest
witness y' such that lg(^') < lgO>). Let (au ftj) = (a,:, ft) and yx =y'. The induction
hypothesis applies to (a1( &{) with witness^! and yields the result. □
Corollary Let S~, a, ft and j> be as in Lemma 11.10.6. Then a ^"-tree for
a, ft must have a path from the root to a fail leaf with no two nodes labelled by the
same pair (a;, ft).
Proof Two different nodes on the same path have shortest witnesses of
different lengths. □
Next we show that valid transformations applied to equivalent pairs cannot
fail.
Lemma 11.10.7 Let Gi = (Vt,'L,Pl,S), /=1,2, be two grammars with
7V"1n7V2 = (8 and let N = NiUN2. Let a, /3G7V* and f be any set of valid
transformations. If a = ft then any ^"-tree for a, ft does not have a fail-leaf.
Proof The argument is an induction on the height n of a ^"-tree for a, ft
Basis, n = 0. There is one node labeled (a, ft) 4^ fail.
Induction step. Suppose the result is true for all a, ft a = ft and all (a,ft)-
trees of height n < n0, where n0 > 0. Let n = n0 + 1. Consider a leaf sn in a tree of
height n. This node is on a path (s0,... ,sn) from the root s0.
Consider the subtree rooted at slt which also contains the leaf sn. This is
the shaded subtree in Fig. 11.7. Since all ^-transformations are valid, the pair (a1( fti)
that labels Si must satisfy a! = ftx. The induction hypothesis applies to the subtree, so
sn is not labeled fail. □

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 411
Node at height n
FIG. 11.7
Because the transformation trees we have been studying can be infinite, it
is convenient to define a partial transformation tree as the finite tree which results by
taking a transformation tree that has a cross section if and taking everything above and
including if. An example is shown in Fig. 11.8.
There are many different ways to form partial trees. Let us fix the convention
that no path in the tree is continued if it reaches a node label that has appeared earlier.
That includes an earlier appearance as its own ancestor.
We are now ready to indicate how to solve the equivalence problem for
s-grammars. For this purpose, we fix the class of the relevant transformations.
Definition Let JTS = {TA, TB, Tz, TA). Note that N* x N* - {(A, A)}
c dom( JTS).
Now a special kind of equivalence tree is defined.
Definition Let Gt = (Vi,'L,Pi,Si) be two s-grammars which are reduced,
A-free, and in standard /--form for some r > 2. Assume NiC\N2 = 0 and N = A^ U N2.
Let a, (SGN*. An equivalence tree for a, 0 is a partial ^-transformation tree with
the following restrictions:
i) If (<*!, 00 is the label of the root of an elementary subtree and T€ JTS is
the corresponding transformation, then
FIG. 11.8

412 BASIC THEORY OF DETERMINISTIC LANGUAGES
a) if lg(aO > £ lgOJ,), (or lg(0,) > £ lg(aO), then T = Te;
b) iflg(aO>(/• - 1)£ + 2 andlg(0O>(r- 1)£ + 2,then T¥=TA;
c) if T= TB, then lg(aO < lg(/3i). (Note that TB may be applied when
lg(0i) <lg(o;i) but then the roles of a! and 0i have to be reversed.
ii) If (oil, Pi) is a label of two different nodes, then at least one of them is a
leaf. (Note then that (au 00 = Q3,, aO.)
It is important to note that the previous definition constrains how the
transformations are to be applied but does not do so uniquely. There are still a large number
of choices as to which transformation to apply.
Example Gt and G2 are given
G,
Si ->• aAC
A -+ Mfllafi
5 -+ ft
C -+ a
below.
G2
52 ->• aD£"
D -+ ftZ>F|a
£" -+ ftG
F -»• ft
G ->• a
A straightforward calculation shows that £ = 4. An equivalence tree for Si,
S2 is shown in Fig. 11.9. Note that the leaf labeled BC = E has been circled to indicate
that it has already been encountered in the tree and fully developed. There are a
number of interesting points that occur in developing this tree. The reader should
note the situation with the circled nodes, particularly A =DF, in which the repetition
occurs on the same path. This point will be clarified as additional properties of
equivalence trees are established.
Lemma 11.10.8 Let Gt = (Vt, £,P,-, S;) be two s-grammars in standard
/•-form for some r>2 with NlnN2 = $ and N = W, UN2. Let a, 0 be in N*. In any
equivalence tree for a, 0, where lg(a), \g(0)<(r— 1)(£ + 1) + 2, any node labeled
(cti, Pi) has the property that either
lg(ai) <(r-l)(«+l) + 2
or
W.) <(r-l)(«+l) + 2.
Proof The argument is an induction on the shape of the tree.
Basis. The root clearly satisfies the condition by hypothesis.
Induction step. Suppose we are at some arbitrary node that satisfies the
condition. A new node labeled (au 0i) is generated by applying some transformation
7"S ^s to the given node. If T= Te or T= 7X, the condition is vacuously satisfied.

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 413
ST\
a>
(J

414 BASIC THEORY OF DETERMINISTIC LANGUAGES
If T=TA, then the new node is labeled (R(A,a)a'i,R(B,a^'^), where
(*! =Aa'l and 0i = B$\. By using part 1(b) of the definition of an equivalence tree, we
get (without loss of generality by the symmetry) that
lgCa'O = lg(a,)-K(r-l)« + 2-l.
Also, lg(R(A, a)) < r, so that
lg(R(4,«)a'1)<(r-l)« + 2-l + r = (r-1)(«+ l) + 2,
and the condition holds.
If T=TB, then the resulting node is labeled either (a',7/3") or (Ay,(}'),
where $x = 0'/3", a! =Aa, /3' j*xy, and x is the shortest terminal string derived from
A. Note that
lg(<*i) < (r-l)(«+l) + 2
by (i(c)) of the definition of an equivalence tree and the hypothesis of the lemma.
Now |3' is the shortest prefix of/3 for which the above is true, so that
lg(T) < (r - 1) lg(x) + 1 < (r - 1)8 + 1. (11.10.3)
Clearly, we have
lg(a') = lg(a,) - 1 < (r - 1)(£ + 1) + 2
and
]g(4i) = lg(T) + 1< (r - 1)(« + 1) + 2
from (11.10.3). D
Corollary 1 Let Gt = (K,-, S.i';, S;), i = 1,2, be two s-grammars in standard
2-form, with NlnN2=9 and A^ = Nl UN2. If a, 0SA^+ with lg(a), lg(/3)<£ + 3,
then any node in an equivalence tree labeled (al, j}^ has either
lg(a,) < £ + 3
or
IgOSi) < £+3.
Proof This follows directly from Lemma 11.10.8 by setting r = 2. □
Corollary 2 The number of nodes in an equivalence tree which are roots of
elementary subtrees for TA or TB is at most
c|./V|(S+1)(S+3).
Proof The labels of such nodes are all distinct, by (ii) of the definition of
equivalence trees. If such a label is (a, /3), then
lg(a) < £ + 3 and lg(/3) < £(£ + 3),
or vice versa. The inequality for 0 follows from part (i(a)) of the definition of an
equivalence tree. Thus,
aSJVl+3 and /3S^6+3>.

11.10 EQUIVALENCE PROBLEM FOR SIMPLE LANGUAGES 415
There are only
|JV|g+4-l |jV|g(g+3>+1-l
\N\-l ' \N\-\
<c,|W|(6+1)(6+3)
such labels for some constant c1 if \N\> 1. If \N\ = 1, there are fewer than c2V?
such labels for some constant c2. □
Corollary 3 An equivalence tree has at most
rf|tf|<S+D<»+3)|S|
nodes, where d is a constant.
Proof For any node counted in Corollary 2, there are at most max{|£|, 2}
immediate successors obtained by applying any transformation. Each of these nodes
either is a leaf or may have one successor that is a leaf, so the number in Corollary 2
need be multiplied by at most max{2| £| + 1, 5}, as a bound on the total number of
nodes. □
Now we can state and prove the main result.
Theorem 11.10.1 Let Gf = (K,-, £,?,-,5,), i= 1,2, be two s-grammars with
N1nN2 = 0 and N = NlUN2, and let a, /3S7V*. Then a = j3 if and only if every
equivalence tree for a and /3 does not have any leaf labeled fail.
Proof Let /? be any equivalence tree for a and /3. For any leaf labeled (alt 0i),
where (alt Px) does not label any ancestor of that leaf, it must be the case that (ai, 0i)
labels an internal node elsewhere in the tree. Modify R to form a new tree /?! by
appending to the leaf in question, a copy of the subtree rooted at the internal node.
This is continued for all such leaves. The process cannot continue indefinitely since it
cannot generate paths of length more than ClAf(6+1)(6+3), by Corollary 2 to Lemma
11.10.8. R! includes every path from root to leaf of some JTS tree for a and 0 that has
no label repeated twice. Moreover,/?! has a/a//-leaf if and only if/? has a fail-leaf.
If a = P, then by Lemma 11.10.7, /?! contains no fail-leaves and/? cannot
have any. Conversely, if a ^ /3, then /?, must have a/a//-leaf, by Lemma 11.10.6. Note
the condition that no label be repeated along the path. Therefore, /? must have a
fail-leaf. □
Corollary The equivalence problem for s-grammars is solvable.
Proof Construct an equivalence tree forS^^S^ and check it for/^//-leaves.
Since the tree is finite, this is an algorithm. □
PROBLEMS
1 In the definition of TB, it was required that lg(/3) > £. It was explained in
the text that this implies /3" # A. Show that the condition /3" # A is necessary to
ensure that TB is monotone. Give an example of the application of TB where /3" = A,

416 BASIC THEORY OF DETERMINISTIC LANGUAGES
that would allow us to conclude that two strings are equivalent when in fact they
are not.
2 Could we develop a decision procedure for equality of s-grammars by
dropping Tb from our set of transformations? Prove your answer.
3 Let X, Y be nonempty sets of strings and suppose that X is prefix-free.
Show that XY is prefix-free if and only if Y is prefix-free.
4 Let X, Y, Z be nonempty sets of strings and assume X is prefix-free. Show
that if XY = XZ, then Y = Z.
5 Let X, Y, Z be nonempty sets of words and, furthermore, suppose that
Y and Z are prefix-free. Show that if YX = ZX, then Y = Z.
6 Translate Problems 4 and 5 into results about cancellation of equivalent
strings over N*. Use these identities to find more shortcuts in dealing with equivalence
trees.
7* Generalize the main result of this section by showing that it is decidable,
of two deterministic languages, one of which is an s-language, whether or not they are
equal. You should assume, for maximum generality, that the languages are presented
by arbitrary dpda's.
11.11 HISTORICAL SURVEY
The LR_1 theorem (11.2.3) was originally proved in Ginsburg and Greibach
[1966], as were many of the closure results in Section 11.3. Our proof of Theorem
11.2.3 follows Book and Greibach [1973], which in turn follows Hopcroft and
Ullman [1968a]. The material on strict deterministic grammars follows Harrison and
Havel [1972, 1973, 1974]. Problem 1 of Section 11.5 is from Geller and Harrison
[1977].
Lemma 11.6.2 is from Ginsburg and Greibach [1966]. Theorem 11.7.1 and
the exercises on realtime Turing-machine computations are derived from the work
of Rosenberg [1967,1968].
The unsolvability of the inclusion problem for simple languages is due to
Friedman [1976], while the solvability of the equivalence problem is due to Korenjak
and Hopcroft [1966]. Our proof follows Harrison, Havel, and Yehudai [1978].

twelve
Recognition and
Parsing of General
Context-Free
Languages
12.1 INTRODUCTION
One of the most important applications of language theory concerns the
recognition and parsing of context-free languages. Such recognizers are at the heart of
many programs that take their input in natural-language form. Although some of these
systems use more sophisticated types of grammars, like transformational grammars,
even these rest on a context-free base.
In the present chapter, we begin by discussing machine models for recognition
which are closer to actual computers than Turing machines. It is shown that there is a
"hardest" context-free language with respect to recognition time and space. Section
12.3 is devoted to techniques for doing elementary operations on matrices, because it
will turn out that one can reduce the recognition problem to computing the transitive
closure of a matrix. The Cocke—Kasami—Younger algorithm appears in Section 12.4
and is analyzed in some detail. In Section 12.5, recognition is reduced to boolean
matrix multiplication, in an algorithm due to Valiant. A very useful practical algorithm
is given in Section 12.6, which is due to Graham, Harrison, and Ruzzo. Section 12.7
includes other related problems such as bounds on space required for general context-
free recognition.
12.2 MATHEMATICAL MODELS OF COMPUTERS AND A HARDEST
CONTEXT-FREE LANGUAGE
In order to avoid taking some tangents in our later development, we shall give
some definitions and results here that will be used later in this chapter.
417

418 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
It will be important to determine the number of steps in a derivation of a
string of length n. We will use this result in subsequent sections to analyze the number
of steps required by certain parsing algorithms. Unless some restriction is placed on the
grammar or the derivation, the length of a derivation may be unbounded.
Our purposes will be served by restricting the grammar to be of the following
type:
Def iniition A context-free grammar G = (V, £, P, S) is cycle-free if, for each
A EN, A =M is impossible.
Now we can state a result which says that, in a cycle-free grammar, the length
of a derivation is linear in the length of the generated string.
Theorem 12.2.1 Let G = (V, ~Z,P,S) be a cycle-free, context-free grammar
and let £ = max{lg(a)|/4 -> a is in P, A & N}. There exist constants c0, Cj, and c2,
which depend only on \N\ and £ such that c\ >c2, and for all A EN, aS V*, if
A^a, lg(a) = n, then
a) if n = 0, then /<Coi
b) ifrt >0, then/<Ci« — c2.
Proof We must consider the derivations of strings whose length is at most one.
Claim 1 Suppose A =*a where lg(a) < 1. Then:
i) If a = A, no path in the derivation tree is longer than \N\.
ii) If a = a S 2, then the path from the root labeled ,4 to the leaf labeled a is
no longer than |A^|.
iii) If a = B S N, then the path from the root to the leaf labeled B is no
longer than | N | — 1.
Proof of Claim 1 Consider a path from the derivation tree of .4 =^a as shown
in Fig. 12.1.
Suppose the path from A = Ax to a is of length k. Then we have
A = A! ^M2 ^- • • ^Mfe ^a
FIG. 12.1
Since At ^-A, is not possible, we must have that all of A i,..., Ak, and a are distinct.
If a =BGN, then Ar< \N\ - 1. If a = a S2 U {A}, then fc< \N\. This completes the
proof of Claim 1.

12.2 MATHEMATICAL MODELS OF COMPUTERS 419
IJVI-1
i< I £' = c0 =
»
1
\N\
gUvi _ j
if£ = 0,
if£=l,
if P^>1
Now we are in a position to complete the analysis of A-derivations. Note that
a derivations =^Ahas a tree of height < \N\. The number of nodes at height/'in the
tree is at most V. Thus,
(12.2.1)
\ £-1
Let c0 be this constant in eq. (12.2.1) which depends only on £ and \N\. Thus we have
proved part (a) of the theorem.
Now we must work out the case where a nonterminal derives another
nonterminal.
Claim 2 Suppose A ^B, 5 6K. If the length of the path from the root A to
the leaf labelled B is k, then
/<£((£-l)c0 + 1).
Proof of Claim 2 The argument is an induction on k.
Basis k = 0 implies / = 0, and the basis is verified.
Induction step: Assume that k >0 and that the result is true for the path to
the node labeled B, of length k — 1. Then
A^AX ■■■Am
where
Ad=*B for some d, 1 <d<m,
and y
Aj=*A if/#d.
By the induction hypothesis,
'd <(*-!) ((«-l)co + l).
Moreover,
/;- < c0 if/ =£ d,
by our proof of (a). Therefore,
i = 1 + id + 1 ij
i*d
;<l+(*-l)((£-l)c0 + l) + (£-l)c0 = *((£-l)c0 + l),
using that m < £. This completes the proof of Qaim 2.

420 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
where
and
If we combine Claims 1 and 2, we have already shown the following result:
Claim 3 Suppose A ^-B, BGV. Then
[|7V|£|JV| if£>l,
/<( |Af|((£-l)c0 + l) if BGV,
((|W|-l)((«-l)co + l) if BGN.
Now we are ready for the main claim, which will complete the argument.
Claim 4 If A ±a, lg(a) = n > 0, then
/<Ci« — c2,
ci = 2\N\((t-l)c0 + l)-co
c2 = |7V|((£-l)co + l)-c0.
Proof of Claim 4 The argument is an induction on n.
Basis. n= 1. We compute
C-l-ca = M((«-l)c0 + l).
So i < Ci • 1 — c2, by the second case of Claim 3.
Induction step. Assume the result true for all strings of length at least 1 and
less than n. Suppose that A =*a, lg(a) = n > 2. The tree of this derivation must have
a node with at least two immediate descendants producing nonnuU strings, as shown
in Fig. 12.2.
A
FIG. 12.2
Let D be the highest such descendant of A. Thus we have
A'^D =►/>! ■■■Dm
Dj & a,, lg(a;) = n,

12.2 MATHEMATICAL MODELS OF COMPUTERS 421
for each/, 1 </ < m. Moreover,
m
a, •••<*„,= a and X n} = n,
and there exist /'i, /2 such that
since lg(a)>2.
To simplify the notation of the proof, assume that there is some p, 2 <p
<m, such that
a,,.. . .a^A
and
(There is no loss of generality in this assumption since, if this condition were not
satisfied, we could introduce new indexing to make it so.)
Then
m P m
i = i* + 1 + I ij = i* + 1 + I ij + I ij-
By the induction hypothesis,
P m
i< (|/V|-l)((«-l)co + l) + l + Z (cin,-c2)+ Z c0
;'=1 ;'=p + l
= (\N\-lW-l)c0 + l)+l+cln-pc2 +(m-p)c0
= Cln + (\N\ - 1)((« - l)c0 + 1) + 1 + mc0 -p[\N\((£ - l)c0 + 1) -c0 + c0]
by substituting the value of c2. Continuing,
i<dn + CUV| - 1)((« - l)c0 + 0 + 1 + mc0 -p [|A^|((£ - l)c0 + 1].
Since m < £ and p > 2,
i <c,» + (|yV| - 1)((£ - l)c0 + 0 + 1 + <k0 -2[\N\(i ~ l)c0 + 1]
= c,« -(|yV| + 1)((£ - l)c0 + 1) + £c0 + 1
= c1«-|A^|((£-l)c0 +l)-(£-l)c0 -1 +£c0 + 1
= Cin-[|yVK(«-l)c0 + D-Co]
= ci«-c2.
Since the induction has been extended, the proof of Claim 4 and the proof of the
theorem are complete. D
Because our fundamental goal in this chapter is to examine parsing and the
resources that are required for its implementation, we must discuss how to implement
it and how to compare algorithms. The usual technique is to implement algorithms on
a standard model of a computer and then to estimate the time and storage requirements
of each algorithm. For that approach, standardized models of computation must be
used. There are two models in common usage which we shall employ, the Turing

422 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
machine and the random-access machine (or RAM, for short). We already know about
the use of Turing machines as language acceptors from Chapter 9.
Although Turing machines have an important role in the theory of
computation, they are not natural models for algorithmic analysis. Turing-machine
computations require a great deal of time moving back and forth along the tapes looking for
information, and do not have the familiar random-access memory properties of
computers. This suggest that another model is needed, and has led to the use of RAM's
(named for random access memory model).
A RAM is a device of the type shown in Fig. 12.3. It has an input tape that
can be read. Unlike a Turing-machine tape, every square may contain an integer, and
there is no a priori bound on the size of the integers used. The output tape is of the
same type, except that it can be written but not read.
Read-only input tape
Registers
Program
memory
|—> (finite)
Program
counter
Write-only \_
output tape I I I I I I I I [ I I I
FIG. 12.3
The RAM has an unbounded number of registers, each of which is capable of
storing an integer. The first register is thought of as the "accumulator." There is no
bound on the size of the integers stored in a register, nor is there a bound on the
number of registers that may be utilized. There is a program in a RAM which resides in
a program memory that cannot be altered. Each register can be directly accessed by
the program. A program counter indicates which instruction is currently being
executed. The output is printed on a write-only output tape, each square of which can
hold an arbitrary integer. There is an instruction set, which can be anything that is
sufficient to compute any partial recursive function. For example, a simple instruction
set might be:
LOAD a
STORE a
SUBTRACT a
READ a
WRITE a
BRANCH ZERO £
where £ labels an instruction in the program. On the other hand, a may be (i) a non-
r0 = Accumulator
r2

12.2 MATHEMATICAL MODELS OF COMPUTERS 423
negative integer representing a register, (ii) a literal, which means that constants are
available, or (iii) a tagged nonnegative integer used for indirection. The interpretations
of these terms are familiar to those who have experience in assembly-language
programming.
We are being deliberately vague about the actual instruction set. Cf. [Aho,
Hopcroft, and Ullman [1974] ] for a number of different choices. The reason we are
vague is that we wish to use computers at a high level and so we shall write algorithms
for a RAM in an ALGOL-like dialect. It would be a straightforward task to translate
such a language into RAM code.
Example Suppose locations A0 toAn_l in our RAM each contain an integer.
We wish to write a program to sort the integers into ascending order. Let bit have the
values true or false and let swap exchange the values of its arguments.
procedure swap (x, y);
begin
t: = x;x: =y;y: = t;
comment the procedure is called by reference;
end;
begin
repeat
b: = false;
for i": = 0 to n — 2 do
if at > ai+i then begin
swap (ah ai+l);
b: = true
end
until not (b)
end.
We leave it to the reader to analyze this algorithm, but note that, in the worst
case, we might have n(n — 1) comparisons. Thus this is an algorithm that runs in
quadratic time but linear space.
In our ALGOL dialect, program labels will have no computational significance,
but will simply serve as reference points for the discussion. In addition to program
variables containing integer or boolean values, we will allow variables to take on
(unordered) sets as values. Our algorithms will often deal with two-dimensional
matrices. The matrices will use 0-origin addressing. That is, anmxn matrix will
consist of rows 0 to m — 1 and columns 0 to n — 1. We will use the following matrix
concepts. A matrix M = (mt f) is (strictly) upper triangular if it consists only of
elements mtj where i </ (i </). The principal diagonal of an n x n matrix M consists
of the elements m0 Q,ml i,... ,/«„_!_ „_i.
If we formalize what we have done in the example, the time complexity of a
program is taken to be a function /(«), which is the maximum over all inputs of size n
of the sum of the time taken by each instruction executed.

424 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
For example, we could write a machine-language RAM program to subtract
two numbers by writing:
READ1
READ 2
L0AD1
SUBTRACT 2
STORE 3
WRITE 3
HALT
There are 7 instructions executed, so the time complexity is 7. Our definition of time
complexity could be considered to be unrealistic because the result is independent of
the length of the numbers. In cases like this, it is said that the uniform-cost criterion is
being used. We could charge log N to every instruction which uses integers of <N. In
that case, the logarithmic-cost criterion is being employed. In many of our applications,
the uniform-cost criterion is appropriate and will be used.
There is another type of model that is sometimes useful, the bit-vector
machine. A bit-vector machine works like a RAM except that its memory utilizes bit
vectors (of unbounded length). It is also convenient to have index registers and indirect
addressing. A typical set of instructions for such a machine would be as follows:
Arithmetic Operations on index registers
vk: = f(vt, Vi), where /is any boolean function of two variables
extended to vectors by bitwise operation;
SHIFT L-V Shift a word left
SHIFT R_v or right
Transfer instructions
for vectors and index
registers
As defined here, bit-vector machines are very powerful because many
complex processes can be coded into very long binary strings, and a bit vector machine
can perform its operations on these long strings in one step. When we use this model,
we must argue that its use is realistic in a computational sense. For instance, if we are
doing parsing of strings of length n and we are using bit vectors whose size is of length
log n, that seems realistic because, in practice, most strings parsed are only a few
hundred characters long at most. It would certainly be unrealistic to store strings of
length n2 in a single word.
Because of the way a bit-vector machine works, it is usually assumed that
there are no input or output operations. One can recognize context-free languages on a
bit-vector machine in time (log «)3. Now, that bound is less than linear time, and
linear time is required to read the input in a reasonable model. So the method of
achieving (log n)3 must be unrealistic.
Most of the discussion in this section has been descriptive. There is a
mathematical result that we wish to establish. We wish to exhibit a single context-free
language which has the property that, if one can parse this language in time /(«) > n

12.2 MATHEMATICAL MODELS OF COMPUTERS 425
and/or space g(n) >n, then any context-free language can be parsed within these
bounds. The significance of this result is that it is unnecessary to derive general time
and space bounds for each of the parsing algorithms that will be studied. We could
show instead that these algorithms have these bounds on just one particular grammar.
We will not choose this approach, because the general techniques of algorithmic
analysis are often more informative, and they lead us to notice significant special
cases; but it is possible in principle.
Let A denote our favorite model of a computer, for instance, a random-access
machine, a bit-vector machine, or some form of Turing machine.
Theorem 12.2.2 Let A beany computational model of the type just described
that accepts L £ A* in time p(n) >n, where p is some polynomial. Let <p: 2* -*• A* be
a homomorphism. The set
can be accepted in time p(n) by a device A' of the same type as A. The same result
holds for space.
Proof We construct A' to work as follows:
1. Scan w and translate it to <pw. This takes k lg(V) steps, for some constant
k.
2. Simulate A on <^w. This takes p(lg(^w)) steps.
3. Then w may be processed in
pQgfow)) + k lg(w) < k'p(lg(w))
steps for some constant k'. Note that a similar result holds for space. □
Let us now recall a specific context-free language, namely, L0 of Section 10.5.
Let 2 = {a i, a2, ai, a2, <t, c}. Define
L0 = {A}U {xjcj/jczjd • • • dxncy„cz„d\n>l,y1 ■ ■ ■ yn&<iD2,xi,zi&-L*
for all 2, 1 <i <n,yie{ai,a2,a1,a2}* for all i>2}.D2 is the semi-Dyck set on two
generators.
Theorem 12.2.3 L0 is a "hardest" context-free language. That is, the time
and tape complexity for recognizing any context-free language is identical to the time
and tape bounds fori0-
Proof The result follows immediately from Theorem 12.2.2 and Theorem
10.5.1. □
PROBLEMS
1 A weakened statement of Theorem 12.2.1 could be that, in a cycle-free
grammar, there is a linear relationship between the length of string generated from A
and the length of the derivation. Can you give a quick proof of this result by
contradiction? If you succeed, why is the proof given in the text superior?

426 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
2 Suppose A is a fe-tape Turing machine which accepts a set L with time
complexity T(n) > n. Show that L can be accepted by a RAM program in time
proportional to
(T(n) with uniform-cost criterion,
T(n) log T(n) with the logarithmic-cost criterion.
3 Show that the converse of Problem 2 is false for the uniform-cost criterion.
That is, there is a language L accepted by some RAM program P in time T(n) under
the uniform-cost criterion. There is no Turing machine A' which accepts L in time
p(T(n)) for any polynomial p(x).
4 Let L be a language accepted by a RAM program in time T(n) under the
logarithmic-cost criterion. Find the smallest possible polynomial p(x) so that L is
accepted by a Turing machine in time p(T(n)). For example, does p(x) = x3 suffice?
5 If indirect addressing is removed as a feature in a RAM, what consequences
follow?
6 Find a simple deterministic context-free language L i such that
A0 ? DSPACE (log n)
if and only if
ij S DSPACE (log 7i).
Recall that A0 is the family of deterministic context-free languages. Simple is used in
the sense of Section 11.9.
7 Consider the set L\ in Problem 6. Is it fair to regard Lx as a hardest
deterministic context-free language?
12.3 MATRIX MULTIPLICATION, BOOLEAN MATRIX
MULTIPLICATION, AND TRANSITIVE CLOSURE ALGORITHMS
A close connection between matrix multiplication, boolean matrix
multiplication, and transitive closure algorithms will emerge in this chapter. Some classical
results in this area are now recalled.
Let us first consider multiplication of matrices over an arbitrary ring. The
key idea is to study the 2 x 2 case first, and then to reduce the general case to it.
Lemma 12.3.1 The product of two 2x2 matrices whose elements are from
any ring can be computed with seven multiplications and fifteen additions.
Proof Let
~au au
tf2l #22
bn b12
021 Z>22
Cll C\2
C21 C22

12.3 MATRIX MULTIPLICATION 427
The following computation computes the Cy. Each st term involves one addition and
each mt involves a single multiplication.
mx = anbn
Si = an—a2\
S3 = a22 — Si
m3 - s4b22
m4 = SiS5
ms = s2s6
m6 = S3S7
mn = a22Sz,
m2 = «i2^2i
S2 = #21 + #22
54 = fll2—^3
55 = b22 — b\2
s6 — b\2 — bn
s7 = s5 + Z>n
St = b2i — s7
s9 - mi + m6
<■{}■
Cn = S10 = m\ + rn2
S11 = s9 + ms
C\i - S12 = su+m3
S13 = s9 + m4
C21 = Si4 = S\z+mn
c22 = sls = s13+ms
It is a straightforward task to verify that these identities correctly compute
D
Now we shall treat general matrix multiplication in terms of the 2 x 2 case.
Theorem 12.3.2 Two n x n matrices whose elements come from an arbitrary
ring can be multiplied in time proportional to n2il.
"c„
. C2i
Cu'
C22 .
An
.A2i
A12
A22.
B„
B2i
Bn
B22
Proof Let us consider the case n = 2k. Let us write the equation C = A * B
involving nx n matrices as
(12.3.1)
where each Cy, Ay, and By is an («/2) x («/2) matrix. Thus we may regard A, B, and
C as 2 x 2 matrices over a different ring. It is this observation that will allow us to use
Lemma 12.3.1.
Let M(n) be the number of scalar multiplications necessary to multiply two
nx n matrices. Let A(n) be the number of scalar additions necessary to multiply two
nx n matrices. Let T(n) = M(n) + A(ri). Note that n2 scalar additions are needed to
add two nx n matrices. Using Eq. (12.3.1) and Lemma 12.3.1, we see that seven

428 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
multiplications of (n/2) x (n/2) matrices and 15 additions of (n/2) x (n/2) matrices
are needed; that is,
r(«) = 77f-)+15(-
Thus
M(n) = 1m(-
and 2
n = 2fe,
Using the initial conditions Af(2) = 7 and .4(2)= 15, we conclude that, if
M(n) = lh
A(n) = 5(7fe -4fe) (12.3.2)
an(i ,. u
T(n) = 6-7fe-5-4\
as the reader may easily verify by induction on k.
Note that k = log2 n so that Eq. (12.3.2) is
M(n) = 7fe = 7^" = n^7 = «2,81.
Also observe that
T{n) = 6-«2-81 -5-«2 <6-h2-81. (12.3.3)
Note that, if n is not a power of 2, one can pad the matrices out to the next
larger power of 2. This can (at worst) double n, which involves increasing the constant
by 6 in Eq. (12.3.3). □
We could instead assume, in Theorem 12.3.2, that n = 2k,h, where h is odd.
It is then possible to derive an expression of the form
T(n) = c-n2-81
where c < 6. We leave this refinement for the reader.
Thus we have an ^(«2-81) time bound for multiplication of n x n matrices
over a ring. However, the product operation of a ring is associative but our applications
will use nonassociative products. Consequently, we extend our previous result to
multiplication of boolean matrices and then reduce our new product operation to boolean
products.
Boolean matrices are n x n matrices whose elements are subsets of {0, 1}. The
associated scalar operations + and *, given by the following table for boolean addition
and multiplication satisfy all the properties of a ring except that the element 1 has no
additive inverse.
+
0 1
0 1
1 1
*
0
1
0
0
0
1
0
1

12.3 MATRIX MULTIPLICATION 429
As usual, we define matrix multiplication by the "sum of products" rule.
Example
"1 1
0 1
Since we do not have a ring, we cannot apply our technique for fast
multiplication to boolean matrices directly. Nevertheless, as the text theorem indicates, we
can multiply boolean matrices within the same time bound.
Theorem 12.3.3 The product of two n x n boolean matrices can be computed
in time proportional to «2-81 steps.
Proof Given two n x n boolean matrices A and B, compute C = A * B by the
algorithm given in Theorem 12.3.2, using Z„+i, the integers modulo (« + 1), as the
ground ring. Note that, for 1 </,/<«, 0 < citi < n. Let D be the boolean product of
A and B.
Claim djj = 1 if and only if 1 < cit;- < n.
Proof of claim If dtj = 0, then there is no k such that aikbkj = 1, and hence
cI;y = 0. If dtj= 1, then, for some value of k, aih = 1 and bkj=^ 1. Therefore
1 < Cij < n and both the claim and the theorem follow. □
PROBLEMS
Transitive closure is an important and familiar operation on relations. It
generalizes to graphs and matrices by the familiar connections among relations,
matrices and graphs.
1 Let A be an n x n boolean matrix. Show that A*=IUAUA2U--- =
(/UA)".
2 Another procedure for computing A* or A+ involves Warshall's algorithm,
shown below.
begin
comment A is an n X n boolean matrix whose closure is to be
computed. The answer is stored into B:
for 1 < i, J < n do 4°}: = A [i, j);
for k: = 1 to n do
for 1 < i, / < n do
c£>:=c£-i>Uc£-i>.c£-i>.
for 1 < i, / < n do B [i, j ]: = c\f
end.
Prove that Warshall's algorithm uses /7(n3) U and • operations on a RAM and that
B = A+. Discuss appropriate cost functions for the computation.
3 Implement Warshall's algorithm on a bit-vector machine by storing rows
(columns). What is the time bound achieved this way, and what is the appropriate cost
function?
"1
1
1"
0
1
.1
1
0_

430 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
4 By a suitable encoding, show that the algorithm of Problem 3 can be
implemented on a unit-cost RAM in time n2log n. Hint: Map (b\, . . . , bn) into:
logn
0 • • • 06,
0 • • • 0bn
n logn
12.4 THE COCKE-KASAMI-YOUNGER ALGORITHM
The algorithm to be presented requires grammars in Chomsky normal form.
As we saw in Theorem 4.4.1, there is no loss of generality in this assumption. The
string A is in a language if and only if the Chomsky normal-form grammar for the
language has a rule S-+ A, where S is the start symbol. Furthermore, the rule S-+ A
cannot be used in the derivation of any non-null strings. Consequently, without loss
of generality, we can restrict our attention to non-null input strings and A-free
grammars in Chomsky normal form.
The key to the Cocke—Kasami—Younger algorithm is to form a strictly upper
triangular matrix of size (n + 1) x (n + 1), where n is the length of the input string.
The entries in the matrix are subsets of variables. We can then determine membership
in the language by inspecting one matrix entry. We can also use the matrix to generate
a parse.
Algorithm 12.4.1 Let G = (V, ~L,P, S) be a grammar in Chomsky normal
form (without the rule S-*■ A) and let w = a\a2 ■ ■ • an, n > 1, be a string where, for
KKn,afc£E. Form the strictly upper-triangular («+l)x(«+l) recognition
matrix T as follows, where each element f,-;isasubset of V — 2 and is initially empty .t
begin
loop-1: for i: = 0 to n — 1 do
U.i+i '■ ={A\A-+ ai+l is inP};
loop-2: ford: = 2 tort do
for i: = 0 to n — d do
begin/: =d + i\
ttj: = {A | there exists k, i + 1 < k </ — 1 such that
A ->-.BC is in .P for somcBSf,- fe, C&tkJ)
end
end.
Example
S-+ SS\AA\b
w = aabb
A-+ AS\AA\a
t Recall the 0-origin addressing convention for matrices.

12.4
THE COCKE-KASAMI-YOUNGER ALGORITHM
431
The recognition matrix is
T =
A
S,A
A
S,A
A
S
A,S
A
S
S
In computing T, the superdiagonal stripe is filled in by loop 1. As the
following diagram illustrates, the remaining computation is done down the diagonals
(;' —i = d) and working up to the righthand corner as d = 2,. .. , n. Thus the rows are
filled left to right, and the columns are filled bottom to top.
1
k 3
2\
£
As shown in Fig. 12.4 the (i, ;')th element of the matrix is computed by
scanning across the 2th row and down the/th column, always to the left and below the
entry in question.
Thus we successively consider the pairs of entries
(i,i+l) (i + l,/)
(i,i + 2) (i + 2,/)
(i,;-l) (j-l,j)
Each matrix entry is a set of nonterminals. For each pair of entries, we find every
grammar rule with a right part composed of a symbol from the first entry followed by
a symbol from the second entry. The left parts of these rules constitute the (i,/)th
entry in the recognition matrix.
The next theorem and its corollary show that the algorithm is correct, in that
it recognizes all and only strings generated by the associated grammar.
Theorem 12.4.1 Let G = (V, I,,P, S) be a grammar in Chomsky normal form
(without the rule S -*■ A). Let w = aifl2 • ■ ■ a„,« > 1, be a string, where for 1 < k < n,

432 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
(0,0)
(1,1)
(i, 0
(i,i + l).
(1,7- 1)
(U)
('+1,7)
"
(7-1,7)
(7,7)
(1-1,1)
(1,1)
FIG. 12.4
ak e 2, and let T = (tt j) be the recognition matrix constructed by Algorithm 12.4.1.
Then A e tu if and only if A =^ai+lai+2 ■ ■ ■ as.
Proof Since the recognition matrix is strictly upper-triangular and (n + 1) x
(« + 1), the algorithm must set entries fy where 0 </ <n — 1, 1 </<«, and i <;'.
Consequently, 1 <;' —j<n. The first loop handles the case;' — i= 1. In the second
loop, since j = d + i, we know that d=j — i. For a given value of d, we must have
j = d + i<:n. Consequently, i<n~d. It follows that Algorithm 12.4.1 sets every
entry of the recognition matrix and each element is assigned a value only once. We
show by induction on d that the values of T correspond appropriately to derivations.
Basis, j-i = 1, so ;'- 1 =i. Clearly, after the first loop, Aetu+1 if and
only if>l =>a1+1.
Induction step. For some d = j — i,l <d<n, assume the result for all d! < d.
After the second loop, A e tu if and only if there exists k,i<k <;', such that A -*• BC
is in P for some B S tih, CS ffe y.

12.4 THE COCKE-KASAMI-YOUNGER ALGORITHM 433
Since k </, we have
k — i<j — i = d,
and the induction hypothesis applies to tiyh since k — i<d. Also, since i < k,
j — k<j — i = d,
and, since / — k<d, the induction hypothesis holds for tkj also.
+
B^U.k if and only if 5 => aI+1 • • ■ ak
Ce rfe>;- if and only if C => afe+1 • • • a;-
Combining these results, we have
A S /j j if and only if A =>BC ^ aI+1 • • • ay
This extends the induction. □
Corollary w&L (G) if and only if S £ to, „.
Algorithm 12.4.1 is an "off-line" algorithm in the sense that all of the input is
read at the beginning and the successive matrix entries computed correspond, by
Theorem 12.4.1, to sets of successively longer substrings of the entire input string. An
"on-line" algorithm would compute the matrix entries in such a way that, after
reading at, the recognition matrix would indicate which nonterminals generate
aia2 "'' ai- In fact> Algorithm 12.4.1 can be rewritten as an on-line algorithm. We
present this algorithm so that the reader can compare it with other on-line algorithms.
Algorithm 12.4.2 Let G = (V, I,,P, S) be a grammar in Chomsky normal
form (without the rule S-+ A) and let w = axa-i • ■ an, n> 1, be a string, where for
1 <j'<«, a,S 2. Form the strictly upper-triangular (« + 1) x (n + 1) recognition
matrix T as follows, where each element ttj is a subset of V — E:
begin
for /: = 1 to n do
begin
tj.-ij: = {A\A-+dj isinP};
loop-2: for i: =/ — 2 down to 0 do
tij: = {A I there exists k, i+ Kk<j— 1 such that
A -+BCis inPfor some B e tUk, C&thJ}
end
end.
Note that this algorithm fills in the matrix column by column and the column
elements are computed "bottom to top." We leave it to the reader to verify that
Algorithm 12.4.1 and Algorithm 12.4.2 yield the same recognition matrix for any
grammar and string.
Next we estimate the number of steps required by the algorithm, as a
function of the length of the string being analyzed. Observe that, for a given terminal

434 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
symbol aI+1) computation of {A\A ->-tfI+1 is in P} takes a fixed amount of time.
Similarly, for fixed i, j, and k, computation of {A\A-+ BC is in P, B e tik, Ce tkj}
takes an amount of time bounded by a constant with respect to the length of the
string being analyzed.
Theorem 12.4.2 Algorithm 12.4.1 requires 0 (n3) steps to compute the
recognition matrix.
Proof Loop 1 takes cxn steps to initialize the superdiagonal. The setting of
ttj in the-body of loop 2 takes c2(d — 1) steps, since k can take on d — 1 values.
Therefore the algorithm requires
ci«+c2 £ (d-l)(n-d+l) = cln + -2L—. = <7(n3)steps. □
d=2 6
The space required by the algorithm is determined by the number of elements
in the strictly upper-triangular matrix. Since each element contains a set of bounded
size, we get immediately the next theorem.
Theorem 12.4.3 Algorithm 12.4.1 requires n(n + l)/2 cells of memory on
a RAM.
Now that the recognition problem has been solved, a technique is needed for
parsing. Algorithm 12.43, which follows, provides a means of obtaining a parse for a
given string from its recognition matrix. The algorithm will also be used for the
recognition matrices defined using other methods of recognition.
Algorithm 12.4.3 Let G = (V, I,,P,S) be a grammar in Chomsky normal
form (without the rule S -*• A) with the productions numbered 1,2,... ,p and the rth
production designated 7r,-. Let w = aitf2 • ■ ■ a„,« > 1,be a string, where for 1 <k<n,
ah £ E and let T = (ttj) be the recognition matrix for w constructed by Algorithm
2.1.1.
In order to produce a left parse for w, define the recursive procedure
parse(i, j, A), which generates a left parse for,4 =*■ ai+1ai+2 ■ ■ ■ as by:
procedure parseQ, j, A);
begin
if/ — i = l and nm = A -*• ai+1 is in P
then output (rri)
else
if k is the least integer i<k<j such that nm = A -*• BC
is inP where B e f/jfe and Ce tkJ
then begin output (m); parse(i, k, B)
parse (k,j, C)
end
end;

12.4 THE COCKE-KASAMI-YOUNGER ALGORITHM 435
Output a left parse for w as follows:
main program: if S £ ?0 „
then parse(0,n,S)
else output ("error").
Note that Algorithm 12.4.3 produces the leftmost parse "top down" with
respect to the syntax trees. By contrast, Algorithm 12.4.1 builds the recognition matrix
"bottom up." The reader can easily modify the algorithm to produce a rightmost parse.
We next verify that Algorithm 12.4.3 works and derive its running time. The
theorem is followed by an example.
Theorem 12.4.4 Let G = (V, 2, P, S) be a grammar in Chomsky normal form
(without the rule S -*• A), with productions designated 7Ti, n2, ■ ■ ■, fp ■ If Algorithm
12.4.3 is executed with the string w = a\a2 ■ ■ ■an,n> 1, where, for 1 </ <n,a,- S 2,
then the parsing algorithm terminates. If w&L(G), then the parsing algorithm
produces a left parse; otherwise it announces error. The parsing algorithm takes time
<7(n2).
Proof We establish the theorem by a sequence of partial results.
Claim 1 Wheneverparse0',/,j4)is called, then/>/ and^l G^y.
Proof of Claim 1 The procedure parse is called once from the main program
and twice from within the body of parse. In all cases the call is conditional on the
satisfaction of this property of the arguments.
Claim 2 parse (i, j, A) terminates and produces a left parse of the derivation
A ^ai+1ai+2 ■■ -ay.
Proof of Claim 2 We induct on / — i.
Basis, j — i = 1. By claim 1, A e tu+1. It follows from Theorem 12.4.2 that
A =>ai+1 and therefore that P contains a rule nm~A -+ai+1. Clearly, the procedure
parse terminates and produces the left parse m.
Induction step. Assume the result for j — i<:d. Consider parse{i,i,A),
where f — i = d+ I. Since j — i¥=l, the else portion of the conditional statement in
parse is executed. It follows from Claim 1 and Theorem 12.4.2 that, for some B, Cm
V— 2, A=^BC=*ai+yai+2 ■ ■ • aj and, consequently, that P contains some rule
7rm =A -+BC where BStik andCe?feiy. Therefore, parse produces (m,parse(i, k, B)
parse{k,j,C)). By the induction hypothesis, parse(i,k,E) and parse(k,j,C)
terminate and produce the appropriate left parses. Consequently, the else portion of
parse terminates and produces a left parse of A ^ai+^a^ '' ■ fly.
Claim 3 For each d, 1 <d<n, a call of parse(i, },A) takes at most cd2 steps
where j — i = d and c is some constant.

436 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Proof of Claim 3 We induct on d.
Basis. d=\. parse(i, i + 1 ,A) takes one step.
Induction step. Assume the result for/ — i<d. Suppose parse (i,j,A) is
called with j — i = d+ 1. Clearly, the else portion of the procedure is executed.
Except for calls of parse, for some constant Ci, the else portion takes at most
C\{j — i — 1) < cx (j — 0 steps, since k takes on / — i — 1 values. Therefore, if we
choose c = c\, then, including the recursive calls on parse and using the induction
hypothesis, parse requires at most c(j — i) + c(k — i)2 + c(j — k)2 steps. Let k = i + x
and; = i + x + y. Then the number of steps is
c((j-i)+(k ~i)2+(j-k)2) = c(x+y+x2 + y2).
Using the elementary relationship
-^<xy,
we get
x+y+x2 +y2<x2 +y2 + 2xy = (x + yf = (j-i)2.
This completes the induction proof of Claim 3.
Putting the partial results together, it follows, from Theorem 12.4.2, that if
w $L(G), the S $ f0, n • Clearly, the algorithm terminates and announces error in this
case. Otherwise the algorithm calls parse(0,n,S). It follows from Claim 2 that the
algorithm terminates and produces a left parse for S=>aia-2 ■ • -an=w. It follows
from Claim 3 that the algorithm takes <^{n2) steps. □
Example Continuing the previous example, we number the productions
1. S-+SS 4. A-+AS
2. S-+AA 5. A-+AA
3. S-+b 6. A-*a
The parse of w = aaab is generated in the following way:
Since S e tQ< 4, parse(0,4, S) is called.
Since 7r2 =S-+AA is in .P, ,4 €t0,i, and A S?14, we get:
(2, parse(0, 1, A), parse{\, 4, A)) = (2,6, parse(l, 4, A)).
Since ir4= A -+AS is in P, A S/li2,and5S/2>4, we get:
(2,6,4,parse(l,2,A),parse(2,4,5)) = (2, 6,4,6,parse(2,4,5)).
Since n^ =<S'->-<S'Sisin.P,1S'ef2i3, and5e?34, we get:
(2,6,4,6, l,parse(2,3,5), parse(3,4,5))
= (2,6,4,6,1,3,pawe(3,4,5))
= (2,6,4,6,1,3,3),

12.4 THE COCKE-KASAMI-YOUNGER ALGORITHM 437
which corresponds to the tree
By combining our recognition and parsing results, we get a time bound for
general context-free parsing.
Theorem 12.4.5 Every context-free language i can be parsed in time
proportional to n3 where n is the length of the string to be parsed and in space proportional
tOrt2.
Proof By Theorem 4.4.1, every context-free language L £ X* is generated by
a grammar G = (V, 2,P,5)in Chomsky normal form. For any string w = a1a2 " ~an,
if n = 0, then vv = A is in i ifS-* A is in (7. If «>0, then, by Theorem 12.4.1, we can
determine whether wSi by a computation which, by Theorem 12.4.2, requires at
most 0(n3) steps. If w Si, then, by Theorem 12.4.4, we can generate a parse for w
in another 0 («2) steps. The space bound follows from Theorem 12.4.3. □
Although we have shown how Algorithm 12.4.1 recognizes strings in time
& (n3 ) on a RAM, it does not follow that this can be done on a Turing machine within
the same time bound. It might be the case that the Turing machine cannot organize its
tapes sufficiently well to avoid many bookkeeping steps that will change the time
bound. In this section we show that, in fact, recognition can be done on a Turing
machine in time 0(n3).
Our model will consist of a Turing machine with a read-only input tape and
two work tapes.
Theorem 12.4.6 Let G = (V, X,P, S) be a A-free, context-free grammar in
Chomsky normal form and let w=a1 • ■ ■ an, at 6 2, 1<j<«. There is a Turing
machine A which computes the recognition matrix T = (ttj) in time at most en3 for
some constant c.
Proof The initial configuration of A is shown below,
Input
Tapel
Tape 2
t
«i
...
t
C|£| S|
t
T
*l
$|
«n
$
t
where <t and $ are endmarkers on the tapes and B is a "blank" symbol.

438 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
The idea of the argument will be to have the upper-triangular entries of ttj
laid out on both tapes 1 and 2. On tape 1 the entries of T will be written by rows,
while on tape 2 the entries will be by columns. The computation proceeds by filling in
the matrix by diagonals, filling entries in the same order as does Algorithm 12.4.1.
The first phase of the computation is to initialize A to the configurations
Input
c
«1
...
«n
$
Tape 1
(by rows)
Tape 2
(by columns)
0,1
« | '0,1 |
!
0,1
...
0, n
1,2
'1,2
• • •
Y V
n n—\
0,2 1,2 0, n-\
« | '0,1 |
'1,2
...
l,n
• • •
n-2
,n-
1
'n-2, n-1
n-1, n
'n-l,n
$
' n-2, n
0,n n-l,n
...
'n-2, n-1
...
'n-l,n
$
n-1
n
Since tape 1 and tape 2 each have (7(n2) entries, the initialization takes at
least <7(n2) steps. It is left to the reader to show that the initialization can be done in
time <7(n2).
We next show how to compute any element ttj where z+ 1 </<«. The
configuration just prior to the computation is shown below.
i m
Input
Tapel
Tape 2
t
i, J
••• 1 'l, 1+1
'U-l
B
...1
t
ij 0,y+l
...
B
'•+ij
• ••
'j-lj
B
...
t
The procedure to compute titl is as follows, once we are in the position
above.
Move right on tapes 1 and 2, computing ttj in the finite-state control until a
blank is encountered on tape 1. This is square (/, /) on tape 1, so write the
entry there. Move left on tape 2 to the first blank which is (/, /) and make the
entry there. This computation takes 0{j — i) steps.
Next we must get into position for the ti+l j+1 (if we weren't previously at the "end"
of the diagonal).
Move right on tape 1 to the first nonblank; since the rows are filled left to
right, this is ti+lti+2. Move right on tape 2. Since the columns are filled
bottom to top, the first blank we encounter is in square (0,/ + 1) and the
subsequent first nonblank is in position (z + 2,/ + 1).

12.4 THE COCKE-KASAMI-YOUNGER ALGORITHM 439
If we reach $ on the work tapes, then we have concluded a diagonal. Thus, the
previously computed entry was tijK. The next entry to be computed will be
(0,n + 1 —/). To position the work tapes:
1 Move to <t on both tapes 1 and 2. If no blanks were encountered along the
way, then look at the last entry ttj in the finite-state control. This was t0itl.
If S& t0 „, then halt and accept; else halt and reject.
2 If there were blanks along the way, then move the head on tape 1 to the
first square to the right of *. This is f 0, i ■ On tape 2, scan right and find the
first blank; then go one square to the right on the blank. The blank is in
position (0, n + 1 — /). The square to the right is tin+i-^. The tapes are
then positioned correctly.
To estimate the number of steps, note that the computation of any element
tjj, where i + 1 </ < n, takes at most en steps. (One can compute c from studying the
tape motion.) Since there are t7{n2) elements, at most c\n3 steps are required to
compute all the elements. In addition, we must consider the tape repositioning. To compute
the next element in a diagonal requires t9{ri) steps for repositioning. For each of the n
diagonals, we must reposition the heads over the entire work tapes. Consequently,
repositioning requires tf (n3) steps. Thus, including the initialization, the entire
computation takes
Cin3 + c2n3 +c3n2 = t7{n3) steps. □
Example Consider the grammar G used in the previous examples and shown
below. Let w = aabb.
S -+SS\AA\b
A^AS\AA\a
After initialization, the configuration of the Turing machine is
\t\a\a\b\b\%\
t
0,1 0,2 0,3 0,4 1,2 1,3 1,4 2,3 2,4 3,4
\ t \ A \ | | \A\ [ | 5 | 1 SY~T\
t
0,1 0,2 1,2 0,3 1,3 2,3 0,4 1,4 2,4 3,4
ft I A\ \ A \ I I 5 I | | | 5 |T|
t
We scan right on both tapes to compute 0,2. This is entered in the 0,2
square of tape 1, which is the current position of the tape head. On tape 2, we scan
left to the first blank to fill in the entry. Then we go right on tape 1 to the first
nonblank. On tape 2, we go right on through one or more nonblanks and then through
blanks and stop at f2,3- This computation is repeated. The following is the
configuration just before f2,4 is computed.

440 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
\t\a\a\b\b\$\
t
0,1 0,2 0,3 0,4 1,2 1,3 1,4 2,3 2,4 3,4
| t | A \S,A\ | 1 A J A | [ 5 | 1 5 | $ |
t
0, 10,2 1,2 0,3 1,3 2,3 0,4 1,4 2,4 3,4
j <t 1 A \S,A\ A 1 1 A. 1 5 1 1 1 1 5 1 $ 1
t
We compute f24 as before, and make the entry. When we go right, we reach
the endmarker and must move left. This yields:
l<t|a[a|b|H$|
t
0,1 0,2 0,3 0,4 1,2 1,3 1,4 2,3 2,4 3,4
| it j A \S,a\ I | A | A | | 5 | 5 [ 5 1 $~1
t
0,1 0,2 1,2 0,3 1,3 2,3 0,4 1,4 2,4 3,4
| it | A \S,A\ A I | A | 5 [ | | 5 | 5 | $ |
t
and so it goes....
As is true in the random-access case, the recognition matrix can be computed
on-line by a Turing machine, provided the bookkeeping is done slightly differently. We
leave this construction to the reader.
As an interesting special case, let us assume that the underlying grammar G is
linear. Recall (from Problem 2.5.1) that there is no loss of generality in assuming that
G = (V, S,P, S) is in linear normal form. That is, each rule is of one of the following
types:
S -s- A ifASL(G)
A->aB a<EZ,AeN,B<EN-{S}
A->Ba a&l,,A&N,B&N-{S)
A^-a aeZ,AeN
Now we are ready to prove that every linear context-free language can be
parsed in time 0(n2) by a form of the Cocke-Kasami-Younger algorithm. Algorithm
12.4.4 provides a variant of the recognition-matrix construction of Algorithm 12.4.1,
which handles grammars in linear normal form. (Cf. Problem 1 of Section 2.5) The
difference between the two algorithms is only in the way in which the t( /s are
computed in the second loop.

12.4 THE COCKE-KASAMI-YOUNGER ALGORITHM 441
Algorithm 12.4.4 Let G = {V, ~L,P,S) be a grammar in linear normal form
(without the rule S-*-A) and let w = aia2 ■ ■ an, n>\, be a string, where, for
l</<«, a,-SS. Form the strictly upper-triangular (n + 1) x (n + 1) recognition
matrix T as follows, where each element tjj is a subset of V — 2.
begin
loop_l: for z: = 0 to n — 1 do
>i,i+i '• = (^ 1^4 -» a,-+i is in P};
loop_2: for d: = 2 to n do
for/: =0 ton —ddo
begin/: =c? + i;
tij\ = {A \A -*■ ai+l B is in P for some B e ti+1 j]
U {A \A -» Zto,- is in P for some 5 e tij-x}
end
end.
Theorem 12.4.7 Let G = (F, S,P, 5) be a grammar in linear normal form
(without the rule S-*- A). Let w = aia2 • • man,n> l,be astrvig, where for 1 <&<«,
flfe S S, and let T = (t;j) be the recognition matrix constructed by Algorithm 12.4.1.
Then A £ t,j if and only if A =>ai+l ■ • • a-r
Proof The argument parallels the proof of Theorem 12.4.2 except that, in
analyzing a derivation, the first step is either A -+ai+lB or A -+Baj. The details are
omitted. D
In order to obtain our time bounds, we simply analyze Algorithm 12.4.4.
Theorem 12.4.8 Every linear context-free language L can be parsed in time
proportional to n2, where n is the length of the string to be parsed. Additionally, every
such language can be recognized using only a linear amount of space.
Proof By Problem 1 of Section 2.5, there is no loss of generality in assuming
that L is generated by a grammar in linear normal form. Using Algorithm 12.4.4, an
input string of length n is in the language if S £ t0, „ (or if n = 0 and S -*■ A is in the
grammar). Following the timing analysis of Theorem 12.4.2, loop 1 requires c^n steps.
Since the setting of titj in loop requires inspection of only two matrix entries, it takes
at most a constant c2 number of steps. Therefore, the algorithm requires
cYn + c2 Yj (n ~d + 1) = cxn -\ = <?(n2) steps
to construct the recognition matrix. Using Algorithm 12.4.3, it follows, from Theorem
12.4.4, that a parse can be produced in (7(n2) steps.
Since, in loop 2, each ttj is computed using only entries from the previous
diagonal, all that must be remembered for recognition are the previous diagonal (or the
input) and the current diagonal. Since the diagonals have at most n elements, this is a
linear space requirement. (However, it does not suffice if a parse is to be generated.) □

442 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
PROBLEMS
1 Complete the proof of Theorem 12.4.7.
2 Modify Algorithm 12.4.2 to obtain an on-line algorithm for recognition of
strings in a linear context-free grammar which takes quadratic time and linear space.
3 Design an algorithm which meets the requirements of Problem 2 but runs
on a Turing machine.
4 Construct an algorithm which obtains a parse from the recognition matrix
and which requires linear time for a grammar in linear normal form. What is the space
bound and which cost criterion is being used?
5 Devise a new algorithm to determine a parse of a string of length n from
the recognition matrix which takes time proportional to n log n.
6 Show how to achieve the configuration of the end of Phase 1 of the
Turing-machine computation of Theorem 12.4.6, using time proportional to n2.
7 Suppose that G is an unambiguous context-free grammar in Chomsky
normal form. Can you prove a better time bound for Algorithm 12.4.1 in this case?
8 Show how to implement the Cocke—Kasami—Younger algorithm on a
single-tape Turing machine which runs in time 0(n4). [Hint. Code the recognition
matrix onto the tape with the columns stored from left to right and from the bottom
to the top. Use symbols to mark the top of each column. In working up column /,
suppose ti+1 j has been computed and we are about to compute tt j. In computing t-x j,
the fth row of T is needed, so the earlier part of the computation stores tk}- in the
same tape square as tik. Thus, ?,-_; may be computed in a single left-to-right pass and
stored in the next empty cell, which is ?,•_,-.].
12.5* VALIANT'S ALGORITHM
The Cocke—Kasami—Younger algorithm allowed us to recognize and parse in
time proportional to n3. We are now about to give a procedure due to Valiant, which
will result in a time bound proportional to n2"81. To achieve this, we will resort to
techniques that are asymptotically superior but result in such huge constants that this
method is only of theoretical interest. It is important in that it relates matrix
multiplication, boolean matrix multiplication, transitive closure, and parsing algorithms.
We first give an overview of the ideas of the algorithm. Let G = (V, S,P, S)
be a context-free grammar in Chomsky normal form. Suppose w is a string of length
n to be parsed. Our algorithm produces the same recognition matrix as Algorithm
12.4.1, but the computation proceeds in quite a different way. We form the same
starting matrix as in the Cocke-Kasami-Younger algorithm. We note that one can
naturally define a (nonassociative) product operation on nonterminals of the grammar.
In a properly generalized sense, the recognition matrix can be found by taking the
"transitive closure" of the starting matrix. It turns out that one can do this efficiently
if one can take the "product" of two upper-triangular matrices efficiently. One can
reduce computing this product to computing ordinary boolean matrix products. This,
in turn, reduces to studying efficient methods for computing ordinary matrix products,
which we have already done.

12.5* VALIANT'S ALGORITHM 443
Given any binary operation on a set S, we can extend that operation to
matrices whose elements are subsets of S in a natural way. Suppose A = (ay), B = (fcy),
and C = (cy) are n x n matricest whose elements are subsets of S. Then we define
C = A * B by
Cij = U aik * bkj for 0 < /, / < n — 1.
fe=0
Also, by definition, A 9 B if for all i and /, atj 9 by. Of course, we define C = A U B
by
ciS = atj U bi} for 0 < /, / < n — 1.
Using any product operation for sets and ordinary set union, we define a
transitive-closure operation on matrices whose elements are sets.
Definition Let D be an n x n matrix whose elements are subsets of a set S
and let * be any binary operation on S. Let D(1) = D and, for each i > 1, let
D(i) = "u D0) * D('"y).
The transitive closure of D is defined to be D+ = D(1) U D(2) U • • • . We say that a
matrix D is transitively closed if D+ = D.
Since the underlying binary operation may be nonassociative, as is the
operation to be used for constructing the recognition matrix, we have defined D(,) so
as to include all possible associations. Although we write
D+ = D(1)UD(2)UD(()U---
as an infinite union, if S is finite, then the number of unions is always finite, as the
number of possible entries is finite and so there are only finitely many such matrices.
We leave it to the reader to find an upper bound for t in this case so that
D+ = D(1)U- ■UD(I).
Example Let S = {0,1} and define a binary product by the following table:
0
1
0
0
0
1
0
1
Let D be any matrix over S. The transitive closure of D is the "standard transitive
closure" treated in the literature. In the usual fashion, D describes a binary relation p
and D+ corresponds in the same way to the transitive closure p+ of p.
The heart of the algorithm for constructing the recognition matrix is the
following binary operation on subsets of the nonterminals of a grammar.
t Recall our convention of 0-origin addressing for matrices.

444 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Definition Let G = (V, S,P, S) be a context-free grammar in Chomsky
normal form. Let A^,N2 9 V — 2. Define:
Nx-N2 = {CeV-Z\C-*ABisinP forsomeAeN^BeN^}.
Example Let G contain the following rules:
C->AB
D^CE
F^-BE
E->AF
There may be other rules as well but these need not concern us. Take Nt = {A},
N2 = {B},N3 = {£■}. Note that:
JVi(A^3) = {E}±{D} = {N,N2)NZ.
This establishes that this product of subsets is not associative.
Using the product operation just defined, we claim that the following
algorithm computes the Cocke—Kasami—Younger recognition matrix. Note that we
have not specified in detail how to compute D+. We leave the details for subsequent
sections.
Algorithm 12.5.1 Let G = (V, S,/5,5) be a grammar in Chomsky normal
form (without the rule S-*- A) and let w = a^a-i ■ ■ ■ an, n > 1, be a string, where, for
1 <k<n,ak S 2.
Form the strictly upper-triangular (n + 1) x (n + 1) recognition matrix T as
follows. Let D be a strictly upper-triangular (n + 1) x (n + 1) matrix, where each
element dtj is a subset of V — 2 and initially the value of each element is 0.
begin
comment D = (d£ j) is an (n + 1) x (n + 1) square matrix;
for 0 < i <} <n '- 1 do du: = 0;
for i: = 0 to n — 1 do
dii+l: = {A\A-*ai+l isinP};
T: =D+
end.
To show that the algorithm works, we establish the following result.
Theorem 12.5.1 Let G = {V, S,P,5) be a context-free grammar in Chomsky
normal form (without the rule S-»A) and let w = a.\a2 ■ ■ an, n>\, be a string,
where, for 1 <k <n, ak e 2. Let D+ be the matrix constructed by Algorithm 12.5.1
and let T be the Cocke-Kasami—Younger recognition matrix. Then D+ = T.
Proof We first prove that the only nonempty elements of D(d) are along the
fifth diagonal.

12.5* VALIANT'S ALGORITHM 445
Claim InD(d) we have
dW =.i'w if/-/ = rf,
'i,i
0 otherwise.
Proof of claim The argument is an induction on d—]—i. The basis of the
induction, namely, d = j — i= 1, is immediate.
Induction step. Consider forming the product D(d+1 \ We have
s = l fe=0
It follows from the property that, for any subset of variables 5, 5*0 = 0*5=0, the
only way we can get a nonzero entry is if there exists some s, 1 <s <d, and some
k,0<k<n, so that dffl^Q and d£j+1~s) =£<}>. It follows, from the induction
hypothesis, that, in such a case,
k — i = s
and
/-* = d+ 1 -s.
Adding the equations produces
/-i = rf+1,
which shows that d\j*1 * = 0 except if / — z = d + 1. If j — i = d + 1, it follows, from
the definition of the product operation and the induction hypothesis, that
dfd}+l) = {A\A-*BC where,foisomek,0<k<n,Betitk,CetkJ} = tu.
This completes the proof of the claim.
Next, note that, as a corollary of the claim,
D<d> = 0
if d>n. Therefore,
D+ = D(1)UD(2)U- UD(n) = T. D
Corollary 1 BG d^j if and only if
B^ a;+1 ...a-, » □
Proof d\ j = tjj and we invoke Theorem 12.4.1.
Corollary 2 w6i(G) if and only if 5 S cf^,,,.
Example
y4->y4y4|a

446 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Let w = aaab. Thus we must form a 5 x 5 matrix D:
D
One can compute that
0 {A}
0
0
"0 {A}
0
0
{A}
0 {A}
0 {B}
TS-
{A} {A} {S}~
{A} {A} {S}
0 {A} {S}
0 {B}
0.
D+ =
Since we have established that Algorithm 12.5.1 yields the Cocke—Kasami—
Younger recognition matrix, we can, of course, use the parsing Algorithm 12.4.3 on
this matrix. Furthermore, using the proof of Theorem 12.5.1 as a guide, the reader can
verify that a straightforward computation of D+ which takes advantage of the fact that
only the product of nonempty sets yields a nonempty set, causes Algorithm 12.5.1 to
be "stepwise equivalent" to Algorithm 12.4.1. Consequently, it is easily shown that
Algorithm 12.5.1 can be carried out in <7(n3) steps. It remains to show that, when
viewed as a transitive-closure problem, the time bound for this computation can be
further improved.
We next give an algorithm which reduces matrix multiplication of the very
general kind we have been considering to boolean matrix multiplication. Let S be any
set and let * be any binary operation on subsets of S. We demand only that * satisfy
the following distributive laws.
For any X, Y, Z £ S,
X*(YUZ)
(XUY)*Z
X*YUX*Z,
X*ZUY*Z.
(Dl)
(D2)
Suppose S = {Ai, A2, • ■., As}. Let B = (btj) be a matrix whose elements
are subsets ofS. Define a set of boolean matrices B!,B2,. .., Bs, where, for 1 < k < s,
the (z — /)th element of Bfe is 1 if Ak S b(,- and 0 otherwise.
e hstS = {Al,A:
B =
}andn = 3. Then,
*0 {Ax} {Al,A2}
0 0 {A2}
0 0 0

12.5* VALIANT'S ALGORITHM 447
Then
Bi =
0 1 1
0 0 0
0 0 0
and
0 0 1
0 0 1
0 0 0
Next we give an algorithm for *-multiplication of two such matrices.
Algorithm 12.5.2 Given two n x n matrices B and C whose entries are
subsets ofS={Al,...,As}, our goal is to compute B * C.
1 Compute Bt,..., Bg and Cj,..., Cs, where these are defined as above.
2 Compute the s2 boolean products B,- * C; for 1 < i, f < s.
3 Let D = (du), where du = {E\E = {Ap}*{Aq}, where (Bp * Cq\j = 1}.
Example Let S = {Ai ,A2}, n = 3, and define * by the table:
Ax A2
Let
0
0
0
Mi)
TS-
0
Ai
A2
{A1M2}
{A2}
<*
A2 A2
Ax A2
r
, c =
0 U1M2} 0
0 0 M,
0 0 0
Then, by using the definition of boolean-matrix products,
"0 0 {A2}
D = B*C
0 0 0
By using Algorithm 12.5.2, we write:
0 1 1
Bi
C, =
0 0
0 0
0 1
0 0
0 0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0

448 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Then
BiC! =
B2C!
0 0 1
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
BiC2
B2C2
0 0 0
0 0 0
0 0 0
'0 0 0
0 0 0
0 0 0
Since (B1C1)02 = 1, we have d^ =AX *AX —A2, and both computations produce
the same answer.
We leave it to the reader to verify that, for operations * satisfying distributive
laws (Dl) and (D2), Algorithm 12.5.2 computes D = B * C.
We can conclude that, if M(n) is the time necessary to multiply two n x n
matrices with elements that are subsets of some set S, and BM(n) is the time to
compute the boolean product of two n x n boolean matrices, then a relationship
between the two quantities is given by the following result.
Theorem 12.5.2 M(n) < cBM(n) for some constant c.
Having posed the recognition problem as a transitive-closure problem, and
having reduced the multiplication aspect of that problem to fast boolean-matrix
multiplication, it remains to reduce the transitive-closure problem to one with a time bound
proportional to that for boolean-matrix multiplication. This will be done shortly.
Our next goal is to prove a lemma due to Valiant, on which the construction
rests. In order to prove the lemma, it is necessary to develop a notation for nonassoci-
ative products in terms of binary trees.
In the discussion that follows, we will consider matrices whose elements are
subsets of some set S. The multiplication operation * defined on these subsets is
assumed to have only the two distributive properties (Dl) and (D2) to reduce matrix
multiplication to boolean-matrix multiplication and the additional axiom
5*0=0*5=0. (Nl)
We will restrict our attention to strictly upper-triangular matrices. It will be
convenient to use the following facts, which are immediate consequences of the
axioms.
Lemma 12.5.1 If A and B are strictly upper-triangular matrices, then A * B is
also strictly upper-triangular and
U aa
fc=0
* b
hi
,= u
ft=1+1
"i" * bki.
Proof Follows directly from Axiom Nl.

12.5* VALIANT'S ALGORITHM 449
Lemma 12.5.2 (The distributive property) Let S be any set and let * be any
binary operation on subsets of S satisfying axioms Dl and D2. Let sl5 s2,. .. ,st,
f > 1, be subsets of S, and let n(sl,s2, ■ ■ ■ ,st) denote the ordered composition of
Sx, s2,.. ■, st under * and some order of association. If / is any family of subsets of
S, then, for any 1 < k < t,
■nis^Si,.. . ,sfe_!, U i,sfe+lJ.. -,sf) = U (Si,s2>- • • >Sfe-i» i',sfe+1,. .. ,st).
i £ I i £ J
Proof Follows directly from distributive axioms Dl and D2. D
In order to describe the nonassociative products that we will deal with in
Valiant's lemma, we first introduce a set of such products called terms. We then
present a notation for terms and a sequence of propositions establishing the properties
of the notation. The verification of these propositions is straightforward and is left for
the reader.
Definition Let B = (p,-) be an n x n matrix, whose elements are subsets of
some set S. Let
0|,,i2»0|2,i3. ■ ■ ■» b{t,it+l
be any sequence of matrix elements. We say the sequence is acceptable in the case
that, for 1 </, k < t + 1, if/ < k, then i} < ik.
The composition under * of the elements of an acceptable sequence under
some order of association is called a term ofB or, more specifically,an (/1; it+i)term
with t components. The notation
denotes any one of the terms of B with components b^,^,
operation *. For example,
bu*(b23*b^)
and
(fc23*fc35)* bSi
are terms, but
b 12*^23* &34
is not a term because no order of association is given, and
^12*^22
is not a term because the indices do not increase.
Definition Two (z, /) terms are formally distinct if either they are composed
of distinct sequences of elements or they have different orders of association. Let
•rfi,)) 0*) be the set of all formally distinct (/, f) terms having exactly d components
from the matrix B. Let
n-l
^*(B) = U U ^(.d> (B)
d>i i,y-o Uj>
be the set of all formally distinct terms with components from the matrix B. (We omit
the explicit mention of B when the matrix is understood.)
,bit, it+1 under the

450 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Proposition 12.5.1 Let B = (b{ j) be annxn matrix whose elements are
subsets of S. Then, for all 0 < i, j < n — 1,
bfl U T*
T*G5r&!o(B)
and
b\j= U U T*.
The proof is straightforward and is left for the reader. D
A natural way to describe terms of the type just introduced is to use binary
trees. For example, the (2,7) term (fc2, 3*^3,s)*^s,7 can be represented by a binary
tree, as shown below.
In order to precisely define the trees that describe terms, we introduce a
functional notation. However, we shall continue to use the trees themselves as a
descriptive device to explain the formalism.
We shall introduce functions t(x1,x2, ■.. ,xt) which denote binary trees
having t leaves labeled Xi to xt from left to right. If B = {btJ) is an n x n matrix
whose elements are subsets of some set S, then the subclass of trees t(xi ,x2 , ■.., xt)
we get by restricting Xi ,x2,... , xt to be an acceptable sequence of elements from B
will be in one-to-one correspondence with the terms of B.
For example, the terms
T*(Xl,X2,X3,X4) = (*i *X2)*(X3 *X4)
and
r$(x1,x2,x3,x4) = *! *(x2 *(*3 **4»
will correspond to the functions t1(x1,X2,x3,x4) and t2(xi,*2,*3>*4) designating
the binary trees
T, T2
X\ ^\
X\ X2 X3 X4 X2 /'^\s^
The formal definition of these functions is now given.
Definition Let X = {xl,x2,...} be a set. The class S~of binary trees
T(*i, >*!,>•••» xt), t > 1 over X, is defined inductively as follows:

12.5* VALIANT'S ALGORITHM 451
i) For any xSX, t(x) = x is in JT
ii) For any jc,- , xt in X, t(x,- , x-t ) is in S7
iii) If Ti(x{i,... ,X[ ), T2(xtp + l,... ,xit), and t3(xJi , x^) are in J?~for
some 1 < p < t, then
t(*,-, , • • ■ ,xif) = Tsfr^ ,... ,xip), T2(xip+l,... ,xif))
is in J7*.
iv) All elements of ^are constructed from (i), (ii), and (iii).
By a suitable choice of elements, we get the class of trees corresponding to
nonassociative products.
Definition Let B = (pij) be an n x n matrix, where each bij is a subset of
some set S. The acceptable class JT of binary trees over B is defined inductively by
i') For any element bitj of B, T(btij) = btj is in ^T
ii') For any acceptable sequence y i, y2, r(y i, y2) is in ^T
iii') Let Xj, x2,..., xp, xp +1,..., xt be an acceptable sequence of elements
of B. IfTi(*i, ■ ■. ,xp),t2(xp + 1, ...,xt), andT3(yi,y2) are in ^ for
some 1 <p <t, then
T(Xi,...,Xt) = T3(Ti(Xi,. .. ,Xp),T2(xp+1,.. . ,XtJ)
is in ^7
iv') All elements of ^~ are constructed from (i'), (ii'), and (iii').
Qearly, an acceptable class of binary trees over a matrix is a special case of a
class of binary trees over a set. These definitions are illustrated in Fig. 12.5. Figure
12.5(a) represents (i) or (ii). Figure 12.5(b) illustrates (iii) or (iii').
(a)
T= T3
FIG. 12.5

452 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
The t functions satisfy the following properties.
Proposition 12.5.2 Let J~ be an (acceptable) class of binary trees. For each
t(xi ,. .., xt), t > 1, in >T, there exist k, 1 < k < t, and three functions, T\, t2 , and
t3 , all in 3", such that
T(Xi,...,Xt) = T3(Ti (*!,.. .,Xk),T2(xk+1,...,Xt)).
This proposition is a restatement of the definition.
Our next principle, which is illustrated by Fig. 12.6, is familiar to us from
language theory. We may describe it as a composition principle.
T2
FIG. 12.6
Proposition 12.5.3 Let JT be an (acceptable) class of binary trees. Given
Ti(xi,.. . ,xs) and T2(ji, ■ ■ ■ ,yt) inland some k, where 1 <A; <s, then there
exists a unique t in T such that
T(^! ,...,Xk^,yi,... ,yt,Xk+l ,-.. ,XS) = Tj (X! ,. . . ,Xfe_! ,T2Ol , • • • ,yt),Xk+i ,-.. ,XS).
We next introduce the notion of a factorization for binary trees.
Definition Let S~ be an (acceptable) class of binary trees. For any
t(x! ,. .., xt) in 3~ and any p, q such that 1 < p < q < t, if, for some Tx,t2, in 3~,
T(Xi, . . . ,Xt) = Ti(Xi, . . . , Xp_1, T2(Xp, . . . , Xq), Xg+1 , . . . ,Xt),
then the quadruple (ji,T2,p, q)is a factorization of t(xi ,... ,xt).
Our next proposition is a unique factorization result for trees.
Proposition 12.5.4 Let ^be an (acceptable) class of binary trees.
1 Given t(x i,... ,xt) inland 1 <p' <<?' <?, then there exist n, t2, p,
and q such that:
a) p<p',
b) q ><jf',and
c) (ti,t2,p,q)is a factorization of t(xi,...,xt).

12.5* VALIANT'S ALGORITHM 453
2 If there also exist t\ , T2,r, and s such that:
d) r<p',
e) s>q ,and
0 0"i , tJ. , r, s) is a factorization of t (*i,..., x(),
then either:
g) r<p ands>q, or
h) p <r and<jf >s.
3 There exists a unique factorization satisfying 1(a), 1(b), and 1(c) and such
that (q — p) is minimal.
Although we have written Proposition 12.5.4 out in great detail, the intuitive
picture is shown in Fig. 12.7. By Case 2, we see that either t2 is nested inside of t2,
or vice versa.
Ti
(a)
X\ Xr Xp Xp Xq Xq xs Xj X\ Xp Xr Xp Xq Xs Xq Xj
r<k pa.nis> q p^ ra.niq> s
(b)
FIG. 12.7
This situation suggests the following definitions.
Definition A factorization of Case 3 in Proposition 12.5.4 is called the
smallest (p', q) factorization ofr.
The smallest (p',q') factorization is exemplified in Fig. 12.8. In that figure,
let us choose p = 5 and q = 9. The unique choices of p and q are p = 4, q = 9. The

454 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
*4 *5 x6 xl x% X(>
FIG. 12.8
X\ X2 X3 XW XU X\2
(b)
trees T\ and t2 are shown in Fig. 12.8. Note that the root of j2 is the least common
ancestor of xs and x9.
Now, we define a smallest tree.
Definition t(x2, ■.. ,xt) is said to be (u, v)-minimal if the smallest (u, v)
factorization of t is (/, t, 1, t), where / is the identity function of one variable, that is,
i(x) =x.
Returning to Fig. 12.8(b), we note that the tree t2(x4, ... ,x9) is
(8, 9)-minimal. In a (u, u)-minimal tree, the least common ancestor of xu and xv is the
root of the entire tree.
Now we present two lemmas concerning these minimal trees.
Lemma 12.5.3 Let
T(Xi,...,Xt) - Ti (Xi , . . . , *,._! , T2 (Xr, . . . , Xs), Xs+1 , . . . , Xt)
for some t>\, \<r<s<t. Then, for any yx,... ,^s_m , j2{yx,... ,jVm) is
(u — r + \,v~r +)-minimal if and only if (tj ,T2,r, s) is the smallest (u, v)
factorization of t.

12.5* VALIANT'S ALGORITHM 455
Proof The result follows directly from the definitions and Proposition
12.5.4. □
The next result is also an obvious property of these trees.
Lemma 12.5.4 Let J7~be an (acceptable) class of binary trees. Let Ti, t2,
t2 be in J"and\et t> 1,1 <p<q<t.Then
Tl\Xl, • . . ,Xp-i , T2\Xp, • • • , Xg),Xg + i, . . . ,Xt)
~ T\(Xi, . . . ,Xp-i, T2\Xp,. . . ,Xq), Xq+l , . . . ,Xt),
if and only if t2 = t'2 ■
Now we extend the composition principle and the factorization notion to
terms, by establishing a correspondence between trees and terms.
Proposition 12.5.5 Let B = (&,-,-) be an n x n matrix whose elements are
subsets of some set S. Let <T be an acceptable class of binary trees over B. Let * be
any binary operator on subsets ofS satisfying axioms Dl, D2, and Nl. Then there is a
one-to-one correspondence between the set of terms Jf~* and the set of trees S~. The
correspondence is defined inductively by:
i) For any element btj of B, r(b;j) *> T*(Z»,_y)
ii) For any T, (*,„,,, *,„,-,,..., bipijp) and t2 (btpttp+l ,■■■, *fcittl).
T(Ti(bilitl,..., bip-Up),T2(bipJp+l,... ,bitJt+l)) ♦->
riib,^ ,..., bip^ip)*T* (bip,ip+l,..., biuh+l).
Using this correspondence, we extend the notion of minimality to terms in
the obvious way.
Definition Let B be an n x n matrix whose elements are subsets of S. Let
T*(bix,H ,-••, bit,it+l) be a term of B. Then, for any p, q, 1 <p <q < t, t* is (ip, iq)-
minimal if the corresponding tree t is (ip, /q)-minimal.
We next introduce two concepts related to matrices which are useful in
conjunction with Valiant's Lemma and the transitive-closure algorithm. The first
concept is the notion of a central submatrix. Intuitively, as shown in Fig. 12.9, a
central submatrix is a matrix contained within a larger matrix such that the principal
diagonal of the smaller matrix lies along the principal diagonal of the larger matrix.
B =
FIG. 12.9

456 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Definition Let B be an n x n matrix. Then A is a central submatrix if there
exists m > 0 and r, 0 < r < n — 1 such that A = (a ,-y) is an m x m matrix and for
0 < /, / < m - 1, au = bi+r-t, y+r_!.
The most important property of central submatrices for our purposes is given
by the next theorem and its corollary.
Theorem 12.5.3 (The Central Submatrix Theorem) Let B be a strictly upper-
triangular matrix. If A is a central submatrix of B, then, for any d > 1, the
corresponding central matrix of B(d) (or B+) is A(d) (or A+).
Proof Let B be an n x n matrix and A be m x m with aStt = Z»s+r_li(+r-1
for each 0 < s, t < m — 1 as the indices range over A. These are the /, / positions of B,
where r<i,f<r + m. Thus, let /'=/ —r+ 1 and/'=/ — r+ 1. We will show, by
induction on d, that
Basis, d = 1. This is immediate.
Induction step. Suppose d > 1 and the result holds for d — 1. It follows from
the definitions and Lemma 12.5.1 that
U = l fe = l + l
If we let k' = k — r + 1 and employ the induction hypothesis, then we get
d-i y'-i
b)j' - U U a(yahy -afj>.
U=l ft =1 +1
This completes the proof for B(d). For B+, note that
B+ = U B(d).
t>\
Corollary Let B be a strictly upper-triangular matrix which is transitively
closed. If A is any central submatrix of B, then A is transitively closed.
The second useful concept is that of the reduction-closure operation on a
matrix.
Definition Let C = (c,-j) be a strictly upper-triangular n x n matrix and let
n>r>n/2 > 1. Define the 2(n — r) x 2(« — r) matrix D = (d{J) to be the r-reduction
ofCif
cu if 0 < /, / < n — r — 1,
',-',; if/>«— r— l,j<n— r— 1,
i,i' if i<n—r— \,j>n—r— 1,
'/',;■ if /,/>« — r — 1,
<*U = {

12.5* VALIANT'S ALGORITHM 457
where i' = i + 2r — n,j' =j + 2r — n.
The r-reduction of C is illustrated in Fig. 12.10(a). Intuitively, we form D by
deleting the (n — r + l)st to rth rows and columns.
Using the notion of the r-reduction of a matrix, we can define the reduction
closure. Intuitively (as illustrated in Fig. 12.10(b)), we take the transitive closure of
the r-reduced matrix and then restore each element to its original place in C.
r
r<
C =
1
4
7
2
" -B i - " -
8
3
6
9
2r—n
>r
D =
1
7
3
9
(a)
D+ = E
El
E7
E3
E9
c+w
E,
4
E7
2
5
8
E3
6
E9
(b)
FIG. 12.10

458 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
1,1
Definition Let C = (c-tJ) be a strictly upper-triangular n x n matrix and let
n>r>njl>\. Let D = (d;j) be the r-reduction of C and let E = (e,-y) be the
transitive closure of D. Define the n x n matrix C+(r) = (c*Sr)) to be the reduction
closure of C, where, for 0 </,/<« — 1,
e-,j if i,j<n — r — 1,
Cij if n— r— 1 < / < r — 1
or«—r—l</<r—1,
et'j if / > r — l,/<«— r— 1,
e,-y' if / < « — r — 1, / > r — 1,
\ ei'j' if/,j>r— 1,
where /' = /— (2r —«) and/' = / —(2r —«).
Now we may begin to combine the partial results into a proof of the key
lemma.
Lemma 12.5.5 (Valiant's Lemma) Let B a strictly upper-triangular « x n
matrix, let n > r > n/2 > 1, and suppose that
2r-n
r
, •
1
4
7
2 i
ll
8
*
; 3
! 6
9
j
> r
where the r x r submatrices
1
4
2
5
5
8
6
9
and B2 = ofB
are transitively closed. Define Er as the submatrix of B given by
E, =
/ •*
0
0
0
2
0
0
0
6
0
where the shaded area above is taken from B.
ThenB+ = (BUE?)+(r).

12.5* VALIANT'S ALGORITHM 459
Proof The proof of the lemma is long and tedious. A sketch is given here
with the detailed proofs of the claims relegated to the problems.
Since B] and B2 are transitively closed, and since B is strictly upper triangular,
the only part of B+ that is unknown is region 3, in which 0</<«— r— 1 and
r</<«— 1. This follows from the Corollary to Theorem 12.5.3 on central sub-
matrices.
Let S"* be the set of terms of B and let S~ be the corresponding set of
acceptable trees.
Claim 1 Let t* be any (z'i,/f+i) term of B, where ix <n—r—\ and
zf+1>r—1. Then for some /p<«— r— 1 and iq+l >r, t* contains a unique
(n — r— l,r— l)-minimal (ip,iq+1) term t|. Moreover, either p = q or, for some
t%,t%&^~* and for some is such that«—r—1</, <r—1,
rt = T$(bipiip+1,..., */,_!,/,) * tJ(\, ig+1,..., biq, iq+1 )•
The proof of Gaim 1 is left to Problem 2(a).
For any x, y, 0<x<n-r-\ </■— 1 <^<«-l,let Ux>y(B) be the set
of all formally distinct (n — r — 1, r — l)-minimal (x, y) terms of B.
Claim 2 For any x < n — r — 1 and y > r — 1,
U t* = 6x>y U U **,.**.,*. 02.5.1)
T*et/x:y(B) s=""r
A-oo/ o/ claim By Proposition 12.5.1, Z»*s is the union of all formally
distinct (x, s) terms of B and b*Sty is the union of all formally distinct (s, y) terms of B.
Therefore, by Claim 1,
?m
, U T* = bx,y U U b+x,s*bly.
Since B] and B2 are transitively closed, Z»x>s = b*x>s and Z»s_y = b*s>y for all
jc<«—r—1,«—r<s<r—l,,y>r—1.
Note that Eq. (12.5.1) can be computed by the following procedure.
1. Matrix-multiply partitions 2 and 6 of B.
2. Form the union of the resulting (n — r) x (« — r) matrix with partition 3
ofB.
Let C be the (n — r) x (« — r) matrix which is in position 3 of B U E;?. We
proceed to analyze C.
It is clear from Claim 2 and the construction of C that the following is true.
Claim 3 For any i<n— r— 1,/>r — 1,
cu-r = U r*. (12.5.2)
T*et/W<B)
In words, c,-j_r is the union of all (n — r — 1, r — l)-minimal (/, /) terms of B.

460 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Note that, if/ = i + 2r — n, Eq. (12.5.2) reduces to the following:
Li,i-n-r
u
= ui, i+2 r-n<P)
Let D be the r-reduction of B U Ej obtained by deleting the central 2r — n
rows and columns. D is a 2(n — r) x 2(« — r) matrix.
Clearly,
du- U U x,
from Proposition 12.5.1. Moreover, each t contains exactly one element from C. This
follows from the strict ordering of the subscripts on the terms and the method of
construction of D.
To finish the proof of the lemma requires establishing a correspondence
between the terms of D and of B.
Claim 4 Let / < n — r — l,j>r — 1, and/' =/ —2r + n. Every (/',/') term of
D is a union of some formally distinct (i, /) terms of B.
The proof of Claim 4 is left as Problem 2(b).
In the reverse direction another claim is needed.
Claim 5 Let /<«—r—l,/>r—1, and/' =/ —2r+ «. Every (/,/) term of
B occurs in the formation of a unique (i, /') term of D.
The proof of Claim 5 is relegated to Problem 2(c).
To complete the main proof, it follows from Claim 4 that d*j' is a union of
distinct (/', /) terms of B. From Claim 5, every (/,/) term of B appears in d*j'. Thus
d)j> is exactly the union of all formally distinct (/, /) terms of B. From Proposition
12.5.1,
d\f = b*tJ
for all / < n — r, j > r, and /' =/ — 2r + n. But this is exactly the region of B which
had to be verified, and the proof is complete. □
Note that since
r
_A_
EJ
0
0
0
0
0
0
2»6
0
0

12.5* VALIANT'S ALGORITHM 461
we have just shown that
J +(r)
1
4
7
2
5
8
3U2»6
6
9
This will provide a simple recursive method for the computation of B+.
Now we are ready to show how to compute D+ in less than cubic time. If D
is a strictly upper-triangular matrix whose elements are subsets of some set S and * is
an arbitrary binary operation on S which satisfies axioms Dl, D2, and Nl, we wish to
compute D+ as efficiently as possible. Since * is not necessarily associative, techniques
from the literature that assume associativity are useless here. It should be observed
that our method will be as efficient as any that is known in the associative case. In
what follows, it is convenient to assume that n is a power of 2. We shall return to this
point later and will relax this assumption.
We now introduce a family of procedures that will be useful.
Definition We define the operation Pk of taking the transitive closure of an
n x n strictly upper-triangular matrix B under the assumption that the [n — (n/k)] x
[n — (n/k)] submatrices B] and B2 (shown in Fig. 12.11) are already transitively
closed.t
n-H
B =
FIG. 12.11
_A_
B|
B2
>n-
"V"
t Recall that a matrix D is transitively closed if D+ = D.

462 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
It will turn out that we need only construct P2,P3, andP4.
Let T;(n) be the time required to compute P; for an n x n matrix. Also, let
TR(n) be the time required to compute the transitive closure of an n x n matrix.
Now we give an algorithm to compute D+ in terms of P2 ■
Algorithm 12.5.3 Let A be an n x n strictly upper-triangular matrix. We
define a procedure trans(A) which returns the transitive closure of A.
1 If n = 2, then trans(A) = A.
2 Consider D
-H
A,
0
A3
A2
If
3 Compute trans{Ai) and trans(A2).
4 LetB:
trans (A))
0
A3
trans (A2)
5 Compute ,P2(B) = A+.
Now we will argue that the algorithm works, and analyze it.
Lemma 12.5.6 Algorithm 12.5.3 correctly computes A+ and
TR(n) = 2TR ||j+ T2{n) +<7(n2).
Proof To see that ^(B) = A+, we induct on k where n = 2k. Note that the
result is true for k = 1. Suppose k > 1 and that trans works correctly for n = 2fe •
Applying the algorithm to a matrix A of size 2fe+1, we correctly compute trans{Ax)
and trans(A2) where
A,
0
A3
A2

12.5* VALIANT'S ALGORITHM 463
by using the induction hypothesis. Let
At
0
A3
A^
Now B has the square submatrices A*! and A+2 of size 2k. Applying P2 yields
the closure of B, which is
At
0
At
A}
= AT
Clearly, the time bound is
TR(n) = 2TR | - + T2(n) +<7(n2).
Next we turn to the interesting task of computing P2.
D
Algorithm 12.5.4 Let B be an n x n matrix as shown below, where each
square is of size njA.
B,
1
13
10
14
■ wwiww
11
15
8
12
16
Assume that B, =
are already transitive closed.
1
5
2
6
1 If« = 2,thenB+ = B.
and B2 =
11
15
12
16
2 Apply P2 to B3 =
6
10
7
11
to yield
6
10
7'
11

464
RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
3 Apply P3 to B4 =
4 Apply P3 to B5 =
5 Apply ^4 to
1
5
9
6
10
14
1
5
9
13
2
6
10
7'
11
15
2
6
10
14
3
7'
11
8
12
16
3'
7'
11
15
4
8'
12
16
to yield
to yield
l
5
9
2
6
10
3'
7'
11
6
10
14
7'
11
15
8'
12
16
= b;
to yield B+.
Lemma 12.5.7 The previous algorithm computes^ and
7-2(1)= r2fcj + 2r3ra + r4(«).
Proof The equation is trivial to verify.
□
To see that the algorithm works, note that we were allowed to assume that
Bi and B2 were transitively closed. Note that
0
and
QD
are transitively closed because they are central submatrices of transitively closed
matrices. (Cf. the Corollary to Theorem 12.5.3.) Thus we apply P2 to B3 to get B^.
Next consider Step 3. To apply P3 to B4, we must check to see whether
1
5
2
6
is transitively closed, and it is, by assumption. Moreover, we require that
6
10
7'
11
be transitively closed, which it is, by Step 2. Therefore,^3(84) = B4. Now we want

12.5* VALIANT'S ALGORITHM 465
to apply P3 to B5. But
6
10
7'
11
and
11
15
12
16
are transitively closed by Step 2 and our assumption.
To invoke P4 in Step 5 we need only check that B4 and B^ are transitively
closed. But this is true, from Steps 2 and 4, respectively. Thus we obtain B+. □
With the aid of Valiant's Lemma, it is easy to give algorithms for P3 and P4.
Algorithm 12.5.5 (Computation of P3) Given an nx n upper-triangular
matrix B. Let us represent B by
1
4
7
2
5
8
3
6
9
where each square is of size w/3. We may assume that
1
4
2
5
and
»2 =
5
8
6
9
= °1
1 Compute B U Er * Er = C, where r = 2«/3. Recall that
Thus
Er
0
0
0
2
0
0
0
6
0
1
4
7
2
5
8
3'
6
9
2 Compute C+ (r) for r=2n/3, using P2, which means that we compute P2 of
1
7
3'
9

466 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
This is permissible since | 1 and 9 are closed.
The algorithm for P4 is similar, with r = 3n/4.
Algorithm 12.5.6 (Computation of P4) Let B be an n x n upper-triangular
matrix, which is represented by
1
5
9
13
2
6
10
14
3
7
11
15
4
8
12
16
where each entry is a matrix of size w/4. Assume that:
1
5
9
2
6
10
3
7
11
and that:
6
10
14
7
11
15
8
12
16
1 Compute BUEr*Er = C, where r = 3w/4, so
E,=
0
0
0
0
2
0
0
0
3
0
0
0
0
8
12
0
Then
c =
1
5
9
13
2
6
10
14
3
7
11
15
4'
8
12
16

12.5 VALIANT'S ALGORITHM 467
2 Compute C+(r) for r = 3n/4, using P2; i.e., apply P2 to
1
13
4'
16
Now we begin to analyze these procedures.
Lemma 12.5.8 Algorithms 12.5.5 and 12.5.6 compute P3 and P4,
respectively. Moreover,
r3(n) = M(n) + T2 fe\ + ^(«2),
T4(n) = M(n)+T2[^)+<7(n2).
Proof Correctness follows, from Valiant's Lemma, while the equations follow
trivially. □
At this point, we have reduced the computation of the transitive closure to
the computation of the product of two upper-triangular matrices. We must estimate
the time required to do this. Let M(n) be the time necessary to multiply two such
n x n matrices.
Lemma 12.5.9 If n = 2k for some k > 1, and there is some 7 > 2 such that,
for all m,
then
T2(n)<
M(2m+i)>2yM(2m),
CiM(n) for some constant Ci if 7 > 2,
,c2M(n)\ogn for some constant c2 if 7 = 2.
Proof If we substitute the result of Lemma 12.5.8 into the result of Lemma
12.5.7, we get
T2(n) = 47-2||) + W^)+M(«)+^(«2).
By the monotonicity of M,M(3n/4) <M(k) so
T2 (h) < 4T21 j I + 3M(n) + <7(n2).
(12.5.3)
(12.5.4)
The solution of (12.5.4) is claimed to be:
logn
T2(n)< <7(n2logn) + 3M(n) £ 2(2-7)m.
(12.5.5)

468 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
If we let n = 2 , the claim becomes:
7,2(2fe)<c(^22fe) + 3M(2fe) Y, 2<2_7)"
m=0
(12.5.6)
for some constant c ^ cx /4, where cx is the constant in (12.5.4). We can verify this by
induction on fc. We shall just do the induction step here. Since M(2fe)<2~7M(2fe+1), it
follows that
T2(2k+1)<c(k + l)22(fe+1) + 3M(2fe+1)
1 + 2(2_7) Z 2(2_7)m
m=0
or
fe+1
7-2(2fe+1)<c(^+l)22(fe+1) + 3M(2fe+1) I 2(2~7)m.
m=0
From inequality (12.5.4),
7'2(2fe+1)<47'2(2fe) + 3M(2fe+1) + Ci(22k).
By the induction hypothesis,
ft
T2(2k+1)<3M(2k+1) + Cl(2*)+4c(k22k)+12M(2k) I 2(2-7)m.
m=0
Since c > c\ /4, we have
4dt + Cj < 4dt + 4c = 4c(fc + 1).
Multiplying both sides by 22fe yields:
Cl22fe +4dt2fe = Cl22fe +ck2k+2 <c(k+l)2zk+2 .
Substituting this in (12.5.7) yields:
ft
r2(2fe+1)<3M(2fe+1)+12M(2fe) £ 2(2-7)m+c(/c + l)22fe+2.
m=0
This completes the induction proof of (12.5.5) and also establishes (12.5.4).
Since the series 2 xm converges if 1*1 < 1, then the result follows.
m=0
Next we use the present result to estimate TR(n).
Theorem 12.5.4 If there exists 7 > 2 such that, for all m,
M(2m+1)>2yM(2m),
(<7(M(n)) ifT>2,
(12.5.7)
□
then
TR(n) <
<7(M(n)\ogn) ifT = 2.

12.5* VALIANT'S ALGORITHM 469
Proof For the moment, we continue to assume that n = 2k. It follows from
Eq. (12.5.3), that
T2(n)>4T2(n/2).
We claim that
logn
TR(n)<<7(n2)+T2(n) £ 2"m.
m=0
Since n = 2k, we shall show by induction on k that
ft
TR(2h)<T2(2h) X 2_m + ^(22fe).
m=0
We shall just do the induction step. Lemma 12.5.6 gives that
TR(2k+1) = 2TR(2k) + T2(2k+1)+ ^(22(fe+1)).
(12.5.8)
(12.5.9)
Applying the induction hypothesis yields
T2(2h) £ 2"m
m=0
+ ^(22(fe+1)).
77?(2fe+1)<7'2(2fe+1)+2
Since T2(2fe) < T2(2fe+1 )/4, we have
fe+i
ra(2fe+1)<r2(2fe+1) X 2""m + ^(22(fe+1)).
m=0
This completes the induction step. It follows from (12.5.9) that:
TR(n)<2T2(n)+ <7(n2).
Applying Lemma 12.5.9 gives the result if n = 2k.
Now we deal with nonpowers of 2. It may have seemed that we would run
into difficulty with assuming that n = 2k and using P3, which would seem to require
that the size of the matrix be divisible by 3. Closer observation of P2, which calls P3,
indicates that if n = 2k, P3 is applied to matrices of size3«/4 = 3-2fe_2.When we take
the "reduction by one third" operation, we deal with matrices of size 2fe"2 and there
is no problem.
To complete the proof, we merely pad out an n x n matrix to the next higher
power of 2 if n ^ 2k, which is 2 ^loe^ "^. The method of padding is illustrated in Fig.
12.12.
n
Original
matiix
0
0
0
>2'
|log2 "1
FIG. 12.12

470 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Note that the padded matrix is still strict upper-triangular if the original one
was. Note that there is a constant c so that, for all n,
cM(n)>M(2^loenh,
and the final result now holds for all n. □
Now we can combine our various results to obtain the following important
theorem.
Theorem 12.5.5 Let 5 be a finite set and * an arbitrary binary operation on
S, which satisfies axioms Dl, D2, and Nl. If D is a strictly upper-triangular matrix
whose entries are subsets of 5, then the time required to compute D+ is 77? (w) and
(cBM(n) < en2-81 if BM(n) > n2*6 for some e > 0,
7R(n)<
(c«2logK if BM(n)<n2*e foranye>0.
Proof The result follows from Theorems 12.2.3,12.5.2, and 12.5.4. □
We can now combine our individual results, to state the main result.
Theorem 12.5.6 Let G = (V, 2, P, S) be a context-free grammar in Chomsky
normal form. The time needed to recognize a string of length n generated by G is at
most a constant times
BM(n)<n2-81 if BM(n)>n2+e for some e>0,
K2log« if BM(n)<n2+e foranye>0.
Proof Algorithm 12.5.1 requires (7(n2) steps to initialize the matrix and
TR(n + 1) steps to compute the transitive closure. It is easily shown that the binary
operation • defined in Section 4.1 satisfies axioms Dl, D2, and Nl. The time bound
then follows, from Theorem 12.5.5. □
PROBLEMS
1 Prove Proposition 12.5.1.
2 Complete the following parts of the proof of Lemma 12.5.5:
a) Claim 1 b) Claim 4 c) Claim 5
3 In one abstract formulation in this section, we started from a binary
operation * on S and extended * to work on subsets. Are the three axioms given
exactly equivalent to this situation? That is, if we start out as above, do the axioms
always hold? If the axioms hold, may * be redefined as above?
12.6 A GOOD PRACTICAL ALGORITHM
In this section, another recognition algorithm is presented that is quite
suitable for practical use. It works on any context-free grammar, so no initial
transformations of the grammar are needed. It can be shown that the simplest variant of the

12.6 A GOOD PRACTICAL ALGORITHM 471
algorithm runs in time en3, where c is proportional to ||G||. It turns out that, in
applications to the grammars which occur in natural language processing, dependence on
||G|| is more critical than dependence on n. It also turns out that this method can be
speeded up by a factor of logw.
The algorithm to be presented will construct an (n + 1) x (n + 1) upper-
triangular matrix. The entries in the matrix will be "dotted rules," which we now
define.
Definition Let G = (V, T,,P, S) be a context-free grammar and let • be a
symbol not in V. If A -*■ aQ is in P, then we say that A -> a • j3 is a dotted rule of G.
Our intuition tells us that the "dot" separates a and j8; the apart of the rule
has been found to be consistent with the input, while nothing is yet known about j3.
If A -*■ A is in P, we write the dotted rule A ->■ • and let • concatenate with A.
The matrix will be filled in by using certain operations that will now be
defined.
Definition Let G - (V, 2. P, S) be a context-free grammar. Let Q be a set of
dotted rules, and let R £ V. Define:
QxR = {A -+afi|3- y\A -+a ■ BPyismQ,P^>A, andB<ER);
Q*R = {,l^afij3-7|/l^a-5j37isme,j3^A,5^CforsomeCe#}
The idea of the x product is to extend the "product" of the Cocke—Kasami—
Younger algorithm to the case in which A-rules exist. The *-product is a method of
precomputing chain derivations. Next we extend these products to the case where
both arguments are sets of dotted rules. Note that:
Qx R£ Q*R.
Definition Let G = (V, T,,P, S) be a context-free grammar and let Q, R be
sets of dotted rules. Define
QxR = {A -+ afij3 • 7IA ->■ a • Bj5y e Q, 0 ^> A, and B ->■ tj • is in R }
and
Q*R = {A->aBp-y\A->a-BPy<EQ,P^>A,B^>CforsomeC<ENandC^ri-ismR}.
Again, note that:
Qx R£Q*R.
The algorithm that we shall give uses a predictor to guess which dotted rules
may be needed next.
Definition Let G = (V, H,P,S) be a context-free grammar and let R£ V.
Define
predict(R) = {C + y-Q C-»- y% is in P,7 =^> A,B ^> Ct\ for some B <ER and some tj e V*}.
If R is a set of dotted rules, then
predict(R) = predict({B\ A-+ a • Bj5 is in R}).

472 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
It is important to note that predict depends only on the grammar and can be
precomputed for each variable or dotted rule.
Example Consider the grammar
tQ =
1*1 A
lie
predict({S])
= {S-
= {S->- A •£}. Then
Now we can
present
S^AB
A^-A
B^CDC
C->A
D->a
•AB,S^A-B,A^--,B-
Qx{B}= {S^AB-},
Q *{D}= {S^AB'l
the main algorithm.
CDQB^-C-DC, C-
Algorithm 12.6.1 Let G = (T, 2,.P, S) be any context-free grammar. Let
w = a\ ■ ■ ■ an, where n >0 and at6S for each k, 1 <k <n, be the string to be
recognized. Form an (n + 1) X (n + 1) matrix T = (/y), as follows:
begin
*o,o: = predict({S});
for/ : = 1 to n do
begin
comment build col./', given cols. 0,
scanner:
for0</</-l do
h/- = U,j-i x {«,}:
completer:
for k: = j' — 1 down to 0 do
begin tky. = tkJ U tk>k * tkJ;
for /: = k — 1 downto 0 do
end;
predictor: / \
tjj:= predict \ U ?u
end \o <,</-i j
end.
-,/-i;
The order of computation will now be enumerated. Suppose we have built
columns 0, ...,/'— 1. To construct column/', the scanner puts partial information
into the off-diagonal part of the column. The completer works up the column from
(/-i./ t0 t0j. The computation of tkj- where 0 <&</', is done by first computing

12.6 A GOOD PRACTICAL ALGORITHM 473
A: J
<j,j
FIG. 12.13
tkj'- = th j U tk, fe * ffc_y. That is, each entry tkj is filled in by adding the *-product
of the diagonal entry of that row to the original contents of tkj. This is illustrated in
Fig. 12.13.
Next tk j is cross-multiplied against tik and the result added to ttj for all
rows above tkj. Then we continue to the next entry, which is tk-u_ This is illustrated
in Fig. 12.14. '
Only after t0j has been completed is the predictor used. The predictor fills in
Before studying the algorithm in detail, let us simply execute it on the first
example of this section, when we try to recognize the input w = a. Since we have
precomputed predict, the first executable statement yields:
'o,o = {S^-AB,A^-,S^A -B,B^- CDC, C^-• ,B ^C • DC,D ^ • a).
The outer for loop is executed once, for /' = 1. The scanner sets t0,i to {/)-*■ a•}. The
fc j
1- r
\
tk
>
J
'J. J
FIG. 12.14

474 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
inner for loop is executed once, with k = 0. Next, the completer adds to t01 some
new dotted rules, and the current value of ?0,i is:
,01 = {D^a-,S^AB-,B^CD-C,B^CDC-}.
The next for loop is not executed. The computation is concluded in the predictor,
where tx x is set to {C-> • }. The final T matrix is shown in Fig. 12.15.
S- -AB
A-~ '
S-^A'B
B - • CDC
C- •
B-~C-DC
£>— . a
D-*a'
S-"AB-
B-~CD'C
B-~CDC'
C--
FIG. 12.15
As we shall see shortly, w G /.(G), because S^-AB- is in f0,i-
Example Let G be the grammar
E^E*E\E + E\a
and let w = a + a * a. Note that the string w has two parses. We present the
recognition matrix in Fig. 12.16, and ask the reader to verify the computation.
T =
E -"E*E
E-"E + E
E-"a
E-~E-*E
E-+E- + E
E-'a-
E-'E + 'E
E — -E*E
E-"E + E
E-"a
E-'E+E-
E-'E- *E
E-'E' + E
E-'E-E
E-'E' + E
E-'a-
E-'E-E
E-~E*'E
E-"E'E
E-"E + E
E-'-a
E-'E + E-
E-'E-E
E-'E' + E
£— E*E-
E-~E*E'
E-'E-E
E-'E'+E
E-'E-E
E-'E' + E
E-'a'
FIG. 12.16

12.6 A GOOD PRACTICAL ALGORITHM 475
Since E-+E + E- is in /0,s (and E-*E *E- is in /0,s)> we conclude that
wGL(G).
Some new terminology is needed for the proof that the algorithm works.
Definition Let G = (V, 2, P, S) be a context-free grammar, let w = a\ ■ • • a„,
ak G 2 for each 1 <fc <«, let A -+aj3 be in P, and let 1 </,/<«. A dotted rule
A -* a-0 is called (/, j)-consistent if there exists 7 G F* such that
atAy
and
a => a;.
A set of dotted rules is (/, ^-consistent if each rule in the set is. A set of (/, /)-
consistent rules is (i, f)-complete if it contains all (/, f)-consistent rules.
There are only a finite number of (/,/)-consistent rules. Figure 12.17 shows the
interpretation of an (/, /)-consistent rule.
0\02 ••• a, ai+l , ,, aj
FIG. 12.17
It is convenient to prove a lemma about the preservation of consistency,
which will substantially reduce the level of detail involved in carrying out the proof
that the algorithm works.
Lemma 12.6.1 Let G = (V, 2,P, S) be a context-free grammar and let
w = ai ■ ■ ■ an,n^0, where eachafe E £ for 1 <£^n. Let 0 </,/<«.
a) U tij-i is (/,/— Inconsistent, then fJ>;-_1 x {a,-} is (/, /)-consistent.
b) If tLi and tiyj are (/,/)- and (/,/)-consistent, respectively, then tUi*tij
is (/, /)-consistent.
c) For any k, 0 < k </, if t, k and tkJ- are (i, k)- and (k, /)-consistent,
respectively, then tUh x tkJ is (/,/)-consistent.
d)
predict U tt j
is (i, /)-consistent if ttj is (/, /)-consistent, where / </.

476 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Proof A -* aafi' y is in ttj-i x {a,} if and only if
A-+a-ajfiy is in tij-i (12.6.1)
and
^A (12.6.2)
Since t^-i is (/, /— l)-consistent,
S^ai---aiA6 for some 9GV* (12.6.3)
and
a ^a,+1 ■•■«;_! (12.6.4)
Now, *
aa;0=>a/+1 ...afi by (12.6.4),
^■ai+1 •••a/ by (12.6.2).
This, together with (12.6.3), allows us to conclude that t{j-i x {a,-} is (/'^-consistent,
and completes the proof of part (a).
Next, we turn to part (b). A -* aB(l-y is in titi * tu if and only if
A-+a • B§y is in titi
U^A (12.6.5)
B^C (12.6.6)
and
C-*rf is in /i;y.
Since ttii is (/', /')-consistent,
a=^A (12.6.7)
and *
5=>«i ■■■a,i4e (12.6.8)
for some 9. Since C-*r\m is in /1;y, we know that
T?=*a,+1---a, (12.6.9)
Combining (12.6.8) with (12.6.5), (12.5.6), (12.6.7), and (12.6.9) leads us to conclude
that
aBP^>Bp^> C^n^a,-+i • • • ay
and so A -* aBfi-y is (/', /)-consistent.
Now, we focus on part (c). A -* aBfi-y is in tiyk x tkj if and only if
A -* a • B$y is in tiM (12.6.10)
0^>A (12.6.11)
and
5-+Tj"isinffc>y. (12.6.12)
By (12.6.10) and the consistency of tik,
s£-al---aiA0 forsomeflGF* (12.6.13)
a^-ai+1--ak (12.6.14)

12.6 A GOOD PRACTICAL ALGORITHM 477
By (12.6.12) and the consistency of tkj,
T?^tffe+r • dj
We must derive some strings from aBfi. That is,
aB$^ai+l ■■■ akBp
=> ai+l ■ ■ ■ akr)P
*
=>ai+l ■ ■ -fifefife+i • • -fi/P
^ fli+i ■■■a,
by (12.6.14)
by (12.6.12)
by (12.6.15)
by (12.6.11)
(12.6.15)
This derivation, together with (12.6.13), proves that tiM x tkJ is (/,/)-consistent,
and completes the proof of part (c).
Turning to part (d) of the lemma, C-+ 6 • % is in predict(L)0<i<j titj) if and
only if there exist / </, and there is B G TV such that
(12.6.16)
(12.6.17)
(12.6.18)
(12.6.19)
(12.6.20)
6 => A
B =^> Cri for some r? G V*
A -* a • 50 is in titi.
Since /,•;- is assumed to be (r,/)-consistent,
5 => a\ ■ ■ ■ atAy for some y G V*
a ^a,+1 •• -ay
Note that
5 => ai • • • a,^7
=> ai • • ■ afliBfiy
^ ai ■■■aiai+l ■■■a]B^y
=> ai • • -ajCriPy
by (12.6.17)
by (12.6.18)
by (12.6.20)
by (12.6.17).
Using the last derivation with (12.6.16) allows us to conclude that C-+8 • % is (J, /)-
consistent, and completes the proof of the lemma. □
We are now ready to prove that the algorithm works. Understanding this
proof is the key to modifying the algorithm to improve its performance.
Theorem 12.6.1 Let G = (V, 2,P, S) be a context-free grammar and let
w = ai ■ ■ -a„, each ak G 2 for 1 <k <n. Let T be the recognition matrix produced
by Algorithm 12.6.1. Then, for any 0 </,/<«, ttj is both (/,/)-consistent and
complete. Equivalently, A -* a • 0 is in ttj if and only if there exists y G V* such that
S => ai ■ • ■ a(Ay
and
a ^> a,+1 ■■■a,

478 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Proof The argument will be an induction on the order in which the
computation is carried out.
Basis. We must show that
predict({S}) = fM
is (O,0)-consistent and complete. It follows, from the definition of predict, that /0;o
is (0, 0)-consistent because, if C-+6 • % is in predict{{S}), then 6 =>A and S=>Cri,
and hence, C-*6 •£ is (0,0)-consistent. Conversely, if C-+8 •£ is (0,0)-consistent,
then, by the definition of (0,0)-consistency, 6 => A and there is some t? G V* such that
S => AG? = Ctj, which implies that C -* 6 • £ is in pre<#c/({S}).
Induction step. Assume that we are about to make our final assignment to
ttj as specified below. As an induction hypothesis, assume that, after final assignment
to all earlier entries /,'/, the t^j are (r',/')-consistent and complete.t We must show
that, for titj, the following propositions are true:
a) All dotted rules added to tiyJ are (/,/)-consistent.
b) If /</, then ttj will be (/,/)-complete after execution of the program
statement
tk,j '■ ~ thj U thh * tkj,
when k and/ have the values / and/, respectively.
c) The tj j will be (/,/)-complete after execution of the statement labeled
predictor.
The proof of part (a) follows immediately, because the induction hypothesis asserts
that tij-i is (/,/ — Inconsistent and Lemma 12.6.1(a) shows that
is (/,/)-consistent. That assignment is done in the scanner. Similarly, the other parts
of Lemma 12.6.1 assert that all additions to titi are (r',/)-consistent.
To show (b), let /</ and let A -*a • j3 be an (/', /)-consistent dotted rale.
By (/,/)-consistency, it is known that
a ^> ai+l ■ ■ a-j (12.6.21)
Since / </, a,+1 • • • a;- ^ A, and so there is some X&V which is the ancestor of a,- in
(12.6.21):
a = a1Ara2
for some alt a2G V*. Also there existsk,i<*k </, such that
t Our formulation of the induction step is complicated by the fact that some entries
are partially completed before the final values are assigned. It is possible to state the
exact characterization of the full matrix at any time during the execution of the
algorithm, but it is cumbersome and involves extra notation. By stating the induction
hypothesis this way, we can eliminate the extra notation.

12.6 A GOOD PRACTICAL ALGORITHM 479
c*i => a,+1 • • ak (12.6.22)
X ^>afc+1 •■■*,• (12.6.23)
a2 ^> A (12.6.24)
It is easily checked that
A -* ^ • Xa2(l
is (/, fc)-consistent. Since k </, the induction hypothesis applies, and A-*^- Xa2fi is
in'*,*.
There are now three cases to be considered.
CASE1. I6S. Then X = as and k=j-l, from (12.6.23). This case is
illustrated in Fig. 12.18. Then
A -* ai' ajOL2fi is in tU-x
so A -* axdja2 • 0 is in ttj-i x {a;}. Hence it is added to ttJ by the scanner.
ai+i
FIG. 12.18
CASE 2. XGN and i<k. This case is illustrated in Fig. 12.19. Rewriting
(12.6.23), we have:
X ^6 ^afc+1---fl,
and A"-*6 is in P. It follows easily that X-+8 • is (fc,/)-consistent and must be in
tk,h by the induction hypothesis, since i<k. But then ,4 -+a • (1 will be in tith x tkj,
so it will be added to tiyj by the completer loop immediately after the computation
of tk j is finished.
CASE 3. X GN and / = k. Now we find that (12.6.22) and (12.6.23) become:
c*i ^ A
X ^aw-aj^A (12.6.25)

480 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
"1
FIG. 12.19
a, a, ai + ] aj A
FIG. 12.20
The situation is illustrated in Fig. 12.20. Expanding (12.6.25), which is possible since
/</, we have:
X => yt ^ai+l--aj (12.6.26)
for some fiG V*. Hence, X-*nm is (/,/)-consistent. Now, however, t-tj may not be
complete when
Uj- = tuUtt.t*tij (12.6.27)
"it
"i+\

12.6 A GOOD PRACTICAL ALGORITHM 481
is executed, so the induction hypothesis does not apply directly as it did in Cases 1
and 2. Let us examine a generation tree of (12.6.26). See Fig. 12.20. There will be an
internal node in the tree, which is labeled Y and is the lowest common ancestor of
ai+1 and a,-. That is
X ^ OlY02
0, ^> A
e2 ^ a
Y -* 5isinP
This situation is illustrated in Fig. 12.21.
A ai+l — at al + i
FIG. 12.21
There is some Z EV such that
6 = 5iZ82
*
Z => afi+1 ■ ■ ■ dj
82 =*> A
Note that Z € V, so that Z = a,- is included in this case. If / + 1 =/, then C = / and
Z =>ay. But then Z is a common ancestor of a,+1 (= aj) and a;-, which is a contradiction
unless Z = a,-.
On the other hand, if / + 1 </, then / < C </, because i = i would imply
that Z is a lower common ancestor of a,+1 and a;- than F. In either case Y-+5 • is
(/, /)-consistent. Moreover, Y -* 8 • falls under Case 1 or Case 2 above, respectively
[with Y-*- 8iZ82 • in place of A -*■ diXa^ • &,Z in place of X, and C in place of k].
By the arguments used earlier, Y-* 8 • was added to titi before the statement
fi,i ■ = U,j Uf(,i* ?i,y
was executed.
for some/ <£</

482 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Since /</ and A -*<*! • Xa2P is in ttit by the induction hypothesis and
X Z> Y
we have that A -* a1Ara2 • fi is in ti>t * ttj, so that A -* a • 0 will be in ttj after
execution of (12.6.27). This completes the proof of part (b).
To prove part (c), let C-*t) • 0 be a dotted rule that is 0',/)-consistent. By
definition,
S ^■al---ajCy (12.6.28)
for some 7 6 V* and
t? =S> A. (12.6.29)
Let the least common ancestor of a}- and C be a node labeled >1, and the production
/4 -► aflj3
is used. See Fig. 12.22. a contains an ancestor of a;- while B 6 V is an ancestor of C.
S
FIG. 12.22
Now A-*a- Bfi must be (/,/)-consistent for some / </. That is, there is some
/ so that
a 2> ai+l • ■ ■ aj
because a contains an ancestor of a,-. By the induction hypothesis,
A -* a ■ B$ is in tt j ^ U tpj.
JJ = 0
Since B is an ancestor of C (but not of aj), we have
B ^Ct

12.6 A GOOD PRACTICAL ALGORITHM 483
for some t 6 V*. Thus, C-+T)m6 must be in
predict I U tpJ ,
which completes the proof of (c) and of the theorem. □
Corollary Let G, w, T be as in Theorem 12.6.1. Then wGL(G) if and only
if S -* a • is in t0n for some a G V*.
Proof By the theorem, S -* a • is in fft„ if and only if S =>a =>ai • • • a„ = w.
a
Corollary Algorithm 12.6.1 is an on-line recognition algorithm.
Note that the algorithm would still work if * were used instead of x. This is
true because * preserves consistency, no extra dotted rules can occur, and none will
be omitted.
In order to analyze the algorithm we have given, we shall modify it
somewhat. The analysis will be simplified by using the * operation in place of the x
operation throughout the algorithm. Basically, this causes chain derivations to be handled
as soon as possible, rather than as late as possible. As we remarked earlier, replacement
of x by * preserves the correctness of the algorithm.
We use ty to denote the vector that is the /th column of T; that is, the /th
component of ty is ti j. Union of two vectors, or * of a vector by a scalar, is done
componentwise. Algorithm 12.6.1 has been recast into vector notation below.
Algorithm 12.6.2 (Vector version of Algorithm 12.6.1)
begin
v =
predict ({S})~
0
0 -
)
for/: = 1 to n do
begin
scanner:
ty: = ty_i * {fly};
computer:
for k: =/ — 1 downto 0 do
V = ty U tfe * tk j;
predictor:
Zy;: = predict I U /,-,y) ;
\0<><; J
end
end.

484 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
First, we claim that this algorithm is also correct.
Theorem 12.6.2 Algorithm 12.6.2 is correct in that A -+a-|3 is placed in
tiyj if and only if S =>fii • • • atAy for some y 6 V* and a =>ai+l ■ ■ ■ as.
Proof The present algorithm is carrying out the same computation as
Algorithm 12.6.1. Therefore, the result follows from Theorem 12.6.1. □
Now let us consider how to implement the "vector" algorithm on a bit-
vector machine. For each dotted rule A -* a • ft, let us define
V
where
1 if A -* a • 0 is in tu,
,0 otherwise.
Corresponding to each column t;- is a sequence of vectors, one for each tf^01'^.
fA-*a-l
Example We use our first example again. First, we index all the dotted rules:
0.
1.
2.
3.
4.
5.
S-+-AB
S-+A-B
S-+AB-
A-+-
B -* ■ CDC
B-+C-DC
6.
7.
8.
9.
10.
B-+CD-C
B-+CDC-
C-+-
D-+ -a
D-*a-
Then t! (the matrix appears after Algorithm 12.6.1) is represented by
ti **
0010001100 1
00000000100
The "one" in the bottom row means that C-+ • is in tlti.
Note that the array has two rows and eleven columns. Although the number
of columns is large, it depends, in general, on the grammar and not on the size of the
input. In example, the row size is n + 1, where n is the length of the input.
Consequently, each column is stored in a bit-vector "word" and we shall use as many
words as we have dotted rules.
Next, we must be able to perform operations on this representation of
columns.
Lemma 12.6.2 Let G = {V, T,,P,S) be a context-free grammar, let
w = al • • • a„, and let T = (ftf) be the matrix computed by Algorithm 12.6.2. If the

12.6 A GOOD PRACTICAL ALGORITHM 485
columns are represented as above, the following operations may be performed in
constant time which is independent of n.
a) Compute the union of t;- and tfe;
b) Compute a representation of tiyj from t};
c) Test whether A -* a • Bfi is in t;-;
d) For any sequence of vectors Q and any set' R, compute
Q*R.
Proof Carrying out these computations is a straightforward programming
exercise and will be left as Problem 2 at the end of the section. □
We are now ready to analyze the time bound for Algorithm 12.6.2.
Theorem 12.6.3 Algorithm 12.6.2 requires time <?(n2) when implemented
on a bit-vector machine and uses words whose length is n. The algorithm may be
implemented on a RAM and takes time <7(n3).
Proof By Lemma 12.6.2, all the operations used in Algorithm 12.6.2 require
only constant time. If it takes time c0 to compute tQ and time cx to execute the
scanner, and time c2 to execute the completer, then the entire program requires
Co + Lcl X c2 = co +cic2 X / = t^iri1).
;=1 \k=0 J 7=1
Since only bit vectors of length n are used, one can simulate the whole process on a
RAM in time ^(n3). □
Although the present algorithm seems quite different from the on-line version
of the Cocke—Kasami—Younger algorithm, it turns out that the two algorithms are
very closely related. The exact correspondence is brought out in the problems at the
end of this section.
We now indicate how to use our new type of recognition matrix to parse as
well as recognize. The algorithm that follows provides a means of obtaining, from the
recognition matrix, a rightmost parse for a given string. The algorithm is similar to that
used for the Cocke—Kasami—Younger recognition matrix.
Algorithm 12.6.3 Let G = (V, 2, P, S) be a cycle-free, context-free grammar
with the productions designated iri,ir2,... , irp. Let w = a\a2 ■ • a„, n > 0, be a
string, where, for 1 < / < n, a, G 2; and let T = (/,• y) be the recognition matrix for w
constructed by either Algorithm 12.6.1 or 12.6.2.
t This is to be interpreted for both the case R 9 V and R a set of dotted rules.

486 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Define the recursive procedure parse(ij\ irm) which generates a right parse for
A =>a=>ai+l ■ ■ ■ dj, where 7rm =A -+a, as follows:
procedure parse(i,f, nm = A-^AlA2- °-4pm);
begin
output(m);f0: =/;
for £: = pm downto 1 do
if^fies
then/0: =/0 — 1
else
if k is the greatest integer / < k </0 such that for some a G V*,
Aa->-a-isintkjo and A -*AXA2 ■■ ■Aa.1-Aa ■ ■ ■ APm is in fIjfe
then begin parse(k,j0 ,Ae~*oi);
Jo- =k;
end;
end;
Output a right parse for w as follows:
mainprogram: if for some a G F ,S-+a • is in t0n
then parse(0, n,S-+a)
else output{" error").
Example Consider the second example of this section with grammar
1. E-+E*E
2. E-+E + E
3. E-+a
and string w = a + a *a. Since E-+E + E- is in /0,s, parse(0, 5, it2 = E-+E + E) is
called, yielding (2,parse(2, 5, n^ = E -+ E * E), parse(0, l,ir3 -E-*a)).Thenparse(2,
5, ir2 = E -+ E * E) generates (1, parse(4, 5, tt3 = E -* a), parse(2, 3,,7r3 = E-*a)),
which yields (1,3,3).parse(0, 1, 7T3 = £" -* a)yields 3. Thus the output of the algorithm
is (2,1,3,3,3).
If the grammar contains cycles, then Algorithm 12.6.3 may not terminate.
However, if we modify the algorithm so that, in the case that some tk j contains two
or more dotted rules A% -* a • and A% -* y •, we choose the dotted rule that was
entered first by Algorithm 12.6.1, it can be shown that Algorithm 12.6.3 always
terminates. We leave these details to the reader.
Theorem 12.6.4 Let G - (V, J,,P, S) be a cycle-free context-free grammar
with productions designated n^, n^,... , irp. If Algorithm 12.6.3 is executed with the
string w = aia2 ■ • a„, n> 0, where, for 1 </<«,af G 2, then the algorithm
terminates. If w G L(G), then the algorithm produces a right parse; otherwise it announces
error. The algorithm takes time & («2).
Proof The proof follows that of Theorem 12.4.4. We outline the proof,
giving only the details that differ significantly.
Claim 1 Whenever parse(i, j, nm = A -* a) is called, then / > i and A -* a • is
infu.

12.6 A GOOD PRACTICAL ALGORITHM 487
Claim 2 In the sequence of calls on parse during the execution of Algorithm
12.6.3, no call parse(i, j, itm=A-+AlA2-- Ap ) generates another call parse(i, j,
nm =A-+AlA2--APm).
Proof Given a call parse(j, j, nm = A-+AiA2- ■ Ap ), each call parse{k, /0,
A£-+a) directly generated has the property that r'<fc</0 and /<></. It follows
inductively that this property (namely, that successive values of/ are nonincreasing
and, for fixed values of /, successive values of / are nondecreasing) is true of the
sequence of calls generated by parse(i, j, nm=A -* A\A2- ■ Ap ). Suppose the call
parse(i, f, nm =A-*AlA2- -Ap ) directly generates the call parse(i, /, irr = B-+
BXB2- ■ -BPr). It follows that, for"some C, 1 < C <pm, Aa =B, and that A -*AXA2
■ ■ vle_! • Ae - • Ap is in titt. It follows from Claim 1 and Theorem 12.6.1 that
A =^A1A2--Ai.iAi--APm
=*AxA2-- -Ai^a^-- ajAe+l- ■ APm
Uci+i ■ ■ a}
and that A \A2- ■ vlfi_! => A and>lfi+1- • Ap =>A. Hence, >1 =>B.
Inductively, if parseQ, j, rrm = A-* AXA2- ■ 'APm) generates a call with the
same arguments, then >1 =M, contradicting the fact that G is cycle-free.
Claim 3 parse{i, j, -nm =A-+a) terminates and produces a right parse of A =>
a =>a,+1 • • -aj.
Proof Consider the sequence of calls on parse during the execution of
Algorithm 12.6.3. It follows, from the argument in Claim 2, that the sequence of values
for/ is nonincreasing and that, for fixed/, the sequence of values for / is nondecreasing.
It follows from Claim 1 that 0 </ and 0 < /. Consequently, we conclude from Claim
2 that the number of calls is finite. Clearly, if parse does not make a call, then it
terminates. Therefore the algorithm terminates. The production of a right parse then
follows, by a straightforward induction on the number of calls on parse.
Claim 4 A call of parse(i, /, nm = A -+a) takes at most c(j — i)2 steps for
some constant c if/ > /, and c steps if/ = /.
Proof We first analyze the number of steps required by the call on parse,
excluding the recursive calls. The for loop is executed, at most, q times, where q is the
length of the longest righthand side of a production. For each execution of the for
loop, if A e G 2, the number of steps is c3 for some constant c3. If A fi G N, the number
of steps, excluding recursive calls, is at most c4(j0 —Jc+ 1) for some constant c4.
Since i <fc and /0 </ for all values of k and/0 during the call, the total number of
steps, excluding recursive calls, is at most qcs(j — i+ 1), where cs =max(c3,c4).
Since each recursive call has arguments /0, k such that / < k and/0 </, each recursive
call requires at most qc$(j —i + 1) steps, excluding its recursive calls.
By Claim 3, parse(j, j, irm =A -+a) produces a right parse of A =>a =>ai+l
■ ■ -aj. Let/? be the length of that derivation.

488 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
By Theorem 12.2.1, if / = /', then p<c0 and if/>/, thenp <cl(/ —i)-c2
where c0, Ci and c2 are constants. Since a production index is printed when and only
when parse is called, this call on parse generates at most p recursive calls. Combining
results, the total number of steps for the call on parse is at most
c0qcs if / = /
and
qcs(/-i+lYci(f-0-c2) if />i. (12.6.30)
Simplifying Eq. (12.6.30) yields
qcs(j-i+ \\ci0'-i)-c2) = qcs(ci(j-i)2 + fci -c2Xj-i)-c2)
<qcs(c1U-t)2+cl(f-i))
< 2qcsci0'-i)2-
Letting c = max {c0qcs, 2qc%c\} completes the claim. □
PROBLEMS
1 Give efficient algorithms for computing x, *, and predict. Analyze the
complexity of these algorithms.
2 Write the programs needed for the proof of Lemma 12.6.2.
3 Prove the following result".
Theorem Let G — (V, 2, P, S ) be a context-free grammar in Chomsky normal
form. Let w = ai ■ ■ -an, n > 0, and a^ 6 2 for 1 < k < n. Let T = (fi;) be the matrix
for G computed by Algorithm 12.6.1. Let T' = (t'lj) be the recognition matrix
computed by the Cocke—Kasami—Younger algorithm. Then,
{A\A-*BC' or A-+a • is in t;j}£ t'tj
for each 0 </</'<«. In general, the containment is proper.
4 Let G = (V, 2, P, S) be any context-free grammar. We modify Algorithm
12.6.1 by "weakening" the predictor and changing that step of the program to be
tjj : = predict(N).
Show that the new algorithm still works and is a correct recognizer. [Hint. Prove that
A -* a ' fi is in ttj if and only if a ^ai+l ■ ■ aj.]
S* Consider the algorithm given in the previous problem. Show that the
diagonal may be eliminated from the matrix by a suitable redefinition of * and x.
What is the relationship between the resulting algorithm and the Cocke-Kasami-
Younger algorithm?
6 Modify Algorithm 12.6.2 by noting that, as n grows, it is efficient to
precompute the product of certain columns and q-tuples of sets. Show that such a
modification can be made and that the resulting algorithm works.

12.7
GENERAL BOUNDS ON TIME AND SPACE
489
7 Continuing with Problem 6, choose q so that the algorithm can run in
n2
&\ on a bit-vector machine,
\iog«;
<?|-^—I on a RAM.
time
logn
8 Let us consider how to recognize a set defined by a 2-pda A = (Q, 2, T,
6, q0, Z0, F). There is no loss of generaUty in assuming that A satisfies the conditions
of Problem 4 of Section 5.4. Form a square matrix T such that (q, Z, q ) is in ttj if and
only if « i* ,
(«,*w$,/,Z)H«,*w$,/,A).
Note that this matrix is not upper-triangular, in general. Define a suitable product
operation, and show that the problem of recognizing whether or not a string is in a
2-pda language L may be reduced to taking the transitive closure of T.
9 Show that T+ of Problem 8 may be computed in time proportional to n3
and space proportional to n2 on a RAM.
10 How fast can the operations of Problem 9 be carried out on a Turing
machine?
11 What is the computational complexity of recognizing a set accepted by a
2-dpda?
12.7 GENERAL BOUNDS ON TIME AND SPACE FOR LANGUAGE
RECOGNITION
We have considered a number of recognition algorithms, and, for each of
these, we have computed the time and space requirements. In this section, we shall
summarize the least upper bounds and greatest lower bounds now known for the time
and memory used in context-free language recognition. These algorithms are
sometimes extreme, in that time may be traded for space.
If we let R(n) be the time required to recognize (or parse) a string of length
n, then we have the following result:
Theorem 12.7.1 n <:R(n)^cBM(n).
Proof The upper bound follows from Valiant's algorithm. The lower bound is
obvious, since we must read the entire string before giving an answer. □
Let us define R'(n) to be the time required to recognize a string of length n
by an on-line Turing machine. In this case, we must formulate the problem as in
Section 9.4, to prevent the machine from copying the input to a work tape and then
proceeding as in an off-line computation.
Theorem 12.7.2 For on-line computation,
n2 _,_ , n3
<R\ri)<~<n3.
log n log n

490 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Proof The upper bound of n3 follows from Algorithm 12.6.1 or Algorithm
12.4.2. The «3/log« result can be obtained from Problem 7 at the end of Section 12.6.
The lower bound is too complicated to be included here, but the idea can be mentioned
briefly. Let
2 = {0,1}, c,st-Z,,
L = {xyzcsuis ■ ■ ■suts\t>l,x,y,z&Z*,\g(y)>0,ujeZ* for l </<?,
and iti = yT for some z, 1 < i < t}.
It is interesting to note that L is a linear context-free language. The idea of the proof is
to find another set l! (which is not context-free) that has essentially the same
recognition time as L. It is possible to show that L' needs at least w2/logw time for on-line
recognition.
Corollary There is a single context-free language which requires at least time
d?(n2/\ogri) for on-line recognition and can be recognized on-line in time ^(w2).
Let us now concentrate on the space requirements for recognition. Let S(n)
be the space required to recognize a string of length n.
Theorem 12.7.3 log«<5(«).
Proof There is no loss of generality in dealing with single-tape Turing
machines, because of Theorem 9.4.1. Suppose we have an L(«)-tape-bounded,
deterministic Turing machine with a read-only input and a single work tape. The instantaneous
descriptions (IDs, for short) of such a machine are of the form (q, i, a), where q is a
state, a is the portion of the work tape scanned so far, and i is the cell of the work
tape being scanned (so 1 < i < lg(a)). The maximum number of IDs is
r(n) = sL(n)tLM, (12.7.1)
where s is the number of states and t> 1 is the cardinality of the work alphabet. Let
Ci,.. .,Cr(„) be an enumeration of all IDs. For each n>0 and each d€2*, with
lg(u) < n, define a transition matrix T„(v) to be an r(n) x r(n) matrix whose entries
are from the set {00,01,10,11}. Then ttj is defined as follows:
1. The first digit of tij is 1 if and only if the Turing machine, started in tape
configuration Q while scanning the first cell of v%, can reach ID Cj
without moving the input head left out of v$.
2. The second digit of ttj is 1 if and only if, when started in Q while
scanning the first cell of v%, the device can move the head left out of v$ and
can be in ID Cj the first time it does so.
Thus Tn(v) describes the behavior of the device on v when v is a suffix of the input.
Then we have the following useful lemma:
Lemma 12.7.1 If \g(uv)<n for some uv&L, where L is the language
recognized by a single-tape Turing machine M and for some v' G 2*, T„(v) = T„(v'), then
ud'GL.

12.7
GENERAL BOUNDS ON TIME AND SPACE
491
Proof The argument is a straightforward "crossing sequence" proof and is
omitted. □
Now, consider the language
L = {wcwT\ wG{0,1}*}.
We shall shortly show that any on-line Turing machine requires at least log n tape to
recognize L.
Let n be a fixed odd integer, and consider all strings in 5= {0, i}"("-D/2.
Claim 1 No two strings in S have the same transition matrix.
Proof Suppose w,w' G S with w^w' and T„(w) = T„(w'). Then wcwT G L
but, by Lemma 12.7.1, we have
wc(w')T G L,
which is a contradiction. If all such strings have distinct transition matrices and there
are 2"("~1)/2 such strings but only 4(r(n)) transition matrices, we must have
4(r(«))2 > 2"<"-1)/2 > 2("~1)/2. (12.7.2)
Taking logs of Eq. (12.7.2) yields:
n-\
2(K»))
>
2
or, using Eq. (12.7.1),
4(r(«))2 = 4(sZ,(«)^(n))2 > h - 1. (12.7.3)
If n > 1, then « - 1 > n/2. Also, for all L(n), fL(") > L(n), since f > 1; so Eq. (12.7.3)
becomes: r, ,
8sY,L(")>n. (12.7.4)
Taking logs of Eq. (12.7.4) yields:
log 8s2 + 4L(n)log t > log n.
Thus
w ^log«-log8s2 /-mo
^W^ 71—I • (12.7.5)
4 log?
If n > (8s2)2, then log n > 2 log 8s2, so Eq. (12.7.5) becomes:
w 8 log t
for n > (8s2)2. Thus, for infinitely many n (all odd n >(8s2)2), we have
L(n)>c\ogn,
where c = 1/(8 log t). Therefore,
S(n) > log n.
We note in passing that L can be recognized in log n space.

492 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
Claim 2 L = {wcwT\ wG{0,1}*} can be recognized in space logw. This
completes the proof of Theorem 12.7.3.
□
Now we turn to an upper bound for S(n). The next lemma shows a suitable
way to "factor" a derivation.
Lemma 12.7.2 Let G = (V, 2,P,S) be a context-free grammar in Chomsky
normal form. Let A GN and aG V3V*. If A =>a, then there exist B, y, 8 in V* and
B &N such that:
A ^ BB8 ^ By8 = a
where
<lg(T)<2
and n = lg(a).
#
Proof Suppose we have a derivation tree of A =*a. Each node of the tree is
to be given a weight by the following procedure:
1 Every leaf has weight 1.
2 Every other node has a weight that is the sum of the weights of its
immediate descendants.
Figure 12.23 illustrates a sample derivation and the corresponding labeling of the tree.
Example G is the grammar with productions
S -+ AS\b
A -+ SA\a
Note that the weight of the root is always n.
Now find a heaviest path s in the tree by starting at the root and taking, at
each node, the branch to the heaviest subtree until a leaf is reached. Such a path is
marked in Fig. 12.23. A path like s must start at the root with a weight of «>3
(recall a G V3 V*) and end at a leaf with weight 1.
■s£)
s
/ \
A S
© J (Da
/V
S A
I / \
b S /I
\©
V)(pb ©
^s©
© /©N S&A (2)
© i ©I in)/\ _
© S A ©
© * « ©
The circled node is the one
promised by the lemma.
FIG. 12.23

12.7
GENERAL BOUNDS ON TIME AND SPACE
493
Now trace up path s until the first node is encountered with weight > L«/3J .
This node cannot be a leaf, since a leaf has weight
-liHs
Such a node must exist since the root has weight n > \n\7>\. As we progress up the
path s, the weight at successive nodes will double (at most) since we have a Chomsky
normal-form grammar and s is through the heaviest subtrees. Hence, this first node of
weight > |_w/3j follows a node of weight < [_w/3j, and so it itself has weight < 2 |_w/3_|.
Let B be the label of this node. Its weight is lg("y), where A =*y, and we have
just shown that
fj<ig(r)<2[fj. n
Now we are in a position to prove a result that gives an upper bound on the
space complexity of all context-free languages.
Theorem 12.7.4 Every context-free language can be recognized by a
deterministic Turing machine that uses space S(n) = (log n)2.
Proof Let L = L(G), where G = (V, Z,P,S) is some context-free grammar
in Chomsky normal form. Let N = {A i,..., Am } with S = A j. The construction of
the Turing machine is quite complex, so an intuitive version is presented first.
It is necessary to determine whether A =*a, where A GN and a G V*. If the
lg(a) < 3 (for example), we can determine whether A ^a by precomputing all such
information and storing this in the finite-state control of the Turing machine.
If lg(a) = k> 3 and A ^a, then there exist B GN, 0, y, 8 G V*, such that
A ^ (356 =* 076 = a
where
<lg(7)<2
The determination of A =*a can be reduced to determining whether B =*y and
A ^>PB8.
Thus, to determine whether A =>a, the Turing machine will, for each B GN
and each possible factorization of a = 0^6 with l£/3] < lg(-y) < 2 1&/3J, attempt the
subproblem B^y. If the subproblem is solved successfully, then the algorithm will
attempt the complementary subproblem A =*066. Since
«r)>[fj
it follows that
lg(7)>^i,

494 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
and hence
lg(056) = lg(076)-lg(7) + lg(5)
k + 1
< k —- +1
3
2k+ 2
since we also know that
3
lg(7)<2
<2kL
^ 3 '
Thus, if the original goal is of size k = lg(a), then all possible subgoals are of size
<(2k + 2)/3.Foi k>3,
k
2 = 2-K2-
3
so
2k + 2 2k + §fc _ 8&
3 ^ 3 9
Thus we have proved:
Claim 1 For k > 3, any goal of size & has all of its subgoals of size at most
8*.
Next, we compute the number of levels of subgoals to break down the
problem to one of size < 3.
Claim 2 The number p of levels of subgoals needed to bring the problem
down to size < 3 satisfies
p<c\og2n,
where c= l/(log2(D).
Proof of Claim 2 After p levels, the goal of size k has been broken into
subgoals of size < (*)pk- If we set this < 3 and take logs, we get
P<, „\= c\og2k.
log2 (1)
To solve the problem at hand for an input of length n will require 0(log ri)
space to record each current subgoal, and there will be 0(log n) such subgoals, leading
to a total bound of 0((log«)2).
The work tape is divided into blocks of
Llog n\ + 1 cells.
Each such block is long enough to contain a binary number from 0 to n. Imagine
the tape split horizontally into channels or tracks. The top track of block i contains
an integer denoted by begin(i). The second track contains an integer denoted by
end(i). The third track contains an integer size(i). The fourth track contains a variable

12.7 GENERAL BOUNDS ON TIME AND SPACE 495
SGoal 1—unsatisfied
Goal 2—unsatisfied
Goal 4—?
Goal 3—satisfied
sat®
FIG. 12.24
called variable® (right justified) and a boolean variable (or bit) sat® (for satisfied).
There are some other tracks to handle arithmetic and comparison operations as needed.
The meaning of the z'th block can now be explained. Its purpose is to record
information as to whether
variable® ^ a,- ^ wbegin(i) ■ ■ ■ wend(i)
where size® = lg(a,) and each wk G 2 for begin® < k < end®. Also
true if variable® =* a,- is known to be true,
false otherwise.
The approach we shall use is to break up a possible derivation of S =* Wj • ■ ■ wn
into subderivations of nearly equal parts. If sat® = false, then all succeeding blocks
are partitions of the second half of the derivation, i.e., the inner subtree. If sat® =
true, then subsequent blocks deal with the outer subtree. This is exemplified in Fig.
12.24.
The algorithm to be given is quite complicated. We wish to determine whether
S=>w, and we generalize that to the consideration of whether A =*w'. If lg(w') < 3,
then table lookup is used; otherwise, we use recursion. The algorithm starts with the
goal of determining whether S=^ai = w. An elaborate backtracking is begun, and w
is factored into 0i7i61, where
[=|<«T.)«|f];
and a variable Bx is stored into block 1 to indicate that it is possible that B ^-y^.
Then ji is stored by writing pointers to the first and last symbols of the input. (Since
these input positions are between 1 and n, log« space is sufficient.) The determination
of Bi =*"y1 will be done recursively.
Now the process is continued on the subgoal in a similar fashion. There are
several points which are different however. Subsequent a,'s may have several variables.
For each at, we record the first and last indices of the subword of the input derived
from a,-.

496 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
We have also been vague about how the recursion will take place, since that
requires space also. The space for that is all the unused blocks to the right of the
present one. A simple calculation shows that exactly the correct number of them
exists at any time.
When it has been determined that a particular goal has failed, then the next
case is systematically generated by our backtracking mechanism.
We are now ready to give a detailed construction of the device. The device is
described by an ALGOL-like program, but all the operations can be carried out on a
Turing machine within the given space bounds.
There will be two variables s and s' which appear in the program. Assume that
these are separate procedures defined by:
j
s = size(i) — Y, (size(k)— 1)
ft =1+1
and
s = size(j) — £ (size(k) — 1).
ft=i+l
Every time that s and s' appear, imagine that they are recomputed.
In understanding the program, i is the last block for which sat(i) = false and;
is the last block.;' may decrease, but it always is the case that 1 < i <;'.
begin
step_0: i : = 1;
;: = 1;
begin(l) : = 1;
end(l) :=n\
comment n may be computed on a first pass through the input;
sz'ze(l) : = n;
variable(l) :=Au
comment A t = S;
sat(l) : = false;
if n < 3 and variable(l) =*ax- ■ -an then accept;
comment The only goal here is to determine if S =*a1 = w;
step_l: comment We are working on goal /and have finished subgoalsz'+ 1,...,;'.
Each successfully completed subgoal k, i+Kk<,j has
established that size(&) symbols in a,- can be replaced by
variable(fc). Also s is the length of af after these
replacements have been made;
if s < 3 then goto step_6;
comment We can test if goal i can be satisfied directly. Otherwise, a new
subgoal must be built;
begin{j): = begin(i)\
end(j): = begin(i) — 1;
size(J) : =0;
variableif) : =Am;

12.7
GENERAL BOUNDS ON TIME AND SPACE
497
step_2: comment advance the end pointers of subgoal/;
repeat
if end(f) = endQ) then goto step_3;
end(j) : = end(J) + 1;
if (end(f) = beginQc)) and satQc) for somet k,j' — 1 > k > 1
then endQ) = endQc);
size(J) : = size(j) + 1;
until size(j) > |//3| ;
if sizeQ) > 2 \s'/3\ then goto step_3;
satQ) : = false;
goto step_l;
step_3: comment advance begin pointer for subgoal /;
if beginQ) = beginQc) and satQc) for somet kj — \>k>\
then beginQ) : = endQc);
if beginQ) =£ endQ) then
begin
endQ) : = beginQ);
beginQ) : = endQ) + 1;
sizeQ) : = 0;
goto step_2
end;
step_4: comment advance the variable stored in block / and continue
backtracking;
if variableQ) = Ak and Qc > 1) then
begin
variableQ) :=Ak^1;
beginQ) : = beginQ);
endQ) : = beginQ) — 1;
sizeQ) : = 0;
goto step_2
end;
step_5: comment back up on block/;
if (/' = 2) and (z = 1) then quit;
comment the above condition means that w is not in L;
repeat /: = / — 1 until not satQ);
comment goal i has failed and we backtrack to a previous goal.
Since not sat(l) this will always work;
goto step_2;
t Count downwards so the largest such k is used.

498 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
step_6: comment The subgoal is now so small that it can be done with a direct
test;
r:=0;
q : = begin(i);
while q < end(i) do
begin
if q = beginQc) for some' k,j>k~> 1 andsat(k) then
begin
ar+1 : = var(k);
q : = end(k)
end
else ar+1 : = wQ;
r:=r+\;
q-=q+U
end;
ifvar(i)i>at- ■ -ar then
begin
i: = i — 1;
goto step_2
end;
comment If the test fails then we must continue backtracking. Whenever
we get here r = s < 3 and the predicate may be checked in
the finite state control;
if i = 1 then accept
else
begin
/: = /;
repeat i: = i — 1 until not sat(i);
sat (J) : = true;
goto step_l;
end
end.
The tasks of showing that the program works and that only log2 n storage
is used on a Turing machine are left to the reader. □
Combining previous results leads to the following proposition.
Theorem 12.7.5 log n < S(n) < log2 («).
The situation for on-line tape bounds is quite simple.
Theorem 12.7.6 In the on-line case,
S'(n) = n.
Proof Note that we may copy the input on a work tape and use the upper
bound of Theorem 12.7.3 for recognition. Thus, S\ri) < n.
< Count downwards so the largest k is used.

12.7
GENERAL BOUNDS ON TIME AND SPACE
499
To handle the lower bound, again let
L = {wcwT\ wG{0, 1}*}.
There are 2" words of length n for w such that wcwT GI. The ID's of the Turing
machine when scanning the c must be different; otherwise we would accept wcwT
and wc(w')T when w=£w'. But the number of different ID's would be
r(n) = sL(n)tL(n)>2n.
By an analysis similar to that in the proof of Theorem 12.7.3, we find that
L(ri)>n.
Thus,S'{n)>n, and therefore S'(«) = n. □
PROBLEMS
1 Give the best time bound you can for the recognition of any 2-pda language
on a deterministic Turing machine with a single work tape and a read-only input tape.
2 Show that there exist nonregular context-free languages that can be
accepted by a tape-bounded Turing machine which uses space S(n) = log log(n).
3 Show that, if a language L is recognizable using space S(n), and if
S(n)
inf —L— = 0,
n-*~ log log(n)
then L is a regular set.
4 Show that it is recursively undecidable how much tape is required to
recognize a nonregular context-free language. [Hint. Find a nonregular context-free
language L0 such that L0 is recognizable in space log n if Lx n L2 # 0, and in space
log log(n) if Li n Li = 0, where L1 and L2 are variants of L(x) and I(y) of Section 8.3.
5* Show that for every nonregular, deterministic context-free language
L, recognizing L or L = 2* — L requires space log(n) on a nondeterministic Turing
machine.
6 Show that the language L defined in the proof of Theorem 12.7.6 can be
recognized in linear time on a RAM.
7 Let R(n) be the time required to test whether a string w G 2" is in L,
where L is a context-free language. Let CBM(n) be the time required to check whether
AB = C where A, B, and C are n x n boolean matrices. Show that
R(n)>sJCBM(n).
[Hint. Design a context-free language i(A, B, C) whose complement consists of strings
which represent that A, B, and C are n x n matrices and AB = C. Note that the time
to recognize_Z,(A, B, C) and its complement is the same. Moreover, a reasonable
encoding of L(A, B, C) is of order n2, so that
CBMin) <R(n2) < (R(n))2,
so
R(n)>s/CBM(n).
Describe L(A, B, C), and show that it is a context-free language.

500 RECOGNITION AND PARSING OF CONTEXT-FREE LANGUAGES
8 Based on Problem 7, devise a new context-free language L(A, B) which
depends only on two n X n boolean matrices A and B. Show how an on-line recognizer
for L (A, B) may be used to compute AB = C.
9 Using Problem 8, it follows that the time required by an on-line RAM to
recognize a context-free language is:
R\n)> \/BM'(n)
where BM\n) is the time required to compute the boolean product of two matrices
"on-line" (i.e., the ;th column of the product is computed before the; + 1 column of
B is read).
10 Write a recursive program to implement the algorithm given in the proof
of Theorem 12.7.4.
12.8 HISTORICAL SURVEY
Theorem 12.2.1 is well known in one form or another. Our particularly sharp
proof is due to A. Yehudai. Greibach [1973] found a hardest context-free language
first. Problem 6 of Section 12.2 is from the work of Sudborough [1976]. The
particular algorithm for multiplying 2x2 matrices used in Lemma 12.3.1 is credited to
Winograd by Fischer and Probert [1974]. Strassen [1969] was the first to find an
w2'81 algorithm for the multiplication of n x n matrices. Problem 6 of Section 12.3 is
from Warshall [1962]. The Cocke—Kasami—Younger algorithm may be found in many
places, such as Kasami [1965] and Younger [1967]. Our version (and much of this
chapter) derives from Graham and Harrison [1976].
Problem 8 of Section 12.4 is due to Taniguchi and Kasami [1969]. The hint
given follows an unpublished construction of W.L. Ruzzo.
Section 12.5 derives from Valiant [1975]. The detailed proof of the key
lemma is due to W.L. Ruzzo. Section 12.6 presents a new algorithm reported in
Graham, Harrison, and Ruzzo [1976]. This algorithm in its simplest form is an
adaptation of Earley's algorithm [1970]. The connection between this algorithm and the
Cocke—Kasami—Younger algorithm is deeper than is hinted at in the problems of
Section 12.6. Problem 8 is due to W.L. Ruzzo again.
The lower bound of Theorem 12.7.2 is due to Gallaire [1965]. Theorem
12.7.4 is due to Lewis, Stearns, and Hartmanis [1965]. Our proof is an adaptation
of a "program" written by Ronald Hunsinger. Problem 5 of Section 12.7 is due to Alt
and Mehlhorn [1975]. Problem 9 is due to Ruzzo.

thirteen
LR(k) Grammars
and Languages
13.1 INTRODUCTION
In Chapter 12, we learned a variety of ways to parse general context-free
languages. Such methods are not usually used in compiler construction because they
require at best superlinear time. In practice, one wants a linear-time method and,
moreover, one cannot tolerate large constants. An important class of grammars is known,
called LR(k) grammars, which have the property that they can be parsed in linear time.
Also, the grammar class is large enough to allow us to accommodate "natural
grammars" for programming languages. Any deterministic language has an LR(k)
grammar so that the family of languages is nontrivial and has desirable practical
properties such as unambiguity.
In the next section, the basic definition of an LR(k) grammar is presented. It
is shown that every strict deterministic language is LR(0) and that every deterministic
language is LR(1). In Section 13.3, an important characterization theorem for LR(0)
languages is proved.
Section 13.4 begins the treatment of parsing by defining table-driven LR-
style parsers. Section 13.5 is a highly technical treatment of the generation of sets of
LR(k) items. In Section 13.6, consistency of a set of items is introduced. It is then
possible to construct LR(k) parsers and to prove exactly how they operate. Section
13.8 concludes by showing that A0 = LR.
13.2 BASIC DEFINITIONS AND PROPERTIES OF LR{k) GRAMMARS
We are about to introduce LR(k) grammars. In order to study this family, it is
convenient to work with "handles" of canonical (or rightmost) sentential forms.
501

502 LR(k) GRAMMARS AND LANGUAGES
Definition Let G = (V, 2, P, S) be a context-free grammar and let y G V*. A
handle of y is an ordered pair (p, z), where p&P and i > 0 such that there exist >4 G N,
a, 0 G K*, and w G 2*, such that:
i) Sj> aAw £ o@w = y
ii) p is A -*■ 0
iii) z" = lg(a0).
Examples Let G\ be the grammar
S^aB\ab
B^b
The string y = ab has handle (S^-ab, 2) by the derivation
On the other hand, y has handle (B -*■ b, 2) by derivation
S^aB^ab
Consider grammar G2 shown below:
S^Aa\aA
A -+a
The canonical sentential form y = aa has handles (A -+a, 1) and (A -+a, 2), as the
reader may verify.
The idea behind an LR(k) grammar is that a parsing device is reading some
canonical sentential form. We wish the device to work by finding the handle (A -*■ 0, z)
and replacing fiby A. This gives a new canonical sentential form and the action is
repeated. In order for this to work, our grammar must have the property that the handle
is uniquely determined by what we have read up to now and by the next k characters
ahead. The formal definition that follows says just this.
Definition 13.2.1 Let k>0 and G = (V, 2, P, S) be a reduced context-free
grammar such that Sj> S is impossible in G G is LR(k) if, for each w, w', x G 2*;
y,a,a',p,p'eV*,A,A'£N,if
$
i) S j* aAw j* afiw = yw (That is, yw has handle (A -*■ 0, lg (a0)).)
ii) S^a'A'x^a'fl'x = yw' (That is, yw' has handle (A'^-fl',
lg(a'0'))-)
iii)t (fe)w = <feV
then iv) (A ■* 0, lg (a©) = (A' -» 0', lg (a '0')).
Example The grammar G3 shown below is LR(0).
S^aAc
A^Abb\b
' Recall that (fe)w denotes the first k letters of w if lg(w) > k, and all of w otherwise.

13.2 LR(k) GRAMMARS 503
At this stage, we have not yet learned efficient tests for the LR(0) property. We must
enumerate all canonical sentential forms and verify that they have unique handles. The
canonical sentential forms are listed below along with the corresponding handles.
Canonical Sentential Forms Handles
aAc (S^-aAc,3)
aAb2ic (A ^ Abb, 4)
ab2i+1c (A+b,2)
Note that L(G3) = {ab2i+1c\i>0}.
Now let us consider a new grammar G4, which generates the same language.
Go, consists of:
S^aAc
A^bAb\b
Consider the canonical sentential forms
abc and abbbc.
The first form has handle (A -*■ b, 2), while the second one has handle (A -*■ b, 3);yet,
when we look back, we see y = ab and the same lookahead string, namely, A.
Therefore, G4 is not LR(0).
The conclusion in Definition 13.2.1, that is, (iv), has several implications.
1 By the definition of equality of ordered pairs, we have A =A',fS = fi', and
lg(c0) = lg(a'0').
2 y = <Mr))a>p> = a««rf»a'0' = ag(«'PV/3' = a'0\ Thus, 7 = a0 = a'jj'.
3 Since 0 = 0', from (2) we have a = a'.
4 a'fi'x =yw' implies a'fi'x = a'fi'w' impliesx = w'.
Note that, if G is LR(k), then G is LR(k') for all k' >k.
One of the properties that we wish a grammatical class to possess, in order
that it constitute a useful class for parsing, is unambiguity. We show that the LR(k)
grammars are unambiguous.
The first lemma gives an equivalent formulation for equality of handles. This
lemma does not require that the grammar be LR{k).
Lemma 13.2.1 Let G = (F, 2, P, S) be a reduced context-free grammar.
Assume that for a, a',0,0' G V*;w, w' GT,*;A,A' SN,
St> cxAw^-aBw
and h h
S^aA w jfapw = apw
Then, aAw = a'A 'w' if and only if
(,4-0,lg(o0)) = (-4'-j3',lg(a'j3')).
Proof Assume that:
(^-j3,lg(a©) = (,4'-0',lg(a'0')).

504 LR(k) GRAMMARS AND LANGUAGES
Since a'fi'w' = otfiw and lg(aj3) = lg(a'|3'), we have afi = a'|3' and w = w'. Since 0 = 0',
we have a = a'. Also,^4 = A'. Thus,a4w = a'A'w'.
Conversely, assume that aAw = a'A 'w'. By our quantification, since A and
A' are the rightmost nonterminals in aAw and a!A 'w', respectively, we have A = A',
a = a', and w = w'. Since a|3w = a'fi'w', we have 0 = 0'. Thus,
(A -0,lg(a0)) = (4'-J3',lg(a'j3')). □
The second lemma characterizes unambiguous grammars in terms of handles.
Lemma 13.2.2 Let G = (V, 2, P, S) be a reduced context-free grammar.
Then G is unambiguous if and only if every canonical sentential form has exactly one
handle except S, which has none.
Proof Assume that G is unambiguous.
Claim 1 Shas no handle.
Proof Assume, for the sake of contradiction, that S has some handle. Then,
for some a € V* and w G 2+, n>0, m>\, since G is reduced, we have
and also t
SRW
Thus G is clearly ambiguous and we have a contradiction. Therefore S has no handle.
Claim 2 Every canonical sentential form except Shas at least one handle.
Proof This follows directly from the definition of a canonical sentential form.
Claim 3 Every canonical sentential form except S has at most one handle.
Proof Assume, for the sake of contradiction, that this is not true. Then we
have, for a, a', 0,0" G V*, w, w',x G 2*, a, a' &P, p, p , p" GP* ,A ,A' GN,
P' o „ P
Sj£aAwj>a(Sw = afiw =* x
where (>4 ->0, lg(a/3)) ¥= (-4' ->0', lg(a'|3')). Since G is unambiguous, p 'op = p"a'p. It
follows that p' = p"; thus a4w = aU'w'. Therefore, by Lemma 13.2.1, (A ->-/3,
lg(a/3)) = (4' ->• 0', lg(a'/3')), giving us a contradiction. Thus, every canonical sentential
form except S has at most one handle.
Conversely, assume that every canonical sentential form has exactly one
handle except S, which has none. Assume for the sake of contradiction that G is
ambiguous. Then there exist m, n, m>n>\, ah aj&V* for 0</<«, 0</</7z,
w G 2 *, and distinct derivations of w in G
S = aog'ar • -^a„ = w
We show how these derivations differ.

13.2 LR(k) GRAMMARS 505
Claim There exists some i, 1 < i< n, such that a„_,- =£ a'
Proof Assume, for the sake of contradiction, that an _, = a^, _ ,• for 1 < i ^ n.
If m = n, the two derivations are not distinct; therefore m>n. It follows that
a0 = a'm-n, where m —n =£0. We know that a^, _„_! j*c4,_„, so a'm-„ = Shas a
handle, which is contradiction.
We now let /, 1 </<«, be the smallest integer such that a„_;- ^a^,-/. It
follows that a„-j+i =a^,_J+1. Since every canonical sentential form has at most
one handle, Lemma 13.2.1 gives us that a„_y = a'm-j, which is a contradiction. □
Now we shall apply these results to verify that every LR(k) grammar is
unambiguous.
Theorem 13.2.1 Let G = (V, 2, P, S) be an LR(k) grammar, k>0. Then G
is unambiguous.
Proof Assume, for the sake of contradiction, that G = (V, 2, P, S), an LR(k)
grammar, is ambiguous. By Lemma 13.2.2, we have either:
1 S has a handle;
2 some canonical sentential form besides S has no handle; or
3 some canonical sentential form has more than one handle.
CASE 1. Suppose S has a handle. Then, for some a G V*, x G 2*, we have
Thus S =? S in G. But this contradicts the fact that G is LR(k).
CASE 2. Assume that some canonical sentential form besides S has no
handle. This is impossible, by the definition of a canonical sentential form.
CASE 3. Assume that some canonical sentential form has more than one
handle. We have, for y, a, j3,a', 0' G V*;w, w' G I,*; A,A' GN,
S j* aAw j* a0w = yw
Sj*aU'w'j*a'j3'w' = a0w = yw
where 04 -> 0, lg(a0)) ¥= (^' -*0', lg(a'0')).
Now, we know that (fe)w = (feV. Thus, since G is L#(&), (4 ->-0,ig(a0)) =
(A' -*■ 0', lg(a'0')). But this is a contradiction. □
The following lemma tells us when a grammar is not LR(k). Consequently the
lemma is often useful in proofs by contradiction.
Lemma 13.2.3 Let k>0 and G = (V, 2, P, S) be a reduced context-free
grammar such that S j* S is impossible in G G is not LR(k) if and only if there exist
w, w',xeZ,*;A,A'GN;y', 7,a,a',0,0'G K*, such that
i) S j* a4 w j* a0w = 7W
n) Sj>a A x j>a p x =y x = yw
iii) <fe>w = <feV

506 LR(k) GRAMMARS AND LANGUAGES
/-
a
~\
P
a'P'
w
'. X
w'
FIG. 13.1 The relations between the strings.
and iv) (A -»0,lg(aj3))# (,4' -► 0', lg(a'j3')), with
v) lg(a'0')>lg(a0).
Proof If (i) through (v) hold, G is clearly not LR(k). Conversely, assume that
G is not LR(k). Then, clearly, (i) through (iv) hold.
Suppose
v') lg(T') = lg(a'0') < lg(a0) - lg(T).
We shall show how, by reversing (i) and (ii), we can satisfy (v).
Figure 13.1 illustrates the relationships between the strings in (i) and (ii).
Now, let y = W>-W»x. Clearly, y G 2 *. We now have, for x, y, w G 2 *;
y',a',a,p',peV,A',AeN,
i ) S^aA x^a p x = y x
From (i) we have:
ii')^
From (iii), we see that:
ii') Sj*aAwj*otfiw = y'yw
(fe)v
(ft)
yw
(ft)
yw
Thus iii') (fe)x = (k)yw
From (iv), we get:
iv') (A' - p\ lg(a'0')) * (,4 - 0, lg(a0)).
Thus (i') through (v') satisfy our lemma.
□
The following theorem will be extremely useful in studying the class of
LR(0) languages. It is an inductive version of the definition of an LR(k) grammar.
Theorem 13.2.2 {Extended LR(kj theorem) Suppose G = {V, 2, P, S) is an
LR(k) grammar and there exist aGK*;;c1,;c2,wG2*, such that
ii) S=>wx2
iii) ™Xl = (fe)x2
iv) x2 ¥= A, k > 0, or k = 0 and there exists no x G 2* such that &c is a
sentential form of G with a handle whose second component is 1 ;t
then v) S j* ox2 ^" wx2.
' In most cases, we will use the fact that x2 ¥= A. The other possibilities will be useful
later on and in the problems.

13.2 LR(k) GRAMMARS 507
/Voo/We assume, for the sake of contradiction, that (i), (ii), (iii), and (iv)
hold, but not (v). Suppose axi j* wx\ is a derivation of n steps, where n > 1, by the
(unique) derivation
«*! = anxl^an_lxl^ ...^axxx = wxx
with a( G V*, for 1 < / < n. Let m be the number of steps in the derivation S j* hw2 ,
and let r = min(/rz, «).
Now, suppose that the last r steps of the derivation S^> wx2 are
for some a; G V*, for 1 < / < r.
Claim There exists some £ < r such that ag ¥= ae;c2.
Proof By contradiction. Suppose ag = asx2 for all £ < r.
CASE 1. r<n. Then ar'=arx2=S. Thus ar = S and x2 = A. Since
(fe)x! = (fe)x2, we must have k = 0 or x i = A. If x i = A, then S =*■ S, which contradicts
the fact that G is LR(k). Therefore, & = 0. However, we know that ar+1jc! j*
cvti = 5xi. The handle of Sx\ has second component 1, contradicting (iv).
CASE 2. r = n. Again ar' = a^. We have S =jfar' = arx2 = anx2 = ooc2 ??
wx2. But this is (v), which is assumed to be false. Thus, we have a contradiction and
the claim is established.
Now let m be the smallest positive integer satisfying our claim. Clearly, m > 1,
since a[ = wx2 = ai*2.
Now, we know that there exist a, a', (?jGF*;l,i,6A'andy, z £2*
such that
0 s§amXi = aAyxx gofiyxx =am-1x1
ii) s^am =a'A'zj>a'fl'z = ain_l =am_lx2 =o$yx2
using the fact that a^j = am_1x2 from our minimality assumption about m.
Now let 7 = ajl We get:
i) S^aAyxl ^ ot^yx^ = yyxi
ii) Sj>a'A'z j* a'fi'z = yyx2
Now <fe)jci = (fe)x2 implies (fe)j>*i = (k)yx2 ■ Since G is Lfl(fc), we have
(A - ft lg(aj3)) = (J' - j3', lg(6'j3')).
From (ii) a^, = a 'A'z, and, using the equality of handles, we have j3 = p,A =A , and
thus a' = a and z-yx2. Thus a!m =aA~yx2 =amx2. But this contradicts our
assumption that a'm =£ amx2. D
Now we must turn to relating A2 to the family of LR(0) languages. This will
involve a detailed study of derivation trees and canonical cross sections. The reader
may wish to reread the definitions given in the problems of Section 1.6. Our next
lemma relates the canonical cross sections to the left part of a tree.

508 LR(k) GRAMMARS AND LANGUAGES
Lemma 13.2.4 Let T be a grammatical tree of some grammar, (n)r the left
part of 7 for some n >0,and % = (xlt.. . ,x^,yi,.. . ,ym) a canonical cross section
in T with exactly the nodes xx,... ,xk in (n)7\ Then 17 = (*i ,...,*&) is a canonical
cross section in (n)7\
Proof The argument is an induction on the level h of the cross section if.
Basis. The case when h = 0 is immediate, since if = 17 = (x0) where x0 is the
common root of T and (n)7\
Inductive step. Assume the lemma true for canonical cross sections of level
h > 0. Let % have level h + 1 and let if' be the (unique) canonical cross section of level
h in T. Using the inductive hypothesis, let 17' be the canonical cross section in (n)T
which consists of all the nodes of if' which are in (n)7\ Now, to obtain if from if' we
need the rightmost internal (with respect to T) node, say x, in if'. There are two cases.
If x is not in (rt>T, then none of its descendants is in (n)r and 17 = 17'. Hence, 17 is a
canonical cross section in (n)7\ On the other hand, if x is in (rt>T, then it is the
rightmost internal node not only in £' (with respect to T) but also in r\' (with respect to
(ri>T) and, by the definition of canonical cross section, we conclude that r\ is a
canonical cross section in (n)r. D
The next lemma is, in a certain sense, a converse of the previous lemma.
Lemma 13.2.5 Let T be a grammatical tree of some grammar and let (n)rbe
the left part of 7 for some n > 0. Let 17 = (xi,... ,xk) be a canonical cross section in
MT and let ^j,.. .,ym be all the leaves in T which are right of xk; that is,xfe L^i L
• • -\-ym. Then the sequence % = (xi,... ,xk,yi,... ,ym) is a canonical cross
section in T.
Proof Again the argument is an induction, this time on the level h of canonical
cross section 17.
Basis. h = 0. Let x0 be the root of MT (and of T). Then 17 = (x0) and the
only possibility for % is if = 17 = (x0). Hence, if is a canonical cross section (of level 0)
inT.
Inductive step. Assume the lemma true for canonical cross sections of level
h > 0. Under the same notation as in Lemma 13.2.4, let 17 = (xi,..., xk) be of level
h + 1 and let 17' = (x[,... ,x'k') be the (unique) canonical cross section of level h in
(n)r. By the induction hypothesis the sequence if' = (x[,... ,x'k, yjt...,ym) for
some / > 1 is a canonical cross section in 7\ Since none of the nodes yt (/ <i<m) is
an internal node in T we can conclude* from the definition that if = (xx,... ,xk,
yi. • • ■.ym) is a canonical cross section in T. □
* We need only use the following observation: If (xj, ..., xp,..., xm) is a canonical
cross section where xp+1,... , xm are leaves and if x[,. .., x'q are all the leaves in the
subtree below xp> then (xj,.. ., xp_x, x\,. .., x'q, xp+1,. .. , xm) is a canonical
cross section.

13.2 LR(k) GRAMMARS 509
We are now prepared to relate A2 and LR(0).
Theorem 13.2.3 Any reduced strict deterministic grammar is also an LR(0)
grammar.
Proof Let G = {V, 2, P, S) be a reduced strict deterministic grammar. First
observe that S j> S cannot occur in P since that would contradict the Corollary to
Theorem 11.4.3. Now, following Definition 13.2.1, assume that, for some aG V*,
w, w'G 2*,p, p' GP and ri >0, (p,lg(a)) and (p', n') are handles of aw and aw',
respectively. We want to prove that these two handles are identical. Informally
summarized, we shall prove this equivalence by first showing that each of the two
handles appears in a certain position in a left part of a given derivation tree, and
second, using the left-part theorem, that this left part is unique.
Since G is reduced, there is some u G 2 * such that a j? u. Let T, T' be two
ti
derivation trees of G corresponding to derivations
Sjfaw^uw and S^ocw'^uw'.
ti ti ti ti
Since G is unambiguous (cf. Corollary 1 after Theorem 11.8.1), each of the two trees
is unique. Moreover, since aw and aw' are canonical sentential forms (by Definition
13.2.1, this is a necessary condition for a sentential form to have a handle), there are
canonical cross sections ^ in T and ifl in T' such that A(ifi) = aw and A(ifi) = aw';
each canonical cross section is also unique in its respective tree (cf. Problem 7 of
Section 11.6). We shall study these cross sections in detail (Fig. 13.2). Assume %x is
of level h (in T) and %[ of level h' (in T'). The existence of handles implies the
existence of a canonical cross section if0 of level A — 1 in T and another canonical cross
section ifo of level h' — 1 in T'. Corresponding to the definition of canonical cross
section, let us write, for 1 < k < m,
£0 = (xi,...,xfe,...,xm)> (13.2.1)
A T
(n + i)T / J \
^3—o-
u w
FIG. 13.2

510 LR(k) GRAMMARS AND LANGUAGES
and, forr> 1,
£j = (x1,...,xk^1,y1,. ..,yr,xk+1,. ..,xm), (13.2.2)
where
X(JCi)---X(xfe_1)X0'1)---X0;r) = 7, X(xfe+1)---X(JCm) = w.
Similarly, for & and 1 <k'<m', (13.2.3)
& = (pc'i,... ,Xk',. .. ,x'm'), (13.2.4)
and for if J andr'S* 1,
%\ = (x\,...,x'h^x,y[,...,y'r\x'h\x,...,x'm'). (13.2.5)
In (13.2.1) and (13.2.4), xh and x'k< are the rightmost internal nodes in if0 and ifo,
respectively.
Next we shall apply the Left-Part Theorem to T and T'. Assume first that
w#A. Let n = lg(«) and let z be the (n + l)st terminal node in T; that is, X(z) = (1)w.
From (11.2.3), we can see that z appears both in if0 and ifi or, more specifically,
z = xq for some £ > k. We have
w/Hr) = « = (n)HT'),
and the left-part property of the derivation trees of G yields the existence of a
structural isomorphism/: <"-1>7,->•<"',■1>7,' and,moreover, for any* in T,
x\lz implies \{x) = X(/Qc)), (13.2.6)
and for z itself, X(z) = X(/(z)) = (1V. (13.2.7)
Define
t?o = (*!,. ..,xfe,...,z),
1?! = (JC1)...)JCfe_1)^1)...)^r)JCfe + 1)...,z).
(Note that xh ¥=z, but xfe+1 = z is possible.) By Lemma 13.2.4, t/0 and t/i are
canonical cross sections in ("+1)r. Because canonical cross sections are invariant under
structural isomorphism, the following are canonical cross sections in ("+1)7,r (cf. Fig.
13.2).
/too) = (/(*i),.-.,/(*fe),...,/(z)),
/(t?i) = (/(*i), . • • ,/C«fc-i),/CVi), • • • ,/0v), • • • ,/(*))•
Extending /(t/0) and /(t/j) by all the leaves in T' on the right of/(z) and using
Lemma 13.2.5, we obtain two canonical cross sections £0' and ifi' of T'. Now, by
(13.2.6) and (13.2.7), X(ifJ') = aw' = X(ifJ) and, since G is unambiguous, we have that
ifi' = £i by Problem 8 of Section 11.6.
Now yx,... ,yr in (13.2.2) are the immediate descendants of xk in T (and
thus in <"+1>r) and therefore, using the structural isomorphism ("+1)7^ ("+1)7^
f(ji), ■ ■ ■ ,/0v) are the immediate descendants of f(xk) in <n+1>7'' (and, since
f(xk)[lf(z), this is also true in T'). Moreover, f(xk)is the rightmost internal node
in /(t?0) and hence in if0'. Consequently, ifo has level h' — 1, which is possible only if
£o = %'o (cf. Problem 6(b) of Section 1.6). Therefore in (13.2.4) and (13.2.5), we have
k' = k, r =r, x'h> = f(xh), and y[ = /()>,-) for /= 1,... ,r. Using (13.2.6), we
conclude thatp' =p andlg(a) = k—\ + r = k' — 1 +r' =n'.
It remains to verify the case w = A. In that case T=T' by the Left-Part
Theorem and the result is an immediate consequence of Problem 8 of Section 1.6. □

13.2 LFHk) GRAMMARS 511
We are now in a position to prove a useful theorem. We shall show that every
A0 language has an LR(\) grammar. When we learn how to accept every LR(k)
language by a dpda and some other techniques, we shall have proved that A0 is
coextensive with the LR(k) languages.
One additional lemma is necessary.
Lemma 13.2.6. Let G be an LR(0) grammar of the form G = (KU{$},
2 U {$}, P, S) where P£ Nx (V* U V*$), L(G) £ S*$, and there is no derivation
Sf 5$ in G. Define another grammar G' = (V, 2, P', S) where P'=P1UP2,
i>! = {A -+P\A^pinPmdpe V*},
P2 = {A+P\A^P$inP}.
Then G' is an LR(1) grammar and L(G')$ = L(G).
Proof Assume, without loss of generality, that G is reduced. Clearly, there is
no derivation S j* S in G' since S^S$ cannot occur in G. We shall examine the handles
in G'. Let h = (A -+ 0, /) be a handle of y in G' where A GN, j3, y G V*, and / > 0.
There are two possibilities. If A -*■ j3 is in P1, then h is also a handle of y or 7$ in G. On
the other hand, if A -*■ j3 is in P2, then (A -*■ j3$, / + 1) is a handle of 7$ in G.
Now, following Definition 13.2.1, let 7G V*, w, w' G S*, p, p' GP', i>0,
and let h = (p, lg(7)) and h' = (p', /') be handles of yw and 7W', respectively.
Moreover, let (1)w = (1)w'. We shall show that h=h'. Assume that p and p' have the form
A -*■ j3 and >4' ->• j3', respectively, where >4, >4' GTV, )3, j3' G K*. We can distinguish four
cases.
CASE 1. p, p' Gi3!. Then h is also a handle of 7wor tw$ in G and/T is also
a handle of yw' or7w'$ in G. In all cases we have h =h' by the LR(0) property of G.
CASE 2. p,p'eP2. Then A, = (^ -»• j3$, lg(7) + 1) is a handle of yw% in G
and h[ =(^4'->-j3'$, /+ 1) is a handle of 7w'$ in G. Here (1)w = (l)w' implies
(1)(w$) = (1)(w'$) and thus hi =h[ by the LR(0) property of G. Therefore, also,
h = h'.
We shall see that the remaining two cases cannot occur. First, let us make the
following observation, which is a direct consequence of Definition 13.2.1. If h = (p,
lg(7)) is a handle of yw and if yw' is a canonical sentential form where w, w' G 2*,
then h is also a handle of 7w'. We proceed in our case analysis.
CASE 3. p&Pi, p'&P2. Then h is a handle of 71V or yw$ in G and
hi = (A' -► j3'$, i + 1) is a handle of yw'$ in G. Here yw'% is a canonical sentential
form in G (otherwise it could not have a handle) and thus h is also a handle of 7w'$
in G. Then, by the LR(0) property, h=h\. But this is not possible since $ does not
occur in j3.
CASE 4. p' GPi, pGP2. Then h' is a handle of 7w' or of 7w'$ and /i! =
(A ->• j3$, lg(7) + 1) is a handle of 7W$. The form of hi implies that w — A. Since
(1)w = (1)w', also w' = A. Now h' is a handle of 7$ (if it is a handle of 7, then it is also

512 Z_fl(fc) GRAMMARS AND LANGUAGES
a handle of 7$ since 7$ is a canonical sentential form). But also hi is a handle of 7$
and thus h' = h 1, which is again not possible.
We have proved that h = h' and therefore G' satisfies the LR{\) property.
The fact that L(G')$ = L(G) is immediate from the definition of C. □
We can now prove an important consequence of the previous result.
Theorem 13.2.4 Any deterministic language has an LR{\) grammar.
Proof Let L £ £* and L G A0. Then L$ G A2 and, by Theorem 11.5.3, there
exists a strict deterministic grammar G = (TU {$}, 2 U {$},i>, 5)such that L(G) = L%.
Using Theorem 11.4.1 and Problem 3 of Section 11.4, we may assume, without loss of
generality, that G is reduced and A-free (note that L(G) ¥= {A}). By Theorem 13.2.3, G
is LR(0). We shall check that G satisfies assumptions of Lemma 13.2.6. First,
P£ Nx {V* U V*%), or in other words, each production in Phas the form A -»-/3 or
A -+0$, where A GNand0G V*.For,if A -+0$5 where in P for some 5 =£A, the fact
that G is reduced and A-free would imply that u$vGL$ for some u, v G £* and v =£ A,
which is not possible. Second, S?>S$ is impossible in G, since that would contradict
the Corollary to Theorem 11.4.3 and the strict determinism of G. Now we can apply
Lemma 13.2.6 and obtain an LR(l) grammar G' such that L(G') = L. □
PROBLEMS
1 Is the following grammar LR(k) and, if so, what is the least k for which it
is true?
E-*E+T\T
T^-T*F\F
F^(E)\a
2 Let k > 0. Find a family of grammars Gk such that G& is not LR(k) but is
LR(k+ 1).
3 Show that the restriction against derivations of the form S^SinDefinition
13.2.1 is necessary to preserve the unambiguity of LR(k) grammars.
4 Another definition that has been proposed for LR(k) grammars in Aho and
Ullman [1972] is now given.
Definition Let k > 0 and G = (V, S, P, S) be a reduced context-free grammar.
Define the augmented grammar G' = (V', S, P', S'), where V' = V U {S'} and
P' = P U {S' -*■ S}, where S, a symbol not in V, is our new starting symbol. G is said to
be ALR(k) (augmented LR(k)) if and only if, in G',
For each w, w',* G £* ;7, a, a', 0,0'in F'*;^,^'G f'-S.if
i) S' => a4w g> a0w = 7w (That is, yw has handle U ~* P, lg(a0)))
and ii) S'ga'A'x jfa'p'x =yw! (That is, 7w' has handle U'-^',
lg(o'U')))

13.2 LR(k) GRAMMARS 513
and Hi) (fe)w = (feV
then iv) (A -» 0, lg(a0)) = (A1-* fi', lg(a'0')).
Show the following lemma.
Lemma Let G = (V, S, P, S) be a reduced context-free grammar. For each
k > 0, if G is ALR(k), then G is Z,/?(fc).
5 Prove the following partial converse of Problem 4.
Lemma Let G = (V, S, P, S) be a reduced context-free grammar. If G is
ZJ?(fc) for some k > 1, then G is ALR(k).
At this point, we know that the family of LR(k) languages and grammars is the same
for k > 1. Let us examine the case k = 0.
6 Prove the following result.
Lemma If G = (K, E, P, S) is LR(0) and S^Sw is impossible in G for any
wGS+,thenGis^Z-P.(0).
7 Prove the following lemma.
Lemma Let G = {V, 2, P, S) be an ^Z,i?(0) grammar. For any wGS*,
S ifSw is impossible in G.
8 Summarize the relationship between the families of ALR{k) and LR(k)
grammars and languages.
Now let us consider a different variation on the LR{k) definition.
Definition Let k > 0 and G = (V, S, P, S) be a reduced context-free grammar.
Define the %-augmented grammar G' = (V, £', P', S\ where K' = 7U{S', $},
S' = SU {$}, P' = P U {S' ^-S$}, where S' and $ are new symbols not in V. G is said
to be $LR(k) if and only if, in G ,
For each w, w', x G S* ; 7, a, a', 0, 0' in V'*, A, A' G JV', if
i) S1' =? aAw ^ a0w = yw (That is, yw has handle {A -+ 0, lg(a0)))
and ii) S'=>aVx^a'0';c=7w' (That is, yw' has handle U ' -+P\ lg (a '0')))
xi rv
and iii) (fe)w = (feV
then iv) {A -* 0, lg(a/3)) = (A ' ■* 0', lg(a'/3')).
9 Let G = (V, 2, P, S) be a $ZJ?(fc) grammar and fc > 0. Show that Sg> S
is impossible in G.
10 Let G = (F, £, P, S) be a reduced context-free grammar. Show that, for
each k > 0, if G is $LR(k), then G is ZJ?(fc).
11 Let G = (F, S, P, S) be a reduced context-free grammar. Show that, for
each k > 1, if G is Z,/?(fc), then G is $LR(k).
In order to work out the properties of the $LR(0) grammars, we need another
definition.

514 LR(k) GRAMMARS AND LANGUAGES
Definition Let G =(F, £, P, S) be a reduced context-free grammar. A
production p GP is pathological if:
i) there exists a T&N, w G £ *, such that
P = (7"-»>S)
and +
S^Tw^Sw
a ti
or ii) there exists an A G N, w G £ * such that
p = U ■* A)
and
12 Prove the following result.
S^S^wg'S'w
Lemma Let G = (F, £, P, S) be a reduced context-free grammar. Then G
has no pathological productions if and only if there exists no w G £* such that Sw is a
sentential form of G with a handle whose second component is 1.
13 Let G = (F, £, P, S) be a reduced context-free grammar. Show that G is
$LR(0) if and only if it is LR(0) and has no pathological productions.
14 Prove a theorem which relates the families of LR(k) and $LR(k)
grammars for all k > 0.
15 Prove a theorem which relates the families of %LR{k) and ALR(k)
grammars for all k > 0.
Finally, another variation of the LR condition is given.
Definition Let G = (V, S, P, S) be a reduced context-free grammar. Then G
is LLR{k) if and only if:
a) G is unambiguous, and
b) for all W[, w2, W3, w'3 G £*, ^4 GJV, if
i) S => wij4w3 => w»i W2W3
ii) S^wiWjWs and
iii) (fe)w3 = (fe)W3 then
iv) S=> wij4w3
16 Let G = (F, £, P, S) be a $LR(k) grammar for some k > 0. Show that G
is LLR(k).
17 Show that every LR(0) grammar is LLR{ 1).
18 Let G = (F, 2, P, S) be an LR{k) grammar, where k > 1. Show that G is
an LLR{k) grammar.
19 Show that, for any k > 0, there exists a grammar which is LLR(0) but not
LR(k).
20 Prove the following useful result.
Theorem Let G = (V, S, P, S) be a reduced context-free A-free grammar.
Then G is $Z,i?(fc) if and only if G is LLR(k).

13.3 CHARACTERIZATION OF LfHO) LANGUAGES 515
21 Prove the following lemma about canonical cross sections.
Lemma Let T be any tree and if a canonical cross section in T of level
h > 1. Let x, y be two nodes in £ such that x\l y and y is internal in T. There exists a
+
node z in Tsuch that zV x and zfy.
Now we introduce an interesting family of grammars which is closely
related to the LR grammars. In this family, the handle is uniquely determined by
looking ahead k characters and looking behind £ characters.
Definition Let G = (V, 2, P, S) be a grammar with no derivations of the
form S ^S. Let C, k > 0. Then G is called a BRC($., k) grammar (bounded right
context) if the following condition holds for all a,a',0G V*;w, w'G2*;.4 GiV;p'GP
and i' > lg(a'j3). If (A -»■ 0, lg(a|3)) and (p', /') are handles of aQw and a'/3w',
respectively, and if a(fi) = a'(fi) and (fe)w = (fe)w', then p' has the form A -*■ 0 and /' = lg(aj3).
22 Prove the following useful result.
Theorem Every strict deterministic grammar G is equivalent to some strict
deterministic BRC(C, 0) grammar G' for some S. > 0. (#wf. Let G = (7, S, P, S) be a
strict deterministic grammar with partition n. Let 7r — S = {Zi, .. . , Xm } = X, where
X,- is a representative of the rth block.) Define G' = (V U X, 2, P', S) where
P' = {X.-^AIK^mjU^^ZjaU^aisinP^GZ,}.
Show that L(G') = L(G), G' is strict deterministic, and (with the help of the previous
problem) that G' is BRC(C, 0) where fi = max{lg(a)U -»-a is inP}.
13.3 CHARACTERIZATION OF LR(0) LANGUAGES
Our main purpose in this section is to prove an important and very useful
characterization theorem for LR(0) languages.
Theorem 13.3.1 (LR(0) language characterization theorem) Let L £ 2*. The
following four statements are equivalent.
a) L is an LR(0) language.
b) LSl 2* is a deterministic context-free language and for all xG2+; w,
y G2*,if wGL, wx GI, and_y GZ-, thenyx GL.
c) There exists a dpda A - (Q, 2, T, 5, <y0, Z0, F), where F = {qt} and there
exists Zf £ r such that
/. = r(^l,Zf) = T(A,r) = {wG2*|(flo,w,Z0)r^(flf,A,Zf)}.
d) There exist strict deterministic languages L0 and L\ such that Z- = L0L*.
Proof We first prove that (a) implies (b). Assume that G = (V, 2,P, S) is an
ZJ?(0) grammar and that L = L(G). Furthermore, suppose that, for w € 2*, x G 2+,
we have wGL and wx G/,. Thus, we have, in G, the derivations

516 LR(k) GRAMMARS AND LANGUAGES
and ii) S^wx
iii) (0)A = (0)x = A
and iv) x =£ A.
By the extended LR(0) theorem (13.2.3), we get
v) S=>Sxfwx
Now we assume that, for some y G £*, we have y &L. Thus, we have S^y.
Derivation (v) gives us „ +
Sj>Sxtyx
Thusyx GL, which completes the proof that (a) implies (b).
We now prove that (b) implies (c). This is fhe.most involved part of the proof
of this theorem, since several machine constructions are involved. We begin by
considering the degeneratet languages that obey (b), namely 0 and {A}. We then consider
the prefix-free languages that satisfy (b). We next consider the languages that satisfy
(b) that are not prefix-free.
Suppose L = 0. Let A - ({q0,qfl 2, {Zt\ 0, q0,Zf, {<?,}), where q0 i=qt.
Clearly, T(A, Zf) = 0. Suppose L = {A}. Let A=({qt}, Z, {Zt}, 0, qt, Zu {qf}).
Clearly, T(A,Zt) = {A}.
Now, we shall operate under the assumption that L is not degenerate. We
know that there exists a dpda A =(Q, 2,r,8,q0,Z0,F) such that L = T(A) since L
is deterministic. We wish to modify A so that we obtain a new dpda such that L is
accepted by a new machine with only Zt on the pushdown. Our first step is to add a
new "bottom of stack" marker to our machine. We let
A' = (Q',Z,r',8',qi,Zb,F'),
where
Q = Q u {<7o}, where q'0 is a new state not in Q,
r' = T U {Zb}, where Zb is a new symbol not in T,
F' = F.
Define 5' as follows:
i) ForaU^GC,aGS,ZGr,let
S'(q,a,Z) = 5(q,a,Z).
ii) Also let
8'(qo,A,Zb) = (q0,ZbZ0).
Clearly, T(A')=T(A).
We now choose any xG£* such that xGmin(Z<). Since Z-^0, clearly,
min(L) ¥= 0. We shall now consider two cases. The first case will correspond to the
strict deterministic languages. Observe carefully, however, that this does not follow
directly from the statement of this case. Our construction will be much simpler in this
case, than in Case 2, where our language is not prefix-free.
' A language L is said to be degenerate if L = 0 or L = {A}.

13.3 CHARACTERIZATION OF Lff(O) LANGUAGES 517
CASE 1. Choose any x G 2 * such that x G min(L). For all y G 2\ xy $ L.
Claim L is strict deterministic.
Proof Assume, for the sake of contradiction, that L is not strict deterministic.
Then there exist w6E*,y6E* such that w GZ, and wy GZ, since L is not prefix-
free. Since x GL,xy GZ.,by (b). But this contradicts the supposition for Case 1,giving
us a contradiction. Thus L is strict deterministic.
We shall construct a machine that simulates the machine A' until a final state
of A' is reached. It then erases the stack until a new bottom of stack marker is reached,
and then goes to a special final state and puts the special accept symbol on the
pushdown. We let
A" = (G",S,r",5",^,Z6)F"),
where
Q" = Q',
V" = r'u{Zf}, where Zf is a new symbol not in T', our special
acceptance stack symbol,
n i
qo = qo,
F" = {qt\
and 5 " is obtained from 5' as follows:
i) For4G£'-F',aG2A,ZGr,let5"(4,a,Z) = b\q,a,Z).
ii) ForallflGF',ZGr'-Z6,welet
S"(q,A,Z) = (q,A).
iii) For q G F\ we let
8"(q,A,Zb) = (qt,Zt).
These cases have the following significance.
i) For nonfinal states, behave the same as A' behaves.
ii) When A" reaches a final configuration for A', erase the stack until:
iii) The bottom marker is reached. Then go to the final configuration by
replacing the bottom marker with the special accept symbol on the
pushdown.
Since L is strict deterministic, we define no move for this configuration, making it a
"dead configuration." Clearly,
T(A",T") = T(A",Zt) = T(A') = T(A) = L.
We now consider Case 2.
CASE 2. Let x G S* be any string such that x G min(L). For some y G £+,
xy GL. In this case, our machine cannot go to a dead configuration after reaching an
accepting configuration. We shall construct a new machine ,4", which, after accepting

518 LR(k) GRAMMARS AND LANGUAGES
any string by T(A ", Zf), will pretend that it is, in fact, x that it has just accepted. Our
first claim shows us how our machine will pretend that it has just accepted x. Our
claim concerns the behavior of A' under the assumption of Case 2.
Claim There exists aq GQ',aG(r')*,Zer', such that, for our chosenx,
(c/o,x,Zb)\^(q,A,ZbaZ),
where for some a G S, 5 '(q, a, Z) is defined.
Proof This follows directly from the hypothesis of Case 2.
We now define A " in terms of A' = (Q', 2, T', 5', q'0, Zb, F'). We let
A" = (G",S,r",5",^',Z6,F"),
where
Q" = Q' U{i7f}, where qt is a new symbol not in Q'.
r" = r' U {Zf}, whereZf is anew symbol not in r'(Zf will be our
special acceptance stack symbol),
Qo = Qo,
F" = {qt}.
We now define 5" as follows:
i) Forall^GF',ZGr'-Z6,welet
8"(q,A,Z) = for, A),
ii) Forall^GF'
a) 8"(q,A,Zb) = (quZt),
b) 8"(q,A,Zt) = (qf,Zj).
iii) 5 "(qt, A, Zf) = (qt,ZtaZ) where aZ is defined in the previous Claim,
iv) ForaUaGS,ZGr'-Zf,^G(2',aGr*,if
8\q,a,Z) = (q,a)
then
8"(qf,a,Z) = fo,a)
with q as defined in the previous Claim (note that 5 '(q, A, Z) = 0, since
,4' is deterministic).
v) Forflree'-.F\aezA,zer\<7'ee\ae(r')*,if
8\q,a,Z) = fo',a)
then let
S"(q,a,Z) = fo',a).
The sets of added rules have the following significance:
1 When an ^'-accepting configuration is reached, the stack is erased until the
bottom marker is reached.
2 When the bottom marker is reached, we go into an A "-accepting
configuration.

13.3 CHARACTERIZATION OF LfHO) LANGUAGES 519
3 The machine pretends that x has just been accepted and so adjusts its
stack.
4 The machine proceeds under the assumption that it is in fact x that has just
been accepted.
5 Otherwise,,4" proceeds as A'.
We must now show that A " is a dpda such that L = T(A ", Zf) = T(A ", r").
We leave to the reader the task of showing that A " is deterministic.
Next we must show that
L = T{A) = T(A') = T{A",Y") = T(A",Zt) = L".
First, we show that L" £ L. We assume that for some w G 2*, we have w&L". Since
w G L", we know that in A " we have
{qo,w,Zb)^r{qt,KZt).
By the determinism of A", there exist w2,..., w„ G 2+, u>i G 2 *, « > 1, where
w = Wi w2 • • • wn such that
(qo, w, ••• w„,Zb)HL(<7f,w2 •••w„,Zf) H1- •• j^fa,, A,Zf),
where our machine passes through (7f in only designated ID's. Since, in A ", we have
(qo,wuZb) \^(qt,A,Zt).
Thus W! G L(,4"). Also in A', for some a G (T')* ,q&F', we have
07o,Wi,Z„) f^fa, A,Z„a).
Thusw! Gr(,4') = /,.
Also, in A ", for 2 < / < n, we have
^ ,„ (flf,w,-,Zf) ]r(qt,A,Zt).
Thus in ,4 we have _ +
(flf>Wi,Zf) h(flf,w,-,ZfaZ) pfa,, A,Zf).
Thus, in ,4', for some q G F', 0 G (r')*, we have
foWj.ZjOZ) ^(qr.A.ZftjS),
since we have assumed that A " only passes through qt at the designated ID's, since
A" could get to the (qt, A, Zf) configuration only if A', the machine A" is emulating,
passed through a final state after reading w,-. We know that, in A',
(q'o,xwuZb) \^(q,wi,ZbuZ).
Thus xWieT(A')=L.
We now show, by induction on i, that w — Wj • • • wt G r(,4') = L, for / < n,
where Wj G 2 * and vv(- G 2+ for 2 < / < «.
5axj'x. n = 1. We have shown that wt GL.

520 LR(k) GRAMMARS AND LANGUAGES
Induction step. Suppose Wi •••w(GL. Now x&L and xwi+1 GI; thus,
since we are assuming that characterization (b) for LR(0) languages holds, we have
Wi ••• WjWf+i G L. It is thus clear that w = Wi • • • w„ G Z..
We next show that L = T(A') £ L". Assume for some wG£* that w&L.
Now, we know there exist w2,... , wn G £+; wi &l,*;n> l,w = Wi ■ ■ ■ wn, such
that Wi • • • Wj G L for all i such that 1 < / < n, and no other prefixes of w are in Z..
Now consider any i such that 2 < i < n.
Claim xWj-GZ,.
Proof Since W\ ■••wMGL,Wi • • • u>; GZ., and* GL, we knowxw,- GZ..
Therefore, in ,4' for some q G F', a G (r ')*,
(qi,xwh Z„) f- (?, w,, Z„aZ) F1 fa, A,Z„a).
In ,4 we get +
(qt, wt, Zf) p (qt, A, Zf).
Also, since W\ G Z,, we know that in A ",
fao\w,,Zb) f^fo.A.Z,).
Thus in ,4 we have
(^o, vvj •• -Wn.Z,,) FL(<7f,w2 •••w„,Zf) F1-- • \-(qt,A,Zf).
Thus w = w, • • • w„ G TC4", Zf) = /,".
By the construction of A ", we see that T(A ", Zf) = T(A ", V"). This
completes the proof that (b) implies (c).
To help prove (c) implies (d), we first show that (c) implies (b). Assume there
exists a dpda A = (Q, X,F, 8,q0, Z, F), where F = {q^ and there exists Zf G r such
that L = T(A, Zf) = T(A, T). Clearly, L G A i. By Theorem 11.5.1, L G A0. Suppose
that forx £S*;wj GS*, we have w£L,wx GZ.,and_y GZ-.Then we have
(qo,wx,Z0)\^(qt,x,Zt)\^(qt,A,Zt)
and t
m r (q0,y,Z0)f-(qt,A,Zt).
Therefore + t
(q0,yx,Z0) f-(qf,x,Zf)YL(qf,A,Zf).
Thusyx GZ.. This completes the proof that (c) implies (b).
We now prove that (c) implies (d). We assume that there exists a dpda A = (Q,
Z,r,8,q0,Z0,F), where F = {qf}and L = T(A, V) = T(A,Zf) for some Zf G V. Let
L0 = min(Z.). By Problem 11.3.5, L0 is deterministic. By the definition of min,Z,0 is
prefix-free. Thus, by Theorem 11.5.5, Z-o is strict deterministic. We now consider two
cases, when L = L0 and when L =£L0.
CASE 1. L =L0. Since L0 is strict deterministic, L = Z.o(0)*.
CASE 2. Z-^Z-o- Since L=tL0, there exist x GS*, z G£+ such fhatxGZ.
and xz&L. Let V = {y G 2* \xy GZ.}. We shall show that Z.' is deterministic, and
that L = L0L'.

13.3 CHARACTERIZATION OF LR(0) LANGUAGES 521
We now construct a dpda to accept L'. Let A' = (Q, S, I\ 8, <7f, Zf, {<7f}),
where ,4 = (2, S, I\ 8,170, Z0, F) was previously defined. Clearly, A' is deterministic.
Claim L = L0L'.
Proof We first show that Z, E L0L'. Suppose that, for some w G S*, u> €Z-.
Then, for some w0, vvx G S *, w = w0Wi, where w0 G L0 = min(Z,).
Ifwi = A, clearly, Wi GZ,';thus wGZ.0Z,'.
Suppose Wi =£ A. Since w0 GL, woWi GZ,, and x GZ,, we knowxwi GZ, by
characterization (b) of LR(0) languages. Recall that we are assuming (c), and (c)
implies (b). Thus Wi GZ,'.
Conversely, we show that L0L' <~ L.
Suppose that, for some w G £ *, w6Z0£'.Then, for some w0,Wi G £ *, we
have w = w0Wi, where w0 EL0 and Wi GZ.'. Since u>i GZ,', weknowxwi GZ,. Since
xGL, xwi GZ-, and w0 &L, we know w = u'ovv1 GZ-, by characterization (b) of
LR(0) languages. Thus Z-0Z-i 9 Z-, and therefore we see that L = LqL'.
Now, let Z-! = min(Z-' — {A}). Clearly, Z-i is strict deterministic, since L' is
deterministic.
Claim L' =L\.
We first show that L' £ L*. For some wG £*, assume that u>GZ-'. Suppose
w = A. Then, clearly, w€Z{. Assume w G £+. Then there exist n > 1, w,- G £+, for
1 < i < n, where w = Wi • • • vv„ such that in machine A',
(,qf,Wi ■ ■■w„,Zt) \*-(qt,w2 ■ ■■w„,Zt) ^(qt,w3 ■ ■ ■ w„,Zt) \^- ■ ■ F^faf, A,Zf),
where these are the only instances in which the machine A' goes through state qt.
Thus,for Ki<n, +
(qf,W;,Zf) p(qrf)A,Zf).
Thus wt G Li and hence w G L*.
We now show that L* £ /,'. For some we 2*, assume w€i{. If w = A,
clearly, wGL', by definition of L'. Assume w =£A. Since wGZ-f, there exist n~>\,
Wj-GS* such that wt GZ- for 1 <*' <«, where w = u>i • • -w„. We now have, in A',
(qt,wl • ■ ■ wn, Zt) f^faf, w2 • • • w„, Zf) h*~ • • • \^{qt, A. Zf).
This gives us that
w = Wx • • • wn GZ,'.
Therefore,
L' = L\.
Thus, we have L = L0L* with L0, Z-i G A2. This completes our proof that (c) implies
(d).
Finally, we show that (d) implies (a). We assume that
L = LqZ-i where Z-0,Z-i GA2.

522 LR(k) GRAMMARS AND LANGUAGES
We first consider the degenerate case. Suppose L0 = 0. Then L = 0 and clearly is an
LR(0) language. Suppose L t = 0 or {A}. Then L =L0. Since L0 is a strict deterministic
language, L must be an LR(0) language by Theorem 13.2.4.
Now, we handle the nondegenerate cases. We assume that
Lo,£i*0, £i*{A}-
Since Lt is a strict deterministic language, there exist strict deterministic, thus LR(0),
grammars Gt = (Vt, S,-, Pt, 5,) such that Lt =L(Gt), for i = 0, 1, with N0 n JV, = 0.
Let G = (V, S, />, 5), where F=F0UF1U{5}, Sn {K0 U K,}=0,
P = P0VPi U (5 -+55!, 5 -+ 50}, S = S0 U Si. Qearly, Z,(G) = L, since G lays down
one word of L0 followed by a series of strings of any length of words of L i. We need
only show that G is LR(Q). Since Lx =t {A}, and Lx G A2, we know A^ Z-i, by the
Corollary to Theorem 11.4.2.
We assume now, for the sake of contradiction, that G is not an LR(0)
grammar. Then, by Lemma 13.2.3,
i) There exist w, w'.x GS*;7, y',a, a', 0,0' G V*;A,A' GN, such that:
ii) 5jf a/lu>if a0w = 7W
iii) St>ol'A 'x t> a'0'x = y'x = yw'
iv) <°>u, = »V = A
where v) (A ■*0, lg(a0)) #(/!'-► 0', lg(a'0'))
and vi) lg(a'0')>lg(T).
We now expand derivations (ii) and (iii). For some yi, ■ ■ ■ ,y„,y[,... ,y„ £
Z+;y0,y'o GS*,n,w>0, (ii) gives us
S%SS1=>SyngSS1yn=>Syn_lyn=>Sy1 ■■■yn^S0yi •■•ynfyoyi •••J',,
where a/tw j? a0w are two consecutive canonical sentential forms in this derivation.
ti
(iii) gives us
S^SS1^Sy'm^SS1y^l§Sy^ly^^Sy[ ■■■y'm^S0y[ ■--y'mfy'*y[ ---y'm
where a'A 'x j* a'0'x are two consecutive canonical sentential forms in this derivation.
The productions in G come from either P0, Plt or from our two new
productions. We thus now consider two cases, namely, (1)7 = 5 and (1)7 =£ 5. In the first
case we will be working on a reduction in Po or on a new production; in the second
case, we will be working on a reduction in Pi, or on a new production.
CASE 1. (1)7 = 5. We now have two subcases, which will turn out to represent
a production in P0 or a new production.
CASE 1(a). Suppose (2)7 = 55i. Then we know that, for some SGV*, we
have 5i ^Si5. But, since Ly is strict deterministic, we must have S\ j*Si by the
Corollary to Theorem 11.4.3, since strict deterministic grammars cannot be left-
recursive. Then 7 = SSl and
C4+0,lg(a0)) = (A'+p'Ma'p')) = (5 + 55,, 2).

13.3 CHARACTERIZATION OF LR(O) LANGUAGES 523
But this is a contradiction of (v).
CASE 1(b). (2)7^55l We have, for i such that -l<*'<«-2, a,,
0, eV*,Wl GZ*,A,Al G7V,
Sjf SSrfn-i ■ ■ ■ yng S<MiW,.y„_.- ■ ■ ■ yn = aAwgaftw = yw
= Sa^lWlyn.i---yn
Also, for/such that- \<j<m-2,u[,&[ 6F*,x, &V*,A',A[ G7V,
]?SSiym-j • • • ym ]t foiAiXrfm-j ■■■ym = aAxgafrx = yw
= Sa[$[xiy'm-r--y'm
Now, for 71 £F*, let 7! = 7<te(?>~1>) that is, y without the leading S. Then, for some
w[ G S *, since lg(a'0') > lg(7),
5i if a^.Wi ^aiBiWi = 7iWi
Si jta[A[xi jta[f}[xi — y\w[
Qearly, (0)u>i = (0)w[. Thus, since d is LR(0), we have
(A, ^01,lg(a,01)) = (ylj -*Ui,lg(a;Ui)).
Now A=AU A'=A[, 0 = 0i, 0' = 0i, and a, =5a, a,'=5a'. Thus, (.4-► 0,
lg(a0)) = (-4' -+0', lg(a'0')). contradicting (v).
CASE 2. (1)7 =£ 5. Again, we have two subcases.
Case 2(a). Suppose (1)7 = 50. We know that, for some 5 G V£, we have
So |f S05. But, since L0 is strict deterministic, as before, then we must have S0 it So-
Thus, we have y = S0, and
(,4+0,lg(a0)) = (A'+p'Ma'p')) = (5^50,1),
which contradicts (v).
CASE 2(b). Suppose wy=tS0. Then, for some a,, 0iGF*, w, GS*,
,41 G TV, we have
5^50yi •••j'njfcMiW,.y1 •••yn5>ai0ivv1y1 •••>>„
Also, for some a(, 0| G V*,xx &!,* ,A[ GW, we have
S^-S0yi ■■■y„galAlxlyl ■ ■ ■ymga^lxlyl ■ ■ ■ yn
We have, for some w[ G £ *, since lg(aj 01) > lg(a! 0!).
Sai?<XiAtWt z>atBtWt = ywi
o^<Mi*i ^ai0i*i = ywi
Qearly, (0)w, = (0)w{. Since G0 is Z,tf(0), we have
04, ^01,lg(a,01)) = (>!,' -*Ui,Ig(«i'Ui')),
giving us a contradiction of (v). □

524 LR(k) GRAMMARS AND LANGUAGES
Condition (b) of the LR(0) characterization theorem will prove most useful
in checking whether or not a language is LR(0). The following corollary to
characterization (b) will be particularly useful.
Corollary Suppose L£ £* is an LR(Q) language. For u>G£*, xG£+, if
w G Z. and wx EL, then wxx GI.
As an example of the utility of this corollary, consider the language
L = {A, a). We could prove directly that L is not an LR(0) language, but the use of the
corollary makes it even easier. Choose w = A and x = a. If L were LR(0), then aa
would have to be in L, which is a contradiction.
The next theorem shows us that the factorization of an LR(Q) language, of
the form given in the LR(0) language-characterization theorem, is unique.
Recall that a language L £ S* is degenerate if L = 0 or L = {A}.
Theorem 13.3.2 {Unique Factorization of LR(0) Languages) Let L = L0L*
be a nonempty LR(0) language, where Z-o, Z-i are strict deterministic languages. If
there are two strict deterministic languages L'0, L[ such that L = L'0(L[)*, then
L0 = L'0 and either
0 L, =L[
or ii) Li,L[ are degenerate.
Proof For the sake of a contradiction, we assume that there exist strict
deterministic languages L0, Z-o, Lx, L[ such that L0L* = L'oL[*, where L0 ^Lq
orLx ¥=L[ andZ-i andZ-J are not degenerate.
CASE 1 L0¥=Lq. We assume, without loss of generality, that there exists
some x E S* such thatx£Z,J but x £ L0.This is possible since L ¥= 0 implies L0 =£0.
Since x EL'o, we have x EL'0(L[)*. Thus,* GZ. and hence x EL0L*. Then, for some
x0 G£*, xi G£+, we have x - x0Xi, where x0 EL0 and*i GZ,*. Sincex0 GZ,0, we
know x0EL0L*; therefore, x0EL. Now, we know that x0 EL'0(L[)*. Clearly
x0 £ Lq, since x0 is a proper prefix of x, x EL'o, and L'0 is prefix-free. Thus, there
exist some x2 £S*,Jt3 G£+, such thatx2 EL0,x3 GZ.J*,wherex0 -x2x3. We also
know that x =x0xl = x2(x3Xi)EL'0. Since x3 ¥= A, L0 is not prefix-free. But this
is a contradiction.
CASE 2 Z-o =Z.;. We have L=L0L* = L0L[*. We assume for the sake of
contradiction, that L\ ¥=L{ and we do not have L\ and L[ degenerate. Without loss
of generality, there exists a y G £+ such that y EL[, but y^Lx. Since Z.0 =£ 0, there
exists some x€E* such that xEL0. Clearly, xyEL0L[*, giving us xyGL0L*.
Therefore, xy =x'y', where x'EL0 and y' EL*. Clearly, x must be a prefix of x' or
x' a prefix of x. Since Z.0 is prefix-free, we must have x =x', and thus y - y'.
Therefore, we see that y EL*. Now, since y<fcLl,we know there exist yoET,* ,yx G £+
such thaty =yoy\, where y0 GZ-!. It follows that xy0 EL and, hence, xy0 EL0L'i*.
Thus jcy0 =Jc'yo where x'EL0 and j>o G/.1*. Again, since L0 is prefix-free, we

13.4 LRSTYLE PARSERS 525
have x =x' and_y0 = yi>, giving us y0 GL[*. But we know_y0 ^ ^1, since, if it were,
we would have y0GL[ and yoyi GZ.J, where yx =£ A, contradicting the fact that
L\ is prefix-free. Now, since y0GL[*, there exist _y2G£*, y3 GX+ such that
yo =72^3, wherey2 GL[. Now we have_y =y0yi = J^C^i)GZ.|. But.y3yi =£A0,
and j>2 G Z, J. But this contradicts the fact that L J is prefix-free. □
We conclude this section with an example of the use of the LR(0)
characterization theorem to show us immediately that a given language which is not strict
deterministic is LR(0). We shall show that the semi-Dyck language is contained in the
class of LR(0) languages.
Theorem 13.3.3 Let Dr£ S* be the semi-Dyck language for some r> 1.
Then Dr is an LR(0) language, but not a strict deterministic language.
Proof Dr = D is deterministic, by Theorem 10.4.1. Suppose that, for x G £+,
w, y G £*, we have wGD, wx GD, and y GD. By Proposition 10.4.2(e), we have
x GD. Since y GD and x GD, we have yx GD, by Proposition 10.4.2(a). By (b) of
the LR(Q) language characterization theorem, D is an LR(0) language. Since a.\a\ GD
andtfitfitfitfi GD,D\%xvo\ strict deterministic, since it is not prefix-free. □
PROBLEMS
All of these problems refer to concepts introduced in the problems of Section
13.2.
1 Show that L is an ALR(O) language if and only if L is strict deterministic.
2 Show that L is a %LR(0) language if and only if L is an LR(0) language.
3 Show that an LR(0) grammar can have at most one pathological
production.
4 Work out the closure properties for the LR(0) languages, as was done for
the families Af) where 0 < i < 2, in Problems 3 through 47 of Section 11.5.
5 Prove the following interesting result.
Theorem There is an algorithm to decide whether a deterministic language is
LR(0) if and only if there is an algorithm to decide whether two deterministic context-
free languages are equal.
Hint for the only if direction: Let L\, L2 £ S* be two deterministic context-free
languages, and let C\, c2, C3, $ be four new symbols not in S. Consider the following
L = c,(Z,,$)*C3(Z.2$)* Uc2(Z,2$)*c3(Z,,$)*.
6 Prove that there is an algorithm to decide whether a A! language isLR(O)
if and only if there is an algorithm to decide whether two deterministic context-free
languages are equal.
13.4 LRSTYLE PARSERS
A parser that has many desirable practical features will now be described. The
operation will be precisely specified and will depend on some special stack symbols,

526 LR(k) GRAMMARS AND LANGUAGES
which are traditionally called "tables." In subsequent subsections, the existence of
such tables will be derived for LR(k) grammars.
An "LR-style parser" has the following structure: There will be an input tape,
an output tape, a pushdown stack, and some mechanism for "looking ahead" k
characters on the input string. The pushdown symbols are alternately symbols of the
grammar being parsed and certain tables. The action of the parser will be determined
by the table at the top of the pushdown stack and the lookahead symbols. The table
on top of the pushdown stack represents all the productions upon which the device
may be working at a given stage of the parse. These tables determine whether to shift
or reduce with the given lookahead, and if we are to reduce, what to "pop" from the
top of the pushdown, and a choice of nonterminals to which we reduce. After we have
popped the given number of symbols from the pushdown, the table appearing on top
of the pushdown will uniquely tell us what reduction to make. The suitable
nonterminal will then be placed on top of the pushdown, followed by the new appropriate
table; and the number of the production being reduced by will be printed on the
output tape (see Fig. 13.3).
The parser will accept a string by halting in a special accept state with the
bottom-of-stack table on the pushdown, and with empty input tape. The parser will
reject strings by halting in a special error state. Such a parser operates "properly" on
a given grammar if, for every string over the alphabet over which the grammar is
defined, the parser halts. Moreover, it must halt in the error state given a string not
generated by the grammar. For a string generated by the grammar, the parser must
halt in the accept state with the reversed canonical derivation of the string on the
output tape.
The device shown in Fig. 13.3 is a type of deterministic pushdown
"transducer" whose moves depend on the tables that are supplied.
The simplest way to formalize the parser is to begin with an instantaneous
description (or ID). An ID of the parser is written as (7, z, p), where
Input tape
Pushdown stack
Tables (are shaded)
Symbols in V
-»indicates head direction
FIG. 13.3 An LR style parser.

13.4 Lfl-STYLE PARSERS 527
i) z £ 2* is the contents of the (as yet unread) input.
ii) 7 G To(VT)* is the contents of the pushdown store. The "top" is
assumed to be at the right. T0 is a special "initial table."
iii) p &P* is the contents of the output tape, the partial parse.
The initial ID is defined to be (T0, z, A). In addition, there are two special ID's error
and accept.
We are now ready to give a more formal description of the functions that
control the parser.
Definition 13.4.1 Let G = (V, 2, P, S) be a reduced context-free grammar
andfc^O.Let J7~ be a set of tables.t Define two functions /andg as follows:
1 /, the parsing action function, is a mapt from S~y> 2^ into {shift, error}
U {reduces|ttGP};
2 g, the goto function, is a map from J^x V into y U {error}.
Example Assume k = 1. Let G be the grammar whose numbered productions
are shown below:
1.
2.
3.
4.
5.
S-*aAd
S-*bAB
A -*cA
A -*c
B-*d
For this grammar, y= {T0,.
Table 13.4.1.
TABLE 13.4.1 The / and g functions*
T9} and the functions / and g are given in
7\)
Ti
T2
T3
T*
Ts
To
T,
T&
T9
* A
/
a b c
shift
shift
shift
shift
shift
l blank denotes error.
d A
reduce 4
shift
shift
reduce 3
reduce 1
reduce 2
reduce 5
a
Ti
b
T2
g
c d
T3
T3
T3
T6
T9
A
T4
Ts
Ta
B
Tn
S
t The construction of tables that may be used by such a parser will be given later.
* Recall that XkA = {x <E 2* |lg(x) < k}.

528 LR(k) GRAMMARS AND LANGUAGES
It is often convenient to express the goto function by a directed graph in
which each node represents a table or "state". In our running example, we have the
following diagram:
The goto graph for our parser
Now we shall indicate how the parsing device functions, by defining a move
relation, \~, on ID's. The device is a pushdown-like automaton whose move will
depend deterministically on its "state" (here, there is only one state), the first ^letters
of the input, and the topmost stack symbol.
Definition Suppose the parser is in ID (yT, z,p), where TG Sr, z G 2*, and
Pep*.
1 If f(T, (fe)z) = shift then
a) if z = A then (7 T, z, p) \-error
b) if z ¥= A then we write z = iz',662,z'G2*. Moreover,
(bi)lfg(T, b) = error then (yT, bz', p) herror.
(b2) If g(T, b) * error then (yT, bz',p) \-(yTbg(T, b),z', p).
That is, in step (b2), we shift the next character of the input onto the stack and also
stack the table determined by the topmost symbol and the input.
2 If f(T, (fe)z) = reduce it where it is A ->0 then we wish to "pop" 2 lg(j3)
symbols. When we do so, let T' be the table we "uncover." More precisely,
define yT= y'T'y", where lgfr") = 2 lg(/3).
a) If T = T0, S =A,and z = A,then
(yT, z, p) \-(T0, A, pir) h accept.
b) If g(T', A) = error then (yT, z, p) herror.
c) If neither case (a) nor case (b) holds, then
(y'T'y",z,p)\-(y'T'Ag(T',A),pir).
In case 2(c), we remove the coded form of 0, and replace it by its immediate ancestor
A. The appropriate table is computed and stacked above A; the production used is
added to the output.

13.4 L/?-STYLE PARSERS 529
3 If f(T, (fe)z) = error, then (yT, z, p) herror.
Finally, let f— (f— ) be the (reflexive) transitive closure of f—.
The action of our sample parser on the following examples may be helpful.
Let us use the parser on the string bccd, which is in L(G).
(T0, bccd, A) h (T0bT2, ccd, A)
\-(T0bT2cT3, cd, A)
\-(T0bT2cT3cT3, d, A)
\-(T0bT2cT3ATs,d,4)
\-(T0bT2AT5,d, 43)
\-(T0bT2AT5dT9, A, 43)
\-(J0bT2ATsBTn, A, 435)
h(r0,A,4352)
\— accept
Since we have reached accept, a rightmost derivation of bccd is given by 2534,
as can be seen by the following tree in Fig. 13.4.
Now, let us try to use the parser on a nonsentence, say acb. We have the
computation
(T0,acb, A)\-(T0aTucb, A)
\-(T0aTlCT3,b,A)
\— error
because f(T3, b) = error.
This kind of parser has a number of attractive features, but there is a great
deal that we do not know. Where did these tables come from? When and how can they
be constructed? We shall learn how to construct such tables shortly. To do so, certain
functions will be helpful.
Definition Let G = (V, 2, P, S) be a reduced context-free grammar. For a,
pev*, define

530 LR{k) GRAMMARS AND LANGUAGES
FIRSTfe(a) = {xG?,\\a^y for some y G 2* andx = my\
and ■ *
FOLLOWfe(0) = {jc G T,hK \S => dfa for some 0 G V* and x G FIRSTS)}.
In our running example, it is easy to compute FIRST! and FOLLOWi
although, in general, it takes some effort. Finding efficient algorithms for these tasks is
left for the problems at the end of this section.
Example FIRSTi (A) = {c} and FIRSTS) = {d}
FOLLOW! 04) = {d}
in our running example.
PROBLEMS
1 Let G = (V, 2, P, S) be a reduced context-free grammar such that L(G) # 0.
Show that:
a) foraG V*,
FIRSTfe(a) = {x£S\la^ for some^ G 2* andx = (feV);
b)for^GAT,
FOLLOWfe(^) = {x &?,\\S => aAy where y G 2 * and x = my}.
2 Derive an efficient algorithm for the computation of FIRSTfe(a) and
FOLLOWfe(a). Is it easier to determine whether some string belongs to these sets than
to generate the sets?
3 The following function appears in the literature on Li? parsing. For aG V*,
define
EFFfe(a) = {w|w G FIRSTfe(a) and there is a derivation
aJ?0i?wx where (!J=Awx for alM GAf}.
Interpret the definition of EFF in terms of generation trees and then give an efficient
algorithm for computing EFF.
13. 5 CONSTRUCTING SETS OF LR(k) ITEMS
We now begin to construct certain sets which will be necessary in forming
the tables that drive our parser.
Definition Let G — (V, 2, P, S) be a reduced context-free grammar, and let
k>0. An LR(k) item for G is a pair (A ->0! • 02) u) where A GN, 0i, 02 G V*,
u G FOLLOWS), and A -> 0!02 is in P.
The intuition behind this definition is clear. It means that, in forming our tree,
we have matched up to the dot and our lookahead is u. The situation is illustrated in
Fig. 13.5.
An algorithm is now given for constructing sets of items. This construction
will build sets that are reminiscent of the entries in the recognition matrices of Section
12.6.

13.5
CONSTRUCTING SETS OF Lft(k) ITEMS
531
Algorithm 13.5.1
Input G = (V, 2, P, 5), a reduced context-free grammar, k> 0, and 7 G F*.
Output We define the output to be a set Vk(y)J
Method
1 To construct Vk(A),
a) If, for some a G V*, S -> a is in P, add (S -> • a, A) to Ffe(A).
b)If, for some .4, BGN, a, j3G K*. «G2*, we have (A -*-Ba, u) in
Ffe(A), andfi -> 0 is in P, then, for all jc such thatt x G FIRSTfe(ow), add
(5->-0,x)toFfe(A).
c) Repeat (b) until nothing new can be added to Vk(A).
2 To construct Vk (Xi • • • X,) for some i > 1, Xj G V for 1</< i.
a) If, for some a, 0 G F*,,4 GJV,wG2*,
(yl-► a-JT^, ii) is in Kfc(JT! •••X,-1),
then add
(A -*• aX, • (3, ii) to V&Xi • • • Xt).
b)If, for some A, BGN, a, 0, 6 G F\ h G2*, U ->a-50, h) has been
placed in Ffe(X! • • • X,) and fi -> 6 is in P, then, for all x G FIRSTfe(0H),
addCtf-fr'S.^toKfcCA'i ■ ■ ■ Xt).
c) Repeat (b) until nothing new can be added to Vk (A^ • • • X;).
The algorithm clearly halts, since, for given k and G, only a finite number of
LR(k) items exist.
We now work out an example, starting with the running example of Section
13.4.
First, we compute /
\(S-*-aAd,A)
Vi(A):
((S -> • bAB, A)
* Note that we write V^Qy) when the grammar in question is clear, as an abbreviation
for V0ik(j).
t FIRST is defined in Section 13.4.

532 LR(k) GRAMMARS AND LANGUAGES
We obtain Vi{a) by applying part 2(a) of Algorithm 13.5.1 to ^(A). This adds
(S-*a-Ad, A)
to Vi(a), and allows us to use part 2(b), to add
(A -> • cA, d)
and
(A->-c,d)
This completes the construction of Vi(a). The reader should check that:
(A -> c • A, d)
(A-*c- ,d)
Vi (ac) is
(A -*• • cA, d)
(A-*-c, d)
We now study the set of items generated by Algorithm 13.5.1. Our first
lemma shows how having an item in a set of items corresponds to "moving across the
righthand side of a production."
Lemma 13.5.1 Let G = (V, 2, P, S) be a reduced context-free grammar and
k > 0. Then, for A GAT, a, fa, fa, y G V*, u G FOLLOWfe(,4),
(,4 -> • fa fa, u) G Ffe (a), where aft = 7
if and only if
(y4-► Pi •&,«)€ Kfc(7).
Proof Assume that
(A-*'fafa,u)eVk(a).
We use induction on lg(|3i).
fiasw. Let lg(0!) = 0. Clearly,
(A->fa-fa,u) = (A->-fa,u)eVk(a) = ^(afr).
Induction step. Assume that the lemma is true in this direction for
lg03O = £ > O.We show that it is true for lg(0i) = £ + 1. Let fa = $[X, where fa' G V*,
16 V. By our induction hypothesis,
(y4-► Pi •*&,«)€ Kfc(o0l).
Therefore,^->/3;X-/32,«)GFfe(a0;Ar)= Ffe(a01) by 2(a) of Algorithm 13.5.1.
Now assume, conversely, that {A -*■ fa • fa, u)GVk(y). Again induct on
lg(0i).
Basis. lg(/3i) = 0. Clearly,
(A->'fafa,u) = (A-*-fa,u)eVk(y) = Vh(a),
where a0! = 7.

13.5 CONSTRUCTING SETS OF LRik) ITEMS 533
Induction step. Assume that the lemma is true in this direction for
lg(0O = £>O. We show that it is true for lg(/31) = K+l. Let 0! = &\X, where
P[GV*,XGV. Since 0|X¥= A, clearly the item (A ->0! • 02, «) had to be added to
Vk(y) in Step 2(a) of Algorithm 13.5.1. Thus, for some y' G V* such that y = y'X, we
haVC (A-*t$[-X(i2,u)eVk(y').
Now, by our induction hypothesis, there exists some a£K* such that (A -*■ • 0i02 >")
G Vk{a), where a0| = 7'. Therefore, a$[X = a^ = y'X = 7. □
The following theorem shows that if, for one item in a set of items, 0i
precedes the dot, and for another item in this set of items, ft precedes the dot, then
either 0i is a suffix of/3( or 0j is a suffix of 0!.
Theorem 13.5.1 Let G = (V, 2, P, S) be a reduced context-free grammar and
k>0. Then, if ,4, ,4'GJV, 7, ft, ft' G F+, Ig0?l)>lg0?i), ft, 02GF*, wG
FOLLOWfeU), w' G FOLLOW,^'), and
(,4->ft-ft,w)
O4'->0i-02,w')
are in Ffe(7), then there exists an a G V* such that ft = aft.
iVoo/ Since (,4 ->ft • ft, w)G Ffe(7), by Lemma 13.5.1, there exists some
7' G V* such that 7 = 7'ft and
U->-ftft,w)isinFfe(7').
Since (A' -> ft" -j32, w') is in Vk(y), by Lemma 13.5.1, there exists some 7" G V* such
that 7 = 7 "01 and
G4'->-0i02,w')isinFfe(7").
Since 7'0! =7"0i andlg(0i)>lg(ft), we let
a = (teOS.Vigtf,)^
as shown in Fig. 13.6. Clearly, 0l = a0l4 □
Since there are only a finite number of possible LR(k) items for any fixed
grammar, there can be only a finite number of sets of items. Therefore, to compute
the collection of distinct sets of items, we could systematically compute Vk(A),
Vk(y) for all 7 such that lg(7) = 1, Vk{y) for all 7 such that lg(7) =2 Vk{y) for
all 7 such that lg(7) = £,... This algorithm will eventually terminate, since, for some
£, we will eventually get no new sets of items. The algorithm we have just described is
somewhat wasteful, since it will necessitate computing many empty sets of items. One
can avoid computing many such sets of items by building up only the collection of
nonempty sets of items.
, 1
y 1 a
y"
Pi
Pi
FIG. 13.6

534 LR{k) GRAMMARS AND LANGUAGES
Algorithm 13.5.2
Input Reduced context-free grammar G = (V, 2,P, S) and k > 0.
Output We define the output to be^g = {Vk(7) # 0I7 G F* }.
Method Initially,^ is empty.
1 If Fft(A) ^=0, place Vk(A) mYk. The set Vk(A) is initially unmarked.
2 If a set of LR(k) items Ffe(7) inJ^ is unmarked,
a) Compute Vk(yX) for each XG K. If Vk(yX) #0, add Ffe(7X) to ^
as an unmarked set of items if it is not already there.
b)Mark Vk(y).
3 Repeat Step (2) until all sets of items in^ are marked.
Again, the algorithm is guaranteed to halt since we will eventually get no new
sets of items.
We now use Algorithm 13.5.2 to compute all of the sets of items for our
running example.
Example There are ten nonempty sets of items which are as follows:
\(S-*-bAB,A)j
((S-*a-Ad,A)}
= \(A-*-cA,d)
{(A-*-c,d) J
((S-*b-AB,A)}
= I (A -> • cA, d) I
{(A-*'C,d) J
Vi(ac) =
Vi(aA) =
VdbA) =
ViiaAd) =
ViibAB) =
V^acA) =
Vt(bAd) =
(A-*cA,d)
(A->c-,d)
(A-*-cA,d)
(A-*-c,d)
{(S->aA-d,A)}
(S-*bA-B,A)\
(B-*-d,A) ]
{(S->aAd-,A)}
{(S -> bAB •, A)}
{(A->cA-,d)}
{(B->d-,A)}

13.5 CONSTRUCTING SETS OF LR(k) ITEMS 535
(S — a-Ad,A)
(A--cA,d)
(A--c,d)
(S-a-4-rf,A)
(S-a4d-,A)
(S — -aAd,K)
(S — -bAB,A)
(A~-c-A,d)
(A - c; d)
(A^-cA,d)
(A - .c, d)
(A~*cA-,d)
(S — b'AB,A)
(A~--cA,d)
(A — -c, d)
FIG. 13.7
It is an aid to our intuition to regard each set of items as a "state," and to
draw a state graph. A state Vk(y) leads to state Vk(yX) under input XG V. In our
running example, we have the graph shown in Fig. 13.7. The reader should compare
Fig. 13.7 with the goto graph in Section 13.4.
We now must show that Algorithm 13.5.2 generates the desired output.
Theorem 13.5.2 Let G = (V, 2, P, 5) be a reduced, context-free grammar
and k > 0. Let J?k be the collection of sets of items for G computed by Algorithm
13.5.2. Then
^k = {Vk(.y)*t>\yev*}.
Proof We first show that ^k £ {Vk (7) # 017 G V* }. Suppose / G ^h. Then,
for some 7GF*, /= Vk(y). By Algorithm 13.5.2, clearly, Kfc(7)#0, and thus
ie{Vk(y)¥=$\yeV*l Therefore, ^kc {Vk(y)¥=0\yGV*}.
Conversely, we must show that {Vk(y)¥=0\yev*}Q Sfh. Assume that
/ = Vk(y) G {Vk(y') *017' G V*}. We use induction on lgfr).
Basis. lg(7) = 0. Then 7= A. Since I&{Vk(y)^0\yeV*}, Ffe(A)=£0.
By (1) of Algorithm 13.5.2, /G ^fe.
Induction step. Assume that the theorem in this direction is true for
lg(7)</. We shall show that it is true for lg(7) = i+ 1. Let 7 = y'X, where X&V.

536 LR{k) GRAMMARS AND LANGUAGES
Now, we know that Vk(y) # 0. From Algorithm 13.5.1, we know that Vk(y') # 0, for
otherwise we would have Vk(y) = 0. By our induction hypothesis, Vk(y')G Sfh.
Therefore, (Ffe(7)^0l7GF*}c ^fe. D
In order to fully understand the use of these items, we must study certain
important prefixes that occur naturally.
Definition Let G = (V, 2, P, S) be a context-free grammar, y G V* is said to
be a viable prefix of G is there exist a, 0 G F*, A G TV, w G 2 *, such that
and 7 is a prefix of a/3.
We will often also use the following concept.
Definition Let G = {V, 2, P, 5) be a context-free grammar. 7G F* is a
valid prefix of G if there exist a, 0!, |32 G F*, ^ G TV, w G 2 *, such that
and 7 = 0:0!.
Thus, a valid prefix is a viable prefix that "includes all of the left context."
A valid prefix corresponds to input that has already been read and reduced by the
parser.
Viable and valid prefixes turn out to be, in fact, equivalent concepts. The
following lemma will be useful in showing the equivalence of these two concepts. The
idea of the proof is that, if we look to some earlier steps in the derivation, viable
prefixes turn out to be valid prefixes.
Lemma 13.5.2 Let G = (F, 2, P, S) be a reduced context-free grammar.
Assume that, for some A G N, a, 0,7,6 G F*, we have
S^cxAw^aBw = 76
where 7 is a proper prefix of a. Then there exist A' G.N, a', ji' G V*, w' G 2*, such
that * , , , , +
St* a'A 'w' 7? a'B'w' j? aBw
It It It
where, for some 0| G V*, fa G F+, we have that /3' = 0l02 and 7 = a'/3j.
Furthermore, if 5 G 2F*, then (1)02 G 2, and if 5 G 2*, we have (fe)5 G FIRSTfe(02w') for any
k>0.
Proof Since S £ a(3w and 7 is a proper prefix of a, we must have a # A.
Therefore +
S^aAw
It
Therefore, there exist n > 2, a,- G F* for 0 < i < n, such that
S = ao^a! £"•• 'jtan-i = otAw^a&w = an
Intuitively, let i be the smallest integer such that a,- and all succeeding
canonical sentential forms in the above derivation have 7 as a prefix. Formally, define

13.5 CONSTRUCTING SETS OF LR{k) ITEMS 537
y
a
Pi
PjAjWj
P'l
P'
w'
FIG. 13.8
i to be the least integer such that for all /', n — 1 >/' > i, there exist Aj G N, 0,- € V*,
vv,-G2*, and a,- = yfyAjWj. Qearly, such an integer exists since n — \>\ and
an_j = aAw, where lg(7) <lg(a). Since a,- = S is impossible, we see that i > 1.
Therefore, for some A' GN, a', 0' G V*, w' G 2*, we have
5j* <*,-_! = aU'w'^a'0'w' = 7/3^4,-w,- = a,- j* a4w j* a0w
By our minimality assumption, lg(7) > lg(a').
Now, since At is the rightmost variable in a'0'vv', we know that lg(yPiAi) <
lg(a'0'). Therefore, lg(7)<lg(a'0'). Thus, there exist 0[ e F*, 02 G K+, where
13' = 0l02 such that y = a'/3l. (See Fig. 13.8.)
Now assume that S G 2 F*. We shall show that (1)02 cannot be written in this
derivation. It will follow that (1)02 G 2. Since we have the rightmost derivation
702w' = 7M,-w,-^70i+i^,+iW,+1 ^ •••^70n-i^n-iW„_1 ^70n-i0vvn_1 = 76
and 0;- G F+ for all / such that n — 1 > / > /, we must have
u>pj = <Dft = (»A+1 =
= (1)ft,
(D6
Since (1)6 G 2, we have (1)02 G 2. If 5 G 2*, since we have 0jw' f- S, we must have
(fe)5G FIRSTfe(02w'). D
The following example illustrates the Lemma.
Example Consider the grammar whose productions are
S-*abA
Since
Sj>abA£abc
and a is a prefix ofabc, we have that a is a viable prefix. However, since
S^S^abA
and a is a prefix of abA, we have that a is a valid prefix. Furthermore, JGS and
mbc G FIRSTfe (bA) for any k > 0.
The following theorem shows that valid and viable prefixes are equivalent.
It will later prove extremely useful to go back and forth between these notions.
Theorem 13.5.3 Let G = (F, 2, P, S) be a reduced context-free grammar.
Then 7 G F* is a valid prefix of G if and only if 7 is a viable prefix of G.

538 LR{k) GRAMMARS AND LANGUAGES
Proof Assume that 7 G V* is a valid prefix of G. From the definitions it
follows that 7 is a viable prefix of G.
Conversely, assume that 7 G V* is a viable prefix of G. Then there exist
A GN,a,BG F*,w6S*, such that
S^aAw^-aBw
H it
and 7 is a prefix of aB.
CASE1. lg(7)>lg(a). Then, clearly, 7 is a valid prefix of G, by the
definition.
CASE 2. lg(7) < lg(a). If 7 = A, then, for some 6 G V*,
Thus A is a valid prefix of G.
Assume that 7 # A. By Lemma 13.5.2, we have, for some A' GN, a',B'G V*,
w'G2*,
Sz>aA w^apw
where, for some B[ G V*, 02 G F+, we have 0' = B[B2 and 7 = a'B[. Thus, 7 is a valid
prefix of G. □
The concept of a valid prefix leads to the corresponding notion of a valid
LR{k) item. A valid LR(k) item for some 7GF* specifies a production, and the
place in that production, at which one might be working, after reading input that has
been reduced to valid prefix 7.
Definition Let G — (V, 2, P, S) be a reduced context-free grammar and
k>0. For a, Blt B2 &V*, w, wG2*, A GN, G4->0i -B2,u) is a valid LR(k) item
for aBi if there exists a derivation in G such that
SgaAwjfoBtBiW
with w = (fe)w.
Example Again consider the sample grammar
S-*abA
A -*c
Since we have
StStabA
(S -* a • bA, A) is a valid LR(k) item for a for any k > 0.
Every valid LR(k) item for a string is an LR(k) item. Moreover, every valid
prefix has at least one corresponding valid LR(k) item, as the following theorem shows.
Theorem 13.5.4 Let G = (V, 2, P, S) be a reduced context-free grammar.
7 G F* is a valid (viable) prefix of G if and only if there is at least one valid LR(k)
item for 7.

13.5 CONSTRUCTING SETS OF LR(k) ITEMS 539
Proof The theorem follows directly from the definitions involved and
Theorem 13.5.3. □
We next show how the existence of a given production in a grammar
guarantees the existence of certain valid LR(k) items corresponding to that production.
Lemma 13.5.3 Let G = (V, 2, P, S) be a reduced context-free grammar.
Assume that, for some 0i, 02 G V*, A G.N, we have
A -> 0!02 in P and h G FOLLOWfeU).
Then, for some a G V*, (A -> 0! ■ 02,") is a valid £/?(£) item for a0!.
Proof By Problem 13.4.1, there exist a G F*, w G 2*, such that
SgaAwgaPifcw,
where h G FIRSTfe (w). Therefore, (,4 -> 0! • 02, h) is a valid Ltf(fc) item for a0!. □
Our ultimate goal in this section is to show that Algorithm 13.5.1 computes
the set of valid LR(k) items for any y G V*. However, we first must prove a result
which shows that, if some y G V* has a valid L.R(fc) item with A preceding the dot,
then there is another specified LR(k) item for y under most conditions.
Lemma 13.5.4 Let G — (V, 2, P, S) be a reduced context-free grammar and
fc>0. For a,0G V*,uGX*,A GN, (A-+-0, u) is a valid LR(k)item for a, where
either A =£S, u¥= A, or a ^ A if and only if there exists A', B0 E.N, 0j, 02 G F*,
h' G FOLLOWfe (,4') such that
1 (A' ->0| -.eo02,H') is a valid Lfl(fc) item for a. If a ¥= A, then 0l G F+
while, if a = A, thenw' = A and S =,4';
2 there exists an n>0, B-, GN, 6,G V* for KKn, with Bn -A such
that
Bo -*Bl8i
B\ -*B282
are inPand«GFIRSTfe(6„ •••S102h').
Proof We assume that there exist a, 0 G V*, A G TV, such that (.4 -> • 0, w) is a
valid LR(k) item for a, where A # S, « # A, or a =#= A. By definition, there exists a
wG 2* such that „
S^-c^4w^-a0w
where u = (fe)w.
Since either A # S,« # A, or a =#= A, we must have
S;f a^vvjf o0w

540 LR{k) GRAMMARS AND LANGUAGES
Therefore, there exist n > 2, a,- G V* for 0 < i < n — 1 such that
S = aog-a! ^-••^•a„_1 = aAw^-a&w
Now let /> 1 be the smallest integer such that there exist Aj&N, ft- G V*,
Wj G 2 * for all / such that n — l>j>i,'we have a,- = afijAjWj. Clearly, such an integer
exists since n — 1 > 1 and a„_t = a4w. Therefore, for some A' &N, a', 0' G F*,
w' G 2*, we have
S^-a,.! = aU'vv'j*a'0'w' = a^^jW,- = a; j*a4w j*a0w
By the minimality of i, lg(a) > lg(a'), provided a # A. If a = A, then a' = A and
i= 1.
Now, since At is the rightmost variable in a'0'vv', we know that Ig(a0,v4,) <
lg(a'0'). Therefore, lg(a)<lg(a'0'). Thus, there exist |3| G V*, fc'eT, where
0' = 0l02 and a = a'0j. (The situation is similart to Fig. 13.8.) Let u - (fe)w' and note
that u' G FOLLOWfe (A'). Thus (A' ->0l • 02',h') is a valid Ltf (fc) item for a. Moreover,
i)0!GF+ if a # A, and
ii) 4' = Sandw = w' = A ifa = A.
We now have „
a,- = a'0'w' = O0,>l,W,£-a0„_1,4„_1W„_1 = OtAw j> O0W
We see that BjAjGNV* for i </' < n — 1, for otherwise we would have (1)0;- G 2 for
i < / < n — 1. Therefore, (1)0„-i = (1)^4 G 2, which is a contradiction. It is necessary
to examine the sequence of derivations in which the variable immediately following a
is changed. We now choose i < i0 < ii < • • • < i6 = n — 1 such that 0,-. = A for
0 </' < £. We see, from our derivation, that At — (1)02. Therefore, for some B0 GN,
02 G V*, we have 02=5O02. We now consider the productions used in the steps
ai.^ai.+l forO</<£.
There exist 5,'. G V* for 0 </ < £ such that
is in P. Since we can write a,-.+1 = a0H.1.41-.+1w1-.+1, with ^ij+lAi.+l GNV*, we must
have w8',.eN. Therefore, there exist SJgK*, BjGN such'that b^Bfij for
0 </ < £. We also know that Bj — A^+l from our derivation. This gives us a sequence
of productions
Bo ~*^iSi
Bi ~*B282
5n->0
* Replace 02 in Fig. 13.8 by 02' and also 7 by a.

13.5
CONSTRUCTING SETS OF LR{k) ITEMS
541
From our derivation, we see that
u = (fe)wGFIRSTfe(5„ •••510»=FIRSTfe(5„ •••81p'2u').
Conversely, assume that there exist A', B0 EN, a, B[, & G V*, and u G
FOLLOWfe (A') such that (A' -> j3[ • B0Bi, «') is a valid LR(k) item for a, and there
exist n > 0, Bt G N, S; G F* for 1 < i < n, with fi„ = ,4, such that
B\ -*i?2§2
Bn~>B
areinPandwG FIRSTfe(5„ • • -b^W)-
By a definition of a valid LR(k) item, there exists a w'G2*, a'GF*,
such that a — a 'B [, and *
where u — (fe)w'.
Since u G FIRSTfe(6n • • • 61/32«') and G is reduced, there exists some w G 2*
such that
§n "" •Si&m'^vv where m = (fe)w
Now we have
■S^a'fllflo&w' = oS0/3>'
since a = a'B[. Because u — *V, we may write
w = u w
for some w" G 2*. Then the above derivation becomes
S£ aBoBiw' = oBqBiu'w" =>otB8n • • -SiB^u'w"
Since u G FIRSTfc(6„ • • • 61BW), it is easy to see that
S=>aB0B'iw"ffaBuw" = a$w
S=>a'B\B0BW%<*'B\Abn ■■■8lw'=>a'B'lAw^a'B'lBw
Therefore, (A -> • B,") is a valid LR(k) item for a. □
We are now ready to prove that Algorithm 13.5.1 computes sets of valid
LR(k) items.
Theorem 13.5.5 Let G = (F, 2, P, 5) be a reduced context-free grammar,
7 G V*, and k > 0. An item is in Ffc(7) after application of Algorithm 13.5.1 if and
only if the item is a valid LR(k) item for y.
Proof By induction on lg(7).

542 LR{k) GRAMMARS AND LANGUAGES
Basis. lg(7) = 0. Then 7 = A. Assume that there exist j3 G V*, A GN, u G 2a
such that (A -> • ft «) is in Ffc(A) after application of Algorithm 13.5.1.
CASE 1. Assume (A -> • ft h) is added to Vk(A) in Step 1(a). Then A = 5 and
h = A. Clearly, (5 -> • ft A) is a valid LR(fc) item for A.
CASE 2. Assume (A ->-ft h) is added to Fft(A) in Step 1(b). Then there
exist n > 0, £,- GN, 6,- G F* for 1< /' < «, with p2GV*,Bn=A such that
are in P and h GFIRSTfc(6n • • -S^). It follows that (S-*-B0p2, A) is added to
Ffe(A) in Step 1(a). By Lemma 13.5.4, (A -> • j3,h) is a valid Li?(fc) item for A.
Conversely, assume that (A -> • ft «) is a valid LR(fc) item for A. The case
structure given below is from Lemma 13.5.4. The symbol a comes from Lemma 13.5.4,
and a = A in the basis of the proof.
CASE 1. Assume A = S, u = A, and a = A. Then, clearly, (S-> • ft u) must
be added to Vk(A) in Step 1(a) of Algorithm 13.5.1.
CASE 2. Assume A¥=S, w=£A, or a=£A. By Lemma 13.5.4, there exist
B0 G N, fti G V* such that (S -> • B0j52, A) is a valid LR(k) item for a and there exist
ann>0,fi,G#,6;GF*for 1 </<n,withfi„ =A, such that
fii -*B2S2
Bn-*V
are in Pand h G FIRSTfc(6„ • • • Sjftz). Therefore (^ -> • ft w) will be placed in Ffc(A)in
Step 1(b) of Algorithm 13.5.1.
Induction step Assume that the theorem is true for lg(7) = /. We show that it
is true for lg(7) = / + 1. Consider some LR(k) item (A -* 0i • 02 »«) where A G TV, 0i,
02 £f',ue FOLLOWfc(,4). Two cases must be considered.
CASE 1. ft. #= A. Let j3x = P{X, where $[ G V*, XG V. We know, from the
definition of valid LR(k) items, that (A-*fii'P2, ") is a valid LR(k) item for 7 if
and only if (A -> 0i • Zfti, w) is a valid LR(k) item for 7', where 7 = 7'^. By our
induction hypothesis, (A -> j5\ 'Xp2, ") is a valid LR(k) item for 7' if and only if

13.5 CONSTRUCTING SETS OF LR{k) ITEMS 543
(A -*0j •XP2,u)eVk(y'). From Algorithm 13.5.1, it is clear that (A -»0j -X02,h)G
Ffc(7') if and only if (A -> 0! • 02, u) G Ffc(7). Thus, 04 -* 0, • 02, h) is a valid L/?(fc)
item for 7 if and only if (A ->0i •02,«)G Vk(y).
CASE 2. 0i = A. Assume that (A ->• 02, w) is a valid LR(k) item for 7.
Therefore, there exist A',B0 GJV,ai,02 G F*,0i G F+,H'GFOLLOWfc04') such that
O4'->0| -fio02, «') is a valid LR{k) item for 7 and an «>0, BtGN, SfG F* for
1 < 1 < n, with Bn= A, such that
i?i -*fi252
are inPandw G FIRSTfc(6„ • • • 6„02h')- By Case 1,
O4'->0;-fio02,")eFfc(7).
Therefore, {A -*■ • 02, u) is placed in Ffc(7), by Algorithm 13.5.1 in Step 2(b).
Conversely, assume that (A -> • 02, u) G Ffc(7). Then there exist A', B0 GN,
02 G V*, 0i G F+, u GFOLLOWfe04') such that 04' ->0i -B0&, u')GVk(y) and
there exist an n>0,B{GN, 6j;G F* for 1 <i<n,withfi„ = ,4, such that
B\ -*'fi252
5n-i ~*Bn8n
Bn->P
are in i» and u GFIRSTfc(6„ • • • Si02h'). By Case 1, (A' -»0| -fio02, "') is a valid
LR(fc) item for 7. Therefore, by Lemma 13.5.4, (A -*■ • 02, w) is a valid LR(fc) item for
7. □
The following definition is useful for dealing with the collection of sets of
items valid for the valid prefixes of a grammar.
Definition Let G = (F, 2, P, S) be a reduced context-free grammar and
let k > 0. Then the canonical collection of sets ofLR(k) items for G is defined as
{{i\i is a valid LR(k) item for 7)17 G V* is a valid prefix for G}.
We now use Algorithm 13.5.2 to get Vfc. We will now show that Algorithm
13.5.2 in fact produces the canonical collection of sets of LR(k) items for G.
Theorem 13.5.6 Let G = (F, 2, P, S) be a reduced context-free grammar,
k > 0. Then .Vfc = the canonical collection of sets of LR(k) items for G.

544 LR(k) GRAMMARS AND LANGUAGES
Proof By Theorem 13.5.2, S?k = {Vk(y) ¥=0\ye V*}. By Theorem 13.5.5,
Vk(y) =£0 if and only if y has at least one valid LR(k) item, and, by Theorem 13.5.4,
this is true if and only if 7 is a valid prefix. Therefore, yk = the canonical collection
of sets of LR(k) items for G.
PROBLEMS
1 Construct the canonical collection of sets of LR(k) items for the following
grammars:
a) k = OandS-*Sa\a
b)k = 1 and S-*aAc
A -*bAb\b
c) k = 0 and S -* aAc
A-*Abb\b
d)k= 1 and E-*E +T | T
T-> T* F\F
F-*(E)\a
2 Which of the grammars in Problem 1 are LR{k)1
3 There can be a large number of elements in 5^ •# even for a small LR(k)
grammar. Find an LR(0) grammar Gn of size which is polynomial in n and yet where
I J?o| is exponential in n.
13.6 CONSISTENCY AND PARSER CONSTRUCTION
The algorithms of the previous section may be applied to any context-free
grammar, but the results may not be useful in constructing parsers. In this section, we
shall introduce the important notion of consistency, which allows us to define parsers
from our collection of sets of items. Then it will be shown which grammars yield
parsers. It should not be a surprise to find that it is exactly the LR(k) grammars which
work.
In the parsers which were described in Section 13.4, all moves were made
deterministically. We hope to be able to use the sets Vk(y) as the tables on the stack.
In order to do so, these tables must satisfy certain conditions. For any move of our
parser:
1 There must never be a conflict between a reduce and a shift move.
2 When we have a reduce move, the reduction to be performed must be
uniquely determined.
This leads to the following definition.
Definition Let G = (V, 2, P, S) be a reduced, context-free grammar and let
k > 0. A collection £? of sets of LR(k) items for G is consistent if, for all / G S*, all

13.6 CONSISTENCY AND PARSER CONSTRUCTION 545
A, A' EN, allj3>j31,i32eF*> where (1)02 gW,and all a, 0 e 2A withh£FIRST*Qk?),
we have that
(A->P -,u)EI and (A' -»0, 'fa,v) EI imply that
(A'->^-^,v) = (A+P;u).
There are several things to note in this definition. The condition (1)02 &N
means that 02 = A or (1)02 E 2. The conclusion means that:
i) u = v
ii) A=A'
iii) 0, = 0 and 02 = A
Example Returning to our running example from Section 13.5, it is easy to
see that J?k is consistent. In fact, only the set
/
/ = Viiac) = •
(A
(A
(A
(A
■cA,d)
■c,d)
■ • cA, d)
• • c,d)
\
might cause concern. But the first item can cause no problem, since A EN. The third
and fourth cannot be inconsistent with the second because both the third and fourth
items have FIRSTi(02tf) = {c}and no conflict can exist.
Our next task is to characterize the class of grammars which produce
consistent grammars.
Theorem 13.6.1 Let G = (V, 2,P,S) be a reduced context-free grammar and
let k > 0. G is LR(k) if and only if the canonical collection of sets of LR(k) items for
G is consistent.
Proof Assume G is LR(k). Assume, for the sake of contradiction, that the
canonical collection of sets of LR(k) items for G, denoted by J^, is not consistent.
Then, for some IE S^, we have some A, B EN, 0, 0x,02 E V*,u,vEl,*, (1)02 £N,
u E FIRSTfc(02zO such that
(A-*P',u)EI and (B-*^ • p2,v)EI
where (A -*■ 0*, u) ¥= (B -> 0i • 02, v). By definition of /, there exist a, a", y E V*, w,
x E 2 * such that u = (fc)w, v = (fc)x and
5;? clAw j* a0w = yw
5^a"fix^a"0,02x
7(02*)
We are not yet ready to employ the LR(k)-ness of the grammar to force a
contradiction, since we must have 02 E 2* in order to use the LR(k) definition. We
therefore divide our proof into two cases, according to whether 02 E 2* or 02 ^ 2*.

546 LR{k) GRAMMARS AND LANGUAGES
CASE 1. 02 G 2 *. Now we have
(fc)(02x) = (fc)(02z>) = u
Since G is /,/?(£) we therefore have
(A -> 0, lg(«0)) = (B -> 0,02, lg(a"0!02)).
It follows that lg(a"0102) = lg(a0) = lg(7) = lg(a"01). Therefore 02 = A. Since
0 = 0i02, we have 0 = 0i. Also, v= (fc)x = (fc)(02x) = u. In addition, we have A = B.
Therefore
G4->0-,h) = (fl->0, .fo.p),
which contradicts our hypothesis.
CASE 2. 02 £ 2 *. We now rewrite 02 in a rightmost fashion until we have a
string of terminals. We know that there exists an x' G 2*, p GP+ such that 02 £x'
and u = ^x'x. Therefore we have
S=>yP2x§-y(x'x)
We consider the last step in this derivation, namely, there exist A' &N,
a'GF*,0',x"G2*suchthat
S=>yp2xf>ya'A'x"gya'p'x" = y(x'x)
Since (1)02 $N and 02#=A, clearly (1V = (1)02 G 2, and lg(7a'0') Mgfr). Since
G is LR(k), we now have
G4->0,lg(7)) = G4'->0',lg(7a'0'))
This is clearly a contradiction; thus the canonical collection of sets of LR(k) items for
G must be consistent.
Conversely, assume that Sf, the canonical collection of sets of LR(k) items
for G, is consistent. Assume, for the sake of contradiction, that G is not LR(k). Then,
by Lemma 13.2.3, there exist u, w, w',*G2*, 7, a, a', 0, 0' G V*, ,4, ,4'GAT such
that:
i) S;? aAw ;*• a0w = yw
tt ft
ii) 5g-aU'x^a'0'x = 7'x = 7w'
iii) «w = Ww' = „
iv) (A ->0, lg(«0)) *(A'->0', lg(a'0')), with
v) lg(a'/J')>lg(a0).
At this point, we know that if, in fact, lg(7) > lg(a'), then derivations (i) and
(ii) will give us items that will lead us to a contradiction. However, in the case where
lg(7)<lg(a'), we will have to examine an earlier step in the derivation of a'ji'x to
get our contradiction.
CASE 1. lg(a')<lg(7). Then, for some 0|,02GF*,0'=0|02 and7 = a'0i.
Let v = (fc)x Then (A -> 0 •, u) and (A' -> 0l • 02, v) are valid LR(k) items for 7 with
u = (fcV = (fc)(02x) = (fc)(02z>). Since we have consistency,
G4->0-,M) = {A'+e\-&,v).

13.6 CONSISTENCY AND PARSER CONSTRUCTION 547
a
P
a
P'
FIG. 13.9.
Therefore, A = A', &=&[,u=v and 02 = A. Since a/3 = a'0l = a'0, we have a = a'.
Therefore, (A -> 0, lg(a0)) = (A' -> 0', lg(a'0')), which contradicts (v).
CASE 2. lg(7) <lg(a'). We have the situation shown in Fig. 13.9.
By Lemma 13.5.2 there exist a",0" G V*, w" &H*,A" GN, such that
=f a ,4 w ^ a p w z>aB x = yw
ti K ti
where, for some $'l GV*, 02 G F+, we have 0"=0i'02\ h GFIRSTfc(0>") and
y = a"0{'. Since w' G 2*, (1)02 G 2+, from this production, we see that
G4"->02'02»
is a valid LR(k) item for 7 where »=(*V'and«6 FIRSTfc (02 z>). Since (A -> 0 •, u) is
also a valid LR(k) item for 7, we have
04 -><?•,«) = o4"->0;-02»
since Sf is consistent. Therefore, 02 = A. This, however, contradicts the
quantification on 02. Therefore, G must be LR(k). □
Corollary There is an algorithm which, if given a context-free grammar G and
a nonnegative integer k, decides whether or not G is LR(k).
Our next step is to build an LR parser from our sets S?k. The set of tables
that are necessary will consist of some new symbols to stand for the elements of S?k.
To be formal, define
jr= jrk = {T(J)\ 1 GVk),
where T(I) is a new symbol denoting/, ^is called the canonical collection ofLR(k)
tables for G. Next, we must define the /and g functions of the parser.
Definition Let G = (V, 2,P, S) be an LR(k) grammar and let k>0. Let
S~= JTh be the canonical collection of LR(k) tables for G and assume that S~ is
consistent. Define the parsing action function /and the goto function g as follows:
1 For each TG S~M\d «6S\, define
a) f(T, u) = shift if there is some0! G F*,02 G l,V* ,vG X\ and A GN
such that (A -> 0j • 02, v) G T and u G FIRSTfe (j32 v);
b) f(T, u) = reduce n if n is A -*j3 and (A ->0',«)isin T;
c) f{T, u) = error otherwise.
2 For each T= Vk(y) G S~m.d each ,4 G V, define
a)g(Vk(y),A) = Vk(yA)ifVk(yA)*k
b) g(Vk(y), A) = error otherwise.

548 LR{k) GRAMMARS AND LANGUAGES
It is a straightforward matter to prove the following lemma and the proof is
left for the exercises.
Lemma 13.6.1 The relations / and g defined in the previous definition
are well defined functions if S? is consistent.
Examples It is easy to see that the / and g functions in our running example
of Section 13.4 were constructed by employing this definition.
As another example, let k = 1 and G be given by
1.
2.
S-*aS
S-*a
The sets of items and the g function may be "read" from the graph in Fig. 13.10.
(S-*-aS,A)
(S—aS,A)
(S-a,A)
(S--aS,A)
(S-'fl,A)
(S-aS,A)
FIG. 13.10.
The /and g functions are given by the following table:
/
a
shift
shift
A
reduce 2
reduce 1
a
Tx
S
T2
Now our string of definitions and results are complete so that, given any
LR(k) grammar, we can use our algorithms to produce a consistent set of LR(k) tables
and an LR parser. However there is one more important task to accomplish. It must be
shown that this parser works correctly. This means that any string in the language is
accepted and, moreover, that any nonsentence is rejected in a finite time (i.e., the
parser halts). A detailed analysis of the actions of the parser on nonsentences is useful
for error recovery work.

13.6 CONSISTENCY AND PARSER CONSTRUCTION 549
We now show that, given a grammar G = (V, 2,P, S), the canonical LR(k)
parser for G does the following:
1. Given a string x £ L(G), the parser halts in the accept state and outputs
the parse for x in G.
2. Given a stringx ^ L(G), the parser halts in the error state.
The next theorem proves that our parser accepts strings in the language and
outputs the correct parse.
Theorem 13.6.2 Let G = (V, 2,P, S) be an LR(k) grammar, k > 0. Then, in
the canonical LR(k) parser for G with tables JT^, for all xGL(G), (T0, x, A) p
(To, A,p) [—accept, where S^ x.
Proof hit x&L(G). Then there exist n>0, a,-GK*, W;G2* for
0<i<n+ 1, p,-GP, Ai&N for 0</<n, a„+1 = An+l = A such that there is a
unique derivation of x in G,
5 = ao^oWo^-^oWo = axAxwx =* a^iW, =
<M2w2 J- •••£> a„0„w„ = a„+i^„+iw„+1 = x
Claim 1 For 0 < / < n, lg(a,-+1^,-+1) < lg(a,ft)-
Proof This is clear from the fact that ,4,+1 is the rightmost variable in a,-/3fw,-.
Claim 1 allows us to do the following: For the a,-ft, 0 < i < n, we order the
prefixes of the a,-j3j as follows:
Rown+1 a„+1^„+1,a„+1^„+1(1)w„+1 ,...,Wa»^i)(oB+1i4B+1wB+,)
Row n anAn< anAn(l )wn,... ,lg(0'»-i<J"-i)(cv4„w„)
Row 1 aiAi^xAi^^Wi ,.. .,alAlwl
Row 0 a0j40 = S
We index each row from zero and we associate the pair (C, ni) with the mth
element in row C.
Next, we impose a linear ordering on these elements, defined by the order in
which they have been listed, indexed from zero.
We now prepare for an induction proof, by defining SP, IP, and OP, which
stand for the stack part, input part, and output part, as follows:
For 0 < / < t, where t is the number of items ordered above, let
SP(i) = ith element in the ordering.
Assume that the pair (&,m) is associated with this element. Let SP(i) = ae,4g(m)wg.

550 LR(k) GRAMMARS AND LANGUAGES
Also let
IP(J) = We(^e)-m)
OP(j) = P„---Ps-
We now make a very strong induction hypothesis which gives us the condition
of every element of our parser at any given state of the parse.
Claim 2 After i steps, 0 < i < t, the configuration of the parser is
(T0X1 TiX2T2--- X$T$,y, p) where x = x'y for x , y G 2*, T, G S~h for 0 </ < s,
XjGVfor Kj<s,pGP* and
1 X, ■■■Xs = SP(i),
2y = IP(il
3 p = OP(i),
4 7>= 7X^(^i-•• A))) for K/<s.
/Voo/
fiasw. /' = 0. The initial ID of our parser is (x, T0, A). We see that
1 A = 5P(0),
2 y=x=IP(.0),
3 A=OP(0),and
4 Condition (4) is vacuously satisfied.
Induction step. Assume that, for i = r, (1), (2), (3), and (4)hold. Now assume
that i = r+ 1. After r moves, we have, by our induction hypothesis,
(7-0,x,A) f- (T0X1TlX2T2---XtT„y,p)
where x = x'y with x',y G 2*, 7} G J^fc for 0</<B, X, G F for K/<s, pGP*
with
1 *, ...*, = SP(r),
2 y = IP(r),
3 p = OP{r),
4 7) = 7XFfc(Jr, •••A»)forl</<s.
Assume that (£, w) is associated with element r. By definition, we have
SP(r) = <Me((mS)
//>(/•) = wj^lg("'e)-m),
OT^r) = p„---ps.
We now have two cases, one of which will amount to a shift move and the
other to a reduce move.
CASE 1. \g(SPQ-)) < lgfag-iftj-i). We know that
S => ae^ewe = afi_1j3fi_1wfi_1 = (SP(r))y

13.6
CONSISTENCY AND PARSER CONSTRUCTION
551
where SP(f) is a proper prefix of ae_i/3e_i. We claim that there exist A'&N, a,
0'GF*,w'G2*suchthat
S => a'A'w' =* a'P'w =*• afi_1j3fi_1wfi_1
where, for some 0', G V*, 0j G F+ we have 0' = 0i0i and SP(f) = a'0i. If 5P(r) is not
a proper prefix of ae_i, then this is immediate. Otherwise, it follows directly from
Lemma 13.5.2. Therefore, (A' -> 0i • 0i, (fcV) is a valid /,/?(&) item for SP(r).
Furthermore, by Lemma 13.5.2, (1)0i G 2 and (fc)y GFIRSTfc(02w').Itfollowsthat/Ts((fcV) =
shift. Clearly, ae,4g((m+1\vg) is a valid prefix of G. It follows from Lemma 13.6.1 that
the next move of our parser is well defined. Therefore
{ToXlTlX2T2---X.T„y,p)
Y-iT^T^Tr ■ ■XsTs(^y)(T(Vk(Xl ■ '■X™y))),y<My>-1\p).
Clearly,
1 X1---Xs(^y) = SP(r+\),
2 ^(lg(y)-1)=/P(/-+l),
3 p = OP(r+l),
4 Tj = HVkiXi ■ ■ ■ Xjj) for 1 </ < s + 1, where *s+1 = (1>y.
CASE 2. lgCSPC)) = lg(ae-ife-i). We have
T
S ^"fi-i^fi-iWfi-! ^ aj-A-iWt-! = ae,4ewe => x
Since Ts = T(Vk{Xx- ■ ■ Xs)), we have
(^e.1->0e_1-,(feS_1) = (Ai.l^pi.r,wy)GVk(Xr--Xt).
Therefore, /Ts((fc)X) = reduce 7r where n = (^Ig-! ->j3g_i). Clearly, ag.^g.! is a
viable prefix of G. Thus, by Lemma 13.6.1, we have the well-defined move
(T0XlTlX1.T2--XsTs,y,p)
\—(T0XlTlX2T2 ■ ■ 'Arg_ig(0g_1)7,fi_lg(pg_1).4fi_1 T ,y,pn),
where 7"' = T(Vk(Xl ■ ■ ■ -Xg-ig^g y4g_i). Therefore we have
1 Xx ■ ■ •A'e.xg^gMy4e_i = SP(r + 1),
2^=/P(r+l),
3 pTr=OP(r+ 1),
4 7} = 7XKfc(jr, • • • A})) for 1</ < £ - lg0B_,) + 1.
This completes the induction proof.
Now let / = t — 1. We have parser configuration
(r0X17Vr27V--*s7's>A,p').
1 A^i • • • Xs = SP(t — 1) = alA iWj,
2 A = IP(t-l),

552 LR(k) GRAMMARS AND LANGUAGES
3 p' = OP(t-\) = pn---Pl
4 Tj = T(Vk(X1 ■ ■ ■ Xj)) for 1</ < s.
We know, by our induction proof, that
SP(t) = ao0owo
IP(t) = A
OP{t) = p„---p0
Since p0= (£->*! • • • XS)GP, then (S-**, • • • Xs •, A)erj. Therefore/Tg(A) =
reduce n where ir = p0. Thus, by Lemma 13.6.1, the parser makes the well-defined
move
(ToX^XiTi ■ ■ -XsTs,A,p') \- (A, T0,p'«) \- accept
where S ^ x.
We now consider the behavior of our canonical LR(Jc) parser on an input
string not in the language. We wish to show that, when the parser is given a string
that is not in the language, it will halt in a special error state; and moreover the
parser will halt at the earliest step possible. That is, as soon as the input string thus
far read could not have the given lookahead string if the string were in the language,
the parser must halt and declare error. When k = 0, we must, in addition, consider
the special case when the input string is a prefix of some string in the language.
Theorem 13.6.3 Let G = (V, 2, P, S) be an LR(k) grammar, k > 0. Then in
the canonical LR(k) parser for G, with tables S'h, for all x & L(G), (T0 ,x,A) \-error.
Furthermore, the parser halts at the following point in the computation:
CASE 1. Let x = x'x", where x is the minimal-length prefix of x such that
there exists nojGS', with (fc)^ = wx" andx'y GL(G). Then
(T0,x,A) )r(7,x",p) h error
where the next-to-last move of the parser is not a reduce move.
CASE 2. If x £ L(G) and Case 1 does not hold, then, for some y G T£ V 3~h )*,
pGP*,
(T0,x,A) )r(y,A,p) \- error.
Proof We must consider two cases. In the first case, at some point the prefix
of the string we have thus far read could not have the given lookahead for any string
in the language. In the second case, the input string is a prefix of some string in the
language.
CASE 1. In this case, it is assumed that there exist x , x"G2* such that
x = x'x" where x is the minimal-length prefix of x such that there exists no y G 2*,
with wy = wx" and x'y 6i(G). We shall show that, immediately after reading the
last letter of x , the parser declares an error. Since x is the minimal-length prefix of

13.6 CONSISTENCY AND PARSER CONSTRUCTION 553
x such that there exists no y G 2* with (ky = (fc)x" and x'y GL(G), by Theorem
4.1, the configuration of the parser after reading x (or the initial configuration if
x = A) is
(T0X1TlX2T2---X.T.,x'',p)
where 7} G ^fe for 0 </ <s,X, G Vfor K/<j,p6 P* and
1 Xf-X.=SP(i),
2 x" = IP(i),
Assume, for the sake of contradiction, that
(7-0^,7-^7-2 • • -XsTs,x",p) h error
is false. Then there exist A GW, 0,, 02 G V*,« G 2*, (1)j32 £ # such that (A -> 0, • j32,«)
is a valid L/?(fc) item for Xi ■ • • Xs with (fc)x" GFIRSTfc(02H). By definition, there
exist aG V*, w G 2*, such that
S =f ovlw f- 00,02w
and « = (fc)w. By (1), a0, j*x'. Thus, for some m'G2*, a0102w^*xW. But this
contradicts the hypothesis of Case 1.
CASE 2. We therefore assume that, for some y G 2+, xy G L(G) and iky = A.
Since y G 2+ and (fc)y = A, we must have k = 0. Since xy G L, by Theorem 13.6.2 for
some 7,7 G T0{V^k)*,p,p GP*,
(7-0,xy, A) ^(7,^,P) ^(7',A,p').
Therefore, given the input x, we will have
(7-0, x, A) ^(7,A,p)
and after a finite number of steps, the parser will attempt to shift. By the definition
of our parser, since the input stream will be empty, the parser will halt and declare
error. □
PROBLEMS
1 Prove Lemma 13.6.1.
2 Prove the following proposition.
Lemma Let G = (V, 2, P, S) be a reduced context-free grammar, where
k > 0. Let 7G y, where 5? is the canonical collection of sets of LR(k) items for G.
Assume that, for some B&N, 0, G K*, 02 GNV*, z>GFOLLOWfc(£) we have
(B -> 0i-02, v) G /, witht u G EFFfc(02zO. Then there exist C G N, S G 2 K*,
«' G FOLLOWfc(C) such that
(C->-5,m')G7
and«GFIRSTfc(6«').
t EFF is defined in Problem 3 of Section 13.4 on page 530.

554 LR(k) GRAMMARS AND LANGUAGES
3 In the literature, one finds a different definition of consistency than the
one used here.
Definition Let G = {V, E,P,S) be a reduced context-free grammar and let
/fc>0. A collection S? of sets of LR(k) items for G is sf- %-consistent if, for all
/G Sf, A,A' &N,(S,(Sx,$i GK*,andall«,vG2^ with u G EFFfc (j32 v), we have that
U->/3-,«)G7 and (A' ->02 • j33, ») G7
imply that
U->/3-,«) = W'-»-02 •&,»).
Prove the following result.
Theorem Let G = (V,"E,P,S) be a reduced context-free grammar where
k > 0. ^, the canonical collection of sets of LR(k) items for G, is s/- ^-consistent
if and only if it is consistent.
4 The way in which our parser is defined in Section 13.4 is complicated.
In order for our parser to halt, three conditions must be satisfied, namely:
a) The input stream must be empty.
b) A reduction to S must be performed immediately before the parser halts.
c) After popping the stack for the final reduction, only T0 may remain on
the stack.
Show that each of these three conditions is necessary.
5 Fix k > 0. Analyze the time required to test a context-free grammar for
the LR(k) property if the consistency is checked.
*6 Find a method for testing a context-free grammar for the LR(k) property
which takes time proportional to nk+2 when k is fixed and where n = \G\.
13.7 THE RELATIONSHIP BETWEEN /.^-LANGUAGES AND A0
We have seen, in Theorems 13.2.4, that every deterministic language has an
LR(l) grammar. Our purpose here is to prove a converse and to characterize the
deterministic context-free languages by LR grammars. To do this is not very difficult.
We must merely convert the LR parser from Sections 13.4 and 13.6 into a dpda that
accepts the language in question.
Theorem 13.7.1 Every LR(k) language is a deterministic context-free
language.
Proof The idea is simply to construct a dpda from the LR parser of the
previous sections. The states of the dpda will consist of pairs, the first element of
which is the table on the top of the stack while the second is the lookahead string.
The final states will be those where the lookahead string, concatenated with a valid
prefix corresponding to the set of items, produces a word in the language. More
formally, let G = (V, 2, P, S) be an LR(k) grammar. Define a dpda A = (Q, 2, T, 6,
q0, T0, F) where
r = kux

13.7 RELATIONSHIP BETWEEN tff-LANGUAGES AND A0 555
and 3~= {^}is the set of LR(k) tables produced by Algorithm 13.5.1.
Q = (,^x(2U{A})fc)U{0,...,fc-l}x(2U{A})fc
U (2 U {A})fc x {t}}
U ^x{r2p| p = lg(a),^->aisinP},
and
f(r0,A) if* = 0,
<7o = {
1(0, A) otherwise.
The final states are defined by cases:
(To, tip) where p = lg(a), S -* a is in P if k = 0.
(/,*) ifk>l,xGL(G),andO<i<k.
F= {(T,u) \ik> l,«GSfeJ= T(I) and there is some
(A ->a-aM)e/forsomeae F*,aG2,0G V*,
»e(EU {A})fc and u G FIRSTfc+1 (a0p).
The 6 function is defined by cases:
CASE 1. For each a G 2,
(((T0, a), To) if*=l,
5((0,A),a,r0) =
\((\,a),To) if*>l.
We start to fill a buffer with the lookahead string of length k > 1.
CASE2. Iffc> l,thenforeachaG2,0",x)GQ with 1 <i<k,
8((i,x),a,T0) = ((i+l,xa),To).
CASE 3. If A: > 1 then, for each* G 2(fc_1) andaG 2,
8((k-l,x),a, To) = ((7-0,xa), T0).
Now the buffer is loaded.
CASE 4.
a) Iff f(T, u) = shift, k> 1, u = aU2-■ ■ Uk, with a G 2, each [/,• G 2, and
£(r, a) =£ error then
5((7>t/2 • • • Uk), b,T) = ((g(T,a), t/2 • • • Ukb), Tag(T,a))
for all ft G 2.
t / and g are the parsing action and goto functions associated with G and are defined
in Section 13.6. Their effect on the parser is explained in Section 13.4.

556 LR(k) GRAMMARS AND LANGUAGES
b) If k = 0 and f(T, u) = shift, then
S((T, A),a,T) = ((g(T,a), A), Tag(T,a))
for any a G 2, provided g(T, a) i= error.
CASE 5. If f(T, u) = reduce n, where 7r is A -> 0, then let p = lg(0) andt for
any w G (2 U {A})\
a) 8([T,u),A,T) = ({u,t0),T)
b) S((h, ?2l), A> r) = ((". ?2(+i). A) for each TG ^and each /, 0 < / <p.
c) 5((j/,r2i+i), A, y) = ((«,r2i+2), A) for each i, 0 <i<p, and each KG K.
d) If k > 1 and £(r', a) ^ error, then
s((u,t2p),A,T') = (a(r',^),«),rU^r'M)).
e) If fc = 0, then we have:
0 5((A,r2p),A,r') = ((r',r2p),r'),
ii) «((r', r2p), A, r') = MT',A), A), T'Ag(T',A)) provided «(7"M) ^
error.
The extra move here is to be able to accept when k = 0.
This completes the construction. Note that the accepting condition has the
property that, if the lookahead string u is concatenated to what has been read already,
we have a string in the language.
It can be shown that T(A) = L(G), but we shall not do so here because A
is so close to the parser of Section 13.4 whose operation was proved to be correct
in Section 13.6. □
The previous result can be combined with an earlier theorem to give an
important characterization.
Theorem 13.7.2 L is a deterministic context-free language if and only if L
is an LR(k) language for some k > 0.
Proof The result follows from Theorems 13.7.1 and 13.2.4. □
Now we prove a very useful theorem which shows that the LR(k) languages
do not form a hierarchy with respect to k.
Theorem 13.7.3 Every LR(k) language has an LR(l) grammar and hence is
an LR(l) language.
Proof If k = 0 or k = 1, the result is trivial. Suppose L is an LR(k) language
with k> 2. Then LGA0, by Theorem 13.7.1. By Theorem 13.2.4, L is LR(l). □
' It is intended that different states t( be used for different productions. This
dependence is not shown in the notation, in order to simplify matters.

13.7 RELATIONSHIP BETWEEN/.ff-LANGUAGES AND A„ 557
We have found an algorithm for testing whether a context-free grammar is
LR(k) when k is known. If k is not known, the problem is unsolvable.
Theorem 13.7.4 There is no algorithm which can decide, of a context free
grammar G, whether or not G is LR(k) for some k.
Proof See Knuth [1965] for the original proof, or Hunt and Szymanski
[1976] for a metatheorem which covers a number of other important cases.
□
PROBLEMS
1 Give a formal proof that T(A) = L(G) in Theorem 13.7.1.
*2 Prove Theorem 13.7.2 by giving a direct grammatical transformation.
3 Show the following interesting result:
Theorem L is a deterministic context-free language if and only if £ is a
bounded-right-context language.
Hint. Show that any LR(k) grammar is equivalent to a BRC(\,k) grammar. Use
Problem 21 of Section 13.2.
4 Prove the following result:
Theorem Every LR(k) language with k > 1 may be given an LR(l) grammar
(and, hence, LR(k)) in Greibach normal form.
5 Find an LR(0) language L for which there is no LR(0) grammar which is
in Greibach form.
6 Prove the following result. Pertinent definitions and results may be found
in the problems of Section 13.2.
Lemma Let G = (V, 2, P, S) be an LLR(k) grammar, k>0. Then there
exists an LR(l) grammar G' such that L(G) = L(G').
7 Prove the following result:
Theorem L£S* is a deterministic language if and only if L is an LLR
language.
8 Prove the following result:
Theorem L £ 2* is an LR(0) language if and only if L is an LLR(0) language.
9 Let L £ 2* be in A0. A string x £ 2* is a correct prefix of L if there exists
some )>£S* such that xy GZ,. Define I(L) = {x\ x = x'a for some a G 2, x is a
correct prefix of L but x is not}. Let A be a dpda acting as a parser for L. A has the
correct prefix property if, by changing the set of final states of A, we can produce a
recognizer for I(L). Show that the LR parser we have constructed in this chapter is a
correct prefix parser.

558 LR(k) GRAMMARS AND LANGUAGES
13.8 HISTORICAL SURVEY
LR(k) parsing was invented by Knuth [1965]. Our treatment of the
fundamental concepts in Section 13.2 is from Geller and Harrison [1977a] while Theorem
13.2.3 is due to Harrison and Havel [1974]. The characterization of LR(0) languages
is also from Geller and Harrison [1977a]. The concept of LLR(k) grammars is from
Lewis and Stearns [1968]. The family of BRC(2,k) grammars was introduced by
Floyd [1964]. Problem 22 of Section 13.2 is from Harrison and Havel [1974].
The treatment of LR parsing is essentially the canonical subcase of the
theory of characteristic parsing which is from Geller and Harrison [1977b and c].
Problem 6 of Section 13.6 is due to Hunt, Szymanski, and Ullman [1975a]. The proof
of Theorem 13.7.1 is due to M.M. Geller. Problem 3 of Section 13.7 appears in Knuth
[1965]; also see Graham [1971]. Problem 4 appears in Lomet [1973] and in Geller,
Harrison, and Havel [1977] along with Problem 6.

ReF<
eiences
AANDERAA, S.O., "On fc-tape versus (k — 1) tape real time computation,"
SIAM—AMS Colloquium on Applied Mathematics, Vol. 7, Complexity of
Computation (R. Karp, ed.), pp. 75-96, 1974.
AGUILAR, R., ed., Formal Languages and Programming, Amsterdam: North-Holland
Publishing Co., 1976.
AHO, A. V., "Indexed grammars: an extension of the context-free grammars," Journal
of the Association for Computing Machinery, 15, pp. 647—671, 1968.
AHO, A.V., "Nested-stack automata," Journal of the Association for Computing
Machinery, 16, pp. 383-406, 1969.
AHO, A.V., DENNING, P.J., AND ULLMAN, J.D., "Weak and mixed-strategy
precedence parsing," Journal of the Association for Computing Machinery, Vol. 19,
pp. 225-243, 1972.
AHO, A.V., HOPCROFT, J.E., AND ULLMAN, J.D., "Time and tape complexity of
pushdown automaton languages," Information and Control, 13, pp. 186—206, 1968.
AHO, A.V., HOPCROFT, J.E., AND ULLMAN, J.D., The Design and Analysis of
Computer Algorithms, Reading, Mass.: Addison-Wesley Publishing Co., 1974.
AHO, A.V., JOHNSON, S.C., AND ULLMAN, J.D., "Deterministic parsing of
ambiguous grammars," Communications of the Association for Computing Machinery, 18,
pp. 441-452, 1975.
AHO, A.V., AND ULLMAN, J.D., The Theory of Parsing, Translating, and Compiling,
Vols. I and II. Englewood Cliffs, N. J.: Prentice Hall, 1972 and 1973.
AHO, A.V., AND ULLMAN, J.D., "Optimization of LR(k) parsers," Journal of
Computer and System Sciences, 6, pp. 573—602, 1972.
559

560 REFERENCES
AHO, A.V., AND ULLMAN, J.D., Compiler Design, Reading, Mass.: Addison-Wesley
Publishing Co., 1977.
AHO, A.V., ULLMAN, J.D., AND HOPCROFT, J.E., "On the computational power of
pushdown automata," Journal of Computer and System Sciences, 4, pp. 129—136,
1970.
ALLEN, D., "On a characterization of the nonregular set of primes," Journal of
Computer and System Sciences, 1, pp. 464—467, 1968.
ALT, H., AND MEHLHORN, K., "Untere Schranken fur den Platzbedarf bei der
kontext-freien Analyse," Fachbereich 10, Angewandte Mathematik und Informatik der
Universitdt des Saarlandes, September 1975.
ALTMAN, E.B., AND BANERJI, R.B., "Some problems of finite representability,"
Information and Control, 8, pp. 251—263, 1965.
AMAR, V., AND PUTZOLU, G., "On a family of linear grammars," Information and
Control, 7, pp. 283-291, 1964.
AMAR, V., AND PUTZOLU, G., "Generalizations of regular events," Information and
Control, 8, pp. 56-63, 1965.
ANGLUIN, D., "The four Russians' algorithm for boolean-matrix multiplication is
optimal in its class," SIGACTNEWS, 8, pp. 29-33, 1976.
ARBIB, M.A., Theories of Abstract Automata, Englewood Cliffs, N. J.: Prentice-Hall,
1969.
ARLAZAROV, V.L., DINIC, E.A., KRONOD, M.A., AND FARADZEV, I.A., "On
economical construction of the transitive closure of an oriented graph," Doklady
Akad. Nauk SSSR, 194, pp. 487-488, 1970. Also see English translation in Soviet
Mathematics, Vol. 11, pp. 1209-1210, 1970.
BACKUS, J.W., "The syntax and semantics of the proposed international algebraic
language of the Zurich ACM—GAMM conference," Proceedings of the International
Conference on Information Processing, UNESCO, Paris, June 1959.
BAKER, B.S., AND BOOK, R.V., "Reversal-bounded multipushdown machines,"
Information and Control, 24, pp. 231-246, 1974.
BAKER, B.S. AND BOOK, R.V., "Reversal-bounded multipushdown machines,"
Journal of Computer and System Sciences, 8, pp. 315—332, 1974.
BANERJI, R.B., "Phrase-structure languages, finite machines, and channel capacity,"
Information and Control, 6, pp. 153-162, 1963.
BAR-HILLEL, Y., Language and Information, Reading, Mass.: Addison-Wesley
Publishing Co., 1964.
BAR-HILLEL, Y., PERLES, M., AND SHAMIR, E., "On formal properties of simple
phrase-structure grammars," Zeitschrift fur Phonetik, Sprachwissenschaft, und
Kommunikationsforschung, 14, pp. 143 — 177, 1961.
BAR-HILLEL, Y., AND SHAMIR, E., "Finite-state languages: formal representations
and adequacy problems," The Bulletin of the Research Council of Israel, 8F,
pp. 155-166, 1960.
BARNES, B., "A two-way automaton with fewer states than any equivalent one-way
automaton," IEEE Transactions on Computers, C-20, pp. 474-475, 1971.
BEERI, C, "An improvement on Valiant's decision procedure for equivalence of
deterministic finite-turn pushdown automata," Journal of Computer and System
Sciences, 10, pp. 317-339, 1975.

REFERENCES 561
BEERI, C, "Two-way nested-stack automata are equivalent to two-way stack
automata," Journal of Computer and System Sciences, 10, pp. 317—339, 1975.
BENNETT, J.H., "On Spectra," Ph.D. Thesis, Dept. of Mathematics, University of
Michigan, 1962.
BERSTEL, J., "Contribution a l'Etude des Proprietes Arithmetiques des Langages
Formels," Sc.D. Thesis, University of Paris VII, January 1972.
BERSTEL, J., "Sur la densite asymptotique des langages formels," in Automata,
Languages, and Programming (M. Nivat, ed.), pp. 345—358, Amsterdam North-Holland
Publishing Co., 1973.
BIRD, M., "The equivalence problem for deterministic two-tape automata," Journal of
Computer and System Sciences, 7, pp. 218-236, 1973.
BLATTNER, M., "The unsolvability of the equality problem for sentential forms of
context-free grammars," Journal of Computer and System Sciences, 7, pp. 463—468,
1973.
BLATTNER, M., "Transductions of context-free languages into sets of sentential
forms," in Automata, Languages, and Programming (J. Loeckx, ed.), 14, "Lecture Notes
in Computer Science," pp. 511-522, New York: Springer-Verlag, 1974.
BLATTNER, M., "Structural similarity in context-free grammars," Information and
Control, 30, pp. 267-294, 1976.
BLUM, E.K., "A note on free semigroups with two generators," Bulletin of the
American Mathematical Society, 71, pp. 678—679, 1965.
BLUM, M., "On the size of machines," Information and Control, 11, pp. 257—265,
1967.
BLUM, M., "A machine-independent theory of the complexity of recursive functions,"
Journal of the Association for Computing Machinery, 14, pp. 322—336, 1967.
BOASSON, L., "Cones Rationnels et Families Agreables de Langages — Application au
Langage a Compteur," Thesis, University of Paris VII, 1971.
BOASSON, L., "Two iteration theorems for some families of languages," Journal of
Computer and System Sciences, 7, pp. 583—596, 1973.
BOASSON, L., "Un critere de rationalite des langages algebriques," in Automata,
Languages, and Programming (M. Nivat, ed.), pp. 359-365, Amsterdam, North-Holland
Publishing Co., 1973.
BOASSON, L., "Paires Iterantes et Langages Algebriques," Thesis, University of Paris
VII, 1974.
BOASSON, L., "Langages algebriques, paires iterantes, et transductions rationnelles,"
Theoretical Computer Science, 2, pp. 209-224, 1976.
BOOK, R.V., "Time-bounded grammars and their languages," Journal of Computer
and System Sciences, 5, pp. 397-429, 1971.
BOOK, R.V., "Terminal context in context-sensitive grammars," SIAM Journal on
Computing, 1, pp. 20-30, 1972a.
BOOK, R.V., "On languages accepted in polynomial time," SIAN Journal on
Computing, 1, pp. 281-287, 1972b.
BOOK, R.V., "On the structure of context-sensitive grammars," Int. Journal of
Computer and Information Sciences, 2, pp. 129—139, 1973.

562 REFERENCES
BOOK, R.V., AND GREIBACH, S.A., "Quasi-real-time languages," Mathematical
Systems Theory, 4, pp. 97-111, 1970.
BOOK, R.V., AND GREIBACH, S.A., "Formal languages," unpublished manuscript,
1973.
BOOK, R.V., AND HARRISON, M.A., "Mutually divisible semigroups," Discrete
Mathematics, 9, pp. 329-332, 1974.
BORODIN, A., "Complexity classes of recursive functions and the existence of
complexity gaps," Proceedings of the First ACM Symposium on Theory of
Computing, pp. 67-78, 1969.
BORODIN, A., "Computational complexity and the existence of complexity gaps,"
Journal of the Association for Computing Machinery, 19, pp. 158—174, 1972.
BORODIN, A., AND MUNRO, I., The Computational Complexity of Algebraic and
Numeric Problems, New York: American Elsevier Publishing Co., 1975.
BOUCKAERT, M., PIROTTE, A., AND SNELLING, M., "Efficient parsing
algorithms for general context-free parsers," Information Sciences, 8, pp. 1—26,
1975.
BRAFFORT, P., AND HIRSCHBERG, D., Computer Programming and Formal
Systems, Amsterdam, North-Holland Publishing Co., 1963.
BRAINERD, W.S., "An analog of a theorem about context-free languages,"
Information and Control, 11, pp. 561-567, 1968.
BRAINERD, W.S., "Tree-generating regular systems," Information and Control,
14, pp. 217-231, 1969.
BROSGOL, B.M., "Deterministic Translation Grammars," Ph.D. Thesis, Harvard
University, 1974.
CANNON, R.L., "Phrase-structure grammars generating context-free languages,"
Information and Control, 29, pp. 252-267, 1975.
CANTOR, D.G., "On the ambiguity problem of Backus systems," Journal of the
Association for Computing Machinery, 9, pp. 477-479, 1962.
CHANDRA, A.K., AND STOCKMEYER, L.J., "Alternation," Proceedings of the 17th
Annual Symposium on Foundations of Computer Science, pp. 98—108, 1976.
CHEN, K.T., FOX, R.H., AND LYNDON, R.C., "Free differential calculus: IV. The
quotient groups of the lower central series," Annals of Mathematics, 68, pp. 81—95,
1958.
CHOMSKY, N., "Three models for the description of language,"IRE Transactions on
Information Theory, IT2, pp. 113-124, 1956.
CHOMSKY, N., "On certain formal properties of grammars," Information and
Control, 2, pp. 137-167, 1959a.
CHOMSKY, N., "A note on phrase-structure grammars," Information and Control, 2,
pp. 393-395, 1959b.
CHOMSKY, N., "Context-free grammars and pushdown storage," MIT Research Lab
of Electronics Quarterly Progress Report, 65, 1962.
CHOMSKY, N., "Formal properties of grammars," Handbook of Mathematical
Psychology, Vol. 2 (Bush, Galanter, and Luce, eds.), New York: John Wiley, 1963.
CHOMSKY, N., Syntactic Structures, The Hague: Mouton & Co., 1964.

REFERENCES 563
CHOMSKY, N., Aspects of the Theory of Syntax, Cambridge, Mass.: M.I.T. Press,
1965.
CHOMSKY, N., AND MILLER, G.A., "Finite-state languages," Information and
Control, 1, pp. 91-112, 1958.
CHOMSKY, N., AND SCHUTZENBERGER, M.P., "The algebraic theory of context-
free languages," in Computer Programming and Formal Systems (Braffort, P., and
Hirschberg, D., eds.), Amsterdam: North-Holland, pp. 118-161, 1963.
CLIFFORD, A.H., AND PRESTON, G.B., The Algebraic Theory of Semigroups, Vols. I
and II, Providence, R. I.: American Mathematical Society, 1967.
COBHAM, A., "On the base dependence of sets of numbers recognizable by finite
automata," Mathematical Systems Theory, 3, pp. 186—192, 1969.
COBHAM, A., "The instrinsic computational difficulty of functions," Proceedings of
the 1964 Congress for Logic, Mathematics, and Philosophy of Science, Amsterdam:
North-Holland Publishing Co., 1964;pp. 24-30.
COBHAM, A., "Uniform tag sequences," Mathematical Systems Theory, 6, pp. 164-
192, 1972.
COLE, S.N., "Real-time computation by iterative arrays of finite-state machines,"
Ph.D. Thesis, Harvard University, 1962.
COLE, S.N., "Deterministic pushdown store machines and real-time computation,"
Journal of the Association for Computing Machinery, 14, pp. 453—460, 1971.
COLMERAUR, A., "Total precedence relations," Journal of the Association for
Computing Machinery, 17, pp. 14-30, 1970.
CONWAY, J.H., Regular Algebra and Finite Machines. London: Chapman and Hall,
1971.
COOK, S.A., "Characterizations of pushdown machines in terms of time-bounded
computers," Journal of the Association for Computing Machinery, 18, pp. 4—18,
1971.
COOK, S.A., "The complexity of theorem-proving procedures," Proceedings of the
Third Annual ACM Symposium on Theory of Computing, pp. 151 — 158, 1971.
COOK, S.A., "An observation on time-storage tradeoff," Journal of Computer and
System Sciences, 9, pp. 308-316, 1974.
COURCELLE, B., "Une forme canonique pour les grammaires simples deterministes,"
Revue Francaise d'Automatique, Informatique, and Recherche Operationelle, Serie
rouge, 8, pp. 19-36, 1974.
COURCELLE, B., "Recursive schemes, algebraic trees, and deterministic languages,"
Proceedings of the 15th Annual Symposium on Switching and Automata Theory,
pp. 52-62, 1974.
COURCELLE, B., "Sur les ensembles algebriques d'arbres et les langages
deterministes," Ph.D. Thesis and IRIA Report, 1975.
CRESTIN, J.P., "Sur un langage quasi-rationnel d'ambiguite inherente non bornee,"
These de 3° Cycle, Faculty of Science, University of Paris, 1969.
CRESTIN, J.P., "Un langage non ambigu dont le carre est d'ambiguite non bornee,"
in Automata, Languages, and Programming (M. Nivat, ed.), pp. 377—390, Amsterdam:
North-Holland Publishing Co., 1973.

564 REFERENCES
CRESTIN, J.P., "Structure des grammaires d'ambiguite bornee," Journal of Computer
and System Sciences, 8, pp. 36—40, 1974.
CUDIA, D.F., "General problems of formal grammars," Journal of the Association for
Computing Machinery, 17, pp. 31—43, 1970.
CULIK, K., AND COHEN, R., "/.^-regular grammars - an extension of LR(k)
grammars," Journal of Computer and System Sciences, 7, pp. 66—96, 1973.
DeREMER, F., "Practical Translators for LR(k) Languages," Ph.D. Thesis and Project
MAC Technical Report TR-65, M.I.T., 1965.
DeREMER, F., "Simple LR(k) grammars," Communications of the Association for
Computing Machinery, 14, pp. 453—460, 1971.
DIKOVSKII, A. JA., "On the relationship between the class of all context-free
languages and the class of deterministic context-free languages" (in Russian), Algebra i
Logika, 8, pp. 44-64, 1969.
EARLEY, J., "An efficient context-free parsing algorithm," Communications of the
Association for Computing Machinery, 13, pp. 94—102, 1970. Also Ph.D. thesis,
Carnegie-Mellon University, 1968.
EICKEL, J., PAUL, M., BAUER, F.L., AND SAMELSON, K., "A syntax-controlled
generator of formal language processors," Communications of the Association for
Computing Machinery, 6, pp. 451—455, 1963.
EILENBERG, S., Automata, Languages, and Machines, Vol. A, New York: Academic
Press Inc., 1974.
EILENBERG, S., AND SCHUTZENBERGER, M.P., "Rational sets in commutative
monoids," Journal of Algebra, 13, pp. 173—191, 1969.
ELGOT, C.C., "Decision problems of finite-automata design and related arithmetics,"
Transactions of the American Mathematical Society, 98, pp. 21—51, 1961.
ELGOT, C.C., AND MEZEI, J.E., "On relations defined by generalized finite
automata," IBM Journal of Research and Development, 9, pp. 47—68, 1965.
ENGELFRIET, J., "Translation of simple program schemes," in Automata, Languages,
and Programming (M. Nivat, ed.), pp. 215—223, Amsterdam: North-Holland
Publishing Co., 1973.
ERDOS, P., AND SZERKERES, G., "A combinatorial problem in geometry,"
Compositio Mathematica, 2, pp. 463-470, 1935.
ESTENFELD, K., "Strukturelle Untersuchungen zur schwersten kontextfreien
Sprache," Technical Report A76/06, Saarbriicken: Universitat des Saarlandes, 1976.
EVEY, R.J., "The Theory and Application of Pushdown Store Machines," Ph.D.
Thesis and Research Report, Mathematical Linguistics and Automatic Translation
Project, Harvard University, NSF-10, May 1963.
FINE, N.J., AND WILF, H.S., "Uniqueness theorems for periodic functions,"
Proceedings of the American Mathematical Society, 16, pp. 109-114, 1965.
FISCHER, M.J., "Grammars with Macrolike Productions," Ph.D. Dissertation, Harvard
University, 1968.
FISCHER, M.J., "Two characterizations of context-sensitive languages," IEEE
Conference Record of 10th Annual Symposium on Switching and Automata Theory,
pp. 149-156, 1969.

REFERENCES 565
FISCHER, M.J., "Some properties of precedence languages," A CM Symposium on
Theory of Computing, pp. 181-190, 1969.
FISCHER, M.J., AND MEYER, A.R., "Boolean-matrix multiplication and transitive
closure," IEEE Conference Record of 12th Annual Symposium on Switching and
Automata Theory, pp. 129-131, 1971.
FISCHER, M.J., AND ROSENBERG, A.L., "Real-time solutions of the origin crossing
problem,"Mathematical Systems Theory, 2, pp. 257—263, 1968.
FISCHER, P.C., "Turing machines with restricted memory access," Information and
Control, 9, pp. 364-379, 1966.
FISCHER, P.C., AND PROBERT, R.L., "Efficient procedures for using matrix
algorithms," in Automata, Languages and Programming (J. Loeckx, ed.), Vol. 14,
"Lecture Notes in Computer Science," Berlin: Springer-Verlag, 1974; pp. 413-428.
FISCHER, P.C., AND ROSENBERG, A., "Multitape one-way nonwriting automata,"
Journal of Computer and System Sciences, 2, pp. 88—101, 1968.
FISCHER, P.C., MEYER, A.R., AND ROSENBERG, A.L., "Counter machines and
counter languages," Mathematical Systems Theory, 2, pp. 265—283, 1968.
FLIESS, M., "Transductions de series formelles," Discrete Mathematics, 10,
pp. 57-74, 1974.
FLIESS, M., "Matrices de Hankel," Journal of Pure and Applied Math, 53,
pp. 197-222, 1974.
FLOYD, R.W., "Syntactic analysis and operator precedence," Journal of the
Association for Computing Machinery, 10, pp. 313—333, 1963.
FLOYD, R.W., "Bounded-context syntactic analysis," Communications of the
Association for Computing Machinery, 7, pp. 62—66, 1964a.
FLOYD, R.W., "The syntax of programming languages," IEEE Transactions on
Electronic Computers, EC-13,pp. 346-353, 1964b.
FLOYD, R.W., "A machine-oriented recognition algorithm for context-free
languages," unpublished manuscript, April 1969.
FRIEDMAN, E.P., "Deterministic Languages and Monadic Recursion Schemes," Ph.D.
Thesis, Harvard University, Cambridge, 1974.
FRIEDMAN, E.P., "The inclusion problem for simple languages," Theoretical
Computer Science, 1, pp. 297-316, 1976.
GALIL, Z., "Some open problems in the theory of computation as questions about
two-way deterministic pushdown automaton languages," Mathematical Systems Theory,
10,pp. 211-218, 1977.
GALIL, Z., "Real-time algorithms for string matching and palindrome recognition,"
unpublished manuscript, Yorktown Heights, N.Y.: IBM Research Center, 1975.
GALIL, Z., "Two fast simulations which imply some fast string-matching and
palindrome-recognition algorithms," Information Processing Letters, 4, pp. 85—100,
1976.
GALIL, Z., AND SEIFERAS, J., "Recognizing certain repetitions and reversals within
strings," Proceedings of the 17th Annual Symposium on Foundations of Computer
Science, pp. 236-252, 1976.
GALLAIRE, H., "Recognition time of context-free languages by on-line Turing
machines," Information and Control, 15, pp. 288-295, 1969.

566 REFERENCES
GELLER, M.M., GRAHAM, S.L., AND HARRISON, M.A., "Production prefix
parsing" (extended abstract), Automata, Languages, and Programming, 2nd
Colloquium (Jacques Loeckx, ed.), University of Saarbriicken, July 29-August 2, 1974,
pp. 232-241.
GELLER, M.M., AND HARRISON, M.A., "Strict deterministic versusLR(0) parsing,"
Conference Record of the ACM Symposium on Principles of Programming Languages,
pp. 22-32, 1973.
GELLER, M.M., AND HARRISON, M.A., "Characterizations of LR(0) languages,"
Conference Record of the 14th Annual Symposium on Switching and Automata
Theory, pp. 103-108, 1973.
GELLER, M.M., AND HARRISON, M.A., "Characteristic parsing: A framework for
producing compact deterministic parsers, I," Journal of Computer and System
Sciences, 14, pp. 265-317, 1977.
GELLER, M.M., AND HARRISON, M.A., "Characteristic parsing: A framework for
producing compact deterministic parsers, II," Journal of Computer and System
Sciences, 14, pp. 318-343, 1977.
GELLER, M.M., AND HARRISON, M.A., "On LR(k) grammars and languages,"
Theoretical Computer Science, 4, pp. 245-276, 1977.
GELLER, M.M., HARRISON, M.A., AND HAVEL, I.M., "Normal forms of
deterministic grammars," Discrete Mathematics, 16, pp. 313—321, 1976.
GELLER, M.M., HUNT, H.B., SZYMANSKI, T.G., AND ULLMAN, J.D., "Economy
of description by parsers, dpda's, and pda's," Theoretical Computer Science, 4,
pp. 143-159,1977.
GINSBURG, S., The Mathematical Theory of Context-Free Languages, New York:
McGraw-Hm Book Co., 1966.
GINSBURG, S., Algebraic and Automata-Theoretic Properties of Formal Languages,
Amsterdam: North-Holland Publishing Co., 1975.
GINSBURG, S., GOLDSTINE, J., AND GREIBACH, S., "Uniformly erasable AFL,"
Journal of Computer and System Sciences, 10, pp. 165—182, 1975.
GINSBURG, S„ AND GREIBACH, S.A., "Deterministic context-free languages,"
Information and Control, 9, pp. 602—648, 1966a.
GINSBURG, S., AND GREIBACH, S.A., "Mappings which preserve context-sensitive
languages," Information and Control, 9, pp. 620—649, 1966b.
GINSBURG, S., AND GREIBACH, S.A., "Abstract families of languages," Memoirs of
the American Mathematical Society, 87, 1969.
GINSBURG, S., GREIBACH, S.A., AND HARRISON, M.A., "Stack automata and
compiling," Journal of the Association for Computing Machinery, 14, pp. 172—201,
1967.
GINSBURG, S., GREIBACH, S.A., AND HARRISON, M.A., "One-way stack
automata," Journal of the Association for Computing Machinery, 14, pp. 389—418,
1967.
GINSBURG, S., AND HARRISON, M.A., "Bracketed context-free languages,"
Journal of Computer and System Sciences, 1, pp. 1—23, 1967.
GINSBURG, S., AND HARRISON, M.A., "On the elimination of endmarkers,"
Information and Control, 12, pp. 103-115, 1968.

REFERENCES 567
GINSBURG, S., AND HARRISON, M.A., "One-way nondeterministic real-time list-
storage languages," Journal of the Association for Computing Machinery, 15,
pp. 428-446, 1968.
GINSBURG, S., AND HARRISON, M.A., "On the closure of AFL under reversal,"
Information and Control, 17, pp. 395-409, 1970.
GINSBURG, S., AND RICE, H.G., "Two families of languages related to ALGOL,"
Journal of the Association for Computing Machinery, 9, pp. 350—371, 1962.
GINSBURG, S., AND ROSE, G.F., "Some recursively unsolvable problems in ALGOL-
like languages," Journal of the Association for Computing Machinery, 10, pp. 29—47,
1963.
GINSBURG, S., AND ROSE, G.F., "Operations which preserve definability in
languages," Journal of the Association for Computing Machinery, 10, pp. 175 — 195,
1963.
GINSBURG, S., AND ROSE, G.F., "The equivalence of stack-counter acceptors and
quasi-real-time acceptors," Journal of Computer and System Sciences, 8, pp. 243—
269, 1974.
GINSBURG, S., AND SPANIER, E.H., "Quotients of context-free languages," Journal
of the Association for Computing Machinery, 10, pp. 487—492, 1963.
GINSBURG, S., AND SPANIER, E.H., "Bounded ALGOL-like languages,"
Transactions of the American Mathematical Society, 113, pp. 333—368, 1964.
GINSBURG, S., AND SPANIER, E.H., "Finite-turn pushdown automata," SIAM
Journal on Control, 4, pp. 429-453, 1966.
GINSBURG, S., AND SPANIER, E.H., "Semigroups, Presburger formulas, and
languages," Pacific Journal of Mathematics, 16, pp. 285—296, 1966.
GINSBURG, S., AND SPANIER, E.H., "Bounded regular sets," Proceedings of the
American Mathematical Society, 17, pp. 1043—1049, 1966.
GINSBURG, S., AND SPANIER, E.H., "Derivation-bounded languages," Journal of
Computer and System Sciences, 2, pp. 228—250, 1968.
GINSBURG, S., AND ULLIAN, J.S., "Preservation of unambiguity and inherent
ambiguity in context-free languages," Journal of the Association for Computing
Machinery, 13, pp. 364-368, 1966.
GINSBURG, S., AND ULLIAN, J., "Ambiguity in context-free languages," Journal of
the Association for Computing Machinery, 13, pp. 62—89, 1966.
GLADKII, A., "Grammars with linear memory," Algebri i Logika Sem., 2, pp. 43—
55, 1963.
GLADKII, A., "On the complexity of derivations in phrase-structure grammars,"
Algebri i Logika Sem., 3, pp. 29-44, 1964.
GOLDSTINE, J., "Some independent families of one-letter languages," Journal of
Computer and System Sciences, 10, pp. 351—369, 1975.
GOLDSTINE, J., "A simplified proof of Parikh's theorem," Discrete Mathematics, 19,
pp. 235-240, 1977.
GRAHAM, S.L., "Extended precedence: bounded right-context languages and
deterministic languages," Proceedings of the Eleventh Symposium on Switching and
Automata Theory, pp. 175-180, 1970.

568 REFERENCES
GRAHAM, S.L., "Precedence Languages and Bounded Right-Context Languages,"
Ph.D. Thesis and Technical Report CS—71— 223, Department of Computer Science,
Stanford University, 1971.
GRAHAM, S.L., "On bounded right-context languages and grammars," SIAM Journal
of Computing, 3, pp. 224-254, 1974.
GRAHAM, S.L., AND HARRISON, M.A., "Parsing of general context-free languages,"
in Advances in Computers, Vol. 14 (M. Yovits and M. Rubinoff, eds.), pp. 77—185,
New York: Academic Press, 1976.
GRAHAM, S.L., HARRISON, M.A., AND RUZZO, W.L., "On-line context-free
recognition in less than cubic time," Proceedings of the Eighth Annual ACM Symposium on
Theory of Computing, pp. 112-120, 1976.
GRAY, J.N., AND HARRISON, M.A., "The theory of sequential relations,"
Information and Control, 9, pp. 435—468, 1966.
GRAY, J.N., AND HARRISON, M.A., "Single-pass precedence analysis," Conference
Record of the 10th Annual Symposium on Switching and Automata Theory, pp. 106 —
117, 1969.
GRAY, J.N., AND HARRISON, M.A., "On the covering and reduction problems for
context-free grammars," Journal of the Association for Computing Machinery, 19,
pp. 675-698, 1972.
GRAY, J.N., AND HARRISON, M.A., "Canonical precedence schemes," Journal of
the Association for Computing Machinery, 20, pp. 214—234, 1973.
GRAY, J.N., HARRISON, M.A., AND IBARRA, O.H., "Two-way pushdown
automata," Information and Control, 11, pp. 30—70, 1967.
GREIBACH, S.A., "Undecidability of the ambiguity problem for minimal linear
grammars," Information and Control, 6, pp. 119—125, 1963.
GREIBACH, S.A., "A new normal-form theorem for context-free, phrase-structure
grammars," Journal of the Association for Computing Machinery, 12, pp. 42—52,
1965.
GREIBACH, S.A., "The unsolvability of the recognition of linear context-free
5-.^ languages," Journal of the Association for Computing Machinery, 13, pp. 582—587,
1966.
GREIBACH, S.A., "A note on pushdown-store automata and regular systems,"
Proceedings of the American Mathematical Society, 18, pp. 263—268, 1967.
GREIBACH, S.A., "A note on undecidable properties of formal languages,"
Mathematical Systems Theory, 2, pp. 1—6, 1968.
GREIBACH, S.A., "An infinite hierarchy of context-free languages," Journal of the
Association for Computing Machinery, 16, pp. 91 — 106, 1969.
GREIBACH, S.A., "Characteristic and ultrarealtime languages," Information and
Control, 18, pp. 65-98, 1971.
GREIBACH, S.A., "A generalization of Parikh's semilinear theorem," Discrete
Mathematics, 2, pp. 347-355, 1972.
GREIBACH, S.A., "The hardest context-free language," SIAM Journal of Computing,
2, pp. 304-310, 1973.
GREIBACH, S.A., "Jump pda's and hierarchies of deterministic context-free
languages," SIAM Journal of Computing, 3, pp. 111-127, 1974.

REFERENCES 569
GREIBACH, S.A., "One-counter languages and the IRS condition," Journal of
Computer and System Sciences, 10, pp. 237-247, 1975a.
GREIBACH, S.A., "Erasable context-free languages," Information and Control, 29,
pp. 301-326, 1975b.
GRIES, D., Compiler Construction for Digital Computers, New York: John Wiley and
Sons, 1971.
GRIFFITHS, T.V., "Some remarks on derivations in general rewriting
systems,"Information and Control, 12, pp. 27—54, 1968.
GRIFFITHS, T. V., "The unsolvability of the equivalence problem for A-free nondeter-
ministic generalized machines," Journal of the Association for Computing Machinery,
15, pp. 409-413, 1968.
GROSS, M., "Inherent ambiguity of minimal linear grammars," Information and
Control, 7, pp. 366-368, 1964.
GROSS, M., "Applications geometriques des langages formels," ICC Bulletin, 5-3,
1966.
GROSS, M., AND LENTIN, A., Notions sur les Grammaires Formelles, Paris:
Gauthier-Villars, 1967.
GRUSKA, J., "Some classifications of context-free languages," Information and
Control, 14, pp. 152-179, 1969.
GRUSKA, J., "Complexity and unambiguity of context-free grammars and languages,"
Information and Control, 18, pp. 502-519, 1971.
GRUSKA, J., "A characterization of context-free languages," Journal of Computer
and System Sciences, 5, pp. 353—364, 1971.
GRUSKA, J., "On star-height hierarchies of context-free languages," Kybernetika, 9,
pp. 231-236, 1973.
GRUSKA, J., "Grammatical levels and subgrammars of context-free grammars," Arch.
Math 1, Scripta Fac. Sci. Nat. Ujep Bruneusis, IX, pp. 19-21, 1973.
GRUSKA, J., "A note on e-rules in context-free grammars," Kybernetika, 11,
pp. 26-31, 1975.
GRZEGORCZYK, A., "Some classes of recursive functions," Rozprawy Mat-
ematyczne, IV, pp. 1—46, 1954.
HAINES, L.H., "Note on the complement of a (minimal) linear language,"
Information and Control, 7, pp. 307-314, 1964.
HAINES, L.H., "Generation and Recognition of Formal Languages," Ph.D. thesis,
M.I.T., 1965.
HAINES, L.H., "On free monoids partially ordered by embedding," Journal of
Combinatorial Theory, 6, pp. 94-98, 1969.
HAINES, L.H., "Representation theorems for context-sensitive languages,"
unpublished manuscript, University of California, Berkeley, 1970.
HARRISON, M.A., Introduction to Switching and Automata Theory, New York:
McGraw-Hill, 1965.
HARRISON, M.A., "On the relation between grammars and automata," in Advances
of Information Sciences 4 (J.T. Tou, ed.), pp. 39-92, New York: Plenum Press, 1972.

570 REFERENCES
HARRISON, M.A., AND HAVEL, I.M., "Strict deterministic grammars," Journal of
Computer and Systems Sciences, 7, pp. 237—277, 1973.
HARRISON, M.A., AND HAVEL, I.M., "On the parsing of deterministic languages,"
Journal of the Association for Computing Machinery, 21, pp. 525—548, 1974.
HARRISON, M.A., AND HAVEL, I.M., "Real-time strict deterministic languages,"
SIAM Journal of Computing, 1, pp. 333-349, 1972.
HARRISON, M.A., AND HAVEL, I.M., "On a family of deterministic grammars,"
in Automata, Languages, and Programming (M. Nivat, ed.), pp. 413-442, Amsterdam:
North-Holland Publishing Co., 1973.
HARRISON, M.A., AND IBARRA, O.H., "Multitape and multihead pushdown
automata," Information and Control, 13, pp. 433—470, 1968.
HARRISON, M.A., AND SCHKOLNICK, M., "A grammatical characterization of
oneway nondeterministic stack languages," Journal of the Association for Computing
Machinery, 18, pp. 148-172, 1971.
HARTMANIS, J., "Context-free languages and Turing-machine computations,"
Proceedings of a Symposium on Applied Mathematics, 19, Mathematical Aspects of
Computer Science, pp. 42—51. Providence: American Mathematical Society, 1967a.
HARTMANIS, J., "On memory requirements for context-free language recognition,"
Journal of the Association for Computing Machinery, 14, pp. 663—665, 1967b.
HARTMANIS, J., "Computational complexity of one-tape Turing-machine
computations," Journal of the Association for Computing Machinery, 15, pp. 325—339,
1968.
HARTMANIS, J., "On the complexity of undecidable problems in automata theory,"
Journal of the Association for Computing Machinery, 16, pp. 160—167, 1969.
HARTMANIS, J., "Computational complexity of random-access stored-program
machines," Mathematical Systems Theory, 5, pp. 232—245, 1971.
HARTMANIS, J., "On nondeterminancy in simple computing devices," Acta
Informatica, 1, pp. 336-344, 1972.
HARTMANIS, J., AND HOPCROFT, J.E., "What makes some language-theory
problems undecidable?" Journal of Computer and System Sciences, 3, 196—217,
1969.
HARTMANIS, J., AND HOPCROFT, J.E., "An overview of computational com-
plexitVjT Journal of the Association for Computing Machinery, 18, pp. 444—475,
^^T97L^
HARTMANIS, J., AND HUNT, H.B., III, "The lba problem and its importance in the
-—theory of computing," in Complexity of Computation (R. Karp, ed.), Vol. VII,
SL\M^ATstS-^roceed1ngs, 1973; pp. 27-42.
HARTMANIS, J., AND SHANK, H., "On the recognition of primes by automata,"
Journal of the Association for Computing Machinery, 15, pp. 382-389, 1968.
HARTMANIS, J., AND SHANK, H., "Two memory bounds for the recognition of
primes by automata," Mathematical Systems Theory, 3, pp. 125-129, 1969.
HARTMANIS, J., AND STEARNS, R.E., "Regularity-preserving modifications of
regular expressions,"Information and Control, 6, pp. 55-69, 1963.

REFERENCES 571
HARTMANIS, J., AND STEARNS, R.E., "On the computational complexity of
algorithms," Transactions of the American Mathematical Society, 117, pp. 285—306,
1965.
HARTMANIS, J., AND STEARNS, R.E., "Sets of numbers defined by finite
automata," American Mathematical Monthly, 74, pp. 539—542, 1967.
HARTMANIS, J., AND STEARNS, R.E., "Automata-based computational
complexity," Information Sciences, 1, pp. 173 — 184, 1969.
HENNIE, F.C., "One-tape, off-line Turing-machine computations," Information and
Control, 8, pp. 553-578, 1965.
HENNIE, F.C., "On-line Turing-machine computations," IEEE Transactions on
Electronic Computers, EC—15, pp. 35—44, 1966.
HENNIE, F.C., AND STEARNS, R.E., "Two-tape simulation of multitape Turing
machines," Journal of the Association for Computing Machinery, 13, pp. 553—546,
1966.
HIBBARD, T.N., "Scan-limited automata and context-limited grammars," Ph.D. Thesis,
Dept. of Mathematics, University of California at Los Angeles, 1966.
HIBBARD, T.N., "Context-limited grammars," Journal of the Association for
Computing Machinery, 21, pp. 446—453, 1974.
HIBBARD, T.N., AND ULLIAN, J., "The independence of inherent ambiguity from
complementedness among context-free languages," Journal of the Association for
Computing Machinery, 13, pp. 588-593, 1966.
HOPCROFT, J.E., "On the equivalence and containment problems for context-free
languages," Mathematical Systems Theory, 3, pp. 119—124, 1969.
HOPCROFT, J.E., PAUL, W., AND VALIANT, L., "On time versus space and related
problems," Proceedings of the 16th Annual Symposium on Foundations of Computer
Science, pp. 57-64, 1975.
HOPCROFT, J.E., AND ULLMAN, J.D., Formal Languages and Their Relation to
Automata, Reading, Mass.: Addison-Wesley Publishing Co., 1969.
HOPCROFT, J.E., AND ULLMAN, J.D., "An approach to a unified theory of
automata," Bell System Technical Journal, 46, pp. 1793-1829, 1967.
HOPCROFT, J.E., AND ULLMAN, J.D., "Deterministic stack automata and the
quotient operator," Journal of Computer and System Sciences, 2, pp. 1—12, 1968a.
HOPCROFT, J.E., AND ULLMAN, J.D., "Relations between time and tape
complexities," Journal of the Association for Computing Machinery, 15, pp. 414—427,
1968b.
HOTZ, G., "Sequentielle Analyse kontextfreier Sprachen," Acta Informatica, 4,
pp. 55-75, 1974.
HOTZ, G., "Normal-form transformations of context-free grammars," unpublished
manuscript, 1976.
HOTZ, G., "Der Satz von Chomsky-Schutzenberger und die schwerste kontext-
freie Sprache von S. Greibach," unpublished manuscript, 1976.
HOTZ, G., AND CLAUS, V., "Automatentheorie und Formale Sprachen," Mannheim:
Bib. Institut, 1972.

572 REFERENCES
HOTZ, G., AND MESSERSCHMIDT, J., "Dyck Sprachen sind in Bandkomplexitat log
n analysierbar," Fachbereich 10, Saarbriicken: Universitat des Saarlandes, 1975.
HUNT, H.B., III "A complexity theory of grammar problems," Conference Record of
the Third ACM Symposium on Principles of Programming Languages, pp. 12-18,
1976.
HUNT, H.B., III, "On the complexity of finite, pushdown, and stack automata,"
Mathematical Systems Theory, 10, pp. 33-52, 1976.
HUNT, H.B., III, "On the time and tape complexity of languages," Proceedings of the
5th Annual Symposium on Theory of Computing, pp. 10—19, 1973.
HUNT, H.B., III, AND RANGEL, J.L., "Decidability of equivalence, containment,
intersection, and separability of context-free languages," Proceedings of 16th Annual
Symposium on Foundations of Computer Science, pp. 144—150, 1975.
HUNT, H.B., III, AND ROSENKRANTZ, D.J., "Computational parallels between the
regular and context-free languages," SIAM Journal on Computing, 7, 1978.
HUNT, H.B., III, ROSENKRANTZ, D.J., AND SZYMANSKI, T.G., "The covering
problem for linear context-free grammars," Theoretical Computer Science, 2,
pp. 361-382, 1976.
HUNT, H.B., 111, AND ROZENKRANTZ, D.J., "On equivalence and containment
problems for formal languages," Journal of the Association for Computing Machinery,
24, pp. 387-396, 1977.
HUNT, H.B., III, AND SZYMANSKI, T.G., "Complexity metatheorems for context-
free grammar problems," Journal of Computer and System Sciences, 13, pp. 318—334,
1976.
HUNT, H.B., III, AND SZYMANSKI, T.G., "Dichotomization, reachability, and the
forbidden-subgraph problem," Proceedings of the Eighth Annual ACM Symposium on
Theory of Computing, pp. 126-135, 1976.
HUNT, H.B., III, SZYMANSKI, T.G., AND ULLMAN, J.D., "On the complexity of
LR(k) Testing," Communications of the Association for Computing Machinery, 18,
pp. 707-716, 1975a.
HUNT, H.B., III, SZYMANSKI, T.G., AND ULLMAN, J.D., "Operations on sparse
relations, with applications to grammar problems," Proceedings of the Fifteenth
Annual Symposium on Switching and Automata Theory, pp. 127-132, 1974.
HUNT, H.B., III, SZYMANSKI, T.G., AND ULLMAN, J.D., "Operations on sparse
relations," Communications of the Association for Computing Machinery, 20, pp.
171-176, 1977.
IBARRA, O.H., "On two-way multihead automata," Journal of Computer and
System Sciences, 7, pp. 28-36, 1973.
IBARRA, O.H., AND KIM, C.E., "On 3-head versus 2-head finite automata,"/!eta
Informatica, 4, pp. 193-200, 1975.
IBARRA, O.H., AND KIM, C.E., "A useful device for showing the solvability of some
decision problems," Journal of Computer and System Sciences, 13, pp. 153—160,
1976.

REFERENCES 573
ICHBIAH, J.D., AND MORSE, S.P., "A technique for generating almost optimal
Floyd—Evans productions for precedence grammars," Communications of the
Association for Computing Machinery, 13, pp. 501—508, 1970.
IGARASHI, Y., AND HONDA, N., "On the extension of Gladkij's theorem and the
hierarchy of languages," Journal of Computer and System Sciences, 7, pp. 199—217,
1973.
JACOB, G., "Sur un theoreme de Shamir," Information and Control, 27, pp. 218—
261, 1975a.
JACOB, G., "Representations et Substitutions Matricielles dans la Theorie Algebriques
des Transductions," Thesis, University of Paris VII, 1975b.
JONES, N.D., "Classes of automata and transitive closure," Information and Control,
13, pp. 207-229, 1968.
JONES, N.D., "Context-free languages and rudimentary attributes," Mathematical
Systems Theory, 3, pp. 102-109, 1969.
JONES, N.D., "Space-bounded reducibility among combinatorial problems," Journal
of Computer and System Sciences, 11, pp. 68—85, 1975.
JONES, N.D., AND LAASER, W.T., "Complete problems for deterministic
polynomial time," Theoretical Computer Science, 3, pp. 105 — 118, 1976.
JONES, N.D., LAASER, W.T., AND LIEN, Y., "New problems complete for nondeter-
ministic log space computability," Mathematical Systems Theory, 10, pp. 19—32,
1976.
JONES, N.D., AND SELMAN, A.L., "Turing machines and the spectra of first-order
formulas," Journal of Symbolic Logic, 39, pp. 139-150, 1974.
JOSHI, A.K., LEVY, L.S., AND TAKAHASHI, M., "Tree-Adjunct Grammars,"
Journal of Computer and System Sciences, 10, pp. 136—163, 1975.
KARP, R.M., "Reducibilities among combinatorial problems," in Complexity of
Computer Computations (R. Miller and J. Thatcher, eds.), New York: Plenum Press, 1972;
pp. 85-104.
KARP, R.M., "Automaton-based complexity theory," Class notes for CS 276,
Computer Science Division, University of California, Berkeley, 1975.
KASAMI, T., "An efficient recognition and syntax-analysis algorithm for context-
free languages," Science Report AFCRL—65—758, Air Force Cambridge Research
Laboratory, Bedford, Mass., 1965.
KASAMI, T., AND TORII, K., "A syntax-analysis procedure for unambiguous
context-free grammars," Journal of the Association for Computing Machinery, 16,
pp. 423-431, 1969.
KASAMI, T., "A note on computing time for recognition of languages generated by
linear grammars," Information and Control, 10, pp. 209—214, 1968.
KEMP, R., "An estimation of the set of states of the minimal LR(0) acceptor," in
Automata, Languages, and Programming (M. Nivat, ed.), pp. 563—574, Amsterdam:
North-Holland Publishing Co., 1973.
KEMP, R., "Mehrdeutigkeiten kontextfreier Grammatiken," in Automata, Languages,
and Programming (J. Loeckx, ed.), Vol. 14, Lecture notes in Computer Science,"
pp. 534-546, Berlin: Springer-Verlag, 1974.

574 REFERENCES
KEMP, R., "LR(k) Analysatoren," Technical Report A73/02, Mathematisches Institut
und Institut fiir Angewandte Mathematik, Saarbriicken, Universitat des Saarlandes,
1973.
KLEENE, S.C., "Representation of events in nerve nets," in Automata Studies (C.E.
Shannon and J. McCarthy, eds.), pp. 3—40, Princeton: Princeton University Press,
1956.
KNUTH, D.E., "On the translation of languages from left to right," Information and
Control, 8, pp. 607-639, 1965.
KNUTH, D.E., "A characterization of parenthesis languages," Information and Control,
11, pp. 269-289, 1967.
KNUTH, D.E., "Fundamental Algorithms," Vol. I of The Art of Computer
Programming, Reading, Mass.: Addison-Wesley Publishing Co., 1973.
KNUTH, D.E., "Top-down syntax analysis," Acta Informatica, 1, pp. 79-110, 1971.
KNUTH, D.E., AND BIGELOW, R.H., "Programming languages for automata,"
Journal of the Association for Computing Machinery, 14, pp. 615-635, 1967.
KNUTH, D.E., MORRIS, J.H., AND PRATT, V.R., "Fast pattern-matching in strings,"
SIAM Journal on Computing, 6, pp. 323-350, 1977.
KNUTH, D.E., "Big omicron and big omega and big theta," SIGACT NEWS, 8, pp.
18-24,1976.
KORENJAK, A.J., "A practical method for constructing LR(k) processors,"
Communications of the Association for Computing Machinery, 12, pp. 613—623, 1969.
KORENJAK, A.J., AND HOPCROFT, J.E., "Simple deterministic languages,"
Conference Record of Seventh Annual Symposium on Switching and Automata Theory,
Berkeley, pp. 36-46, 1966.
KOSARAJU, S.R., "One-way stack automaton with jumps," Journal of Computer and
System Sciences, 9, pp. 164-176, 1974.
KOSARAJU, S.R., "Speed of recognition of context-free language by array-automata,"
SIAM Journal on Computing, 4, pp. 333-340, 1975.
KOSARAJU, S.R., "Context-free preserving functions," Mathematical Systems Theory,
9, pp. 193-197,1975.
KOSARAJU, S.R., "Recognition of context-free languages by random-access machines,"
unpublished manuscript, Fall, 1975.
KOZEN, D., "On parallelism in Turing machines," Proceedings of the 17th Annual
Symposium on Foundations of Computer Science, pp. 89—97, 1976.
KRAL, J., AND DEMNER, J., "A note on the number of states of the DeRemer's
recognizer," Information Processing Letters, 2, pp. 22-23, 1973.
KREIDER, D„ AND RITCHIE, R.E., "A basis theorem for a class of two-way
automata," Zeitschrift fiir Mathematische Logik und Grundlagen der Mathematik, 12,
pp. 243-255, 1966.
KURODA, S.Y., "Classes of languages and linear-bounded automata," Information
and Control, 7, pp. 207-223, 1964.
LALONDE, W.R., "On directly constructing LR(k) parsers without chain reductions,"
Conference Record of the Third ACM Symposium on Principles of Programming
Languages, pp. 127-133, 1976.

REFERENCES 575
LANDWEBER, P.S., "Three theorems on phrase-structure grammars of type 1,"
Information and Control, 6, pp. 131-136, 1963.
LANDWEBER, P.S., "Decision problems of phase-structure grammars," IEEE
Transactions on Electronic Computers, EC-13, pp. 354—362, 1964.
LEARNER, A., AND LIM, A.L., "A note on transforming grammars to Wirth—Weber
precedence form," Computer Journal, 12, pp. 142-144, 1970.
LEHMANN, D., "LR(k) grammars and deterministic languages," Israel Journal of
Mathematics, 10, pp. 526-530, 1971.
LENTIN, A., Equations dans les Monoides Libres. Paris: Gauthier—Villars, 1972.
LENTIN, A., AND SCHUTZENBERGER, M.P., "A combinatorial problem in the
theory of free monoids," in Combinatorial Mathematics and Its Applications, pp.
128-144, 1967; Proc. of conf. at Chapel Hill, N.C., April 10-14, 1967, Ed. by R.C.
Bose and T.A. Dowling. Chapel Hill: University of No. Carolina Press, 1969.
LEVI, F.W., "On semigroups," Bulletin of the Calcutta Mathematical Society, 36, pp.
141-146,1944.
LEWIS, P.M., Ill, AND ROSENKRANTZ, D.J., "An ALGOL compiler designed using
automata theory," Proceedings of a Symposium on Computers and Automata, Vol. 21.
New York: Polytechnic Institute of Brooklyn, 1971; pp. 75-88.
LEWIS, P.M., Ill, AND STEARNS, R.E., "Syntax-directed transductions," Journal of
the Association for Computing Machinery, 15, pp. 465—488, 1968.
LEWIS, P.M., Ill, STEARNS, R.E., AND HARTMANIS, J., "Memory bounds for
recognition of context-free and context-sensitive languages," IEEE Conference Record
on Switching Circuit Theory and Logical Design, Ann Arbor, Michigan, pp. 191—202,
1965.
LIPTON, R.J., AND ZALCSTEIN, Y., "Word problems solvable in log space," Journal
of the Association for Computing Machinery, 24, pp. 522—526, 1977.
LIU, L.Y., AND WEINER, P., "A characterization of semilinear sets," Journal of
Computer and System Sciences, 4, pp. 299-307, 1970.
LOMET, D.B., "A formalization of transition diagram systems," Journal of the
Association for Computing Machinery, 20, pp. 235-237, 1973.
LYNCH, N., "Log space recognition and translation of parenthesis languages," Journal
of the Association for Computing Machinery, 24, pp. 583-590, 1977.
LYNDON, R.C, AND SCHUTZENBERGER, M.P., "The equation aM = bNcp in a free
group," Michigan Mathematical Journal, 9, pp. 289-298, 1962.
MCNAUGHTON, R., "Parentheses grammars," Journal of the Association for
Computing Machinery, 14,pp. 490-500, 1967.
MCNAUGHTON, R., AND YAMADA, H., "Regular expressions and state graphs for
automata," IEEE Transactions on Electronic Computers, EC—9, pp. 39—47, 1960.
MCWHIRTER, LP., "Substitution expressions," Journal of Computer and System
Sciences, 5, pp. 629-637, 1971.
MAIBAUM, T.S.E., "A generalized approach to formal languages," Journal of
Computer and System Sciences, 8, pp. 409-439, 1974.
MARCUS, S., Introduction Mathematique a la Linguistique Structural. Paris: Dunod,
1967.

576 REFERENCES
MATTHEWS, G., "Discontinuity and asymmetry in phrase-structure grammars,"
Information and Control, 7, pp. 137-146, 1963.
MATTHEWS, G., "A note on asymmetry in phrase-structure grammars," Information
and Control, 7, pp. 360-365, 1964.
MATTHEWS, G., "Two-way languages," Information and Control, 10, pp. 111-119,
1967.
MAURER, H.A., "A direct proof of the inherent ambiguity of a simple context-free
language," Journal of the Association for Computing Machinery, 16, pp: 256—260,
1969.
MEHLHORN, K., "Bracket languages are recognizable in logarithmic space," Technical
Report A 75/12. Saarbriicken; Universitat des Saarlandes, 1975.
MEHTA, V., "Onward and upward with the arts: John is easy to please," New Yorker,
47, pp. 44-87, May 8, 1971.
MEYER, A.R., and FISCHER, M.J., "Economy of description by automata, grammars,
and formal systems," Conference Record of 1971 Annual Symposium on Switching
and Automata Theory, pp. 188-191, 1971.
MICKUNAS, M.D., "On the complete covering problem for LR(k) grammars," Journal
of the Association for Computing Machinery, 23, pp. 17—30, 1976.
MICKUNAS, M.D., LANCASTER, R.L., AND SCHNEIDER, B.B., "Transforming
LR(k) grammars to LR(1), SLR(l), and (1, 1) bounded right-context grammars,"
Journal of the Association for Computing Machinery, 23, pp. 1 7—30, 1976.
MINSKY, M.L., "Recursive unsolvability of Post's problem of 'Tag' and other topics in
the theory of Turing machines," Annals of Mathematics, 74, pp. 437—455, 1961.
MINSKY, M.L., AND PAPERT, S., "Unrecognizable sets of numbers," Journal of the
Association for Computing Machinery, 13, pp. 281—286, 1966.
MORSE, M., AND HEDLUND, G.A., "Unending chess, symbolic dynamics, and a
problem in semigroups," Duke Mathematical Journal, 11, pp. 1—7, 1944.
MUNRO, I., "Efficient determination of the transitive closure of a directed graph,"
Information Processing Letters, l,pp. 56—58, 1971.
MYHILL, J., "Finite automata and the representation of events," WADD Technical
Report 57—624, Wright-Patterson Air Force Base, November, 1957.
MYHILL, J., "Linear bounded automata," WADD Technical Note 60-165, Wright-
Patterson Air Force Base, 1960.
NELSON, C, "One-way automata on bounded languages," Technical Report TR-14-76,
Center for Research in Computing Technology, Harvard University, July, 1976.
NERODE, A., "Linear automaton transformations," Proceedings of the American
Mathematical Society, 9, pp. 541-544, 1958.
NIVAT, M., "Transductions des Langages de Chomsky," Ph.D. Thesis, University of
Paris, 1967.
NIVAT, M., "Sur les automates a memoire pile." Institut de Programmation, 1970.
NIVAT, M„ "On some families of languages related to the Dyck language," Institut de
Programmation, 1970.
NIVAT, M., "Congruences de Thue et Mangages," StudiaScientiarum Mathematicarum
Hungarica, 6, pp. 243-249, 1971.

REFERENCES 577
NIVAT, M., "On the interpretation of recursive program schemes," Rapport de
Recherche No. 84, Institut de Recherche d'Informatique et d'Automatique, 1974.
NIVAT, M., "Extensions et restrictions des grammaires algebriques," in Formal
Languages and Programming (R. Aguilar, editor), pp. 83—96, 1976.
OGDEN, W.F., "Intercalation Theorems for Pushdown Store and Stack Languages,"
Ph.D. thesis, Stanford University, 1968.
OGDEN, W.F., "A helpful result for proving inherent ambiguity," Mathematical
Systems Theory, 2, pp. 191-194, 1968.
PAGER, D., "A fast left-to-right parser for context-free grammars," Technical Report
PE240, Information Sciences Program, University of Hawaii, January, 1972.
PAIR, C, "Trees, pushdown stores, and compilation," Rev. Franq. Traitment Infor-
matique, Chiffres 7; 3, pp. 199-216, 1964.
PAIR, C, AND QUERE, A., "Definition et etude des bilangages reguliers," Information
and Control, 13, pp. 565-593, 1968.
PARIKH, R.J., "Language-generating devices," Quarterly Progress Report, No. 60,
Research Laboratory of Electronics, M.I.T., pp. 199-212, 1961.
PARIKH, R.J., "On context-free languages," Journal of the Association for Computing
Machinery, 13, pp. 570-581, 1966.
PATERSON, M.S., "Decision problems in computational models," in Proceedings of
the ACM Conference on Proving Assertions about Programs, Las Cruces, New Mexico,
1972.
PAULL, M.C., AND UNGER, S.H., "Structural equivalence of context-free grammars,"
Journal of Computer and System Sciences, 2, pp. 427—463, 1968.
PERROT, J.F., "Monoides syntactiques des langages algebriques," ^4 eta Informatica,
7, pp. 393-413, 1977.
PERROT, J.F., AND SAKAROVITCH, J., "Langages algebriques deterministes et
groupes abe'liens," Second GI Fachtagung, Automaten theorie und formate Sprachen,
Kaiserslautern, pp. 20—30, May 1975.
PETERS, P.S., AND RITCHIE, R.W., "Context-sensitive immediate constituent
analysis — context-free languages revisited," Conference Record of the ACM
Symposium on Theory of Computing, pp. 1—8, 1969.
PILLING, D.L., "Commutative regular equations and Parikh's theorem," Journal of
the London Mathematical Society (II) 6, pp. 663-666, 1973.
PIRICKA-KELEMENOVA, A., "Greibach normal-form complexity," in Mathematical
Foundations of Computer Science 1975 (J. Becvar, ed.), Vol. 32, "Lecture Notes in
Computer Science," pp. 344-350. Berlin: Springer-Verlag, 1975.
PLEASANTS, P.A.B., "Nonrepetitive sequences," Proceedings of the Cambridge
Philosophical Society, 68, pp. 261-21 A, 1970.
POST, E.L., "A variant of a recursively unsolvable problem," Bulletin of the American
Mathematical Society, 52, pp. 264-268, 1946.
PRATT, V.R., RABIN, M.O., AND STOCKMEYER, L.J., "A characterization of the
power of vector machines," Conference Record of the Sixth ACM symposium on
Theory of Computing, pp. 122-134, 1974.
PRATT, V., "LINGOL - a progress report," Fourth International Joint Conference on
Artificial Intelligence, Tbilisi, U.S.S.R., Vol. 1; 1975; pp. 422-428.

578 REFERENCES
PRESBURGER, M., "Uber die Vollstandigkeit eines gewissen Systems der Arithmetic
ganzer Zahlen in welchem die Addition als einzige Operation hervortritt," Sprawozdanie
z I Kongresu Matematykow Krajow Slowanskieb, Warsaw, pp. 92-101, 1930.
PUTZBACH, P., "Une famille de congruences de Thue pour lesquelles le probleme de
I'equivalence est decidable. Application a I'equivalence des grammaires separees," in
Automata, Languages, and Programming (M. Nivat, ed.), pp. 3—12. Amsterdam:
North-Holland Publishing Co., 1973.
RABIN, M.O., "Degree of difficulty of computing a function and a partial ordering of
recursive sets," Applied Logic Branch, Technical Report No. 2, Hebrew University,
Jerusalem, 1960.
RABIN, M.O., "Real-time computation," Israel Journal of Mathematics, 1, pp. 203—
211, 1963.
RABIN, M.O., AND SCOTT, D., "Finite automata and their decision problems," IBM
Journal of Research and Development, 3, pp. 114—125, 1959.
RAMSEY, F.P., "On a problem of formal logic," Proceedings of the London
Mathematical Society, 2nd Series, 30, pp. 264-286, 1930.
RANEY, G.N., "Functional composition patterns and power-series reversion,"
Transactions of the American Mathematical Society, 94, pp. 441-451, 1960.
REEDY, A., AND SAVITCH, W.J., "The Turing degree of the inherent ambiguity
problem for context-free languages," Theoretical Computer Science, 1, pp. 77-79,
1975.
RIORDAN, J., Combinatorial Identities. New York: John Wiley and Sons, Inc., 1968.
RITCHIE, R.W., "Classes of predictably computable functions," Transactions of the
American Mathematical Society, 106, pp. 139—173, 1963a.
RITCHIE, R.W., "Finite automata and the set of squares," Journal of the Association
for Computing Machinery, 10, pp. 528-531, 1963b.
RITCHIE, R.W., AND SPRINGSTEEL, F.N., "Language recognition by marking
automata," Information and Control, 20, pp. 313—330, 1972.
ROGERS, H., The Theory of Recursive Functions and Effective Computability. New
York: McGraw-Hill Book Co., 1967.
ROSENBERG, A.L., "On multihead finite automata," IBM Journal of Research and
Development, 10, pp. 388-394, 1966.
ROSENBERG, A.L., "A machine realization of the linear context-free languages,"
Information and Control, 10, pp. 175-188, 1967.
ROSENBERG, A.L., "Real-time definable languages," Journal of the Association
for Computing Machinery, 14, pp. 645—662, 1967.
ROSENBERG, A.L., "On the independence of realtime definability and certain
structural properties of context-free languages," Journal of the Association for
Computing Machinery, IS, 672-679, 1968.
ROSENBERG, A.L., "A note on ambiguity of context-free languages and presentations
of semilinear sets," Journal of the Association for Computing Machinery, 17, pp.
44-50, 1970.
ROSENKRANTZ, D.J., "Matrix equations and normal forms for context-free grammar."
Journal of the Association for Computing Machinery, 14, pp. 501—507, 1967.

REFERENCES 579
ROSENKRANTZ, D.J., "Programmed grammars and classes of formal languages,"
Journal of the Association for Computing Machinery, 16, pp. 107—131, 1969.
ROSENKRANTZ, D.J., AND LEWIS, P.M., Ill, "Deterministic left-corner parsing,"
Symposium on Switching and Automata Theory, pp. 139—152, 1970.
ROSENKRATZ, D.J., AND STEARNS, R.E., "Properties of deterministic top-down
grammars," Information and Control, 17, pp. 226—255, 1970.
ROUNDS, W.C., "Mappings and grammars on trees," Mathematical Systems Theory, 4,
pp. 257-287, 1970.
ROZENBERG, G., "Direct proofs of the undecidability of the equivalence problem
for sentential forms of linear context-free grammars and the equivalence problem for
OL systems," Information Processing Letters, 1, pp. 233—235, 1972.
RUOHONEN, K., "Some combinatorial mappings of words," Annates Academiae
Scientiarum Fennicae, Sec. A, Helsinki, 1976.
RYSER, H.J., "Combinatorial Mathematics," Vol. 14, Cams Monograph Series, The
Mathematical Association of America, 1963.
SALOMAA, A., Theory of Automata. New York: Pergamon Press, 1969a.
SALOMAA, A., "On grammars with restricted use of productions." Annales Academiae
Scientiarum Fennicae, Series A.I., 454, Helsinki, 1969b.
SALOMAA, A., "On the index of a context-free grammar and language," Information
and Control, 14, pp. 474-477, 1969c.
SALOMAA, A., "On some families of formal languages obtained by regulated
derivations," Annales Academiae Scientiarum Fennicae, Series A.I., 479. Helsinki,
1970.
SALOMAA, A., Formal Languages, New York: Academic Press, 1973a.
SALOMAA, A., "On sequential forms of context-free grammars," Acta Informatica, 2,
pp. 40-49, 1973b.
SANTOS, E.S., "A note on bracketed grammars," Journal of the Association for
Computing Machinery, 19, pp. 222-224, 1972.
SAVITCH, W.J., "Relationships between nondeterministic and deterministic tape
complexities," Journal on Computer and System Sciences, 4, pp. 177—192, 1970.
SAVITCH, W.J., "How to make arbitrary grammars look like context-free grammars,"
SIAM Journal on Computing, 2, pp. 174-182, 1973.
SAVITCH, W.J., "Maze-recognizing automata and nondeterministic tape complexity,"
Journal of Computer and System Sciences, 7, pp. 389—403, 1973.
SCHEINBERG, S., "Note on the boolean properties of context-free languages,"
Information and Control, 3, pp. 372-375, 1960.
SCHKOLNICK, M., "Two-tape bracketed grammars," Proceedings of the 9th Annual
Symposium on Switching and Automata Theory, pp. 315—326, 1968.
SCHUTZENBERGER, M.P., "A remark on finite transducers," Information and
Control, 4, pp. 185-196, 1961a.
SCHUTZENBERGER, M.P., "On the definition of a family of automata," Information
and Control, 4, pp. 245-270, 1961b.
SCHUTZENBERGER, M.P., "Certain elementary families of automata," in Proceedings
of the Symposium on Mathematical Theory of Automata, Polytechnic Institute of
Brooklyn, pp. 139-153, 1962a.

580 REFERENCES
SCHUTZENBERGER, M.P., "On a theorem of R. Jungen," Proceedings of the
American Mathematical Society, 13, pp. 885-890, 1962b.
SCHUTZENBERGER, M.P., "Finite counting automata," Information and Control, 5,
pp. 91-107, 1962c.
SCHUTZENBERGER, M.P., "On context-free languages and pushdown automata,"
Information and Control, 6, pp. 246—264, 1963.
SCHUTZENBERGER, M.P., Quelques Problemes Combinatories de la Theorie des
Automata; Course notes. Paris: Institut de Programmation, 1967.
SCHUTZENBERGER, M.P., "A remark on acceptable sets of numbers," Journal of
the Association for Computing Machinery, IS, pp. 300—303, 1968.
SCHUTZENBERGER, M.P., "Le theoreme de Lagrange selon Raney," in Logiques et
Automates. Rocquencourt: Seminaires IRIA, 1971.
SCHUTZENBERGER, M.P., "Une caracterisation des parties reconnaissables," in
Formal Languages and Programming (R. Aguilar, ed.), 1.976; pp. 77—82.
SCHUTZENBERGER, M.P., "Sur une caracterisation des fonctions sequentielles,"
Research Report 176, IRIA, 1976.
SCOTT, D., "Some definitional suggestions for automata theory," Journal of Computer
and System Sciences, l,pp. 187-212, 1967.
SEIFERAS, J.I., "A note on prefixes of regular languages," ACM SIGACT News,
pp. 25-29, January 1974.
SEIFERAS, J.I., "Techniques for separating space complexity classes" and "Relating
refined space complexity classes," Journal of Computer and System Sciences, 14,
pp. 73-99 and pp. 100-129, 1977.
SEIFERAS, J.I., AND MCNAUGHTON, R., "Regularity-preserving relations,"
Theoretical Computer Science, 2, pp. 147-154, 1976.
SEKIMOTO, S., MUKAI, K., AND SUDO, M., "A method of minimizing LR(k)
parsers," Systems, Computers, and Control, 4, pp. 73—80, 1973.
SEMONOV, A.L., "Algorithmic problems for power-series and context-free grammars,"
Doklady Akademii Nauk SSSR, 212, pp. 50-52, 1973. Translated into English in
Soviet Math, 14, pp. 1319-1322, 1973.
SHAMIR, E., "A representation theorem for algebraic and context-free power series in
noncommutating variables," Information and Control, 11, pp. 239—254, 1967.
SHAMIR, E., "Some inherently ambiguous context-free languages," Information and
Control, 18, pp. 355-363, 1971.
SHEPHERDSON, J.C., "The reduction of two-way automata to one-way automata,"
IBM Journal of Research and Development, 3, pp. 198-200, 1959.
SLISENKO, A.O., "Recognition of palindromes by multihead Turing machines,"
Proceedings of the Steklov Mathematical Institute, Acad, of Sciences of the U.S.S.R.,
129, pp. 30-202,1973.
SMITH, A.R., "Real-time language recognition by one-dimensional cellular automata,"
Journal of Computer and System Sciences, 6, pp. 233—253, 1972.
SMULLYAN, R.M., Theory of Formal Systems. Princeton: Princeton University Press,
1961.
STANAT, D.F., "A homomorphism theorem for weighted context-free grammars,"
Journal of Computer and System Sciences, 6, pp. 217-232, 1972.

REFERENCES 581
STANTON, R.G., Numerical Methods for Science and Engineering. Englewood Cliffs,
N.J.: Prentice-Hall, Inc., 1961.
STEARNS, R.E., "A regularity test for pushdown machines," Information and
Control, 11, pp. 323-340, 1967.
STEARNS, R.E., HARTMANIS, J., AND LEWIS, P.M., Ill, "Hierarchies of memory-
limited computations," Conference Record on Switching Circuit Theory and Logical
Design, pp. 191-202, 1965.
STRASSEN, V., "Gaussian elimination is not optimal," Numerische Mathematik, 13,
pp. 354-456, 1969.
SUDBOROUGH, I.H., "A note on tape-bounded complexity classes and linear context-
free languages," Journal ofthe Association for Computing Machinery, 22, pp. 499—500,
1975.
SUDBOROUGH, I.H., "On tape-bounded complexity classes and multihead finite
automata." Journal of Computer and System Sciences, 10, pp. 62—76, 1975.
SUDBOROUGH, I.H., "One-way multihead writing finite automata," Information and
Control, 30, pp. 1-20, 1976.
SUDBOROUGH, I.H., "On deterministic context-free languages, multihead automata,
and the power of an auxiliary pushdown store," Proceedings of the Eighth Annual
ACM Symposium on Theory of Computing, pp. 141 — 148, 1976.
SZYMANSKI, T.G., AND ULLMAN, J.D., "Evaluating relational expressions with
dense and sparse arguments," SIAM Journal on Computing, 6, pp. 109—122, 1977.
SZYMANSKI, T.G., AND WILLIAMS, J.H., "Noncanonical extensions of bottom-up
parsing techniques," SIAM Journal on Computing, 5, pp. 231-250, 1976.
TANIGUCHI, K., AND KASAMI, T., "A note on computing time for the recognition
of context-free languages by a single-tape Turing machine," Information and Control,
14, pp. 278-284, 1969.
THATCHER, J.W., "Characterizing derivation trees of a context-free grammar through
a generalization of finite-automata theory," Journal of Computer and System
Sciences, 1, pp. 317-322, 1967.
THATCHER, J.W., "Generalized 2-sequential transducers," Journal of Computer and
System Sciences, 4, pp. 339-367, 1970.
TOWNLEY, J.G., "The Measurement of Complex Algorithms," Ph.D. Thesis, Harvard
University, Cambridge, Mass., October 1972. Also available as Report TR 14—73,
Center for Research in Computing Technology, Harvard University.
TURING, A.M., "On computable numbers with an application to the Entscheidungs-
problem," Proceedings of the London Mathematical Society, 2—42, pp. 230-265,
1936.
ULLIAN, J., "Failure of a conjecture about context-free languages," Information and
Control, 9, pp. 61-65, 1966.
ULLIAN, J.S., "Partial-algorithm problems for context-free languages," Information
and Control, 11, pp. 80-101, 1967.
ULLMAN, J.D., "Halting stack automata," Journal of the Association for Computing
Machinery, 16, pp. 550-563, 1969.

582 REFERENCES
ULLMAN, J.D., "Applications of Language Theory to Compiler Design," in Theoretical
Computer Science (A.V. Aho, ed.). Englewood Cliffs, N.J.: Prentice-Hall, 1972.
ULLMAN, J.D., Fundamental Concepts of Programming. Reading, Mass.: Addison-
Wesley Publishing Co., 1976.
URPONEN, T., "On axiom systems for regular expressions and on equations involving
languages," Annales Universitatis Turkuensis, Series A, Turku, 1971.
URPONEN, T., "Equations with a Dyck-language solution," Information and Control,
30, pp. 21-37, 1976.
VALIANT, L.G., "Decision Procedures for Families of Deterministic Pushdown
Automata," Ph.D. Thesis, Department of Computer Science, University of Warwick,
1973.
VALIANT, L., "General context-free recognition in less than cubic time," Journal of
Computer and System Sciences, 10, pp. 308-315, 1975.
VALIANT, L.G., "Regularity and related problems for deterministic pushdown
automata," Journal of the Association for Computing Machinery, 22, pp. 1 — 10, 1975.
VALIANT, L.G., "A note on the succinctness of descriptions of deterministic
languages," Information and Control, 32, pp. 139—145, 1976.
VALIANT, L.G., AND PATERSON, M.S., "Deterministic one-counter languages,"
Journal of Computer and System Sciences, 10, pp. 340-350, 1975.
VAN LEEUWEN, J., "A generalization of Parikh's theorem in formal language theory,"
in Automata, Languages, and Programming (J. Loeckx, ed.), Vol. 14, "Lecture Notes
in Computer Science," pp. 17-26. Berlin: Springer-Verlag, 1974.
VAN LEEUWEN, J., AND SMITH, C.H., "An improved bound for detecting looping
configurations in deterministic pda's," Information Processing Letters, 3, pp. 22—24,
1974.
VAN LEEUWEN, J., "Variations of a new machine model," Proceedings of the 17th
Annual Symposium on Foundations of Computer Science, pp. 228—235, 1976.
WARSHALL, S., "A theorem on boolean matrices," Journal of the Association for
Computing Machinery, 9, pp. 11 — 12, 1962.
WEGBREIT, B., "A generator of context-sensitive languages," Journal of Computer
and System Sciences, 3, pp. 456—461, 1969.
WEICKER, R., "General context-free language recognition by a RAM with uniform-
cost criterion in time n2\o%n," Technical Report No. 182, The Pennsylvania State
University, 1976.
WILLIAMS, J.H., "Bounded-context parsable grammars," Information and Control,
28, pp. 314-334, 1975.
WIRTH, N., "PL 360 - a programming language for the 360 computer," Journal of
the Association for Computing Machinery, IS, pp. 37-74, 1968.
WIRTH, N., AND WEBER, H., "EULER, a generalization of ALGOL, and its formal
definition," Communications of the Association for Computing Machinery, 9, pp.
13-25, pp. 88-89, 1966.
WISE, D.S., "Generalized overlap-resolvable grammars and their parsers," Journal of
Computer and System Sciences, 6, pp. 538—572, 1972.

REFERENCES 583
WISE, D.S., "A strong pumping lemma for context-free languages," Theoretical
Computer Science, 3, pp. 359-369, 1976.
WOOD, D., "The normal-form theorem — another proof," The Computer Journal,
12, pp. 139-147, 1969a.
WOOD, D., "A note on top-down deterministic languages," BIT, 9, pp. 387-399,
1969b.
WOOD, D., "The theory of left-factored languages," The Computer Journal, 12,
pp. 349-356, 1969c, and 13, pp. 55-62, 1970.
WOOD, D., "A generalised normal-form theorem for context-free grammars," The
Computer Journal, 13, pp. 212—211, 1970.
WOOD, D., "A further note on top-down deterministic languages," The Computer
Journal, 14, pp. 396-403, 1971.
WOOD, D., "Some remarks on the KH algorithm for j-grammars," BIT, 13, pp. 476—
489, 1973.
WOODS, W.A., "Context-sensitive parsing," Communications of the Association for
Computing Machinery, 13, pp. 437-445, 1970.
WRATHALL, C, "Subrecursive predicates and automata," Ph.D. Thesis, Harvard
University, 1975. Also Research Report 56, Dept. of Computer Science, Yale University,
1975.
YAMADA, H., "Real-time computation and recursive functions not real-time
computable," IRE Professional Group on Electronic Computers, 11, pp. 753—760, 1962.
YAO, A.C., AND RIVEST, R.L., "K + 1 heads are better than K," Proceedings of the
17th Annual Symposium on Foundations of Computer Science, pp. 67—70, 1976.
YNTEMA, M.K., "Inclusion relations among families of context-free languages,"
Information and Control, 10, pp. 572-597, 1967.
YOUNGER, D.H., "Recognition and parsing of context-free languages in time n3,"
Information and Control, 10, pp. 189-208, 1967.
YU, Y., "Rudimentary Relations and Formal Languages," Ph.D. Dissertation,
University of California, Berkeley, 1970.
ZIMMER, R., "Soft precedence," Information Processing Letters, 1, pp. 108—110,
1972.

ndex
Aanderaa, Stal, 0., 297
Accept state of an LR style parser, 526, 528,
529, 547, 552
Acceptable class of binary trees, 451, 452, 455
Acceptable sequence of matrix elements, 449,
450
Acceptance by a
finite automaton, 46
pda with empty store, 139, 143-159
pda with empty store and final state, 139,
143-147, 158
pda with final state, 139, 143-147, 158, 160
Turing machine, 282, 285
a-transducer, 206
1-bounded, 212
Accessible pair in a dpda
(See Deterministic pushdown automaton)
Addison Wesley Publishing Co., ix
Aho, Alfred V., 44, 133, 423, 512
sf- 2/ Consistency, 554
Albert, Douglas, J., ix
ALGOL, 100, 219, 220
ALGOL 60, 219, 220
ALGOL 68, 221,
Algorithm, 247
Cocke-Kasami-Younger, 412, 430-442
on-line version, 433
to compute transitive closure, 462, 500
to decide if
a generates /3 in a context free grammar,
79
a generates /3 in a context-sensitive
grammar, 295
a context-free grammar is strict
deterministic, 348-350
a context-free language is empty, 79, 248
a context-free language is finite, 248
a context-free language is infinite, 248
a deterministic context-free language is
equal to a regular set, 249
a given string is in a context-free
language, 248
a given string is in a context-sensitive
language, 276
the null string is in a context-free
language, 99
a variable is reachable, 77
to eliminate
null rules, 98, 99, 103
chain rules, 101, 102
variables which do not generate terminal
strings, 74
to obtain
a left parse from a recognition matrix,
434
the minimal strict partition of a strict
584

INDEX 585
deterministic grammar, 348-350
to place a context-free grammar into
Chomsky normal form 104, 105
Greibach normal form 111-113, 127-129
invertible form, 107, 108
2-standard form, 116, 117
operator form, 122
to reduce a context-free grammar, 78, 96,
97
to test membership in C, or C2, 167
Valiant's, 442-470
Alphabet, 2
terminal, 13
Alt, Helmut, 500
Ambiguous context-free grammar, 27, 28, 43,
233-247, 260
Ambiguous context-free grammar of degree
k, 240
Ancestor, 31
Associativity, 2
Augmented grammar, 512 ALR(k) grammar
or language, 512, 513, 514, 525
Backtracking, 495
Backus, John W., 18
Backus-Naur form, 18
Backus normal form, 18
BNF, 18, 219
Baker, Brenda S., 331
Baker's theorem, 300-307
Bar-Hillel, Yehoshua, 44, 71, 91, 98, 133,
231, 232, 270
Beatty, John C, ix
Bikle, Dorian, ix
Binary relations, 4
Binary standard form, 106
Binary trees, 450-455
Birdie problems, 261, 270
Bit vector machine, 424, 429, 484, 485, 489
Blattner, Meera, 270
Block, 37, 264
Blum, Manuel, 297
BoaSson, Luc, 331
Book, Ronald V., 44, 297, 331, 416
Boolean algebra, 293
Boolean matrix, 428, 429
Boolean matrix multiplication, 426-430, 442,
446-448, 499, 500
Borodin, Alan, 297
Bottom-up parsing, 107
Bounded existential quantification, 295
Bounded regular set, 240
Bounded right context grammar or language,
515, 557, 558
Bounded set, 240, 262, 269, 270
Brainerd, Walter S„ 331
fl-node, 187-190
Cafarella, Mary A., ix
Cancellation
left, 3
right, 3
Canonical collection of sets of LR (k) items,
543, 545, 547, 553
Canonical cross section, 31, 508, 510, 515
level of, 31
Canonical dpda, 361-366, 378, 380
Canonical form of a form, 325, 326
Canonical generation or derivation
(See Generation)
Canonical grammar, 151, 152, 359, 360
Canonical sentential form, 27, 501-505
Canonical two form, 104, 106, 115, 122, 133,
355
Cardinality of a set, 32
Cartesian product of two semilinear sets, 231
Cauchy product, 132
Ceiling, 8
Central submatrix, 455, 456, 464
Central submatrix theorem, 456
Chain in a partially ordered set, 215
Chain rules, 98, 101-103
Chain-free context-free grammar, 107, 110
Checking problem for boolean matrix
multiplication, 499
Chomsky, Noam, 17-21, 71, 91, 133, 184,
296, 331
Chomsky hierarchy, 17-21
Chomsky normal form, 103-106, 113, 115,
132, 248, 355, 430, 433, 434, 444, 470,
488, 492, 493
Chomsky-Schutzenberger Theorem, 317
Clique, 35
Closure
of Aq.A,, and A2, 366
of deterministic languages, 341-346
Kleene, 3, 56
operations
(See Operations)
reflexive-transitive 4, 23, 24, 54
semigroup, 3
transitive, 4, 82, 417, 426-430, 442, 445,
446, 448, 456, 457, 461, 462, 470, 489
COBOL, 221
Cocke, John, 500
Cocke-Kasami-Younger Algorithm, 417, 430-
432, 444, 445, 471, 481, 488
Code, 213, 214

586 INDEX
Coding problem, 213
Coffman, Ralph L., ix
Commutative image, 226
Commutative set, 269, 270
Complement of a context-free language, 198,
206, 246
Complement of a semilinear set, 231
Complementation, 296, 381, 397
Complete graph, 35
Complete set of dotted rules
(See Dotted rules)
Completer, 472, 474, 483
Composition
of functions, 196
of relations, 4
Composition principle, 452, 455
Computation
of a tree automaton, 391
Concatenation, 2, 3, 11, 24, 344
Conjugate, 12
Consistent collection of sets of LR (k) items,
544-554
Consistent set of dotted rules
(See Dotted rules)
Context-free grammar or language, 18, 20,
23-31, 60-64, 73-91, 93-133, 148-160,
162, 165, 181, 185-270, 294, 295, 304-
307, 312-331, 417-558
Context-free preserving function, 196
Context-sensitive grammar or language, 18,
20, 21, 219, 271-297
Context-sensitive grammar with erasing, 17,
20-22, 272, 274, 275, 305
Converse, 30
Cook, Stephen A., 297
Correct prefix property, 557
Coset, 231
Counter automaton
(See Pushdown automaton)
Counter Language, 147, 179, 324
Cross section of a tree, 24, 30
Cycle-free context-free grammar, 418, 425,
486
Decision problems, 247-270, 275-277
Degenerate language, 516, 524
Degree of ambiguity, 103, 125, 240
Degree of direct ambiguity
(See Direct ambiguity)
Degree of a strict deterministic grammar or
language, 367, 374, 375, 392-394
Delay k in a pda
(See Pushdown automaton)
Derivation, 14
Derivation tree, 25, 27, 31, 233
Descendant, 189
immediate, 187, 188
Detection move, 361
Determinism, 21, 110, 280
D-acceptance, 167-169, 178
D-computations, 167, 175-178
Deterministic context-free language, 20, 21,
135, 139, 143, 158, 163-179, 184, 193,
197, 198, 209, 211, 212, 247, 249, 257-
259, 269, 289, 299, 307, 309-312 316,
323, 330, 333-416, 426, 499, 501, 512,
525, 554, 556, 557
Deterministic context-sensitive language, 294
Deterministic counter
(See Pushdown automaton)
Deterministic grammar, 391
Deterministic linear bounded automaton
(See Linear bounded automaton)
Deterministic iterated counter
(See Pushdown automaton)
Deterministic iterated counter language, 312
Deterministic pushdown automaton, 20, 139-
144, 146, 147, 161-178, 249, 255, 333,
355-374, 378-380, 393, 394, 515, 557
Deterministic pushdown automaton
accessible pair, 178
loop-free, 168-171, 178, 369
mode, 143
realtime, 147, 162
DSPACE, 287, 289, 290, 294, 296, 426
DTIME, 287
Deterministic Turing machine
(See Turing machine)
Diagonal
principal, 423
Diagonal relation
(See Relation)
Diagonalization, 276
Difference of a context-free language and a
regular set, 197, 198, 243
Direct ambiguity of a context-free grammar or
language, 242, 246, 269
degree, 242, 243
infinite, 242, 243
Dirichlet's principle, 32
Disjoint subwords, 234
D-node, 187
Distinguished position
(See Position)
Distinguishing pair, 348
terminal, 348, 392
Distributive laws, 446, 448, 455
Distributive property, 449

INDEX 587
$-augmented grammar or language, 513
%LR(k) grammar or language, 513, 514, 525
Dombroski, Caryn, ix
Dotted rule, 471-479, 482, 483-485
complete set of, 475, 477, 478
consistent set of, 475-482
Double Greibach form, 116
Dpda
(See deterministic pushdown automata)
Dyckset, 299, 312, 322, 323
Earley, Jay, 246, 500
Effective enumeration, 277, 280, 288
Eilenberg, Samuel, 71, 232
Elementary subtree, 24
Empty set, 79
Encoded messages, 213
Endmarkers, 70, 294, 333, 339
Englefriet, Joost, 270
EFFA function, 530
Erdos, Pal, 44
Equality problem, 394, 416
Equality relation
(See Relation)
Equations, 125-128
Equivalence
of deterministic pushdown automata, 139
of grammars, 14
of strings, 398
weak, 14
Equivalence relation
(See Relation)
Equivalence tree, 411-415
Error state of an LR style parser, 526, 528,
529, 547, 552, 553, 556
Evey, R. J., 184
Explicit Transformation, 295
Extended LR(k) theorem, 506, 507, 516, 524
Factorization, 452, 455
smallest, 453, 454
string, 186, 187, 190, 219, 233, 234, 236,
241
Fermat, Pierre de, 51
Fin, 211
Final state, 45, 59
Fine, Nathan J., 44
Finite Automata, 20, 45-47, 50, 52, 55, 57-59,
110, 163,249,258, 262
Finite-turn pushdown automaton
(See Pushdown automaton)
Finitely inherently ambiguous context-free
language, 240, 243, 245, 247, 268, 269
FIRST* function, 530
Fischer, Michael J., 133, 297
Fischer, Patrick C, 500
Floyd, Robert W.
Foderaro, John, ix
FOLLOW* function, 530
Form, 325, 326
Formal power series, 125-133, 240
FORTRAN, 220, 221
Friedman, Emily P., 416
Frontier of a tree
(See Tree)
Gallaire, Herve, 500
Gap theorem, 289
Geller, Matthew, M., ix, 416, 558
General two-type bracketed grammar, 325,
326
Generalized sequential machine, 206-209, 232,
234, 235, 244, 246, 280, 341
Generation, 14, 26
canonical, 26, 27
leftmost, 26, 27
rightmost, 26, 27
Generation tree, 30
Generators
(See Semigroup)
Grammar
(See Type)
Ginsburg, Seymour, 49, 184, 232, 246, 270,
297, 331,416
Goldstine, Jonathan, 232
goto function, 527, 528, 547, 548, 555
Graham, Susan L., ix, 417, 500, 558
(See Susan G. Harrison)
Grammatical tree, 25, 30, 31, 508
Graphs, 35
Gray, James N., 133
Greatest lower bound, 240
Greibach, Sheila A., 44, 133, 184, 232, 297,
327, 331, 416, 500
Greibach normal form, 111-116, 122, 125-133,
159, 353, 354, 394, 399, 557
Gries, David, 44
Group
finitely presented, 322
free, 322
linear, 324
Gruener, William B., ix
Gruska, Jozef, 133, 270
Gsm
(See Generalized sequential machine)
Haines, Leonard H., 184, 232
Half-block, 37-39

588 INDEX
Halting problem, 247, 250, 254
Handle, 501-505
Hardest context-free language, 300, 326, 417,
425
Hardest context-sensitive language, 296
Hardest deterministic context-free language,
426
Harrison, Anthony M., ix
Harrison, Craig A., v
Harrison, Michael A., iv, 133, 232, 331, 416,
417,500,558
Harrison, Susan G., v
(See Susan L. Graham)
Hartmanis, Juris, 71, 231, 270, 297, 500
Havel, Ivan M., ix, 416, 558
Height, 24
Hennie, Frederick C. Ill, 297
Henry, Robert, ix
Hibbard, Thomas N., 246, 331
Hierarchy of strict deterministic languages,
366
Homomorphism, 36, 57, 82, 85, 132, 162,
193, 199, 207, 208, 213, 244, 245, 280,
299,300, 307,310,311, 317,321,323,
326, 344, 381
inverse, 57, 208, 244, 246, 326-331, 341,
344
^-limited, 277, 278
lambda-free, 279
length preserving, 295
Hopcroft, John E., ix, 44, 133, 270, 416,
423
Hull, Richard B., ix
Hunsinger, Ronald A., 500
Hunt, Harry B., Ill, 133, 270, 297, 557
ID
(See Instantaneous description)
Identity, 2, 3, 322
Immediate descendant, 23
Immortal ID
(See Instantaneous description)
Inclusion problem, 394, 416
Incomparable elements, 214, 215, 217, 218
Index
of a derivation, 322, 323
of a context-free language, 323
finite, 323
leftmost, 323
Induced right congruence relation
(See Relation)
Inf, 240
Infinitely ambiguous context-free grammar,
240
Infinitely inherently ambiguous context-free
language, 240, 242
Inherently ambiguous context-free grammar
of degree k, 240, 242
Inherently ambiguous context-free language,
28, 233-247, 260, 330
Inherently ambiguous context-free language
of degree k, 240, 241, 269
lnit, 211, 246, 316, 327, 336-339
Initial instantaneous description of a pda, 137
Initial slate, 45
Input part, 549-552
Instantaneous description
of an LR style parser, 526, 528, 529
immortal, 164-171, 179
of a pushdown automaton, 137
of a two way finite automaton, 65
of a nondeterministic Turing machine, 281,
285
accepting, 282
initial, 282
of a two way pushdown automaton, 158,
159
Instruction set, 422, 424
Intersection, 366, 381, 397
of a context-free language and a regular set,
197
of two context-free languages, 198
of two semilinear sets, 231
Inverse, 30
Inverse gsm mappings, 208, 211, 244, 245,
280, 341
Inverse homomorphism
(See Homomorphism)
Inverse sequential transducer mappings, 209,
212
lnvertible context-free grammar, 107-110,
133, 354
IP
(See Input part)
Isomorphism
structural, 24, 383
Iterated counter automaton
(See Pushdown automaton)
Iterated counter language, 148, 312, 324
Iteration theorem, 185-194, 231, 236, 238,
242, 246, 333, 387
for A0, 388
for A2, 387
for regular sets, 47
for sequential forms, 191
Jones, Neil D., 297
Joy, William N., ix

INDEX 589
Karp, Richard, M„ 297
Kasami, Tadao, 500
Kernel of a free group, 322
King, Kimberly N., ix
Kleene, Stephen C, 57, 71
Kleene Operations, 355, 366
Kleene's theorem, 57-60, 85
/c-limited homomorphism
(See Homomorphism)
Knuth, Donald E., 331, 557, 558
K6nig, Denes, 217
K6nig infinity lemma, 217
Korenjak, Allen J., 416
Kosarju, S. Rao, 232
Kuroda, S. Y. 296
Label, 24
root, 24
Lagrange, Joseph L., 40-44
Lagrange inversion formula, 40-44
Lambda free context-free grammar, 103, 110,
111, 122-124
Lambda free pushdown automaton
(See Pushdown automaton)
Lambda free substitution
(See Substitution)
Lambda moves in a pushdown automaton,
136, 147, 159-162
Lambda-rules, 18, 98-100, 103, 471
Lambda-transitions, 373
Landweber, Peter S., 296, 297
Language, 14
Leaf, 24
Least upper bound, 240
Left(G), 305, 306
Left linear context-free grammar or language,
19, 20,64, 81
Leftmost generation or derivation
(See Generation)
Left parse, 434, 435
Left part, 382, 383, 508
Left part property, 383-386
Left part Theorem, 382, 384-387, 510
Left recursion, 93, 111, 148
Left recursive
grammar, 111, 113, 353
variable, 111, 112
LR parsers, 525-530
LR parsing, 501-558
LR (k) grammar or language, 20, 21, 501-558
LR (k) items, 530-544
LR(0) grammar or language, 509-511, 513-
525, 557
LR (0) language characterization theorem,
515-524, 558
Left-to-right order
(See Tree)
Length of a word, 2
Lentin, Andre, 44
Letter-equivalent, 227, 228
Levi, F. W., 5, 44
Lewis, Philip M., Ill, 500, 558
Linear, 296
Linear bounded automaton, 20, 21, 290-294,
296
deterministic, 293-295
Linear context-free grammar or language, 19,
20, 28, 29, 64, 82, 182, 183, 193, 206,
212, 221-225, 240, 242, 270, 296, 323,
441
Linear normal form, 64, 440-442
Linear set, 226, 231
Linear-time pushdown automaton
(See Pushdown automaton)
L n R theorem, 231
Lipton, Richard J., 331
LLR (k) grammar or language, 514, 557, 558
Log space, 324, 331
Logarithmic cost function, 424, 426
Lomet, David B.
Loop-free deterministic pushdown automaton
(See Deterministic pushdown automaton)
Lower bound, 240
-Lukasiewicz, Jan, 323
-Lukasiewicz language, 323, 324
Lynch, Nancy, 331
McNaughton, Robert, 71, 133, 331
Marked operations, 346, 366
Matrices, 132, 417
Matrix
multiplication of, 426-430, 446-448
representation, 132
strictly upper triangular, 423, 448, 456, 461,
462
transitively closed, 443, 456, 458, 459, 461
upper triangular, 423
Matthews, G. H., 331
Max, 346, 366, 381
Measurable function, 284
Mehlhorn, Kurt, 331, 500
Message set, 213
Metalinear grammar or language, 64, 206,
225, 270, 323
of width k, 225
Miller, George A., 71
Min, 346, 366, 516,517
Minimal elements in a partially ordered set,

590 INDEX
217,218
Minimal linear context-free grammar or
language, 29, 240, 242, 323
Minsky, Marvin, 331
Mode of a deterministic pushdown automaton
(See Deterministic pushdown automaton)
Moderate pushdown automaton
(See Pushdown automaton)
Modified Post correspondence problem, 251-
254, 395
Monoid, 3, 5
free, 5, 313
Monotone transformations, 400, 401, 405,
409, 410, 415
Monotonic grammar or language, 20-23, 272-
275, 290
Moura, Arnaldo, ix
Move
decreasing, 180, 181
increasing, 180-182
nondecreasing, 180-182
nonincreasing, 180-182
Move relation
of a pushdown automaton, 137
of a two way finite automaton, 65
MPCP
(See Modified Post Correspondence
Problem)
Mu function for word reduction, 313-317,
322, 327, 329, 330
Myhill, John, 53, 71, 296, 297
National Science Foundation, ix
Naur, Peter, 18
Nerode, Anil, 71
Nilpotent, 40
Nivat, Maurice, 232, 331
Node, 23, 24
internal, 24
lambda, 25, 382
nonterminal, 25
terminal, 25, 382
Nonassociativity, 428, 442, 444
Nondeterminism, 21, 280
jTS1, 280, 287-289
^^■-complete, 288
^T^.hard, 288
NSPACE, 290, 294, 296
NTIME, 287
Nondeterministic Turing machine
(See Turing machine)
Nonrepetitive words, 36-39
Nonterminal, 13
Nonterminal bounded context-free grammar
or language, 183, 184, 324
Nonterminal rewriting grammar or language,
272, 275, 300, 301, 305, 307
Normal forms for context-free grammars
binary standard form, 106
canonical two form, 104, 106, 115, 122,
133, 355
Chomsky normal form, 103-106
Greibach normal form, 111-116
operator normal form, 121-124, 133
Normalization of a form, 325, 326
Norms, 93-98
Notational conventions, 14
Null rules,
elimination of, 98
Null word
(See Word)
Off-line algorithm, 433
Off-line computation, 283
Ogden, William F., 231, 246
One-bounded
(See a-Transducer)
One letter alphabet, 194-196
One turn pushdown automaton, 20
On-line algorithm, 433, 483, 500
On-line computation, 283, 289, 440, 489, 498
OP
(See Output part)
Operation
closure, 55-57, 60, 80, 85, 164-178, 277-280
Operator
transpose, 4, 277
Operator grammar, 121-124, 133
Order
lexicographic, 10-12
partial, 10, 214-219, 240, 305
total, 10-12, 306
Order of magnitude, 96
Origin-crossing problem, 289
Output part, 549-552
Pairing lemma, 226, 229
Palindrome, 5
with a center marker, 139
even length, 5, 19, 141
with a center marker, 139
Parenthesis grammar or language, 133, 324,
325, 331
Parenthesis-free notation, 323
Parikh, Rohit J., 232, 246
Parikh mapping, 226
Parikh's theorem, 185, 225-232
Parser, 333

INDEX 591
Parsing, 417-558
Parsing action function, 527-529, 547, 548,
555
Partial order
(See Order)
Partial solution to the MPCP, 253
Partial transformation tree, 411
Partition, 347, 391
strict, 347, 348, 360, 367, 378, 383, 384,
386
maximal, 354
minimal, 378
PASCAL, 221
Path, 23, 24, 30
Pathological production, 514, 525
PCP
(See Post correspondence problem)
Pda
(See pushdown automaton)
Perles, Micha A., 91, 98, 133, 231, 232, 270
Permutation, 109
Phrase, 107, 234, 239
Phrase structure grammar or language, 13-23,
271-297, 305, 323
Piricka-Kemonova, Alica, 133
Pleasants, P. A. B., 44
Polish notation, 323
■£», 280, 287-290,296,297, 331
Polynomially reducible, 288
J^-SPACE, 287, 289
Position, 186, 188, 190
Post, Emil L., 270
Post Correspondence Problem, 247, 249-252,
254, 256, 258, 260, 270
Power series, 40, 41
Precedence analysis, 121
Predicate, 265, 266, 268
nontrivial, 265
Predictor, 471-473, 475, 477, 478, 483, 488
Prefix, 6, 7, 257, 347, 352, 383
Prefix-free set, 262, 333, 352, 355, 358, 386,
416
Presburger formulae, 231
Presburger set, 231
Prime, 50-52, 193, 194
Probert, Robert L., 500
Product
of sets, 3, 56, 366, 381, 397
Productions, 13
Programming languages, 219-221, 501
PL/I, 220, 221
Projection, 245, 344, 346
Proper erasing transition, 372
Pumping lemma, 190, 192, 193, 226, 231, 248
Pushdown automaton, 20, 21, 135-184, 225,
249, 311, 340
counter, 147
deterministic counter, 179
deterministic iterated counter, 179
finite-turn, 135, 179-184, 212, 262
iterated counter, 148, 312
lambda-free, 142, 160
linear time, 142, 143, 151, 158
moderate, 161, 162
one state, realtime, deterministic, 334, 393,
394
operating with delay k, 142, 143, 158, 160
quasi-realtime, 142, 143
realtime, 142, 143, 393
size of, 160-162, 166, 167
two way deterministic, 158, 159, 489
two way nondeterministic, 158, 159, 489,
499
unambiguous, 142, 143, 151, 243, 244
Pushdown store, 21, 97, 135-138, 179-183,
225, 311,312, 526
Quasi-inverse, 132
Quasi-realtime deterministic language, 376,
377
Quasi-realtime pushdown automaton
(See Pushdown automaton)
Quotient of sets, 50, 209-211, 232, 245, 333-
340,366, 381,416
Rabin, Michael O., 71, 297
RAM
(See Random access machine)
Ramsey, Frank. P., 31-36, 44
Ramsey's Theorem, 31-36
Random access machine, 94, 96, 422, 424,
426, 429, 430, 437, 485, 489, 499
Rank
of an equivalence relation, 51, 367
of a string in a nonterminal bounded
grammar, 183, 184
Rationally closed set of formal power series,
132
Reachable symbol, 76-78
Realtime pushdown automaton
(See Pushdown automaton)
Realtime strict deterministic grammar or
language, 333, 377-381, 393, 394
Realtime Turing machine, 288, 289
Recognition matrix, 430, 432-435, 437, 441,
442, 444, 477, 485
Recognition problem, 248, 280, 417-500
Recursive descent parsers, 111

592 INDEX
Simple grammar or language, 334, 392-416,
426
Simple machines, 392-398
Simple phrase, 234
Size
of a context-free grammar, 94, 98, 106,
129, 160
of a context-free language, 103
of a finite automaton, 110
of a pushdown automaton, 160-162, 166,
167
of a strict partition, 360, 367, 374, 392
Solvability, 247-249
SP
(See Stack part)
Space bounded Turing machine
(See Turing machine)
Space complexity, 284, 285, 425
Spanier, Edwin H., 184, 270
Speed-up theorem 288, 297
Springsteel, Fred N., 331
Stack
(See Pushdown store)
Stack bounded context-free language, 324
Stack part, 549-552
/^Standard form, 112, 116, 117
2-Standard form, 116-121, 122, 124, 132, 133,
162
Stanton, Ralph G., 44
Start symbol, 13
State, 45, 535
State graph, 46, 535
Stearns, Richard E., 297, 500, 558
Stencil, 108
Strassen, Volker, 500
Strict Deterministic grammar or language,
333, 346-394, 416, 509, 512, 515, 525
Strict Deterministic property of a tree
automaton, 391
Strict partition
(See Partition)
Strictly upper triangular matrix
(See Matrix)
String, 2
(See Word)
Strongly repetitive word, 40
Subsequence, 214, 217
Substitution
mapping, 82, 85, 280
lambda-free, 280
theorem, 82-85, 90
Subtree
elementary, 24
Subwords, 211, 246
Sudborough, Ivan H., 297, 500
Suffix, 6, 7, 347
Sup, 240
Supersequence, 217
Support, 125, 132
Suzuki, Ruth, ix
Sweep, 180-182
decreasing, 180, 181
increasing, 180, 181
Szerkeres, George, 44
Szymanski, Thomas G., 133, 551
Tables for LR style parsers, 526-528
Taniguchi, Kenichi, 500
Tape complexity
(See Space complexity)
Term, 449, 455, 460
minimal, 455, 459
Terminal alphabet, 13
Terminal bounded grammar or language,
300-307
Terms
formally distinct, 449, 460
Time bounded Turing machine
(See Turing machine)
Time complexity, 283, 284, 423, 425
Transduction, 213, 324
Transformation tree, 409-411
Transformations, 400, 411, 414
Transition function, 45, 46
Transition matrix, 334-338, 490, 491
o, 339, 340
Transition systems, 52-57, 60
Transitive closure
(See Closure)
Transpose
string, 4
set, 4, 55, 80, 81, 267, 344, 381, 397
Tree, 23-31, 381-386, 391
derivation, 25, 386, 387
frontier of, 24, 383-386
generation, 30
grammatical, 25, 30, 382, 386
height of, 23
internal node of, 24
leaf, 24
left-to-right order in, 24
minimal, 454
root of, 23, 384, 386
Tree automaton, 391
Tree structures, 30, 31
Trees
structurally isomorphic, 24
structural isomorphism of, 24

INDEX 593
Recursive function
total, 196, 258, 267, 288, 289
Recursive set, 21, 79, 257, 276, 277, 279, 288,
296, 311
Recursive variable, 98, 193, 249
Recursively enumerable set, 20, 21, 257, 282,
294, 299, 307, 310, 326
Reduce move, 544, 547, 552, 556
Reduced
context-free grammar, 73-79, 96, 97, 350,
386
form, 325, 326
word, 313
Reduction
closure of a matrix, 458
move, 36
matrix, 456-458, 460
Reference
dangling, 558
Refinement of an equivalence relation, 51
Reflexive-transitive closure
(See Closure)
Register, 422
Regular expression, 60
Regular relation, 213
Regular set, 20, 46-53, 57-60, 62-64, 70, 71,
81, 85-90, 132, 143, 163, 185, 193-210,
213, 214, 222-225, 228, 243, 249, 259,
268, 269, 295, 299, 317, 321-324, 334,
335, 340, 341, 344, 345, 355, 366, 381,
397
standard, 323
Relation
binary, 4, 367
congruence, 313, 322, 398
converse of, 30
diagonal, 4
equality, 4
equivalence, 12, 51, 375, 398
induced right congruence, 51, 70
relative right congruence, 367
right congruence, 51, 355
Remainder, 253
Representation theorems, 299-331
Response, 46
Revelle, John D., ix
Reverse Greibach form, 116
Rice, H. Gordon, 232
Right congruence relation
(See Relation)
Right linear context-free grammar or
language, 19, 20, 29, 60-64, 81, 193,
223
Right parse, 485
Right recursive
grammar. 111
variable, 111
Rightmost generation or derivation
(See Generation)
Ritchie, Robert W„ 331
Root, 12
label, 24
primitive, 12
Rose, Gene F., 232
Rosenberg, Arnold L., 71, 297, 416
Rosenkrantz, Daniel J., 133, 270
Rozenberg, Grzegorz, 270
Rudimentary sets, 295
Rule
(See Production), 13
Ruzzo, Walter L., ix, 417, 500
Ryser, Herbert J., 44
Salomaa, Arto, 44, 71, 270, 331
Sannella, Donald, ix
Santos, Eugene S., 331
Saturated letter, 379, 380
Sautter, Joan M., ix
Savitch, Walter J., 297
Savitch's theorem, 271, 286, 287, 289
Scanner, 472, 479, 483
Schkolnick, Mario, 331
Schutzenberger, Marcel P., 44, 133, -184, 331
Scott, Dana, 71
Self-embedding context-free grammar or
language, 85-90, 381
Semi-Dyck set, 299, 312-322, 324, 326, 327,
331, 345, 397, 525
Semigroup, 3, 5, 40
free, 5, 9, 213, 214
generation of, 5, 213
nilpotent, 40
Semilinear image, 221, 231
Semilinear set, 225-231
Sentence
Presburger, 231
Sentential form, 14, 190, 191, 261
Sequential machine, 246
Sequential transducer, 197-213
Sequential transducer theorem, 200-205, 244
Shamir, Eliahu, 71, 91, 98, 133, 231, 232,
242, 246, 270
Shank, Herberts., 71, 231
Shepherdson, John C, 71
Shift move, 544, 547, 556
Shoe box principle, 32
Shuffle, 206, 331
Simons, Barbara B., ix

594 INDEX
Triangles, 44
Turing, Alan M., 270
Turing machine, 20, 21, 247, 249, 250, 252,
257, 271, 311, 312, 324, 395, 422, 426,
437-440, 490-499
deterministic, 20, 147, 281, 282, 284-288,
292, 294
nondeterministic, 20, 280-282, 286, 287,
294, 331, 499
realtime, 288, 289, 333, 375, 376, 381,
416
tape bounded, 280-290, 297, 493-499
time bounded, 280-290, 297
Turn, 180-183
2-Standard form, 118-121
Two way finite automaton, 65-71
Two-way(G), 305, 306
Two way pushdown automaton
(See Pushdown automaton)
Type 0 grammar or language, 17, 20, 212-215
ucivLanguage, 224, 225
Ullian, Joseph, 246, 270
Ullman, Jeffrey D., 44, 133, 270, 416, 423,
512, 558
Ultimately periodic sequence, 49-50
Ultimately periodic set, 47-50, 195, 196, 369
Ultralinear context-free grammar or language,
181-184, 212, 270
Unambiguous context-free grammar or
language, 27, 62, 81, 82, 103, 110, 115,
151, 158, 197, 233, 234, 240, 243-246,
261, 269, 322, 326, 330, 354, 386, 387,
503-505, 514
Unambiguous pushdown automaton
(See Pushdown automaton)
Unbounded set, 262, 263, 268-270
Uniform cost criterion, 424, 426
Union, 366, 381, 397
Unique factorization theorem for LR (0)
languages, 524, 525
Unsolvable problems, 250, 252, 256-261, 269,
295
Upper bound, 240
Valiant, Leslie G., 417, 442, 500
Valiant's algorithm, 442-470, 489
Valiant's lemma, 449, 455, 458-460, 465, 467
Valid LR (k) item for a string, 538-543
Valid prefix, 536-538
Valid transformation, 400, 401, 405, 409, 410
Variables, 13
Viable prefix, 536-538
Vocabulary, 13
Wagoner, Neil, ix
Warshall, Stephen, 500
Warshall's algorithm, 429
Weight
of a phrase structure grammar, 22, 273, 274
of a phrase structure production, 273, 274
Width
of a context-free rule, 117-120
of a metalinear grammar, 225
Wilf, Herbert S., 44
Winkler, Timothy C, ix
Winograd, Shmuel, 500
Wise, David S., 231
Witness, 399, 407
shortest, 399, 408-410
Wood, Derick, 133
Word, 2
empty, 2
nonrepetitive, 36-39
null, 2
strongly repetitive, 40
Word problem for groups, 322, 324
Words
conjugate, 12
Yamada, Hisao, 71
Yehudai, Amiram, ix, 133, 416, 500
Yntema, Mary K., 331
Younger, Daniel H., 500
Yu, Yth Y. Y., 297
Zalcstein, Yechezkel, 331
Zero, 40
Zero-origin addressing, 423

<sentence> —*■ <subjectxpredicate>
<subject> —> <deterrniner>
<determiner> —> they
<predicate> -» <verb phrase>
<verb phrase> -* <verb><noun phrase>
<verb> —» are
<noun phrase> —* <adjective><noun>
<odjective> -* pumping
<noun> —»lemmings
ISBN 0-201-02955-3