/
Автор: Maienborn Claudia Heusinger Klaus von Portner Paul
Теги: semantics linguistics
ISBN: 978-3-11-018470-9
Год: 2011
Текст
. tics
An International Handbook of
Natural Language Meaning
Volume i
Edited by
Claudia aienborn
Klaus von Heusinger
Paul Portner
Handbooks of Linguistics and Communication Science *2 *2 "1
Handbiicher zur Sprach- und Kommunikationswissenschaft J J • I
DE GRUYTER
MOUTON
ISBN 978-3-11-018470-9
e-ISBN 978-3-11-022661-4
ISSN 1861-5090
Library of Congress Cataloging-in-Publication Data
Semantics : an international handbook of natural language
Maienborn, Klaus von Heusinger, Paul Portner.
p. cm. - (Handbooks of linguistics and communication
Includes bibliographical references and index.
ISBN 978-3-11-018470-9 (alk. paper)
1. Semantics—Handbooks, manuals, etc. I. Maienborn,
von. III. Portner, Paul.
P325.S3799 2011
401'.43-dc22
meaning / edited by Claudia
science
Claudia.
33.1)
II. Heusinger, Klaus
2011013836
Bibliographic information published by the Deutsche Nationalbibliothek
The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie:
detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.
© 2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston
Cover design: Martin Zech, Bremen
Typesetting: RefineCatch Ltd, Bungay, Suffolk
Printing: Hubert & Co. GmbH & Co. KG, Gottingcn
© Printed on acid-free paper
Printed in Germany
www.degruytcr.com
Contents
Volume i
I. Foundations of semantics
1. Meaning in linguistics • Claudia Maienborn, Klaus von Heusinger
and Paul Portner 1
2. Meaning, intentionality and communication • Pierre Jacob 11
3. (Frege on) Sense and reference • Mark Textor 25
4. Reference: Foundational issues • Barbara Abbott 49
5. Meaning in language use • Georgia M. Green 74
6. Compositionality • Peter Pagin and Dag Wcstcrstahl 96
7. Lexical decomposition: Foundational issues • Stefan Engelberg 124
II. History of semantics
8. Meaning in pre-19th century thought • Stephan Meier-Oeser 145
9. The emergence of linguistic semantics in the 19th and
early 20th century • Brigittc Ncrlich 172
10. The influence of logic on semantics • Albert Ncwcn and
Bcrnhard Schroder 191
11. Formal semantics and rcprcscntationalism • Ruth Kempson 216
III. Methods in semantic research
12. Varieties of semantic evidence • Manfred Krifka 242
13. Methods in cross-linguistic semantics • Lisa Matthcwson 268
14. Formal methods in semantics • Alice G.B. tcr Meulen 285
15. The application of experimental methods in semantics •
Oliver Bott,Sam Featherston, Janina Rado and Britta Stolterfoht 305
IV. Lexical semantics
16. Semantic features and primes • Manfred Bierwisch 322
17. Frameworks of lexical decomposition of verbs • Stefan Engelberg 358
18. Thematic roles • Anthony R. Davis 399
19. Lexical Conceptual Structure • Beth Levin and
Malka Rappaport Hovav 420
20. Idioms and collocations • Christiane Fellbaum 441
21. Sense relations • Ronnie Cann 456
22. Dual oppositions in lexical meaning • Sebastian Lobncr 479
V.
Ambiguity and vagueness
23. Ambiguity and vagueness: An overview • Christopher Kennedy 507
24. Semantic underspecification • Markus Egg 535
25. Mismatches and coercion • Henriette de Swart 574
26. Metaphors and metonymies • Andrea Tyler and
Hiroshi Takahashi 597
VI. Cognitively oriented approaches to semantics
27. Cognitive Semantics: An overview • Leonard Talmy 622
28. Prototype theory • John R. Taylor 643
29. Frame Semantics • Jean Mark Gawron 664
30. Conceptual Semantics • Ray Jackcndoff 688
31. Two-level Semantics: Semantic Form and Conceptual
Structure • Ewald Lang and Claudia Maienborn 709
32. Word meaning and world knowledge • Jerry R. Hobbs 740
VII. Theories of sentence semantics
33. Model-theoretic semantics • Thomas Edc Zimmcrmann 762
34. Event semantics • Claudia Maienborn 802
35. Situation Semantics and the ontology of natural language •
Jonathan Ginzburg 830
VIII. Theories of discourse semantics
36. Situation Semantics: From indexicality to metacommunicative
interaction • Jonathan Ginzburg 852
37. Discourse Representation Theory • Hans Kamp and Uwe Reyle 872
38. Dynamic semantics • Paul Dekker 923
39. Rhetorical relations • Hcnk Zccvat 946
Volume 2
IX. Noun phrase semantics
40. Pronouns • Daniel Biiring
41. Dcfinitcncss and indcfinitcncss • Irene Hcim
42. Specificity • Klaus von Heusinger
43. Quantifiers • Edward Keenan
44. Bare noun phrases- Veneeta Dayal
45. Posscssivcs and relational nouns • Chris Barker
46. Mass nouns and plurals • Peter Lascrsohn
47. Gcncricity • Gregory Carlson
Contents
xiii
X. Verb phrase semantics
48. Aspectual class and Aktionsart • Hana Filip
49. Perfect and progressive • Paul Portner
50. Verbal mood • Paul Portner
51. Deverbal nominalization • Jane Grimshaw
XI. Semantics of adjectives and adverb(ial)s
52. Adjectives • Violeta Demonte
53. Comparison constructions • Sigrid Beck
54. Adverbs and adverbials • Claudia Maienborn and Martin Schafer
55. Adverbial clauses • Kjcll Johan Saeb0
56. Secondary predicates • Susan Rothstcin
XII. Semantics of intensional contexts
57. Tense -Toshiyuki Ogihara
58. Modality • Valentine Hacquard
59. Conditionals • Kai von Fintel
60. Propositional attitudes • Eric Swanson
61. Indexicality and de se reports • Philippe Schlenkcr
XIII. Scope, negation, and conjunction
62. Scope and binding • Anna Szabolcsi
63. Negation • Elena Hcrburgcr
64. Negative and positive polarity items • Anastasia Giannakidou
65. Coordination • Roberto Zamparclli
XIV. Sentence types
66. Questions • Manfred Krifka
67. Imperatives • Chung-hye Han
68. Copular clauses • Line Mikkelsen
69. Existential sentences • Louise McNally
70. Ellipsis • Ingo Reich
XV Information structure
71. Information structure and truth-conditional semantics • Stefan Hinterwimmer
72. Topics • Craigc Roberts
73. Discourse effects of word order variation • Gregory Ward and Betty J. Birner
74. Cohesion and coherence • Andrew Kehler
75. Discourse anaphora, accessibility, and modal subordination • Bart GeurLs
76. Discourse particles • Make Zimmermann
Volume 3
XVI. The interface of semantics with phonology and morphology
77. Semantics of intonation • Hubert Truckcnbrodt
78. Semantics of inflection • Paul Kiparsky and Judith Tonhauscr
79. Semantics of derivational morphology • Rochcllc Licbcr
80. Semantics of compounds • Susan Olsen
81. Semantics in Distributed Morphology • Heidi Harley
XVII. The syntax-semantics interface
82. Syntax and semantics: An overview • Arnim von Stcchow
83. Argument structure • David Pcsctsky
84. Operations on argument structure • Dieter Wundcrlich
85. Type shifting • Helen do Hoop
86. Constructional meaning and compositionality • Paul Kay and
Laura A. Michaelis
87. The grammatical view of scalar implicatures and the relationship between
semantics and pragmatics • Gcnnaro Chicrchia, Danny Fox and
Benjamin Spcctor
XVI11.The semantics-pragmatics interface
88. Semantics/pragmatics boundary disputes • Katarzyna M. Jaszczolt
89. Context dependency • Thomas Edc Zimmcrmann
90. Dcixis and demonstratives • Holgcr Dicsscl
91. Presupposition • David Beaver and Bart Geurts
92. Implicature • Mandy Simons
93. Game theory in semantics and pragmatics • Gerhard Jager
94. Conventional implicature and expressive content • Christopher Potts
XIX. Typology and crosslinguistic semantics
95. Semantic types across languages • Emmon Bach and Wynn Chao
96. Count/mass distinctions across languages • Jenny Doctjcs
97. Tense and aspect:Time across languages • Carlota S. Smith
98. The expression of space across languages • Eric Pederson
XX. Diachronic semantics
99. Theories of meaning change: An overview • Gcrd Fritz
100. Cognitive approaches to diachronic semantics • Dirk Gccracrts
101. Grammaticalization and semantic reanalysis • Regine Eckardt
XXI. Processing and learning meaning
102. Meaning in psycholinguistics • Lyn Frazier
103. Meaning in first language acquisition • Stephen Crain
Contents
xv
104. Meaning in second language acquisition • Roumyana Slabakova
105. Meaning in ncurolinguistics • Roumyana Panchcva
106. Conceptual knowledge, categorization and meaning • Stephanie Kcltcr and
Barbara Kaup
107. Space in semantics and cognition • Barbara Landau
XXII. Semantics and computer science
108. Semantic research in computational linguistics • Manfred Pinkal and
Alexander Kollcr
109. Semantics in corpus linguistics • Graham Katz
110. Semantics in computational lexicons • Anette Frank and Sebastian Pado
111. Web Semantics • Paul Buitelaar
112. Semantic issues in machine translation • Kurt Eberle
I. Foundations of semantics
1. Meaning in linguistics
1. Introduction
2. Truth
3. Compositionality
4. Context and discourse
5. Meaning in contemporary semantics
6. References
Abstract
The article provides an introduction to the study of meaning in modern semantics.
Major tenets, tools, and goals of semantic theorizing are illustrated by discussing typical
approaches to three central characteristics of natural language meaning: truth conditions,
compositionality, and context and discourse.
1. Introduction
Meaning is a key concept of cognition, communication and culture, and there is a
diversity of ways to understand it, reflecting the many uses to which the concept can be put. In
the following we take the perspective on meaning developed within linguistics, in
particular modern semantics, and we aim to explain the ways in which semanticists approach,
describe, test and analyze meaning. The fact that semantics is a component of linguistic
theory is what distinguishes it from approaches to meaning in other fields like
philosophy, psychology, semiotics or cultural studies. As part of linguistic theory, semantics is
characterized by at least the following features:
1. Empirical coverage: It strives to account for meaning in all of the world's languages.
2. Linguistic interfaces: It operates as a subtheory of the broader linguistic system,
interacting with other subthcorics such as syntax, pragmatics, phonology and
morphology.
3. Formal cxplicitcness: It is laid out in an explicit and precise way, allowing the
community of semanticists to jointly test it, improve it, and apply it to new theoretical
problems and practical goals.
4. Scientific paradigm: It is judged on the same criteria as other scientific theories, viz.
coherence, conceptual simplicity, its ability to unify our understanding of diverse
phenomena (within or across languages), to raise new questions and open up new
horizons for research.
In the following we exemplify these four features on three central issues in modern
semantic theory that define our understanding of meaning: truth conditions,
compositionality, and context and discourse.
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 1-10
2
I. Foundations of semantics
2. Truth
If one is to develop an explicit and precise scientific theory of meaning, the first thing
one needs to do is to identify some of the data which the theory will respond to, and
there is one type of data which virtually all work in semantics takes as fundamental:
truth conditions. At an intuitive level, truth conditions are merely the most obvious way
of understanding the meaning of a declarative sentence. If I say It is raining outside, I
have described the world in a certain way. I may have described it correctly, in which case
what I said is true, or I may have described it incorrectly, in which case it is false. Any
competent speaker knows to a high degree of precision what the weather must be like
for my sentence to count as true (a correct description) or false (an incorrect
description). In other words, such a speaker knows the truth conditions of my sentence. This
knowledge of truth conditions is extremely robust - far and wide, English speakers can
make agreeing judgments about what would make my sentence true or false - and as a
result, we can see the truth conditions themselves as a reliable fact about language which
can serve as part of the basis for semantic theory.
While truth conditions constitute some of the most basic data for semantics, different
approaches to semantics reckon with them in different ways. Some theories treat truth
conditions not merely as the data which semantics is to deal with, but more than this
as the very model of sentential meaning. This perspective can be summarized with the
slogan "meaning is truth conditions", and within this tradition, we find statements like
the following:
(1) [[It is raining outside ]],s = TRUE iff it is raining outside of the building where the
speaker s is located at time t, and = FALSE otherwise.
The double brackets [[ X ]] around an expression X names the semantic value of X in
the terms of the theory in question. Thus, (1) indicates a theory which takes the semantic
value of a sentence to be its truth value,TRUE or FALSE.The meaning of the sentence,
according to the truth conditional theory, is then captured by the entire statement (1).
Although (1) represents a truth conditional theory according to which semantic value
and meaning (i.e., the truth conditions) are distinct (the semantic value is a crucial
component in giving the meaning), other truth conditional theories use techniques which
allow meaning to be reified, and thus identified with semantic value, in a certain sense.
The most well-known and important such approach is based on possible worlds:
(2) a. [[ It is raining outside ]]w,s = TRUE iff it is raining outside of the building where
the speaker s is located at time t in world w, and = FALSE otherwise,
b. [[ It is raining outside ]]'•* = the set of worlds {w : it is raining outside of the
building where the speaker s is located at time t in world w]
A possible world is a complete way the world could be. (Other theories use constructs
similar to possible worlds, such as situations.) The statement in (2a) says virtually the
same thing as (1), making explicit only that the meaning of It is raining outside depends
not merely on the actual weather outside, but whatever the weather may turn out to be.
Crucially, by allowing the possible world to be treated as an arbitrary point of evaluation,
as in (2a), we are able to identify the truth conditions with the set of all such points, as
1. Meaning in linguistics 3
in (2b). In (2), we have two different kinds of semantic value: the one in (2a), relativized
to world, time, and speaker, corresponds to (1), and is often called the extension or
reference. That in (2b), where the world point of evaluation has been transferred into the
semantic value itself, is then called the intension or sense. The sense of a full sentence,
for example given as a set of possible worlds as in (2b), is called a proposition. Specific
theories differ in the precise nature of the extension and intension: The intension may
involve more or different parameters than w, t, s, and several of these may be gathered
into a set (along with the world) to form the intension. For example, in tense semantics,
we often see intensions treated as sets of pairs of a world and a time.
The majority of work in semantics follows the truth conditional approach to the
extent of making statements like those in (l)-(2) the fundamental fabric of the theory.
Scholars often produce explicit fragments, i.e. mini-theories which cover a subset of a
language, which are actually functions from expressions of a language to semantic values,
with the semantic values of sentences being truth conditional in the vein of (l)-(2). But
not all semantic research is truth conditional in this explicit way. Descriptive linguistics,
functional linguistics, typological linguistics and cognitive linguistics frequently make
important claims about meaning (in a particular language, or crosslinguistically). For
example, Wolfart (1973: 25), a descriptive study of Plains Cree states: "Semantically,
direction serves to specify actor and goal. In sentence (3), for instances, the direct theme
sign /a7 indicates the noun atim as goal, whereas the inverse theme sign /ekw/ in (4) marks
the same noun as actor."
(3) nisekihanan atim
scare(lp-3) dog(3)
'We scare the dog.'
(4) nisekihikonan atim
scare(3-lp) dog(3)
'The dog scares us.'
Despite not being framed as such, this passage is implicitly truth conditional. Wolfart is
stating a difference in truth conditions which depends on the grammatical category of
direction using the descriptions "actor" and "goal", and using the translations of cited
examples. This example serves to illustrate the centrality of truth conditions to any
attempt to think about the nature of linguistic meaning.
As a corollary to the focus on truth conditions, semantic theories typically take
relations like entailment, synonymy, and contradiction to provide crucial data as well.
Thus, the example sentence It is raining outside entails (5), and this fact is known to any
competent speaker.
(5) It is raining outside or the kids are playing with the water hose.
Obviously, this entailment can be understood in terms of truth conditions (the truth of
the one sentence guarantees the truth of the other), a fact which supports the idea that
the analysis of truth conditions should be a central goal of semantics. It is less satisfying
to describe synonymy in terms of truth conditions, as identity of truth conditions doesn't
in most cases make for absolute sameness of meaning, in an intuitive sense - consider
4
I. Foundations of semantics
Mary hit John and John was hit by Mary; nevertheless, a truth conditional definition of
synonymy allows for at least a useful concept of synonymy, since people can indeed judge
whether two sentences would accurately describe the same circumstances, whereas it's
not obvious that complete intuitive synonymy is even a useful concept, insofar as it may
never occur in natural language.
The truth conditional perspective on meaning is intuitive and powerful where it
applies, but in and of itself, it is only a foundation. It doesn't, at first glance, say anything
about the meanings of subsentential constituents, the meanings or functions of non-
declarative sentences, or non-literal meaning, for example. Semantic theory is
responsible for the proper analysis of each of these features of language as well, and we will see
in many of the articles in this handbook how it has been able to rise to these challenges,
and many others.
3. Compositionality
A crucial aspect of natural language meaning is that speakers are able to determine
the truth conditions for infinitely many distinct sentences, including sentences they
have never encountered before. This shows that the truth conditions for sentences (or
whatever turns out to be their psychological correlate) cannot be memorized. Speakers
do not associate truth conditions such as the ones given in (1) or (2) holistically with
their respective sentences. Rather, there must be some principled way to compute the
meaning of a sentence from smaller units. In other words, natural language meaning is
essentially combinatorial. The meaning of a complex expression is construed by
combining the meaning of its parts in a certain way. Obviously, syntax plays a significant role
in this process. The two sentences in (6), for instance, arc made up of the same lexical
material. It is only the different word order that is responsible for the different sentence
meanings of (6a) and (6b).
(6) a. Caroline kissed a boy.
b. A boy kissed Caroline.
In a similar vein, the ambiguity of a sentence like (7) is rooted in syntax. The two
readings paraphrased in (7a) and (7b) correspond to different syntactic structures, with the
PP being adjoined either to the verbal phrase or to the direct object NP.
(7) Caroline observed the boy with the telescope.
a. Caroline observed the boy with the help of the telescope.
b. Caroline observed the boy who had a telescope.
Examples such as (6) and (7) illustrate that the semantic combinatorial machinery takes
the syntactic structure into account in a fairly direct way. This basic insight lead to the
formulation of the so-called "principle of compositionality", attributed to Gottlob Frege
(1892), which is usually formulated along the following lines:
(8) Principle of compositionality:
The meaning of a complex expression is a function of the meanings of its parts and
the way they are syntactically combined.
1. Meaning in linguistics 5
According to (8), the meaning of, e.g., Caroline sleeps is a function of the meanings of
Caroline and sleeps and the fact that the former is the syntactic subject of the latter.
There arc stronger and weaker versions of the principle of compositionality, depending
on what counts as "parts" and how exactly the semantic combinatorics is determined by
the syntax. For instance, adherents of a stronger version of the principle of
compositionality typically assume that the parts that constitute the meaning of a complex expression
are only its immediate constituents. According to this view, only the NP [Caroline] and
the VP [kissed a boy] would count as parts when computing the sentence meaning for
(6a), but not (directly) [kissed] or [a boy].
Modern semantics explores many different ways of implementing the notion of
compositionality formally. One particularly useful framework is based on the mathematical
concept of a function. It takes the meaning of any complex expression as being the result
of applying the meaning of one of its immediate parts (= the functor) to the meaning
of its other immediate part (= the argument). With functional application as the basic
semantic operation that is applied stepwise, mirroring the binary branching of syntax,
the function-argument approach allows for a straightforward syntax-semantics mapping.
Although there is wide agreement among semanticists that, given the
combinatorial nature of linguistic meaning, some version of the principle of compositionality must
certainly hold, it is also clear that, when taking into account the whole complexity and
richness of natural language meaning, compositional semantics is faced with a series of
challenges. As a response to these challenges, semanticists have come up with several
solutions and amendments.These relate basically to (A) the syntax-semantics interface, (B) the
relationship between semantics and ontology, and (C) the semantics-pragmatics interface.
A Syntax-Semantics Interface
One way to cope with challenges to compositionality is to adjust the syntax properly.
This could be done, e.g., by introducing possibly mute, i.e. phonetically empty, functional
heads into the syntactic tree that nevertheless carry semantic content, or by relating the
semantic composition to a more abstract level of syntactic derivation - Logical Form -
that may differ from surface structure due to invisible movement. That is, the syntactic
structure on which the semantic composition is based may be more or less directly linked
to surface syntax, such that it fits the demands of compositional semantics. Of course, any
such move should be independently motivated.
B Semantics - Ontology
Another direction that might be explored in order to reconcile syntax and semantics is
to reconsider the inventory of primitive semantic objects the semantic fabric is assumed
to be composed of. A famous case in point is Davidson's (1967) plea for an ontological
category of events. A crucial motivation for this move was that the standard treatment of
adverbial modifiers at that time was insufficient insofar as it failed to account properly
for the combinatorial behavior and entailments of adverbial expressions. By positing
an additional event argument introduced by the verb, Davidson laid the grounds for a
theory of adverbial modification that would overcome these shortcomings. Under this
assumption Davidson's famous sentence (9a) takes a semantic representation along the
lines of (9b):
6
I. Foundations of semantics
(9) a. Jones buttered the toast in the bathroom with the knife at midnight.
b. 3e [ butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) & at
(e, midnight) ]
According to (9b), there was an event c of Jones buttering the toast, and this event was
located in the bathroom. In addition, it was performed by using a knife as an instrument,
and it took place at midnight.That is, Davidson's move enabled standard adverbial
modifiers to be treated as simple first-order predicates that add information about the verb's
hidden event argument. The major merits of such a Davidsonian analysis are, first, that it
accounts for the typical entailment patterns of adverbial modifiers directly on the basis
of their semantic representation.That is, the entailments in (10) follow from (9b) simply
by virtue of the logical rule of simplification.
(10) a. Jones buttered the toast in the bathroom at midnight.
b. Jones buttered the toast in the bathroom.
c. Jones buttered the toast at midnight.
d. Jones buttered the toast.
And, secondly, Davidson paved the way for treating adverbial modifiers on a par with
adnominal modifiers. In the meantime, researchers working within the Davidsonian
paradigm have discovered more and more fundamental analogies between the verbal and
the nominal domain, attesting to the fruitfulness of Davidson's move.
In short, by enriching the semantic universe with a new onto logical category of events,
Davidson solved the compositionality puzzle of adverbials and arrived at a semantic
theory superior to its competitors in both conceptual simplicity and empirical coverage.
Of course once again, such a solution does not come without costs. With Quine's (1958)
dictum "No entity without identity!" in mind, any ontological category a semantic theory
makes use of requires a proper ontological characterization and legitimization. In the
case of events, this is still the subject of ongoing debates among scmanticists.
C Semantics-Pragmatics Interface
Finally, challenges to compositionality might also be taken as an invitation to reconsider
the relationship between semantics and pragmatics by asking how far the composition of
sentential meaning goes, and what the principles of pragmatic enrichment and pragmatic
licensing are. One notorious case in point is the adequate delineation of linguistic
knowledge and world knowledge. To give an example, when considering the sentences in (11),
we know that each of them refers to a very different kind of opening event. Obviously,
the actions underlying, for instance, the opening of a can differ substantially from those
of opening one's eyes or opening a file on a computer.
(11) a. She opened the can.
b. She opened her eyes.
c. She opened the electronic file.
To a certain extent, this knowledge is of linguistic significance, as can be seen when taking
into account the combinatorial behavior of certain modifiers:
1. Meaning in linguistics 7
(11) a. She opened the can {with a knife, *abruptly, *with a double click}.
b. She opened her eyes {*with a knife, abruptly, *with a double click}.
c. She opened the electronic file {*with a knife, *abruptly,with a double click}.
A comprehensive theory of natural language meaning should therefore strive to
account for these observations. Nevertheless, incorporating this kind of world
knowledge into compositional semantics would be neither feasible nor desireable. A possible
solution for this dilemma lies in the notion of semantic underspecification. Several
proposals have been developed which take the lexical meaning that is fed into semantic
composition to be of an abstract, context neutral nature. In the case of to open in (11),
for instance, this common meaning skeleton would roughly say that some action of an
agent x on an object y causes a change of state such that y is accessible afterwards.This
would be the verb's constant meaning contribution that can be found in all sentences
in (lla-c) and which is also present,e.g.,in (lid), where we don't have such clear
intuitions about how x acted upon y, and which is therefore more liberal as to adverbial
modification.
(11) d. She opened the gift {with a knife, abruptly, with a double click}.
That is, underspecification accounts would typically take neither the type of action
performed by x nor the exact sense of accessibility of y as part of the verb's lexical meaning.
To account for this part of the meaning, compositional semantics is complemented by
a procedure of pragmatic enrichment, by which the compositionally derived meaning
skeleton is pragmatically specified according to the contextually available world
knowledge.
Semantic underspecification/pragmatic enrichment accounts provide a means for further
specifying a compositionally well-formed, underspecified meaning representation. A
different stance towards the semantics-pragmatics interface is taken by so-called "coercion"
approaches.These deal typically with the interpretation of sentences that are strictly speaking
ungrammatical but might be "rescued" in a certain way. An example is given in (12).
(12) The alarm clock stood intentionally on the table.
The sentence in (12) does not offer a regular integration for the subject-oriented adverbial
intentionally, i.e, the subject NP the alarm clock does not fulfill the adverbial's request for
an intentional subject. Hence, a compositional clash results and the sentence is
ungrammatical. Nevertheless, although deviant, there seems to be a way to rescue the sentence so
that it becomes acceptable and interpretable anyway. In the case of (12), a possible repair
strategy would be to introduce an actor, who is responsible for the fact that the alarm clock
stands on the table.This move would provide a suitable anchor for the adverbial's semantic
contribution. Thus, we understand (12) as saying that someone put the alarm clock on
purpose on the table. That is, in case of a combinatorial clash, there seems to be a certain
leeway for non-compositional adjustments of the compositionally derived meaning. The
defective part is "coerced" into the right format. The exact mechanism of coercion and its
grammatical and pragmatic licensing conditions are still poorly understood.
In current semantic research many quite different directions are being explored
with respect to the issues A-C. What version of the principle of compositionality
8
I. Foundations of semantics
ultimately turns out to be the right one and how compositional semantics interacts
with syntax, ontology, and pragmatics is, in the end, an empirical question. Yet, the
results and insights obtained so far in this endeavor are already demonstrating the
fruitfulness of reckoning with compositionality as a driving force in the constitution of
natural language meaning.
4. Context and discourse
Speakers do not use sentences in isolation, but in the context of an utterance situation
and as part of a longer discourse. The meaning of a sentence depends on the particular
circumstances of its utterances, but also on the discourse context in which it is uttered. At
the same time the meaning of linguistic expression changes the context, e.g., the
information available to speaker and hearer. The analysis of the interaction of context,
discourse and meaning provides new and challenging issues to the research agenda in the
semantics-pragmatics interface as described in the last section. In the following we focus
on two aspects of these issues to illustrate how the concept of meaning described above
can further be developed by theorizing on the interaction between sentence meaning,
contextual parameters and discourse structure.
So far wc have characterized the meaning of a sentence by its truth conditions and, as
a result, we have "considered semantics to be the study of propositions" (Stalnaker 1970:
273). It is justified by the very clear concept that meaning describes "how the world
is". However, linguistic expressions often need additional information to form
propositions as sentences contain indexical elements, such as /, you, she, here, there, now and
the tenses of verbs. Indexical expressions cannot be interpreted according to possible
worlds, i.e. how the conditions might be, but they are interpreted according to the actual
utterance situation. Intensive research into this kind of context dependency led to the
conclusion that the proposition itself depends on contextual parameters like speaker,
addressee, location, time etc.This dependency is most prominently expressed in Kaplan's
(1977) notion character for the meaning of linguistic expressions. The character of an
expression is a function from the context of utterance c, which includes the values for the
speaker, the hearer, the time, the location etc. to the proposition. Other expressions such
as local, different, a certain, enemy, neighbor may contain "hidden" indexical parameters.
They express their content dependent on one or more reference points given by the
context. Thus meaning is understood as an abstract concept or function from contexts
to propositions, and propositions themselves are described as functions from possible
worlds into truth conditions.
The meaning of a linguistic expression is influenced not only by such relatively
concrete aspects of the situation of use as speaker and addressee, but also by intentional
factors like the assumptions of the speaker and hearer about the world, their beliefs and
their goals. This type of context is continuously updated by the information provided
by each sentence in a discourse. We see that linguistic expressions are not only
"context-consumers", but also "context-shifters". This can be illustrated by examples from
anaphora, presuppositions and various discourse relations.
(13) a. A man walks in the park. He smokes,
b. #He smokes. A man walks in the park.
1. Meaning in linguistics 9
(14) a. Rebecca married Thomas. She regrets that she married him.
b Rebecca regrets that she married Thomas. 'She married him.
(15) a. John left. Ann started to cry.
b. Ann started to cry. John left.
In (13) the anaphoric pronoun needs an antecedent, in other words it is a context-
consumer as it takes the information provided in the context for fixing its meaning. The
indefinite noun a man however is a context-shifter. It changes the context by introducing
a discourse referent into the discourse or discourse structure such that the pronoun can
be linked to it. In (13a) the indefinite introduces the referent and the anaphoric pronoun
can be linked to it, in (13b) the pronoun in the first sentence has no antecedent and if the
indefinite noun phrase in the second clause should refer to the same discourse referent
it must not be indefinite. In (14) we see the contribution of presupposition to the
context. (14b) is odd, since one can regret only something that is known to have happened.
To assert this again makes the contribution of the second sentence superfluous and the
small discourse incoherent. (15) provides evidence that wc always assume some relation
between sentences above a simple conjunction of two propositions. The relation could
be a sequence of events or a causal relation between the two event, and this induces
different meanings on the two small discourses as a whole.These and many more examples
have led to the development of dynamic semantics, i.e. the view that meaning is shifting
a given information status to a new one.
There are different ways to model the context dependency of linguistic expressions
and the choice among them is still an unresolved issue and a topic of considerable
contemporary interest. We illustrate this by presenting one example from the literature.
Stalnaker proposes to represent the context as a set of possible worlds that are shared by
speaker and hearer, his "common ground". A new sentence is interpreted with respect
to the common ground, i.e. to a set of possible worlds.The interpretation of the sentence
changes the common ground (given that the hearer does not reject the content of the
sentence) and the updated common ground is the new context for the next sentence.
Kamp (1988) challenges this view as problematic, as possible worlds do not provide
enough linguistically relevant information, as the following example illustrates (due to
Barbara Partee, first discussed in Heim 1982:21).
(16) Exactly one of the ten balls is not in the bag. It is under the sofa.
(17) Exactly nine of the ten balls are in the bag. #It is under the sofa.
Both sentences in (16) and (17) have the same truth conditions, i.e. in exactly all possible
circumstances in which (16) is true (17) is true, too;still the continuation with the second
sentence is only felicitous in (16), but not in (17). (16) explicitly introduces an antecedent
in the first sentence, and the pronoun in the second sentence can be anaphorically linked
to it. In (17), however, no explicit antecedent is introduced and therefore wc cannot
resolve the anaphoric reference of the pronoun. Extensive research on these issues has
proven very fruitful for the continuous developing of our methodological tools and for
our understanding of natural language meaning in context and its function for discourse
structure.
10
I. Foundations of semantics
5. Meaning in contemporary semantics
Meaning is a notion investigated by a number of disciplines, including linguistics,
philosophy, psychology, artificial intelligence, semiotics as well as many others.The definitions of
meaning are as manifold and plentiful as the different theories and perspectives that arise
from these disciplines. We have argued here that in order to use meaning as a well-defined
object of investigation, we must perceive facts to be explained and have tests to expose the
underlying phenomena, and we must have a well-defined scientific apparatus which allows
us to describe, analyze and model these phenomena.This scientific apparatus is
contemporary semantics: It possesses a clearly defined terminology, it provides abstract
representations and it allows for formal modeling that adheres to scientific standards and renders
predictions that can be verified or falsified. We have illustrated the tenets, tools and goals of
contemporary semantics by discussing typical approaches to three central characteristics
of meaning: truth conditionality, compositionality, and context and discourse.
Recent times have witnessed an increased interest of scmanticists in developing their
theories on a broader basis of empirical evidence, taking into account crosslinguistic
data, diachronic data, psycho- and neurolinguistic studies as well as corpus linguistic and
computational linguistic resources. As a result of these efforts, contemporary semantics is
characterized by a continuous explanatory progress, an increased awareness of and
proficiency in methodological issues, and the emergence of new opportunities for
interdisciplinary cooperation. Along these lines, the articles of this handbook develop an integral,
many-faceted and yet well-rounded picture of this joint endeavour in the linguistic study
of natural language meaning.
6. References
Davidson, Donald 1967.The logical form of action sentences. In: N. Reslier (ed.). The Logic of
Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81-95. Reprinted in: D. Davidson
(ed.). Essays on Actions and Events. Oxford: Clarendon Press, 1980,105-148.
Frege, Gottlob 1892/1980. Ubcr Sinn und Bedeutung. Zeitschrift fiir Philosophic und philosophische
Kritik 100,25-50. English translation in: P. Geacli & M. Black (eds.). Translations from the
Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78.
Hciin, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation.
University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms.
Kaplan. David 1977/1989. Demonstratives. An Essay on the Semantics, Logic, Metaphysics, and
Epistemology of Demonstratives and Other Indexicals. Ms. Los Angeles, CA, University of
California. Printed in: J. Almog & J. Perry & H. Wcttstcin (eds.). Themes from Kaplan. Oxford:
Oxford University Press, 1989,481-563.
Kamp, Hans 1988. Belief attribution and context & Comments on Stalnaker. In: R.H. Grimm & D.D.
Merrill (eds). Contents of Thought. Tucson, AZ:The University of Arizona Press, 156-181,203-206.
Quine. Willard van Orinan 1958. Speaking of Objects. Proceedings and Addresses of the American
Philosophical Association 31,5-22.
Stalnaker, Robert 1970. Pragmatics. Synthese 22,272-289.
Wolfart, H. Christoph 1973. Plains Cree:A Grammatical Study. Transactions of the American
Philosophical Society, New Series, vol. 63, pt. 5. Philadelphia, PA:Tlie American Philosophical Society.
Claudia Maienborn, Tubingen (Germany)
Klaus von Heusinger, Stuttgart (Germany)
Paul Portner, Washington, DC (USA)
2. Meaning, intentionality and communication U_
2. Meaning, intentionality and communication
1. Introduction
2. Intentionality: Brcntano's legacy
3. Early pragmatics: ordinary language philosophy and speech act theory
4. Gricc on speaker's meaning and implicaturcs
5. The emergence of truth-conditional pragmatics
6. Concluding remarks: pragmatics and cognitive science
7. References
Abstract
This article probes the connections between the metaphysics of meaning and the
investigation of human communication. It first argues that contemporary philosophy of mind has
inherited most of its metaphysical questions from Brentano's puzzling definition of
intentionality. Then it examines how intentionality came to occupy the forefront of pragmatics
in three steps. (1) By investigating speech acts, Austin and ordinary language philosophers
pioneered the study of intentional actions performed by uttering sentences of natural
languages. (2) Based on his novel concept of speaker's meaning and his inferential view of
human communication as a cooperative and rational activity, Grice developed a three-
tiered model of the meaning of utterances: (i) the linguistic meaning of the uttered sentence;
(ii) the explicit truth-conditional content of the utterance; (Hi) the implicit content conveyed
by the utterance. (3) Finally, the new emerging truth-conditional trend in pragmatics urges
that not only the implicit content conveyed by an utterance but its explicit content as well
depends on the speaker's communicative intention.
1. Introduction
This article lies at the interface between the scientific investigation of human verbal
communication and metaphysical questions about the nature of meaning. Words and
sentences of natural languages have meaning (or semantic properties) and they are used
by humans in tasks of verbal communication. Much of twentieth-century philosophy of
mind has been concerned with metaphysical questions raised by the perplexing nature
of meaning. For example, what is it about the meaning of the English word "dog" that
enables a particular token used in the USA in 2008 to latch onto hairy barking creatures
that lived in Egypt four thousand years earlier (cf. Horwich 2005)?
Meanwhile, the study of human communication in the twentieth century can be seen
as a competition between two models, which Sperber & Wilson (1986) call the "code
model" and the "inferential model." A decoding process maps a signal onto a message
associated to the signal by an underlying code (i.e., a system of rules or conventions).
An inferential process maps premises onto a conclusion, which is warranted by the
premises. When an addressee understands a speaker's utterance, how much of the content
of the utterance has been coded into, and can be decoded from, the linguistic meaning
of the utterance? How much content does the addressee retrieve by his ability to infer
the speaker's communicative intention? These are the basic scientific questions in the
investigation of human verbal communication.
Maienborn, von Heusingerand Portner (eds.) 2011, Semantics (HSK33.1), de Gruyter, 11-25
12
I. Foundations of semantics
Much philosophy of mind in the twentieth century devoted to the metaphysics of
meaning sprang from Brentano's puzzling definition of the medieval word "intention-
ality" (section 2). Austin, one of the leading ordinary language philosophers, emphasized
the fact that by uttering sentences of some natural language, a speaker may perform
an action, i.e., a speech act (section 3). But he espoused a social conventionalist view of
speech acts, which later pragmatics rejected in favor of an inferential approach. Grice
instead developed an inferential model of verbal communication based on his concept of
speaker's meaning and his view that communication is a cooperative and rational activity
(section 4). However, many of Grice's insights have been further developed into a non-
Gricean truth-conditional pragmatics (section 5). Finally, the "relevance-theoretic"
approach pioneered by Spcrbcr & Wilson (1986) fills part of the gap between the study of
meaning and the cognitive sciences (section 6).
2. Intentionality: Brentano's legacy
Brentano (1874) made a twofold contribution to the philosophy of mind: he provided a
puzzling definition of intentionality and he put forward the thesis that intentionality is
"the mark of the mental." Intentionality is the power of minds to be about things,
properties, events and states of affairs. As the meaning of its Latin root (tendere) indicates,
"intentionality" denotes the mental tension whereby the human mind aims at so-called
"intentional objects."
The concept of intentionality should not be confused with the concept of intention.
Intentions are special psychological states involved in the planning and execution
of actions. But on Brentano's view, intentionality is a property of all psychological
phenomena. Nor should "intentional" and "intentionality" be confused with the
predicates "intensional" and "intensionality," which mean "non-extensional" and "non-
extensionality": they refer to logical features of sentences and utterances, some of which
may describe (or report) an individual's psychological states. "Creature with a heart"
and "creature with a kidney" have the same extension: all creatures with a heart have
a kidney and conversely (cf. Quinc 1948). But they have different intensions because
having a heart and having a kidney are different properties. This distinction mirrors
Frege's (1892) distinction between sense and reference (cf. article 3 (Textor) Sense
and reference and article 4 (Abbott) Reference). In general, a linguistic context is non-
extensional (or intensional) if it fails to license both the substitution of coreferential
terms salva veritate and the application of the rule of existential generalization.
As Brentano defined it, intentionality is what enables a psychological state or act to
represent a state of affairs, or be directed upon what he called an "intentional object."
Intentional objects exemplify the property which Brentano called "intentional inexis-
tence" or "immanent objectivity," by which he meant that the mind may aim at targets
that do not exist in space and time or represent states of affairs that fail to obtain or even
be possible. For example, unicorns do not exist in space and time and round squares are
not possible geometrical objects. Nonetheless thinking about either a unicorn or a round
square is not thinking about nothing. To admire Sherlock Holmes or to love Anna Kar-
enina is to admire or love something, i.e., some intentional object.Thus, Brentano's
characterization of intentionality gave rise to a gap in twentieth-century philosophical logic
between intentional-objects theorists (Meinong 1904; Parsons 1980; Zalta 1988), who
claimed that there must be things that do not exist, and their opponents (Russell 1905;
2. Meaning, intentionality and communication 13
Quine 1948), who denied it and rejected the distinction between being and existence.
(For further discussion, cf. Jacob 2003.)
Brentano (1874) also held the thesis that intentionality is constitutive of the mental: all
and only psychological phenomena exhibit intentionality. Brentano's second thesis that
only psychological (or mental) phenomena possess intentionality led him to embrace
a version of the Cartesian ontological dualism between mental and physical things.
Chisholm (1957) offered a linguistic version of Brentano's second thesis, according to
which the intensionality of a linguistic report is a criterion of the intentionality of the
reported psychological state (cf. Jacob 2003). He further argued that the contents of
sentences describing an agent's psychological states cannot be successfully paraphrased
into the bchaviorist idiom of sentences describing the agent's bodily movements and
behavior.
Quine (1960) accepted Chisholm's (1957) linguistic version of Brentano's second
thesis which he used as a premise for an influential dilemma: if the intentional idiom is
not reducible to the behaviorist idiom, then the intentional idiom cannot be part of the
vocabulary of the natural sciences and intentionality cannot be "naturalized." Quine's
dilemma was that one must choose between a physicalist ontology and intentional
realism, i.e., the view that intentionality is a real phenomenon. Unlike Brentano, Quine
endorsed physicalism and rejected intentional realism.
Some of the physicalists who accept Quine's dilemma (e.g., Churchland 1989) have
embraced climinativc materialism and denied the reality of beliefs and desires.The short
answer to this proposal is that it is difficult to make sense of the belief that there are
no beliefs. Others (such as Dennett 1987) have taken the "instrumentalist" view that,
although the intentional idiom is a useful stance for predicting a complex physical
system's behavior, it lacks an explanatory value. But the question arises how the intentional
idiom could make useful predictions if it fails to describe and explain anything (cf. Jacob
1997,2003 and Rcy 1997).
As a result of the difficulties inherent to both eliminative materialism and
interpretive instrumentalism,several physicalists have chosen to deny Brentano's thesis that only
non-physical things exhibit intentionality, and to challenge Quine's dilemma according
to which intentional realism is not compatible with physicalism.Their project is to
"naturalize" intentionality and account for the puzzling features of intentionality (e.g., the fact
that the mind may aim at non-existing objects and represent non-actual states of affairs),
using only concepts recognizable by natural scientists (cf. section 3 on Gricc's notion of
non-natural meaning).
In recent philosophy of mind, the most influential proposals for naturalizing
intentionality have been versions of the so-called "teleosemantic" approach championed by
Millikan (1984), Dretske (1995) and others, which is based on the notion of biological
function (or purpose).Teleosemantic theories are so-called because they posit an
underlying connection between teleology (design or function) and content (or intentionality):
a representational device is endowed with a function (or purpose). Something whose
function is to indicate the presence of some property may fail to fulfill its function. If and
when it does, then it may generate a false representation or represent something that
fails to exist.
Brentano's thesis that only mental phenomena exhibit intentionality seems also open
to the challenge that expressions of natural languages, which are not mental things, have
intentionality in virtue of which they too can represent things, properties, events and
14
I. Foundations of semantics
states of affairs. In response, many philosophers of mind, such as Grice (1957, 1968),
Fodor (1987), Haugeland (1981) and Searle (1983,1992), have endorsed the distinction
between the underived intentionality of a speaker's psychological states and the derived
intentionality (i.e., the conventional meaning) of the sentences by the utterance of which
she expresses her mental states. On their view, sentences of natural languages would
lack meaning unless humans used them for some purpose. (But for dissent, see Dennett
1987.)
Some philosophers go one step further and posit the existence of an internal
"language of thought:" thinking, having a thought or a propositional attitude is to entertain
a token of a mental formula realized in one's brain. On this view, like sentences of
natural languages, mental sentences possess syntactic and semantic properties. But, unlike
sentences of natural languages, they lack phonological properties. Thus, the semantic
properties of a complex mental sentence systematically depend upon the meanings
of its constituents and their syntactic combination. The strongest arguments for the
existence of a language of thought are based on the productivity and systematicity
of thoughts, i.e., the facts that there is no upper limit on the complexity of thoughts
and that a creature able to form certain thoughts must be able to form other related
thoughts. On this view, the intentionality of an individual's thoughts and propositional
attitudes derives from the meanings of symbols in the language of thought (cf. Fodor
1975,1987).
3. Early pragmatics: ordinary language philosophy
and speech act theory
Unlike sentences of natural languages, utterances are created by speakers, at particular
places and times, for various purposes, including verbal communication. Not all
communication, however, need be verbal. Nor do people use language solely for the
purpose of communication; one can use language for clarifying one's thoughts, reasoning
and making calculations. Utterances, not sentences, can be shouted in a hoarse voice
and tape-recorded. Similarly, the full meaning of an utterance goes beyond the linguistic
meaning of the uttered sentence, in two distinct aspects: both its representational
content and its so-called "illocutionary force" (i.e., whether the utterance is meant as a
prediction, a threat or an assertion) are underdetermined by the linguistic meaning of the
uttered sentence.
Prior to the cognitive revolution of the 1950's,the philosophy of language was divided
into two opposing approaches: so-called "ideal language" philosophy (in the tradition
of Frege, Russell, Carnap and Tarski) and so-called "ordinary language" philosophy
(in the tradition of Wittgenstein, Austin, Strawson and later Searle). The word
"pragmatics," which derives from the Greek word praxis (which means action), was first
introduced by ideal language philosophers as part of a threefold distinction between syntax,
semantics and pragmatics (cf. Morris 1938 and Carnap 1942). Syntax was defined as the
study of internal relations among symbols of a language. Semantics was defined as the
study of the relations between symbols and their denotations (or designata). Pragmatics
was defined as the study of the relations between symbols and their users (cf. article 88
(Jaszczolt) Semantics and pragmatics).
Ideal language philosophers were interested in the semantic structures of sentences
of formal languages designed for capturing mathematical truths.The syntactic structure
2. Meaning, intentionality and communication 1_5
of any "well-formed formula" (i.e., sentence) of a formal language is defined by arbitrary
rules of formation and derivation. Semantic values are assigned to simple symbols of
the language by stipulation and the truth-conditions of a sentence can be mechanically
determined from the semantic values of its constituents by the syntactic rules of
composition. From the perspective of ideal language philosophers, such features of natural
languages as their context-dependency appeared as a defect. For example, unlike formal
languages, natural languages contain indexical expressions (e.g., "now", "here" or "I")
whose references can change with the context of utterance.
By contrast, ordinary language philosophers were concerned with the distinctive
features of the meanings of expressions of natural languages and the variety of their
uses. In sharp opposition to ideal language philosophers,ordinary language philosophers
stressed two main points, which paved the way for later work in pragmatics. First, they
emphasized the context-dependency of the descriptive content expressed by utterances
of sentences of natural languages (see section 4). Austin (1962a: 110-111) denied that a
sentence as such could ever be ascribed truth-conditions and a truth-value:"the question
of truth and falsehood does not turn only on what a sentence is, nor yet on what it means,
but on, speaking very broadly, the circumstances in which it is uttered." Secondly, they
criticized what Austin (1962b) called the "descriptive fallacy," according to which the
sole point of using language is to state facts or describe the world (cf. article 5 (Green)
Meaning in language use).
As indicated by the title of Austin's (1962b) book, How to Do Things with Words,
they argued that by uttering sentences of some natural language, a speaker performs
an action, i.e., a speech act: she performs an "illocutionary act" with a particular illocu-
tionary force. A speaker may give an order, ask a question, make a threat, a promise, an
entreaty, an apology, an assertion and so on. Austin (1962b) sketched a new framework
for the description and classification of speech acts. As Green (2007) notes, speech acts
arc not to be confused with acts of speech: "one can perform an act of speech, say by
uttering words in order to test a microphone, without performing a speech act."
Conversely, one can issue a warning without saying anything, by producing a gesture or a
"minatory facial expression."
Austin (1962b) identified three distinct levels of action in the performance of a
speech act: the "locutionary act," the "illocutionary act," and the "perlocutionary act,"
which stand to one another in the following hierarchical structure. By uttering a
sentence, a speaker performs the locutionary act of saying something by virtue of which
she performs an illocutionary act with a given illocutionary force (e.g., giving an order).
Finally, by performing an illocutionary act endowed with a specific illocutionary force,
the speaker performs a perlocutionary act, whereby she achieves some psychological or
behavioral effect upon her audience, such as frightening him or convincing him.
Before he considered this threefold distinction within the structure of speech acts,
Austin had made a distinction between so-called "constative" and "performative"
utterances. The former is supposed to describe some state of affairs and is true or false
according to whether the described state of affairs obtains or not. Instead of being a
(true or false) description of some independent state of affairs, the latter is supposed to
constitute (or create) a state of affairs of its own. Clearly, the utterance of a sentence in
either the imperative mood ("Leave this room immediately!") or the interrogative mood
("What time is it right now?") is performative in this sense: far from purporting to
register any pre-existing state of affairs, the speaker either gives an order or asks a question.
16
I. Foundations of semantics
By drawing the distinction between constative and performative utterances, Austin was
able to criticize the descriptive fallacy and emphasize the fact that many utterances of
declarative sentences are performative (not constative) utterances.
In particular, Austin was interested in explicit performative utterances ("I promise I'll
come," "I order you to leave" or "I apologize"), which include a main verb that denotes
the very speech act that the utterance performs. Austin's attention was drawn towards
explicit performatives, whose performance is governed, not merely by linguistic rules,
but also by social conventions and by what Searle (1969: 51) called "institutional facts"
(as in "I thereby pronounce you husband and wife"), i.e., facts that (unlike "brute facts")
presuppose the existence of human institutions. Specific bodily movements count as a
move in a game, as an act of e.g., betting, or as part of a marriage ceremony only if they
conform to some conventions that are part of some social institutions. For a performative
speech act to count as an act of baptism, of marriage, or an oath, the utterance must meet
some social constraints, which Austin calls "felicity" conditions. Purported speech acts of
baptism, oath or marriage can fail some of their felicity conditions and thereby "misfire"
if either the speaker lacks the proper authority or the addressee fails to respond with an
appropriate uptake - in response to e.g., an attempted bet sincerely made by the speaker.
If a speaker makes an insincere promise, then he is guilty of an "abuse."
Austin came to abandon his former distinction between constative and performative
utterances when he came to realize that some explicit performatives can be used to make
true or false assertions or predictions. One can make an explicit promise or an explicit
request by uttering a sentence prefixed by either "I promise" or "I request." One can
also make an assertion or a prediction by uttering a sentence prefixed by either "I assert"
or "I predict." Furthermore, two of his assumptions led Austin to embrace a social
conventionalist view of illocutionary acts. First, Austin took explicit performatives as a
general model for illocutionary acts. Secondly, he took explicit performatives, whose felicity
conditions include the satisfaction of social conventions, as a paradigm of all explicit
performatives. Thus Austin (1962b: 103) was led to embrace a social conventionalist
view of illocutionary acts according to which the illocutionary force of a speech act is
"conventional in the sense that it could be made explicit by the performative formula."
Austin's social conventionalist view of illocutionary force was challenged by Strawson
(1964:153-154) who pointed out that the assumption that no illocutionary act could be
performed unless it conformed to some social convention would be "like supposing that
there could not be love affairs which did not proceed on lines laid down in the Roman de
la Rose." Instead, Strawson argued, what confers to a speech act its illocutionary force is
that the speaker intends it to be so taken by her audience. By uttering "You will leave,"
the speaker may make a prediction, a bet or order the addressee to leave. Only the
context, not some socially established convention, may help the audience determine the
particular illocutionary force of the utterance.
Also, as noted by Searle (1975) and by Bach & Harnish (1979), speech acts may be
performed indirectly. For example, by uttering "I would like you to leave," a speaker
directly expresses her desire that her addressee leave. But in so doing, she may indirectly
ask or request her addressee to do so. By uttering "Can you pass the salt?" - which is a
direct question about her addressee's ability -, the speaker may indirectly request him to
pass the salt. As Recanati (1987:92-93) argues, when a speaker utters an explicit
performative such as "I order you to leave," her utterance has the direct illocutionary force of a
statement. But it may also have the indirect force of an order.There need be no socially
2. Meaning, intentionality and communication 17
established convention whereby a speaker orders her audience to leave by means of an
utterance with a verb that denotes the act performed by the speaker.
4. Grice on speaker's meaning and implicatures
In his 1957 seminal paper, Grice did three things: he drew a contrast between "natural"
and "non-natural" meaning; he offered a definition of the novel concept of speaker's
meaning; and he sketched a framework within which human communication is seen
as a cooperative and rational activity (the addressee's task being to infer the speaker's
meaning on the basis of her utterance, in accordance with a few principles of rational
cooperation). In so doing, Grice took a major step towards an "inferential model" of
human communication, and away from the "code model" (cf. article 88 (Jaszczolt)
Semantics and pragmatics).
As Grice (1957) emphasized, smoke is a natural sign of fire: the former naturally
means the latter in the sense that not unless there was a fire would there be any smoke.
By contrast, the English word "fire" (or the French word "feu") non-naturally means
fire: if a person erroneously believes that there is a fire (or wants to intentionally mislead
another into wrongly thinking that there is a fire) when there is none, then she can
produce a token of the word "fire" in the absence of a fire. (Thus, the notion of non-natural
meaning is Grice's counterpart of Brcntano's intentionality.)
Grice (1957,1968,1969) further introduced the concept of speaker's meaning, i.e., of
someone meaning something by exhibiting some piece of behavior that can, but need
not, be verbal. For a speaker 5 to mean something by producing some utterance x is for
5 to intend the utterance of x to produce some effect (or response r) in an audience A by
means of A's recognition of this very intention. Hence, the speaker's meaning is a
communicative intention, with the peculiar feature of being reflexive in the sense that part of
its content is that an audience recognize it.
Strawson (1964) turned to Grice's concept of speaker's meaning as an intention-
alist alternative to Austin's social conventional account of illocutionary acts (section 2).
Strawson (1964) also pointed out that Grice's complex analysis of speaker's meaning or
communicative intention requires the distinction between three complementary levels
of intention. For 5 to mean something by an utterance x is for 5 to intend:
(i) 5's utterance of x to produce a response r in audience A;
(ii) A to recognize 5's intention (i);
(iii) y4's recognition of 5's intention (i) to function at least as part of A's reason for A's
response r.
This analysis raises two opposite problems: it is both overly restrictive and insufficiently
so. First, as Strawson's reformulation shows, Grice's condition (i) corresponds to 5's
intention to perform what Austin (1962b) called a perlocutionary act. But for 5 to
successfully communicate with A, it is not necessary that 5's intention to perform her
perlocutionary act be fulfilled (cf. Searle 1969:46-48). Suppose that 5 utters: "It is raining,"
intending (i) to produce in A the belief that it is raining./! may recognize 5's intention
(i); but, for some reason, A may mistrust 5 and fail to acquire the belief that it is raining.
5 would have failed to convince A (that it is raining), but 5 would nonetheless have
successfully communicated what she meant to /t.Thus, fulfillment of 5's intention (i) is not
18
I. Foundations of semantics
necessary for successful communication. Nor is the fulfillment of 5's intention (iii), which
presupposes the fulfillment of 5's intention (i). All that is required for 5 to communicate
what she meant to A is /t's recognition of 5's intention (ii) that 5 has the higher-order
intention to inform A of her first-order intention to inform A of something.
Secondly, Strawson (1964) pointed out that his reformulation of Grice's definition of
speaker's meaning is insufficiently restrictive. Following Sperber & Wilson (1986: 30),
suppose that 5 intends A to believe that she needs his help to fix her hair-dryer, but she
is reluctant to ask him openly to do so. 5 ostensively offers A evidence that she is trying
to fix her hair-dryer, thereby intending A to believe that she needs his help. 5 intends A
to recognize her intention to inform him that she needs his help. However, 5 does not
want A to know that she knows that he is watching her. Since 5 is not openly asking A
to help her, she is not communicating with A. Although 5 has the second-order intention
that A recognizes her first-order intention to inform him that she needs his help, she does
not want A to recognize her second-order intention. To deal with such a case, Strawson
(1964) suggested that the analysis of Grice's speaker's meaning include 5's third-order
intention to have her second-order intention recognized by her audience. But as Schiffer
(1972) pointed out, this opens the way to an infinity of higher-order intentions. Instead,
Schiffer (1972) argued that for 5 to have a communicative intention, 5's intention to
inform A must be mutually known to 5 and A. But as pointed out by Sperber & Wilson
(1986:18-19), people who share mutual knowledge know that they do. So the question
arises: how do speaker and hearer know that they do? (We shall come back to this issue
in the concluding remarks.)
Grice (1968) thought of his concept of speaker's meaning as a basis for a reductive
analysis of semantic notions such as sentence- or word-meaning. But most linguists
and philosophers have expressed skepticism about this aspect of Grice's program (cf.
Chomsky 1975,1980). By contrast, many assume that some amended version of Grice's
concept of speaker's meaning can serve as a basis for an inferential model of human
communication. In his 1967 William James Lectures, Grice argued that what enables the
hearer to infer the speaker's meaning on the basis of her utterance is that he rationally
expects all utterances to meet the "Cooperative Principle" and a set of nine maxims
or norms organized into four main categories which, by reference to Kant, he labeled
maxims of Quantity (informativeness), Quality (truthfulness), Relation (relevance) and
Manner (clarity).
As ordinary language philosophers emphasized, in addition to what is being said by
an assertion - what makes the assertion true or false - , the very performance of an
illocutionary act with the force of an assertion has pragmatic implications. For example,
consider Moore's paradox: by uttering "It is raining but I do not believe it," the speaker
is not expressing a logical contradiction, as there is no logical contradiction between the
fact that it is raining and the fact that the speaker fails to believe it. Nonetheless, the
utterance is pragmatically paradoxical because by asserting that it is raining, the speaker
thereby expresses (or displays) her belief that it is raining, but her utterance explicitly
denies that she believes it.
Grice's (1967/1975) third main contribution to an inferential model of communication
was his concept of conversational implicature, which he introduced as "a term of art" (cf.
Grice 1989:24). Suppose that Bill asks Jill whether she is going out and Jill replies: "It's
raining." For Jill's utterance about the weather to constitute a response to Bill's question,
2. Meaning, intentionality and communication 1^
additional assumptions are required, such as, for example, that Jill does not like rain (i.e.,
that if it is raining, then Jill is not going out) which, together with Jill's response, may
entail that she is not going out.
Grice's approach to communication, based on the Cooperative Principle and the
maxims, offers a framework for explaining how, from Jill's utterance, Bill can retrieve
an implicit answer to his question by supplying some additional assumptions. Bill
must be aware that Jill's utterance is not a direct answer to his question. Assuming
that Jill does not violate (or "flout") the maxim of relevance, she must have intended
Bill to supply the assumption that e.g., she does not enjoy rain, and to infer that she
is not going out from her explicit utterance. Grice (1967/1975) called the additional
assumption and the conclusion "conversational" implicaturcs. In other words, Grice's
conversational implicaturcs enable a hearer to reconcile a speaker's utterance with
his assumption that the speaker conforms to the Principle of Cooperation. Grice
(1989:31) insisted that "the presence of a conversational implicature must be capable
of being worked out; for even if it can in fact be intuitively grasped, unless the
intuition is replaceable by an argument, the implicature (if present at all) will not count
as a conversational implicature." (Instead, it would count as a so-called
"conventional" implicature, i.e., a conventional aspect of meaning that makes no contribution
to the truth-conditions of the utterance.) Grice further distinguished "generalized"
conversational implicaturcs, which are generated so to speak "by default," from
"particularized" conversational implicaturcs, whose generation depends on special
features of the context of utterance.
Grice's application of his cooperative framework to human communication and his
elaboration of the concept of (generalized) conversational implicature were motivated
by his concern to block certain moves made by ordinary language philosophers. One
such move was exemplified by Strawson's (1952) claim that, unlike the truth-functional
conjunction of propositional calculus, the English word "and" makes different
contributions to the full meanings of the utterances of pairs of conjoined sentences. For example,
by uttering "John took off his boots and got into bed" the speaker may mean that the
event first described took place first.
In response,Grice (1967/1975,1981) argued that,in accordance with the truth-table of
the logical conjunction of propositional calculus, the utterance of any pair of sentences
conjoined by "and" is true if and only if both conjuncts are true and false otherwise. He
took the view that the temporal ordering of the sequence of events described by such an
utterance need not be part of the semantic content (or truth-conditions) of the utterance.
Instead, it arises as a conversational implicature retrieved by the hearer through an
inferential process guided by his expectation that the speaker is following the Cooperative
Principle and the maxims, e.g., the sub-maxim of orderliness (one of the sub-maxims of
the maxim of Manner), according to which there is some reason why the speaker chose
to utter the first conjunct first.
Also, under the influence of Wittgenstein, some ordinary language philosophers
claimed that unless there are reasons to doubt whether some thing is really red,
it is illegitimate to say "It looks red to me" (as opposed to "It is red"). In response,
Grice (1967/1975) argued that whether an utterance is true or false is one thing;
whether it is odd or misleading is another (cf. Carston 2002a: 103; but see Travis 1991
for dissent).
20
I. Foundations of semantics
5. The emergence of truth-conditional pragmatics
Grice's seminal work made it clear that verbal communication involves three layers
of meaning: (i) the linguistic (conventional) meaning of the sentence uttered, (ii) the
explicit content expressed (i.e.,"what is said") by the utterance, and (iii) the implicit
content of the utterance (its conversational implicatures). Work in speech act theory further
suggests that each layer of meaning also exhibits a descriptive dimension (e.g., the truth
conditions of an utterance) and a pragmatic dimension (e.g., the fact that a speech act is
an assertion). Restricting itself to the descriptive dimension of meaning, the rest of this
section discusses the emergence of a new truth-conditional pragmatic approach, whose
core thesis is that what is said (not just the conversational implicatures of an utterance)
depends on the speaker's meaning. By further extending the inferentialist model of
communication, this pragmatic approach to what is said contravenes two deeply entrenched
principles in the philosophy of language: literalism and minimalism.
Ideal language philosophers thought of indcxicality and other context-sensitive
phenomena as defective features of natural languages. Quine (1960: 193) introduced
the concept of an eternal sentence as one devoid of any context-sensitive or
ambiguous constituent so that its "truth-value stays fixed through time and from speaker to
speaker." An instance of an eternal sentence might be: "Three plus two equals five."
Following Quine, many philosophers (see e.g., Katz 1981) subsequently accepted
literalism, i.e., the view that for any statement made in some natural language, using a
context-sensitive sentence in a given context, there is some eternal sentence in the
same language that can be used to make the same statement in any context. Few
linguists and philosophers nowadays subscribe to literalism because they recognize that
indexicality is an ineliminable feature of natural languages. However, many subscribe
to minimalism.
Gricc urged an inferential model of the pragmatic process whereby a hearer infers
the conversational implicatures of an utterance from what is said. But he embraced the
minimalist view that what is said departs from the linguistic meaning of the uttered
sentence only as is necessary for the utterance to be truth-evaluable (cf. Grice 1989:
25). If a sentence contains an ambiguous phrase (e.g., "He is in the grip of a vice"),
then it must be disambiguated. If it contains an indexical, then it cannot be assigned
its proper semantic value except by relying on contextual information. But according
to minimalism, appeal to contextual information is always mandated by some
linguistic constituent (e.g., an indexical) within the sentence. In order to determine what
is said by the utterance of a sentence containing e.g., the indexical pronoun "I," the
hearer relics on the rule according to which any token of "I" refers to the speaker who
used that token. As Stanley (2000: 391) puts it, "all truth-conditional effects of extra-
linguistic context can be traced to logical form" (i.e., the semantic information that is
grammatically encoded).
Unlike the reference of a pure indexical like "I," however, the reference of a
demonstratively used pronoun (e.g., "he") can only be determined by representing the
speaker's meaning, not by a semantic rule. So does understanding the semantic value of "here"
or "now." A person may use a token of "here" to refer to a room, a street, a city, a country,
the Earth, and so forth. Similarly, a person may use a token of "now" to refer to a
millisecond, an hour, a day, a year, a century, and so forth. One cannot determine the semantic
value of a token of either "here" or "now" without representing the speaker's meaning.
2. Meaning, intentionality and communication 2\_
According to truth-conditional pragmatics, what is said by an utterance is determined
by pragmatic processes, which are not necessarily triggered by some syntactic
constituent of the uttered sentence (e.g., an indexical). By contrast, minimalists reject truth-
conditional pragmatics and postulate, in the logical form of the sentence uttered, the
existence of hidden variables whose semantic values must be contcxtually determined
for the utterance to be truth-evaluable (see the controversy between Stanley 2000 and
Recanati 2004 over whether the logical form of an utterance of "It's raining" contains a
free variable for locations).
The rise of truth-conditional pragmatics may be interpreted (cf. Travis 1991) as
vindicating the view that an utterance's truth-conditions depend on what Searle (1978,
1983) calls "the Background," i.e., a network of practices and unarticulatcd assumptions
(but see Stalnaker 1999 and cf. article 38 (Dekker) Dynamic semantics for a semantic
approach). Although the verb "to cut" is unambiguous, what counts as cutting grass
differs from what counts as cutting a cake. Only against alternative background
assumptions will one be able to discriminate the truth-conditions of "John cut the grass" and of
"John cut the cake." However, advocates of minimalism argue that if, instead of using
a lawn mower, John took out his pocket-knife and cut each blade lengthwise, then by
uttering "John cut the grass" the speaker would speak the truth (cf. Cappelen & Lepore
2005).
Three pragmatic processes involved in determining what is said by an utterance
have been particularly investigated by advocates of truth-conditional pragmatics: free
enrichment, loosening and transfer.
5.1. Free enrichment
Grice (1967/1975,1981) offered a pragmatic account according to which the temporal or
causal ordering between the events described by the utterance of a conjunction is
conveyed as a conversational implicaturc. But consider Carston's (1988) example:"Bob gave
Mary his key and she opened the door." Carston (1988) argues that part of what is said is
that "she" refers to whoever "Mary" refers to and that Mary opened the door with the key
Bob gave her.If so, then the fact that Bob gave Mary his key before M ary opened the door is
also part of what is said. Following Sperber & Wilson (1986:189), suppose a speaker utters
"I have had breakfast," as an indirect way of declining an offer of food. By minimalist
standards, what the speaker said was that she has had breakfast at least once in her life
prior to her utterance. According to Grice, the hearer must be able to infer a
conversational implicaturc from what the speaker said. However, the hearer could not conclude
that the speaker does not wish any food from the truism that she has had breakfast at
least once in her life before her utterance. Instead, for the hearer to infer that the speaker
does not wish to have food in response to his question, what the speaker must have said
is that she has had breakfast just prior to the time of utterance.
5.2. Loosening
Cases of free enrichment are instances of strengthening the concept linguistically encoded
by the meaning of the sentence - for example, strengthening of the concept encoded
by "the key" into the concept expressible by "the key Bob gave to Mary". However,
22
I. Foundations of semantics
not all pragmatic processes underlying the generation of what is said from the linguistic
meaning of the sentence are processes of conceptual strengthening or narrowing. Some
are processes of conceptual loosening or broadening. For example, imagine a speaker's
utterance in a restaurant of "My steak is raw" whereby what she says is not that her steak
is literally uncooked but rather that it is undercooked.
5.3. Transfer
Strengthening and loosening are cases of modification of a concept linguistically encoded
by the meaning of a word. Transfer is a process whereby a concept encoded by the
meaning of a word is mapped onto a related but different concept.Transfer is illustrated
by examples from Nunberg (1979, 1995): "The ham sandwich left without paying" and
"I am parked out back." In the first example, the property expressed by the predicate
"left without paying" is being ascribed to the person who ordered the sandwich, not
to the sandwich itself. In the second example, the property of being parked out back is
ascribed not to the speaker, but to the car that stands in the ownership relation to her.
The gist of truth-conditional pragmatics is that speaker's meaning is involved in
determining both the conversational implicatures of an utterance and what is said. As the
following example shows, however, it is not always easy to decide whether a particular
assumption is a conversational implicature of an utterance or part of what is said.
Consider "The picnic was awful.The beer was warm." For the second sentence to offer a
justification (or explanation) of the truth expressed by the first, the assumption must be made
that the beer was part of the picnic. According to Carston (2002b), the assumption that
the beer was part of the picnic is a conversational implicature (an implicated premise)
of the utterance. According to Recanati (2004), the concept linguistically encoded by
"the beer" is strengthened into the concept expressible by "the beer that was part of the
picnic" and part of what is said.
6. Concluding remarks: pragmatics and cognitive science
Sperber & Wilson's (1986) relevance-theoretic approach squarely belongs to truth-
conditional pragmatics: it makes three contributions towards bridging the gap between
pragmatics and the cognitive sciences. First, it offers a novel account of speaker's
meaning. As Schiffer (1972) pointed out, not unless S's intention to inform A is
mutually known to S and A could S's intention count as a genuine communicative intention
(cf. section 3). But how could S and A know that they mutually know S's intention to
inform A of something? Sperber & Wilson (1986) argue that they cannot and urge that
the mutual knowledge requirement be replaced by the idea of mutual manifestness. An
assumption is manifest to S at t if and only if S is capable of representing and accepting it
as true at t. A speaker's informative intention is an intention to make (more) manifest to
an audience a set of assumptions {I}. A speaker's communicative intention is her
intention to make it mutually manifest that she has the above informative intention. Hence, a
communicative intention is a second-order informative intention.
Secondly, relevance theory is so-called because Sperber & Wilson (1986) accept a
Cognitive principle of relevance according to which human cognition is geared towards
the maximization of relevance. Relevance is a property of an input for an individual at
t: it depends on both the set of contextual effects and the cost of processing, where the
2. Meaning, intentionality and communication 23
contextual effect of an input might be the set of assumptions derivable from processing
the input in a given context. Other things being equal, the greater the set of
contextual effects achieved by processing an input, the more relevant the input. The greater
the effort required by the processing, the lower the relevance of the input. They further
accept a Communicative principle of relevance according to which every ostensively
produced stimulus conveys a presumption of its own relevance: an ostensive stimulus
is optimally relevant if and only if it is relevant enough to be worth the audience's
processing effort and it is the most relevant stimulus compatible with the communicator's
abilities and preferences.
Finally, the relevance-theoretic approach squarely anchors pragmatics into what
cognitive psychologists call "third-person mindrcading," i.e., the ability to represent others'
psychological states (cf. Leslie 2000). In particular, it emphasizes the specificity of the
task of representing an agent's communicative intention underlying her
(communicative) ostensive behavior. The observer of some non-ostensive intentional behavior (e.g.,
hunting) can plausibly ascribe an intention to the agent on the basis of the desirable
outcome of the latter's behavior, which can be identified (e.g., hit his target), whether
or not the behavior is successful. However, the desirable outcome of a piece of
communicative behavior (i.e., the addressee's recognition of the agent's communicative
intention) cannot be identified unless the communicative behavior succeeds (cf. Sperber 2000;
Origgi & Sperber 2000 and Wilson & Sperber 2002).Thus, the development of pragmatics
takes us from the metaphysical issues about meaning and intentionality inherited from
Brentano to the cognitive scientific investigation of the human mindrcading capacity to
metarepresent others' mental representations.
Thanks to Neftali Villanueva Fernandez, Paul Horwich and the editors for comments on
this article.
7. References
Austin. John L. 1962a. Sense and Sensibilia. Oxford: Oxford University Press.
Austin, John L. 1962b. How to Do Things with Words. Oxford: Clarendon Press.
Bach, Kent & Robert M. Harnish 1979. Linguistic Communication and Speech Acts. Cambridge,
MA:The MIT Press.
Brentano, Frantz 1874/1911/1973. Psychology from an Empirical Standpoint. London: Routledge
& Kegan Paul.
Cappelen, Herman & Ernie Lepore 2005. Insensitive Semantics. Oxford: Blackwell.
Carnap. Rudolf 1942. Introduction to Semantics. Chicago, IL:Tlic University of Chicago Press.
Carston. Robyn 1988. Iinplicaturc.cxplicaturc and truth-theoretic semantics. In: R. Kcmpson (cd.).
Mental Representations: The Interface between Language and Reality. Cambridge: Cambridge
University Press, 155-181.
Carston, Robyn 2002a. Thoughts and Utterances. The Pragmatics of Explicit Communication.
Oxford: Blackwell.
Carston, Robyn 2002b. Linguistic meaning, communicated meaning and cognitive pragmatics. Mind
& Language 17,127-148.
Chisholm, Robert M. 1957. Perceiving: A Philosophical Study. Ithaca. NY: Cornell University Press.
Chomsky, Noam 1975. Reflections on Language. New York: Pantheon Books.
Chomsky, Noam 1980. Rules and Representations. New York: Columbia University Press.
Cliurchland, Paul M. 1989. A Neiirocomputational Perspective: The Nature of Mind and the
Structure of Science. Cambridge, MA: The MIT Press.
24
I. Foundations of semantics
Dennett. Daniel C. 1987. The Intentional Stance. Cambridge. MA:Tlie MIT Press.
Dretske. Fred 1995. Naturalizing the Mind. Cambridge, MA:Tlie MIT Press.
Fodor. Jerry A. 1975. The Language of Thought. New York: Crowell.
Fodor, Jerry A. 1987. Psychosemantics: The Problem of Meaning in the Philosophy of Mind.
Cambridge, MA:The MIT Press.
Frege, Gottlob 1892/1980. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic undphilosophische
Kritik 100,25-50. English translation in: P. Geach & M. Black (eds.). Translations from the
Philosophical Writings of Gottlob Frege. Oxford: Blackwell. 1980,56-78.
Green, Mitchell 2007. Speech acts. Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/
entries/speech-acts, July 20,2008.
Grice. H. Paul 1957. Meaning. The Philosophical Review 64. 377-388. Reprinted in: H. P. Grice.
Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989,212-223.
Grice, H. Paul 1967/1975. Logic and conversation. In: P. Cole & J. Morgan (eds.). Syntax and
Semantics 3: Speech Acts. New York: Academic Press, 41-58. Reprinted in: H.P Grice. Studies in the Way
of Words. Cambridge. MA: Harvard University Press, 1989,22^0.
Grice, H. Paul 1968. Utterer's meaning, sentence meaning and word meaning. Foundations of
Language 4, 225-242. Reprinted in: H.P. Grice. Studies in the Way of Words. Cambridge, MA:
Harvard University Press, 1989.117-137.
Grice, H. Paul 1969. Utterer's meaning and intentions. Philosophical Review 78,147-177. Reprinted
in: H.P. Grice. Studies in the Way of Words. Cambridge, MA: Harvard University Press, 1989,
86-116.
Grice, H. Paul 1978. Further notes on logic and conversation. In: P. Cole (ed.). Syntax and Semantics
9: Pragmatics. New York: Academic Press, 113-128. Reprinted in: H. P. Grice. Studies in the Way
of Words. Cambridge, MA: Harvard University Press, 1989,41-57.
Grice, H. Paul 1981. Presuppositions and conversational implicatures. In: P. Cole (ed.). Radical
Pragmatics. New York: Academic Press, 183-198. Reprinted in: H. P. Grice. Studies in the Way of
Words. Cambridge, MA: Harvard University Press, 1989.269-282.
Grice, H. Paul 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press.
Haugeland, John 1981. Semantic engines: An introduction to mind design. In: J. Haugeland (ed.).
Mind Design, Philosophy, Psychology, Artificial Intelligence. Cambridge, MA: The MIT Press,
1-34.
Horwich, Paul 2005. Reflections on Meaning. Oxford: Oxford University Press.
Jacob, Pierre 1997. What Minds Can Do. Cambridge: Cambridge University Press.
Jacob, Pierre 2003. Intentionality. Stanford Encyclopedia in Philosophy, http://plato.Stanford.edu/
intensionality/, June 15,2006.
Katz, Jerry J. 1981 Language and Other Abstract Objects. Oxford: Blackwell.
Leslie, Alan 2000. Theory of Mind' as a mechanism of selective attention. In: M. Gazzaniga (ed.).
The New Cognitive Neuroscience. Cambridge, MA:The MIT Press, 1235-1247.
Meinong, Alexis 1904.Uber Gegenstandstheorie. In: A.Meinong (cd.). Untersuchungen zur Gegen-
standstheorie und Psychologie. Leipzig: Barth, 1-50. English translation in: R. M. Chisholm (ed.).
Realism and the Background of Phenomenology. Glencoe:The Free Press, 1960,76-117.
Millikan, Ruth G. 1984. Language, Thought and Other Biological Objects. Cambridge, MA: The
MIT Press.
Morris, Charles 1938. Foundations of the Theory of Signs. Chicago. IL:Tlie University of Chicago
Press.
Nunbcrg. Geoffrey 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics &
Philosophy 3.143-184.
Nunberg. Geoffrey 1995. Transfers of meaning. Journal of Semantics 12.109-132.
Origgi. Gloria & Dan Sperber 2000. Evolution, communication and the proper function of
language. In: P. Carruthers & A. Chamberlain (eds.). Evolution and the Human Mind: Language,
Modularity and Social Cognition. Cambridge: Cambridge University Press, 140-169.
Parsons,Terence 1980. Nonexistent Objects. New Haven, CT: Yale University Press.
3. (Frege on) Sense and reference 25
Quine, Willard van Orman 1948. On what there is. Reprinted in: W.V.O. Quine. From a Logical
Point of View. Cambridge, MA: Harvard University Press, 1953,1-19.
Quine, Willard van Orinan 1960. Word and Object. Cambridge, MA:The MIT Press.
Recanati. Francois 1987. Meaning and Force. Cambridge: Cambridge University Press.
Recanati. Francois 2004. Literal Meaning. Cambridge: Cambridge University Press.
Rey, Georges 1997. Contemporary Philosophy of Mind: A Contentiously Classical Approach.
Oxford: Blackwell.
Russell. Bertrand 1905. On denoting. Mind 14,479^93. Reprinted in: R. C. Marsh (ed.). Bertrand
Russell, Logic and Knowledge, Essays I90I-I950. New York: Capricorn Books, 1956.41-56.
Schiffer. Stephen 1972. Meaning. Oxford: Oxford University Press.
Searle, John R. 1969. Speech Acts. Cambridge: Cambridge University Press.
Searle, John R. 1975. Indirect speech acts. In: P. Cole & J. Morgan (eds.). Syntax and Semantics 3:
Speech Acts. New York: Academic Press, 59-82.
Searle, John R. 1978. Literal meaning. Erkenntnis 13.207-224.
Searle, John R. 1983. Intentionality. Cambridge: Cambridge University Press.
Searle, John R. 1992. The Rediscovery of the Mind. Cambridge, M A:Thc MIT Press.
Sperber, Dan 2000. Metarepresentations in an evolutionary perspective. In: D. Sperber (ed.). Meta-
represen tat ions: A Multidisciplinary Perspective. Oxford: Oxford University Press, 117-137.
Sperber. Dan & Deirdre Wilson 1986. Relevance, Communication and Cognition. Cambridge. MA:
Harvard University Press.
Stalnaker, Robert 1999. Context and Content. Oxford: Oxford University Press.
Stanley, Jason 2000. Context and logical form. Linguistics & Philosophy 23,391-434.
Strawson, Peter F. 1952. Introduction to Logical Theory. London: Methuen.
Strawson, Peter F. 1964. Intention and convention in speech acts. The Philosophical Review 73.
439-460. Reprinted in: P.F. Strawson. Logico-linguistic Papers. London: Methuen. 1971.149-169.
Travis, Charles 1991. Annals of analysis: Studies in the Way of Words, by H.P. Grice. Mind 100,
237-264.
Wilson. Deirdre & Dan Sperber 2002.Truthfulness and relevance. Mind 111,583-632.
Zalta, Edward N. 1988. Intensional Logic and the Metaphysics of Intentionality. Cambridge, MA:
The MIT Press.
Pierre Jacob, Paris (France)
3. (Frege on) Sense and reference
1. Sense and reference: a short overview
2. Introducing sense and reference
3. Extending and exploring sense and reference
4. Criticising sense and reference
5. Summary
6. References
Abstract
Gottlob Frege argues in "Ober Sinn und Bedeutung" that every intuitively meaningful
expression has a sense. He characterises the main ingredient of sense as a 'mode of
presentation' of at most one thing, namely the referent. Theoretical development of sense
Maienborn, von Heusingerand Portner (eds.) 2011, Semantics (HSK33.1), de Gruyter, 25-49
26
I. Foundations of semantics
and reference has been a fundamental task for the philosophy of language. In this article
I will reconstruct Frege's motivation for the distinction between sense and reference
(section 2). I will then go on to discuss how the distinction can be applied to predicates,
sentences and context-dependent expressions (section 3). The final section 4 shows how
discussions of Frege's theory lead to important proposals in semantics.
1. Sense and reference: a short overview
Gottlob Frege argues in Uber Sinn und Bedeutung that every intuitively meaningful
expression has a sense ("Sinn"). The main ingredient of sense is suggestively
characterised as a 'mode of presentation' of at most one thing, namely the referent ("Bedeutung").
Frege's theory of sense and reference has been at the centre of the philosophy of
language in the 20th century. The discussion about it is driven by the question whether an
adequate semantic theory needs to or even can coherently ascribe sense and reference
to natural language expressions. On the side of Frege's critics are, among others, Russell
and Kripke.
Russell (1905) argues that the introduction of sense creates more problems than
it solves. If there arc senses, one should be able to speak about them. But according
to Russell, every attempt to do so leads to an 'inextricable tangle'. Russell goes on to
develop his theory of definite descriptions to avoid the problems Frege's theory seems
to create. How and whether Russell's argument works is still a matter of dispute. (For
recent discussions see Levine 2004 and Makin 2000.)
More recently, Kripke (1972) has marshalled modal and epistemic arguments against
what he called the "The Frege-Russell picture"."Neo-Millians"or"Neo-Russellians"are
currently busy closing the holes in Kripke's arguments against Frege.
Frege's friends have tried to develop and defend the distinction between sense and
reference. In Meaning and Necessity Carnap (1956) takes Frege to task for multiplying
senses beyond necessity and therefore proposes to replace sense and reference with
intension and extension. I will return to Carnap's view in section 4.1.
Quine (1951) has criticised notions like sense and intension as not being definable
in scientifically acceptable terms. Davidson has tried to preserve Frege's insights in
view of Quine's criticism. According to Davidson, "a theory of truth patterned after a
Tarski-type truth definition tells us all we need to know about sense. Counting truth
in the domain of reference, as Frege did, the study of sense thus comes down to the
study of reference" (Davidson 1984:109). And reference is,evenby Quinean standards, a
scientifically acceptable concept. Davidson's proposal to use theories of truth as
theories of sense has been pursued by in recent years and lead to fruitful research into the
semantics of proper names. (See McDowell 1977 and Sainsbury 2005. See also Wiggins
1997.)
Dummett has criticised Davidson's proposal (see Dummett 1993, essay 1 and 2). A
theory of sense should be graspable by someone who does not yet master a language and
a theory of truth docs not satisfy this constraint. A theory of meaning (sense) should be
a theory of understanding and a theory of understanding should take understanding to
consist in the possession of an ability, for example, the ability to recognise the bearer of
a name.
The jury is still out on the question whether the distinction between sense and
reference should be preserved or abandoned. The debate has shed light on further
3. (Frege on) Sense and reference 27
notions like compositionality, co-reference and knowledge of meaning. In this
article I want to introduce the basic notions, sense and reference, and the problems
surrounding them.
A note of caution. Frege's theory is not clearly a descriptive theory about natural
language, but applies primarily to an ideal language like the Begriffsschrift (see Frege,
Philosophical and Mathematical Correspondence (PMC): 101). If a philosopher of
language seeks inspiration in the work of Frege, he needs to consider the question whether
the view under consideration is about natural language or a language for inferential
thought.
2. Introducing sense and reference
2.1. Conceptual content and cognitive value
Frcgc starts his logico-philosophical investigations with judgement and inference. (Sec
Burge 2005:14f). An inference is a judgement made "because we are cognizant of other
truths as providing a justification for it." (Posthumous Writings (PW): 3).The premises of
an inference are not sentences, but acknowledged truths.
In line with this methodology, Frege introduces in BS the conceptual content of a
judgement as an abstraction from the role the judgement plays in inference:
Let mc observe that there arc two ways in which the contents of two judgements can differ:
it may. or it may not, be the case that all inferences that can be drawn from the first
judgement when combined with certain other ones can always be drawn from the second when
combined with the same judgements.!...] Now I call the part of the content that is the same
in both the conceptual content.
(BS,§3:2-3)
Frege's 'official' criterion for sameness of conceptual content is:
s has the same conceptual content as s* iff
given truths t, to tn as premises, the same consequences can be inferred from s and s*
together with t, to tn.
(See,BS,§3)
Frege identifies the conceptual content of a sentence with a complex of the things for
which the sentence constituents stand (BS §8). This raises problems for the conceptual
content of identity sentences. In identity sentences the sign of identity is flanked by what
Frege calls 'proper names'. What is a proper name?
For Frege, every expression that can be substituted for a genuine proper name
salva grammaticale and salva veritate qualifies as a proper name. Fregean proper
names include genuine proper names, complex demonstratives ('That man') when
completed by contextual features and definite descriptions ('the negation of the
thought that 2 = 3'). Following Russell (1905), philosophers and linguists have argued
that definite descriptions do not belong on this list. Definite descriptions do not
stand for objects, but for properties of properties. For example,'The first man on the
moon is American' asserts that (the property) being a first man on the moon is uniquely
instantiated and that whatever uniquely has this property has also the property of
28
I. Foundations of semantics
being American. The discussion about the semantics of definite descriptions is
ongoing, but we can sidestep this issue by focusing on genuine proper names. (For an
overview of the recent discussion about definite descriptions see Bezuidenhout &
Reimer 2004.)
Equations figure as premises in inferences that extend our knowledge. How can this
be so? In BS Frege struggles to give a convincing answer. If one holds that "=" stands for
the relation of identity, the sentences
(51) The evening star = the evening star
and
(52) The evening star = the morning star,
have the same conceptual content CC:
CC(S1): < Venus, Identity. Venus> = CC(S2): <Venus, Identity. Venus>.
But although (SI) and (S2) stand for the same complex, they differ in inferential
potential. For example, if we combine the truth that (SI) expresses with the truth that the
evening star is a planet we can derive nothing new, but if we combine the latter truth we
can derive the truth that the morning star is a planet.
Prima facie, (i) Frege's sameness criteria for conceptual content are in conflict with
(ii) his claim that expressions are everywhere merely placeholders for objects. In BS
Frege responds by restricting (ii): in a sentence of the form "a = b" the signs "a" and "b"
stand for themselves; the sentence says that "a" and "b" have the same conceptual
content.The idea that designations refer to themselves in some contexts helps to explain the
difference in inferential potential. It can be news to learn that "Dr. Jekyll" stands for the
same person as "Mr. Hyde". Although promising, we will see in a moment that this move
alone does not distinguish (Si) and (S2) in the way required.
Treating identity statements as the exception to the rule that every expression just
stands for an object allows Frcgc to keep his identification of the conceptual content of
a sentence with a complex of objects and properties. The price is that the reference of a
non-ambiguous and non-indexical term will shift from one sentence to another. Consider
a quantified statement like:
(Vx)(Vy)((x = y&Fx)->Fy)
If signs stand for themselves in identity-statements, one can neither coherently suppose
that the variables in the formula range over signs, nor over particulars. (See Furth 1964:
xix,and Mendelsohn 1982:297f.)
These considerations prime us for the argument that motivates Frcgc's introduction
of the distinction between sense and reference. Frege's new argument uses the notion
of cognitive value ("Erkenntniswert"). For the purposes of Frege's argument it is not
important to answer the question "What is cognitive value?": more important is the
3. (Frege on) Sense and reference 29
question "When do two sentences s and s* differ in cognitive value?" Answer: If the
justified acceptance of s puts one in a position to come to know something that one neither
knew before nor could infer from what one knew before, while the justified acceptance
of s* does not do so (or the other way around). Frege hints at this notion of cognitive
value in a piece of self-criticism in The Foundations of Arithmetic (FA). If a true equation
connected two terms that refer to the same thing in the same way
[a]ll equations would then come down to this, that whatever is given to us in the same way is
to be recognized as the same. But this is so self-evident and so unfruitful that it is not worth
stating. Indeed, no conclusion could ever be drawn here that was different from any of the
premises.
(FA. §67:79. My emphasis)
If a rational and minimally logically competent person accepts what "The evening star
is a planet" says, and she comes also to accept the content of "The evening star = the
morning star", she is in a position to acquire the knowledge that the morning star is a
planet. This new knowledge was not inferable from what she already knew. Things are
different if she has only reason to accept the content of "The evening star = the evening
star". Exercising logical knowledge won't take her from this premise to a conclusion that
she was not already in a position to know on the basis of the other premise or her general
logical knowledge.
On the basis of this understanding of cognitive value, Frege's argument can now be
rendered in the following form:
(PI) (SI) "The evening star = the evening star" differs in cognitive value from (S2) "The
evening star = the morning star".
(P2) The difference in cognitive value between (SI) and (S2) cannot be constituted by
a difference in reference of the signs composing the sentence.
(P3) "If the sign 'a' is distinguished from the sign 'b' only as object (here by means of
its shape), not as sign (i.e., not by the manner in which it designates something)
the cognitive value of a = a becomes essentially equal to the cognitive value of
a = b."
(P4) "A difference can only arise if the difference between the signs corresponds to a
difference between the mode of presentation of that which is designated."
(CI) The difference in cognitive value between (SI) and (S2) is constituted by the fact
that different signs compose (SI) and (S2) AND that the different signs designate
something in different ways.
We have now explained (PI) and Frege has already given convincing reasons for holding
(P2) in BS. Let us look more closely at the other premises.
2.2. Sense, sign and logical form
In On Sense and Reference (S&R) Frege gives several reasons for the rejection of the
Begriffsschrift view of identity statements.The most compelling one is (P3). Frege held
in BS that (SI) has a different cognitive value from (S2) because (i) in (S2) different
30
I. Foundations of semantics
singular terms flank the identity sign that (ii) stand for themselves. If this difference
is to ground the difference in cognitive value, Frege must provide an answer to the
question "When do different signs and when does the same sign (tokens of the same
sign) flank the identity sign?" that is adequate for this purpose. (P3) says that the
individuation of signs in terms of their form alone is not adequate. Frege calls signs
that are individuated by their form 'figures'. The distinctive properties of a figure are
geometrical and physical properties. There are identity-statements that contain two
tokens of the same figure ("Paderewski = Paderewski"), which have the same cognitive
value as identity statements that contain two tokens of different figures ("Paderewski =
the prime minister of Poland between 1917 and 1919"). There are also identity-
statements that contain tokens of different figures ("Germany's oldest bachelor is
Germany's oldest unmarried eligible male") that have the same cognitive value as
identity statements that contain two tokens of the same figure ("Germany's oldest
bachelor is Germany's oldest bachelor").
If sameness (difference) of figure does not track sameness (difference) of cognitive
value, then the Begriffsschrift view needs to be supplemented by a method of sign
individuation that is not merely form-based. Frege's constructive suggestion is (P4). Let us
expand on (P4). I will assume that I have frequently seen what I take to be the brightest
star in the evening sky. I have given the star the name "the evening star". I will apply "the
evening star" to an object if and only if it is the brightest star in the evening sky. I am also
fascinated by what I take to be the brightest star in the morning sky. I have given this star
the name "the morning star" and I will apply "the morning star" to an object if and only
if it is the brightest star in the morning sky.
Now if the difference in form between "the evening star" and "the morning star"
indicates a difference in mode of designation, one can explain the difference in
cognitive value between (SI) and (S2). (Si) and (S2) contain different figures AND the
different figures refer to something in a different way (they arc connected in my idiolect to
different conditions of correct reference).
Frege assumes also that a difference in cognitive value can only arise if different
terms designate in different ways. But (SI) and (S2) would be translated differently into
the language of first-order logic with identity: (SI) as "a = a", (S2) as "a = b". Why not
hold with Putnam (1954) that the difference in logical form grounds the difference in
cognitive value?
This challenge misses the point of (P3). How can wc determine the logical form of
a statement independently of a method of individuating signs? As we have already
seen, if "b" is a mere stylistic alternative for "a" (the difference in form indicates
no difference in mode of presentation), the logical form of "a = b" is a = a. If
difference of cognitive value is to depend on differences of logical form, logical form cannot
merely be determined by the form of the signs contained in the sentence. Logical
form depends on sense. The appeal to logical form cannot make the notion of sense
superfluous.
If Frege's argument is convincing, it establishes the conclusion that the sense of an
expression is distinct from its reference. The argument tells us what sense is not, it does
not tell us what it is. The premises of the argument arc compatible with every
assumption about the nature of sense that allows the sense of an expression to differ from its
reference.
3. (Frege on) Sense and reference 31_
3. Extending and exploring sense and reference
3.1. The sense and reference of predicates and sentences
Not only proper names have sense and reference. Frege extends the distinction to
concept-words (predicates) and sentences. A letter to Husserl contains a diagram that
outlines the complete view (PMC: 63):
Proper Name Sentence Concept-word
Sense Thought Sense concept-word
Referent Truth-Value Concept -* Object that falls under it.
Fig. 3.1: Frege's diagram
The sense of a complete assertoric sentence is a thought. The composition of the
sentence out of sentence-parts mirrors the composition of the thought it expresses. If (i) a
sentence S is built up from a finite list of simple parts e, to en according to a finite number
of modes of combination, (ii) e, to en express senses, and (iii) the thought expressed by
S contains the senses of c, to cn arranged in a way that corresponds to the arrangement
of e, to en in S, we can express new thoughts by re-combining sentence-parts we already
understand. Since, we can express new thoughts, we should accept (i) to (iii). (See Frege,
Collected Papers (CP):390 and PMC: 79.)
Compositionahty is the property of a language L that the meaning of its complex
expressions is determined by the meaning of their parts and their mode of combination.
(Sec article 6 (Pagin & Wcstcrstahl) Compositionality.) According to Frege, natural
language and his formal language have a property that is like compositionality, but not the
same.
First, the claim that the composition of a sentence mirrors the composition of the
thought expressed implies that the sense of the parts and their mode of combination
determine the thought expressed, but the determination thesis does not imply the
mirroring thesis.
Second, Frcgcan thoughts arc not meanings. The meaning of an indcxical sentence
("I am hungry") does not vary with the context of utterance, the thought expressed
does.
Third, the assertoric sentence that expresses a thought often contains constituents
that are not words, while compositionality is usually defined in terms of expression parts
that are themselves expressions. For example, Frege takes pointings, glances and the time
of utterance to be part of the sign that expresses a thought. (See T, CP: 358 (64). For
discussion see Kiinne 1992 andTextor 2007.)
If the thought expressed by a sentences s consists of the senses expressed by the parts
of s in an order, a sentence s expresses a different thought from a sentence s* if s and
s* arc not composed from parts with the same sense in the same order. (Sec Dummett
1981a: 378-379.) Hence, the Mirror Thesis implies that only isomorphic sentences can
express the same thought.This conclusion contradicts Frege's view that the same thought
32
I. Foundations of semantics
can be decomposed differently into different parts, and that no decomposition can be
selected on independent grounds as identifying the thought. Frege argues, for example,
that 'There is at least one square root of 4' and 'The number 4 has the property that there
is something of which it is the square' express the same thought, but the wording of the
sentence suggests different decompositions of the thought expressed. (See CP: 188 (199—
200).) Progress in logic often consists in discovering that different sentences express
the same thought. (See PW: 153-154.) Hence, there is not the structure of a thought. A
fortiori, the sentence structure cannot be isomorphic with the structure of the thought
expressed.
The Mirror and Multiple Decomposition Thesis are in conflict. Dummett (1989) and
Rumfitt (1994) want to preserve the Mirror View that has it that thoughts arc complex
entities,Geach (1975),Bell (1987) and Kemmerling (1990) argue against it.Textor (2009)
tries to integrate both.
Does an assertoric sentence have a referent, and, if it has one, what is it? Frege
introduces the concept of a function into semantic analysis. Functional signs are
incomplete. In '1 + ^' the Greek letter simply marks an argument place. If we
complete the functional expression '1 + ^' with '1', we 'generate' a complex designator
for the value of the function for the argument 1, the number 2. The analysis of the
designator'1 + 1' into functional expressions and completing expressions mirrors the
determination of its referent. Frege extends the analysis to equations like '1 + 1' to '22
= 4'. In order to do so, he assumes (i) that the equation can be decomposed into
functional (for example %2 = 4') and non-functional expressions (for example '2'). The
functional expression stands for a function, the non-functional one for an argument.
Like '1 + 1', the equation '22 = 4' shall stand for the value the concept x2 = 4 assumes
for the argument 2. Frege identifies the values of these functions as truth-values: the
True and the False (CP: 144 (13)).
To generalise: An assertoric sentence s expresses a thought and refers to a truth-value.
It can be decomposed into functional expressions (concept-words) that stand for
functions from arguments to truth-values (concepts). The truth-value of s is the value which
the concept referred to by the concept-word distinguishable in s returns for the argument
referred to by the proper name(s) in s. In this way the semantics of sentences and their
constituents interlock. The plausibility of the given description depends crucially on the
assumption that talk of truth-values is more than a stylistic variant of saying that what
a sentence says is true (false). Frege's arguments for this assumption arc still under
scrutiny. (See Burge 2005, essay 3 and Ricketts 2003.)
3.2. Sense determines reference
The following argument is valid: Hesperus is a planet; Hesperus shines brightly.
Therefore, something is a planet and shines brightly, is a valid. By contast, this argument isn't:
Hesperus is a planet; Phosphorus shines brightly. Therefore, something is a planet and
shines brightly. Why?
A formally valid argument is an argument whose logical form guarantees its
validity. All arguments of the form "Fa, Ga, Therefore: (3x) (Fx & Gx)" are valid, i.e. if the
premises are true, the conclusion must also be true. If the first argument is formally
valid, the "Hesperus" tokens must have the same sense and reference. If the sense were
different, the argument would be a fallacy of equivocation. If the reference could be
3. (Frege on) Sense and reference 33
different, although the sense was the same, the argument could take us from true
premises to a false conclusion, because the same sense could determine different objects
in the premises and the conclusion. This consideration makes a thesis plausible that is
frequently attributed to Frege, although never explicitly endorsed by him:
Sense-Determines-Reference: Necessarily, if a and P have the same sense, a and P have the
same referent.
Sense-Determines-Reference is central to Frege's theory. But isn't it refuted by
Putnam's (1975) twin earth case? My twin and I may connect the same mode of
presentation with the syntactically individuated word "water", but the watery stuff in my
environment is H20, in his XYZ. While I refer to H20 with "water", he refers to XYZ
with "water".
It is far from clear that this is a convincing counterexample to Sense-Determines-
Reference. If we consider context-dependent expressions, it will be necessary to
complicate to Sense-determines-Reference to Sense-determines-Reference in context
of utterance. But this complication does not touch upon the principle. Among other
things, Putnam may be taken to have shown that "water" is a context-dependent
expression.
3.3. Transparency and homogeneity.
In the previous section wc were concerned with the truth-preserving character of
formally valid arguments. In this section the knowledge-transmitting character of such
arguments will become important: a formally valid argument enables us to come to know
the conclusion on the basis of our knowledge of the premises and the logical laws (See
Campbell 1994, chap. 3.1 and 3.2). Formally valid arguments can only have thisepistemic
virtue if sameness of sense is transparent in the following way:
Transparency-of-Sense-Sameness: Necessarily, if a and P have the same sense, everyone who
grasps the sense of a and P. thereby knows that a and P have the same sense,provided that
there is no difficulty in grasping the senses involved.
Assume for reduction that sameness of sense is not transparent in the first argument in
section 3.2.Then you might understand and accept "Hesperus is a planet" (first premise)
and understand and accept "Hesperus shines brightly" (second premise), but fail to
recognise that the two "Hesperus" tokens have the same sense. Hence, your understanding of
the premises does not entitle you to give them the form "Fa" and "Ga" and hence, you
cannot discern the logical form that ensures the formal validity of the argument.
Consequently, you are not in a position to come to know the conclusion on the basis of your
knowledge of the premises alone. The argument would be formally valid, but one could
not come to know its conclusion without adding the premise
Hesperus mentioned in premise 1 is the same thing as Hesperus mentioned in premise 2.
Since we take some arguments like the one above to be complete and knowledge
transmitting, sameness of sense must be transparent.
34
I. Foundations of semantics
If we assume in addition to Transparency-of-Sense-Sameness:
Competent speakers know that synonymous expressions co-refer, if they refer at all,
we arrive at:
Sense-Determines-Rcference: Necessarily, if a and P have the same sense, and a refers to
a and P refers to b. everyone who grasps the sense of a and P knows that a = b.
Frege makes use of Sense-Reference in Thoughts to argue for the view that an indexical
and a proper name, although co-referential, differ in sense.
Transparency-of-Sense-Sameness has further theoretical ramifications. Russell wrote
to Frege:
I believe that in spite of all its snowfields Mont Blanc itself is a component part of what is
actually asserted in the proposition 'Mont Blanc is more than 4000 meters high'.
(PMC: 169)
If Mont Blanc is part of the thought expressed by uttering "Mont Blanc is a mountain",
every piece of rock of Mont Blanc is part of this thought.To Frege, this conclusion seems
absurd (PMC: 79). Why can a part of Mont Blanc that is unknown to me not be part
of a sense I apprehend? Because if there were unknown constituents of the sense of
"Mont Blanc" I could not know whether the thought expressed by one utterance of
"Mont Blanc is a mountain" is the same as that expressed by another merely by grasping
the thought. Hence, Frege does not allow Mont Blanc to be a constituent of the thought
that Mont Blanc is a mountain. He endorses the Homogeneity Principle that a sense can
only have other senses as constituents (PMC: 127).
How plausible is Transparency-of-Sense-Sameness? Dummett (1975:131) takes it to be
an'undeniable feature of the notion of meaning'.Frege's proviso to Transparency-of-Sense-
Sameness shows that he is aware of difficulties. Many factors (the actual wording,
pragmatic implicatures, background information) can bring about that one assents to s while
doubting s*, although these sentences express the same thought. Frege has not worked
out a reply to these difficulties. However, he suggests that two sentences express the same
thought if one can explain away appearances to the contrary by appeal to pragmatics etc.
Recently, some authors have argued that it is not immediate knowledge of co-
reference, but the entitlement to 'trade on identity' which is the basic notion in a theory
of sense (see Campbell 1994: 88). For example, if I use the demonstrative pronoun
repeatedly to talk about an object that I visually track ("That bird is a hawk. Look,
now that bird pursues the other bird"), I am usually not in a position to know
immediately that I am referring to the same thing. But the fact that I am tracking the bird
gives me a defeasible right to presuppose that the same bird is in question. There is no
longer any suggestion that the grasp of sense gives one a privileged access to sameness of
reference.
3.4. Sense without reference
According to Frege, the conditions for having a sense and the conditions for having a
referent are different. Mere grammatically is sufficient for an expression that can stand in
3. (Frege on) Sense and reference 35
for (genuine) proper names ("the King of France") to have a sense, but it is not sufficient
for it to have a referent (S&R: 159 (28)).
What about genuine proper names? The name 'Nausikaa' in Homer's Odyssee is
probably empty. "But it behaves as if it names a girl, and it is thus assured of a sense."
(PW: 122). Frege's message is:
If a is a well-formed expression that can take the place of a proper name or a is a proper
name that purports to refer to something, a has a sense.
An expression purports to refer if it has some of the properties of an expression that does
refer and the possession of these properties entitles the uninformed speaker to treat it
like a referring expression. For instance, in understanding "Nausikaa" I will bring
information that I have collected under this name to bear upon the utterance.
It is very plausible to treat (complex) expressions and genuine proper names
differently when it comes to significance. We can ask a question like "Does a exist?" or "Is a
so-and-so"?, even if "a" does not behave as if it to refers to something. "The number
which exceeds itself" docs not behave as if it names something. Yet, it seems implausible
to say that "The number which exceeds itself cannot exist" does not express a thought.
What secures a sense for complex expressions is that they are composed out of
significant expressions in a grammatically correct way. Complex expressions need not behave
as if they name something, simple ones must.
Since satisfaction of the sufficient conditions for sense-possession does not require
satisfaction of the sufficient conditions for referentiality, there can be sense without
reference:
Sense-without-Reference: If the expression a is of a type whose tokens can stand for an
object, the sense of a determines at most one referent.
If Sense without Reference is true, the mode of presentation metaphor is misleading. For
there can be no mode of presentation without something that is presented. Some Neo-
Fregeans take the mode of presentation metaphor to be so central that they reject Sense
without Reference (see Evans 1982: 26). But one feels that what has to give here is the
mode of presentation metaphor, not the idea of empty but significant terms. Are there
more convincing reasons to deny that empty singular terms are significant?
The arguments pro and contra Sense without Reference are based on existence
conditions for thoughts. According to Evans, a thought must either be either true or false, it
cannot lack a truth-value (Evans 1982:25). Frege is less demanding:
The being of a thought may also be taken to lie in the possibility of different thinkers
grasping the same thought as one and the same thought.
(CP: 376 (146))
A thought is the sense of a complete propositional question. One grasps the thought
expressed by a propositional question iff one knows when the question deserves the
answer "Yes" and when it deserves the answer "No". Frege aims above for the following
existence condition:
(E!)The thought that p exists if the propositional question "p?"can be raised and addressed
by different thinkers in a rational way.
36
I. Foundations of semantics
There is a mere illusion of a thought if different thinkers cannot rationally engage with
the same question. On the basis of (E!), Frege rejects the view that false thoughts aren't
thoughts. It was rational to ask whether the circle can be squared and different thinkers
could rationally engage with this question. Hence, there is a thought and not merely a
thought illusion, although the thought is necessarily false.
The opposition to Sense-without-Reference is also fuelled by the idea that (i) a thought
expressed by a sentence s is what one grasps when one understands (an utterance of)
s and (ii) that one cannot understand (an utterance of) a sentence with an empty singular
term. A representative example of an argument against Sense without Reference that goes
through understanding is Evans argument from diversity (Evans 1982: 336). He argues
that the satisfaction of the conditions for understanding a sentence s with a proper name
n requires the existence of the referent of n. Someone who holds Sense without Reference
can provide a communication-allowing relation in the empty case: speaker and audience
grasp the same mode of presentation. But Evans rightly insists that this proposal is ad
hoc. For usually, we don't require exact match of mode of presentation. The Fregean is
challenged to provide a common communication-allowing relation that is present in the
case where the singular term is empty and the case where it is referential.
Sainsbury (2005, chapter 3.6) proposes that the common communication-allowing
relation is causal: exact match of mode of presentation is not required, only that there
is a potentially knowledge-transmitting relation between the speaker's use and the
audience episode of understanding the term.
3.5. Indirect sense and reference
If we assume that the reference of a sentence is determined by the reference of its parts,
the substitution of co-referential terms of the same grammatical category should leave
the reference of the sentence (e.g. truth-value) unchanged. However, Frege points out
that there are exceptions to this rule:
Gottlob said (believes) that the evening star shines brightly in the evening sky.
The evening star = the morning star.
Gottlob said (believes) that the morning star shines brightly in the evening sky.
Here the exchange of co-referential terms changes the truth-value of our original
statement. (In order to keep the discussion simple I will ignore propositional attitude
ascriptions in which the singular terms are not within the scope of 'S believes that ...', see
Burge (2005:198). For a more general discussion of these issues see article 60 (Swanson)
Propositional attitudes.)
Frege takes these cases to be exceptions of the rule that the reference of a sentence is
determined by the referents of its constituents. What the exceptions have in common is
that they involve a shift of reference:
In reported speech one talks about the sense, e.g. of another person's remarks. It is quite
clear that in this way of speaking words do not have their customary reference but
designate what is usually their sense. In order to have a short expression, we will say: In reported
speech, words are used indirectly or have their indirect reference. We distinguish accordingly
the customary sense from its indirect sense.
(S&R: 159 (28))
3. (Frege on) Sense and reference 37
Indirect speech is a defect of natural language. In a language that can be used for
scientific purposes the same sign should everywhere have the same referent. But in indirect
discourse the same words differ in sense and reference from normal discourse. In a letter
to Russell Frege proposes a small linguistic reform to bring natural language closer to his
ideal: "we ought really to have special signs in indirect speech, though their connection
with the corresponding signs in direct speech should be easy to recognize" (PMC: 153).
Let us implement Frege's reform of indirect speech by introducing indices that
indicate sense types. The index 0 stands for the customary sense expressed by an unem-
bedded occurrence of a sign, 1 for the sense the sign expresses when embedded under
one indirect speech operator like "S said that ...". By iterating indirect speech operators
wc can generate an infinite hierarchy of senses:
Direct speech: The-evening-star0 shines0 brightly0 in0 the-evening-sky0.
Indirect speech: Gottlobn believesn that the-evening-star, shines, brightly, in,
the-evening-sky,.
Doubly indirect speech: Bertrand0 believes0 that Gottlob, believes, that the-evening-star2
shines2 brightly2 in2 the-evening-sky2.
Triply indirect speech: Ludwig,, believes,, that Bertrand, believes, that Gottlob2 believes2
that the-evening-star3 shines, brightly3 in3 the-evening-skyv
"The evening star0" refers to the planet Venus and expresses a mode of presentation
of it; "the evening star," refers to the customary sense of "the evening star", a mode of
presentation of the planet Venus. "The evening star2" refers to a mode of presentation
of the mode of presentation of Venus and expresses a mode of presentation of a mode of
presentation of a mode of presentation of Venus.
Let us first attend to an often made point that requires a conservative modification of
Frege's theory. It seems natural to say that the name of a thought ("that the evening star
shines brightly in the evening sky") is composed of the nominaliscr "that" and the names
of the thought constituents in an order ("that" + "the evening star" + "shines" ...). If "the
evening star" names a sense in "Gottlob said that the evening star shines brightly in the
evening sky", then "Gottlob said that the evening star shines brightly in the evening sky
and it does shine brightly in the evening sky" should be false (the sense of "the evening
star" does not shine brightly). But the sentence is true!
Does this refute Frege's reference shift thesis? No, for can't one designator refer to
two things? The designator "the evening star," refers to the sense of "the vening star0"
AND to the evening star. Since such anaphoric back reference seems always possible,
we have no reference shift, but a systematic increase of referents. Fine (1989:267f) gives
independent examples of terms with multiple referents.
Frege himself should be sympathetic to this suggestion. In S&R he gives the following
example:
John is under the false impression that London is the biggest city in the world.
According to Frege, the italicised sentence 'counts double' ("ist doppelt zu nehmen"
S&R: 175, (48)). Whatever the double count exactly is, the sentence above will be true iff
38
I. Foundations of semantics
John is under the impression that London is the biggest city in the world
AND
It is not the case that London is the biggest city in the world.
Instead of doubly counting the same 'that'-designator, we can let it stand for a truth-
value and the thought that determines the truth-value. Reference increase seems more
adequate than reference shift.
Philosophers have been skeptical of Frege's sense hierarchy. What is so bad about an
infinite hierarchy of senses and references? If there is such a hierarchy, the language of
propositional attitude ascriptions is unlearnable, argues Davidson. One cannot provide
a finitely axiomatised theory of truth for propositional attitude ascriptions if a "that"-
clause is infinitely ambiguous (Davidson 1984: 208). The potentially infinitely many
meanings of "that p" would have to be learned one by one. Hence, we could not
understand propositional attitude ascriptions we have not heard before, although they are
composed of words we already master.
This argument loses its force if the sense of a "that" designator is determined by the
sense of the parts of the sense named and their mode of combination (see Burge 2005:
172). One must learn that "that" is a name forming operator that takes a sentence s
and returns a designator of the thought expressed by s. Since one can iterate the name
forming operator one can understand infinitely many designations for thoughts on the
basis of finite knowledge.
But is there really an infinite hierarchy of senses and references? We get an infinite
hierarchy of sense and references on the following assumptions:
(PI) Indirect reference of w in n embeddings = sense of w in n-\ einbeddings
(P2) Indirect sense of w in n einbeddings * sense of w in n embeddings
(P3) Indirect sense of w in n einbeddings * indirect sense of w in m embeddings if n * m.
Dummett has argued against (P2) that "the replacements of an expression in double oratio
obliqua which will leave the truth-value of the whole sentence unaltered are —just as in
single oratio obliqua — those which have the same sense" (Dummett 1981a: 269). Assume
for illustration that "bachelor" is synonymous with "unmarried eligible male". Dummett
is right that we can replace "bachelor" salva veritate in (i) "John believes that all
bachelors are dirty" and (ii) "John believes that Peter believes that all bachelors are dirty" with
"unmarried eligible male". But this does not show that "bachelor" retains its customary
sense in multiple embeddings. For Frcgc argues that the sense of "unmarried eligible male"
also shifts under the embedding. No wonder they can be replaced salva veritate.
However, if singly and doubly embedded words differ in sense and reference, one
should expect that words with the same sense cannot be substituted salva veritate in
different embeddings.The problem is that it is difficult to check in a convincing way for such
failures. If we stick to the natural language sentences, substitution of one token of the
same word will not change anything; if we switch to the revised language with indices, we
can no longer draw on intuitions about truth-value difference.
In view of a lack of direct arguments one should remain agnostic about the Frcgc an
hierarchy. I side here with Parsons that the choice between a theory of sense and
reference with or without the hierachy "must be a matter of taste and elegance" (Parsons 1997:
3. (Frege on) Sense and reference
39
408). The reader should consult the Appendix of Burge's (2005) essay 4 for attempts to
give direct arguments for the hierarchy, Peacocke (1999:245) for arguments against and
Parsons (1997: 402ff) for a diagnosis of the failure of a possible Fregean argument for
the hierarchy.
4. Criticising sense and reference
4.1. Carnap's alternative: extension and intension
Carnap's Meaning and Necessity plays an important role in the reception of Frege's ideas.
However, as Carnap himself emphasizes he docs not have the same explanatory aims
as Frege. Carnap himself aims to explicate and extend the distinction between
denotation and connotation, while he ascribes to Frege the goal of explicating the distinction
between the object named by a name and its meaning (Carnap 1956:127). This
description of Frege does not bear closer investigation. Frege argues that predicates ("£ > 1")
don't name their referents, yet they have a sense that determines a referent. Predicates
refer, although they are not names.
Carnap charges Frege for multiplying senses without good reason (Carnap 1956:
157 and 137). If every name has exactly one referent and one sense, and the referent
of a name must shift in propositional attitude contexts, Frege must have names for
senses which in turn must have new senses that determine senses and so on. Whether
this criticism is convincing is, as we have seen in the previous section, still an open
issue.
Carnap lays the blame for the problem at the door of the notion of naming. Carnap's
method of intension and extension avoids using this notion. Carnap takes equivalence
relations between sentences and subsentential expressions as given. In a second step
he assigns to all equivalent expressions an object. For example, if s is true in all state
descriptions in which s* is true, they have the same intension. If s and s* have the same
truth-value, they have the same extension. Expressions have intension and extension,
but they don't name either. It seems however unclear how Carnap can avoid appeal to
naming when it comes to equivalence relations between names. 'Hesperus' and
'Phosphorus' have the same extension because they name the same object (see Davidson
1962:326f).
An intension is a function from a possible world w (a state-description) to an object.
The intension of a sentence s is, for example, a function that maps a possible world w
onto the truth value of s in w; the intension of a singular term is a function that maps a
possible world w onto an object in w. Carnap states that the intension of every expression
is the same as its (direct) sense (Carnap 1956: 126). Now, intensions are more coarsely
individuated than Fregean senses.The thought that 2 + 2 = 4 is different from the thought
that 123 + 23 = 146, but the intension expressed by "2 + 2 = 4" and "123 + 23 = 146" is the
same. Every true mathematical statement has the same intension: the function that maps
every possible world to the True.
Although Carnap's criticism does not hit its target, his method of extension and
intension has been widely used. Lewis (1970), Kaplan (1977/1989) and Montague (1973)
have refined and developed it to assign intensions and extensions to a variety of natural
language expressions.
40
I. Foundations of semantics
4.2. Rigidity and sense
What, then, is the sense of a proper name? Frege suggests, but does not clearly endorse,
the following answer:The sense of a proper name (in the idiolect of a speaker) is given by
a definite description that the speaker would use to distinguish the proper name bearer
from all other things. For example, my sense of "Aristotle" might be given by the definite
description "the inventor of formal logic". I will now look at two of Kripkc's objections
against this idea that have figured prominently in recent discussion.
First, the modal objection (for a development of this argument see Soames 1998):
1. Proper names are rigid designators.
2. The descriptions commonly associated with proper names by speakers are non-rigid.
3. A rigid designator and a non-rigid one do not have the same sense.
Therefore:
4. No proper name has the same sense as a definite description speakers associate with it.
A singular term a is a rigid designator if, and only if, a refers to x in every possible
world in which x exists (Kripke 1980: 48-49). One can make sense of rigid
designation without invoking a plurality of possible worlds. If the sentence which results from
uniform substitution of singular terms for the dots in the schema
... might not have been ...
expresses a proposition which we intuitively take to be false, then the substituted term is
a rigid designator, if not, not.
Why accept (3)? If "Aristotle" had the same sense as "the inventor of formal logic",
we should be able to substitute "the inventor of formal logic" in all sentences without
altering the truth-value (salva veritate). But
Aristotle might not have been the inventor of formal logic. (Yes!)
and
Aristotle might not have been Aristotle. (No!)
But in making this substitution we go from a true sentence to a false one. How can this be?
Kripke's answer is: Even if everyone would agree that Aristotle is the inventor of formal
logic, the description "the inventor of formal logic" is not synonymous with "Aristotle".
The description is only used Xo fix the reference of the proper name. If someone answers
the question "Who is Aristotle?" by using the description, he gives no information about
the sense of "Aristotle". He just gives you advice how to pick out the right object.
The modal argument is controversial for several reasons. Can the difference between
sentences with (non-rigid) descriptions and proper names not be explained without
positing a difference in sense? Dummett gives a positive answer that explains the difference
in terms of a difference in scope conventions between proper names and definite
descriptions (see Dummett 1981a: 127ff; Kripke 1980, Introduction: 1 Iff replies; Dummett 1981b
Appendix 3 replies to the reply).
3. (Frege on) Sense and reference 41_
Other authors accept the difference, but argue that there are rigid definite
descriptions that are associated with proper names. (See, for example, Jackson 2005.) The debate
about this question is ongoing.
Second, the epistemic objection: Most people associate with proper names only definite
descriptions that are not satisfied by only one thing ("Aristotle is the great Greek
philosopher") or by the wrong object ("Peano is the mathematician who first proposed the axioms
for arithmetic").The 'Peano' axioms were first proposed by Dedekind (see Kripke 1972:84).
These objections do not refute Frege's conclusion that co-referential proper names can
differ in something that is more finely individuated than their reference. What the
objections show to be implausible is the idea that the difference between co-referential proper
names consists in a difference of associated definite descriptions. But arguments that
show that a proper name does not have the same sense as a particular definite
description do not show the more general thesis that a proper name has a sense that is distinct
from its referent to be false. In order to refute the general thesis an additional
argument is needed. The constraint that the sense of an expression must be distinct from its
reference leaves Frege ample room for manoeuvre. Consider the following two options:
The sense of "Aristotle" might be conceptually primitive: it is not possible to say what
this sense is without using the same proper name again.This is not implausible. We
usually don't define a proper name.This move of course raises the question how the sense of
proper names should be explained if one does not want to make it ineffable. More soon.
The sense of "Aristotle" in my mouth is identical with a definite description, but not
with a famous deeds description like "the inventor of bi-focals". Neo-Descriptivists offer
a variety of meta-linguistic definite descriptions ("the bearer of 'N'") 'to specify the
sense of the name (see, for example, Forbes 1990:536ff and Jackson 2005).
4.3. Too far from the ideal?
Frege has provided a good reason to distinguish sense from reference. The expressions
composing an argument must be distinguished by form and mode of presentation, if the
form of an argument is to reveal whether it is formally valid or not. If figures that differ in
form also differ in sense, the following relations obtain between sign, sense and reference:
The regular connection between a sign, its sense, and its reference is of such a kind that to
the sign there corresponds a definite sense and to that in turn a definite referent, while to a
given referent (an object) there docs not belong only a single sign.
(S&R: 159 (27))
The "no more than one" requirement is according to Frege the most important rule that
logic imposes on language. (For discussion see May 2006.) A language that can be used to
conduct proofs must be unambiguous. A proof is a scries of judgements that terminates
in the logically justified acknowledgement that a thought stands for the True. If "2 + 2
= 4" expressed different thoughts, the designation that 2+2 = 4 would not identify the
thought one is entitled to judge on the basis of the proof (CP: 316, Fn. 3).
Frege is under no illusion: there is, in general, no such regular connection between
sign, sense and reference in natural language:
To every expression belonging to a complete totality of signs [Gauzes von Bezeichnungen],
there should certainly correspond a definite sense: but natural languages often do not satisfy
42
I. Foundations of semantics
this condition, and one must be content if the same word lias the same sense in the same
context [Zusammenhange].
(S&R: 159 (27-28))
In natural language, the same shape-individuated sign may have more than one sense
("Bank"), and different occurrences of the same sign with the same reference may vary
in sense ("Aristotle" in my mouth and your mouth). Similar things hold for concept-
words. In natural language some concept-words are incompletely defined, others are
vague. The concept-word "is a natural number" is only defined for numbers, the sense
of the word does not determine its application to flowers, "is bald" is vague, it is neither
true nor false of me. Hence, some sentences containing such concept-words violate the
law of excluded middle: they are neither true nor false. To prevent truth-value gaps
Frege bans vague and incompletely defined concept-words from the language of
inference. Even the language of mathematics contains complex signs with more than one
referent ("V2").
Natural language doesn't comply with the rule that every expression has in all
contexts exactly one sense. Different speakers associate different senses with the figure
"Aristotle" referring to the same person at the same time and/or the same speaker
associates different senses with the same proper name at different times. Frege points
out that different speakers can correctly use the same proper name for the same
bearer on the basis of different definite descriptions (S&R: 27, fn. 2). More importantly,
such a principle seems superfluous, since differences in proper name sense don't
prevent speakers of natural language from understanding each other's utterances. Hence,
we seem to be driven to the consequence that differences in sense between proper
names and other expressions don't matter, what matters is that one talks about the
same thing:
So long as the reference remains the same, such variations of sense may be tolerated,
although they arc to be avoided in the theoretical structure of a demonstrative science
("bcwciscndc Wisscnschaft") and ought not to occur in a perfect language.
(S&R: 158. &fn. 4 (27))
Why is variation of sense tolerable outside demonstrative sciences? Frege answers:
The task of vernacular languages is essentially fulfilled if people engaged in
communication with one another connect the same thought, or approximately the same thought,
with the same sentence For this it is not at all necessary that the individual words should
have a sense and reference of their own, provided that only the whole sentence has a
sense.
(PMC: 115. In part iny translation)
Frege makes several interesting points here, but let us focus on the main one: if you
and I connect approximately the same thought with "Aristotle was born in Stagira", we
have communicated successfully. What does'approximately the same' amount to? What
is shared when you understand my question "Is Aristotle a student of Plato?" is not the
thought I express with "Aristotle is a student of Plato". If one wants to wax
metaphysically, what is shared is a complex consisting of Aristotle (the philosopher) and Plato (the
Philosopher) standing in the relation of being a student connected in such a way that the
3. (Frege on) Sense and reference 43
complex is true iff Aristotle is a student of Plato.There are no conventional, community-
wide senses for ordinary proper names, there is only a conventional community wide
reference. Russell sums this up nicely when he writes to Frege:
In the case of a simple proper name like "Socrates" I cannot distinguish between sense and
reference: I only see the idea which is something psychological and the object. To put it
better: I don't acknowledge the sense, only the idea and the reference.
(Letter to Frege 12.12.1904. PMC: 169)
If we go along with Frege's use of "sign" for a symbol individuated in terms of the mode
of presentation expressed, we must say that often we don't speak the same Fregean
language, but that it does not matter for communication. If we reject it, we will speak
the same language, but proper names turn out to be ambiguous. (Variations of this
argument can be found in Russell 1910/11: 206-207; Kripke 1979:108; Evans 1982: 399f and
Sainsbury 2005:12ff)
This line of argument makes the Hybrid View plausible (see Heck 1995: 79). The
Hybrid View takes Frege to be right about the content of beliefs expressed by
sentences containing proper names, but wrong about what atomic sentences literally say.
"Hesperus is a planet" and "Phosphorus is a planet" have the same content, because the
proper name senses are too idiosyncratic to contribute to what one literally says with an
utterance containing the corresponding name. Grasping what an assertoric utterance of
an atomic sentence literally says is, in the basic case, latching on to the right particulars
and properties combined in the right way. The mode in which they arc presented docs
not matter. By contrast, "S believes that Phosphorus is a planet" and "S believes that
Hesperus is a planet" attribute different beliefs to S.
The argument for the Hybrid View requires the Fregean to answer the question "Why
is it profitable to think of natural language in terms of the Fregean ideal in which every
expression has one determinate sense?" (see Dummett 1981a: 585).
Dummett himself answers that we will gradually approximate the Fregean ideal
because we can only rationally decide controversies involving proper names ("Did
Napoleon really exist?") when we agree about the sense of these names (Dummett
1981a: lOOf). Evans' reply is spot on:"[I]t is the actual practice of using the name 'a', not
some ideal substitute, that interests us [...]" (Evans 1982: 40). The semanticist studies
English, not a future language that will be closer to the Fregean ideal.
Heck has given another answer that is immune to the objection that the Fregean
theory is not a theory for a language anyone (already) speaks. Proponents of the Hybrid
View assume (i) that one has to know to which thing "Hesperus" refers in order to
understand it and (ii) that there is no constraint on the ways or methods in which one
can come to know this. But if understanding your utterance of "George Orwell wrote
1984" consists at least in part in coming to know of George Orwell that the utterance is
about him, one cannot employ any mode of presentation of George Orwell.The speaker
will assume that his audience can come to know what he is talking about on the basis of
his utterance and features of the context. If the audience's method of finding out who
is talked about does not draw on these reasons, they still might get it right. But it might
easily have been the case that the belief they did actually acquire was false. In this
situation they would not know who I am talking about with "George Orwell". Hence, the
idea that in understanding an assertoric utterance one acquires knowledge limits the
44
I. Foundations of semantics
ways in which one may think of something in order to understand an utterance about
it (Heck 1995:102).
If this argument for the application of the sense/reference distinction to natural
language is along the right lines, the bearers of sense and reference are no longer form-
individuated signs. The argument shows at best that the constraints on understanding
an utterance are more demanding than getting the references of the uttered words
and their mode of combination right. This allows different utterances of the same form-
individuated sentence to have different senses. There is no requirement that the
audience and the speaker grasp the same sense in understanding (making) the utterance.The
important requirement is that they all know what they are referring to.
There is another line of argument for the application of the scnsc/rcfcrcncc
distinction to natural language. Frege often seems to argue that one only needs sense AND
reference in languages 'designed' for inferential thinking. But of course we also make
inferences in natural languages. Communication in natural language is often joint
reasoning. Take the following argument:
You: Hegel was drunk.
Me: And Hepel is married.
We both: Hegel was drunk and is married.
Neither you nor I know both premises independently of each other; each person knows
one premise and transmits knowledge of this premise to the other person via testimony.
Together we can come to know the conclusion by deduction from the premises. But the
argument above can only be valid and knowledge-transferring if you and I are entitled
to take for granted that "Hegel" in the first premise names the same person as "Hegel"
in the second premise without further justification. Otherwise the argument would be
incomplete; its rational force would rest on implicit background premises. According to
Frege, whenever coincidence in reference is obvious, we have sameness of sense (PMC:
234). Every theory that wants to account for inference must acknowledge that sometimes
we are entitled to take co-reference for granted. Hence, we have a further reason to
assume that utterances of natural language sentences have senses.
4.4. Sense and reference for context-dependent expressions
Natural language contains unambiguous signs, tokens of which stand for different
things in different utterances because the sign means what it does. Among such
context-dependent expressions are:
- personal pronouns:T,'you','my','he','she','it'
- demonstrative pronouns: 'that', 'this', 'these' and 'those'
- adverbs: 'here', 'now', 'today', 'yesterday', 'tomorrow'
- adjectives: 'actual', 'present' (a rather controversial entry)
An expression has a use as a context-dependent expression if it is used in a way that its
reference in that use can vary with the context of utterance while its linguistic meaning
stays the same.
3. (Frege on) Sense and reference 45
Context-dependent expressions are supposed to pose a major problem for Frege.
Perry (1977) has started a fruitful discussion about Frege's view on context-dependent
expressions. Let us use the indexical 'now' to make clear what the problem is supposed
to be:
1. If tokens of the English word 'now' are produced at different times, the tokens refer
to different times.
2. If two signs differ in reference, they differ in sense.
Hence,
3. Tokens of the English word 'now' that are produced at different times differ in sense.
4. It is not possible that two tokens of 'now' co-refer if they are produced at different
times.
Hence,
5. It is not possible that two tokens of'now' that are produced at different times have the
same sense.
Every token of 'now' that is produced at time t differs in sense from all other tokens of
'now' not produced at t. Now one will ask what is the particular sense of a token of 'now'
at r? Perry calls this 'the completion problem'. Is the sense of a particular token of 'now'
the same as the sense of a definite description of the time t at which 'now' is uttered? No,
take any definite description d of the time t that does not itself contain 'now' or a synonym
of'now'. No statement of the form 'd is now' is trivial. Take as a representative example,
'The start of my graduation ceremony is now'. Surely, it is not self-evident that the start
of my graduation ceremony is now. Hence, Perry takes Frege to be settled with the
unattractive conclusion that for each time t there is a primitive and particular way in which t is
presented to us at f, which gives rise to thoughts accessible only at f, and expressible then
with 'now' (Perry 1977:491). Perry (1977) and Kaplan (1977/1989) have argued that one
should, for this and further reasons, replace Fregean thoughts with two successor notions:
character and content. The character of an indexical is, roughly, a rule that fixes the
referent of the context-dependent expression in a context; the content is, roughly, the referent
that has been so fixed.This revision of Frcgc has now become the orthodox theory.
4.5. The mode of presentation problem
Frege's theory of sense and reference raises the question "What are modes of
presentation?" If modes of presentation are not the meanings of definite descriptions,
what are they? This question can be understood as a request to reduce modes of
presentation to scientifically more respectable things. I have no argument that one
cannot reduce sense to something more fundamental and scientifically acceptable,
but there is inductive evidence that there is no such reduction: many people have
tried very hard for a long time, none of them has succeeded (see Schiffer 1990 and
2003). Modes of presentation may simply be modes of presentation and not other
things (see Peacocke 1992:121).
46
I. Foundations of semantics
Dummett has tried to work around this problem by making a theory of sense a theory
of understanding:
[Tjhere is no obstacle to our saying what it is that someone can do when lie grasps that
sense; and that is all that we need the notion of sense for.
(Dummett 1981a: 227)
If this is true, we can capture the interesting part of the notion of sense by explaining
what knowledge of sense consists in. Does knowledge of sense reduce to something
scientifically respectable? Dummett proposes the following conditions for knowing the sense
of a proper name:
S knows the sense of a proper name N iff
S has a criterion for recognising for any given object whether it is the bearer of N.
(Dummett 1981a: 229)
Docs your understanding "Gottlob Frcgc" consist in a criterion for recognising him when
he is presented to you? (See Evans 1982, sec. 4.2.) No, he can no longer be presented to
you in the way he could be presented to his contemporaries. Did the sense of "Gottlob
Frege" change when he died?
The real trouble for modes of presentation seems not to be their irreducibility, but
the problem to say, in general, what the difference between two modes of presentation
consists in. Consider Fine's (2007: 36) example. You live in a symmetrical universe. At
a certain point you are introduced to two identical twins. Perversely you give them
simultaneously the same name 'Bruce'. Even in this situation it is rational to assert the
sentence "Bruce is not the same person as Bruce".
The Fregean description of this case is that the figure "Bruce" expresses in one
idiolect two modes of presentation. But what is the difference between these modes
of presentation? The difference cannot be specified in purely qualitative terms. The
twins have all their qualitative properties in common. Can the difference be specified
in indexical terms? After all, you originally saw one Bruce over there and the other
over here. This solution seems ad hoc. For, in general, one can forget the features that
distinguished the bearer of a name when one introduced the name and yet continue
to use the name. Can the difference in mode of presentation be specified in terms of
the actual position of the two Bruce's? Maybe, but this difference cannot ground the
continued use of name and it raises the question of what makes the current distinction
sustain the use of the name originally introduced on the basis of other features. As long
as these and related questions are not answered, alternatives to sense and reference
merit a fair hearing.
5. Summary
Frege's work on sense and reference has set the agenda for over a century of research.
The main challenge for Frege's friends is to find a plausible way to apply his theoretical
apparatus to natural languages. Whether one believes that the challenge can be met
or not, Frege has pointed us to a pre-philosophical datum, the fact that true identity
3. (Frege on) Sense and reference 47
statements about the same object can differ in cognitive value, that is a crucial
touchstone for philosophical semantics.
/ want to thank Sarah-Jane Conrad, Silvan Imhof, Laura Mercolli, Christian Nimtz, the
Editors and an anonymous referee for helpful suggestions.
6. References
Works by Frege:
Begriffsschrift (BS, 1879). Reprinted: Darmstadt: Wissenschaftliche Buchgesellschaft, 1977.
The Foundations of Arithmetic (FA, 1884). Oxford: Blackwell, 1974. (C.Thiel (ed.). Die Grundlagen
der Arithmetik. Hamburg: Meiner, 1988.)
Nachgelassene Schriften (NS), edited by Hans Hermes, Friedrich Kambartel & Friedrich Kaulbach.
2nd rev. edn. Hamburg: McincrVerlag 1983.1st edn. translated by Peter Long & Roger White as
Posthumous Writings (PW). Oxford: Basil Blackwell. 1979.
Wissenschaftlicher Briefwechsel (BW),edited by Gottfried Gabriel, Hans Hermes, Friederich
Kambartel, Christian Thiel & Albert Veraart. Hamburg: Meiner Verlag, 1976. Abridged by Brian
McGuinness and translated by Hans Kaal as Philosophical and Mathematical Correspondence
(PMC). Oxford: Basil Blackwell, 1980.
McGuinness, Brian (ed.) 1984: Gottlob Frege: Collected Papers on Mathematics, Logic and
Philosophy (CP). Oxford: Basil Blackwell.
'Uber Sinn und Bedeutung'. Zeitschrift fiir Philosophic undphilosophische Kritik
100,25-50.Translated as'On Sense and Meaning' (S&R), in: McGuinness 1984,157-177.
'Der Gedanke'. Beitrdge zur Philosophic des deutschen Idealismus 1,58-77. Translated as'Thoughts'
(T), in: McGuinness 1984,351-373.
Other Works:
Bell. David 1987. Thoughts. Notre Dame Journal of Philosophical Logic 28,36-51.
Bezuidenhout, Anne & Marga Reiiner (eds.) 2004. Descriptions and Beyond. Oxford: Oxford
University Press 2004.
Burge,Tyler 2005. Truth, Thought, Reason: Essays on Frege. Oxford: Clarendon Press.
Campbell, John 1994. Past, Space and Self. Cambridge, M A:The MIT Press.
Carnap, Rudolf 1956. Meaning and Necessity. 2nd edn. Chicago, IL:Thc University of Chicago Press.
Church, Alonzo 1951. A formulation of the logic of dense and denotation. In: P. Hcnlc, H.M.Kallcn
& S. K. Langer (eds.). Structure, Method and Meaning. Essays in Honour of Henry M. Sheffer.
New York: Liberal Arts Press, 3-24.
Church, Alonzo 1973. Outline of a revised formulation of the logic of sense and denotation (Part
I). Noils 7,24-33.
Church, Alonzo 1974. Outline of a revised formulation of the logic of sense and denotation (Part
II). Nous 8,135-156.
Davidson, Donald 1962. The method of extension and intension. In: P. A. Schilpp (ed.). The
Philosophy of Rudolf Carnap. La Salle, IL: Open Court, 311-351.
Davidson, Donald 1984. Inquiries into Truth and Interpretation. Oxford: Oxford University Press.
Dumniett, Michael 1975. Frege's distinction between sense and reference [Spanish translation
under the title 'Frege']. Teorema V. 149-188. Reprinted in: M. Dununctt. Truth and Other
Enigmas. London: Duckworth, 1978,116-145.
Duinmett, Michael 1981a. Frege: Philosophy of Language. 2nd edn. London: Duckworth.
Dummett, Michael 1981b. The Interpretation of Frege's Philosophy. London: Duckworth.
Duminett, Michael 1989. More about thoughts. Notre Dame Journal of Philosophical Logic 30,1-19.
Dummett, Michael 1993. The Seas of Language. Oxford: Oxford University Press.
Evans, Gareth 1982. Varieties of Reference. Oxford: Oxford University Press.
48
I. Foundations of semantics
Fine, Kit 1989.The problem of de re modality. In: J. Almog, J. Perry & H. K. Wettstein (eds.). Themes
from Kaplan. Oxford: Oxford University Press, 197-272.
Fine, Kit 2007. Semantic Relationism. Oxford: Blackwell.
Forbes, Graeme 1990.The indispensability of Sinn. The Philosophical Review 99,535-563.
Furth, Montgomery 1964. Introduction. In: M. Furth (ed.). The Basic Laws of Arithmetic:
Exposition of the System. Berkeley, CA: University of California Press, v-lix.
Geach, Peter T. 1975. Names and identity. In: S. Guttenplan (ed.). Mind and Language. Oxford:
Clarendon Press, 139-158.
Heck, Richard 1995.The sense of communication. Mind 104.79-106.
Jackson, Frank 2005. What are proper names for? In: J. C. Marek & M. E. Reicher (eds.). Experience
and Analysis. Proceedings of the 27th International Wittgenstein Symposium.Vienna: hpt-obv.
257-269.
Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. K. Wettstein (eds.). Themes from
Kaplan. Oxford: Oxford University Press, 481-563.
Kripke, Saul 1980. Naming and Necessity. Cambridge, MA: Harvard University Press. (Separate
reissue of: S. Kripke. Naming and necessity. In: D. Davidson & G. Harm an (eds.). Semantics of
Natural Language. Dordrecht: Reidel, 1972,253-355 and 763-769.)
Kripke. Saul 1979. A puzzle about belief. In: A. Margalit (ed.). Meaning and Use. Dordrecht: Reidel,
239-283. Reprinted in: N. Salomon & S. Soames (eds.). Propositions and Attitudes. Oxford:
Oxford University Press, 1989,102-148.
Kemmerling. Andreas 1990. Gedanken und ihreTeile. Grazer Philosophische Studien 37,1-30.
Kunne, Wolfgang 1992. Hybrid proper names. Mind 101,721-731.
Levine, James 2004. On the 'Gray's elegy1 argument and its bearing on Frege's theory of sense.
Philosophy and Phenomenological Research 64,251-295.
Lewis, David 1970. General semantics. Synthese 22,18-67.
Makin, Gideon 2000. The Metaphysicians of Meaning. London: Routledge.
May, Robert 2006. The invariancc of sense. The Journal of Philosophy 103,111-144.
McDowell, John 1977. On the sense and reference of a proper name. Mind 86,159-185.
Mendelsohn, Richard L. 1982. Frege's BegriffsschriftT\\eory of identity. The Journal for the History
of Philosophy 20,279-299.
Montague. Richard 1973.The proper treatment of quantification in ordinary English. In: J. Hintikka,
J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel, 221-242.
Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard Montague.
New Haven, CT: Yale University Press, 1974.247-270.
Parsons, Terence 1997. Fregean theories of truth and meaning. In: M. Schirn (ed.). Frege:
Importance and Legacy. Berlin: de Gruyter, 371-409.
Peacocke, Christopher 1992. A Study of Concepts. Cambridge, MA: The MIT Press.
Peacocke, Christopher 1999. Being Known, Oxford: Oxford University Press.
Perry, John 1977. Frege on demonstratives. The Philosophical Review 86,474-497.
Putnam,Hilary 1954.Synonymy and the analysisof belief sentences. Analysis 14,114-122.
Putnam. Hilary 1975. The meaning of 'meaning'. In: K. Gunderson (ed.). Language, Mind and
Knowledge. Minneapolis, MN: University of Minnesota Press. Reprinted in: H. Putnam. Mind.
Language and Reality: Philosophical Papers, Volume 2. Cambridge: Cambridge University Press,
1975,215-271.
Quine, Willard van Orman 1951. Two dogmas of empiricism. The Philosophical Review 60, 20-43.
Reprinted in: W.V.O. Quinc. From a Logical Point of View. Harvard: Harvard University Press,
1953.20-47.
Ricketts, Thomas 2003. Quantification, sentences, and truth-values. Manuscrito: Revista
International de Filosofia 26.389^24.
Rumfitt, Ian 1994. Frege's theory of predication: An elaboration and defense, with some new
applications. The Philosophical Review 103,599-637.
Russell, Bertrand 1905. On denoting. Mind 14,479^93.
4. Reference: Foundational issues
49
Russell, Bertrand 1910/11. Knowledge by acquaintance and knowledge by description. Proceedings
of the Aristotelian Society New Series 11,108-128. Reprinted in: B. Russell. Mysticism and Logic.
London: Unwin, 1986,201-221.
Sainsbury, Mark R. 2005. Reference without Referents. Oxford: Oxford University Press.
Schiffer, Stephen 1990. The mode of presentation problem. In: A. C. Anderson & J. Owens (eds.).
Propositional Attitudes. Stanford, CA: CSLI Publications. 249-269.
Schiffer, Stephen 2003. The Things We Mean. Oxford: Oxford University Press.
Soames, Scott 1998.The modal argument: Wide scope and rigidified descriptions. Nous 32,1-22.
Textor, Mark 2007. Frege's theory of hybrid proper names developed and defended. Mind 116,
947-982.
Textor, Mark 2009. A repair of Frege's theory of thoughts. Synthese 167,105-123.
Wiggins, David 1997. Meaning and truth conditions: From Frege's grand design to Davidson's. In: B.
Hale & C.Wright (eds.). A Companion to the Philosophy of Language. Oxford: Blackwell, 3-29.
Mark Textor, London (United Kingdom)
4. Reference: Foundational issues
1. Introduction
2. Direct reference
3. Frege's theory of sense and reference
4. Reference vs. quantification and Russell's theory of descriptions
5. Strawson's objections to Russell
6. Donnellan's attributive-referential distinction
7. Kaplan's theory of indexicality
8. Proper names and Kripke's return to Millian nondescriptionality
9. Propositional attitude contexts
10. Indefinite descriptions
11. Summary
12. References
Abstract
This chapter reviews issues surrounding theories of reference. The simplest theory is the
"Fido"-Fido theory - that reference is all that an NPhas to contribute to the meaning of
phrases and sentences in which it occurs. Two big problems for this theory are
preferential NPs that do not behave as though they were semantically equivalent and meaningful
NPs without a referent. These problems are especially acute in sentences about beliefs and
desires - propositional attitudes. Although Frege's theory of sense, and Russell's quantifica-
tional analysis, seem to solve these problems for definite descriptions, they do not work well
for proper names, as Kripke shows. And Donnellan and Strawson have other objections to
Russell's theory. Indexical expressions like "I" and "here" create their own issues; we look
at Kaplan's theory of indexicality, and several solutions to the problem indexicals create in
propositional attitude contexts. The final section looks at indefinite descriptions, and some
more recent theories that make them appear more similar to definite descriptions than was
previously thought.
Maienborn, von Heusingerand Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 49-74
50
I. Foundations of semantics
1. Introduction
Reference, it seems, is what allows us to use language to talk about things and thus vital
to the functioning of human language. That being said there remain several parameters
to be fixed in order to determine a coherent field of study. One important one is whether
reference is best viewed as a semantic phenomenon - a relation between linguistic
expressions and the objects referred to, or whether it is best viewed as pragmatic - a
three-place relation among language users, linguistic expressions, and things. The
ordinary everyday meaning of words like "refer" and "reference" would incline us toward
the pragmatic view (we say, e.g., Who were you referring to?, not Who was your phrase
referring to?). However there is a strong tradition, stemming from work in logic and early
modern philosophy of language, of viewing reference as a semantic relation, so that will
be the main focus of our attention at the outset, although we will turn before long to
pragmatic views.
1.1. Reference vs. predication
Another parameter to be determined is what range of expressions (can be used to)
refer.The traditional model sentence, e.g. Socrates rims, consists of a simple noun phrase
(NP), a proper name in this case, and a simple verb phrase (VP).The semantic function
of the NP is to pick out (i.e. refer to) some entity, and the function of the VP is to
predicate a property of that entity. The sentence is true if the entity actually has that
property, and false otherwise. Seen in this light, reference and predication are quite different
operations. Of course many natural language sentences do not fit this simple model.
In a semantics which is designed to treat the full panoply of expression types and to
provide truth conditions for sentences containing them, the distinction between
reference and predication may not be so clear or important. In classical Montague Grammar,
for example, expressions of all categories (except determiners and conjunctions) are
assigned an extension, where extension is a formal counterpart of reference (Montague
1973;see also article 11 (Kempson) Formal semantics and representationalism). (In
Montague Grammar expressions are also assigned an intension, which corresponds to a Fre-
gean sense - see below, and cf. article 3 (Textor) Sense and reference.) An expression's
extension may be an ordinary type of entity, or, more typically, it may be a complex
function of some kind. Thus the traditional bifurcation between reference and
predication is not straightforwardly preserved in this approach. However for our purposes we
will assume this traditional bifurcation, or at least that there is a difference between NPs
and other types of expressions, and wc will consider reference only for NPs.
Furthermore our primary focus will be on definite NPs, which are the most likely candidates
for referring expressions, though they may have other, non-referring uses too (see the
articles in Chapter IX, Noun phrase semantics). The category of definite NPs includes
proper names (e.g. Amelia Earhart), definite descriptions (e.g. the book Sally is reading),
demonstrative descriptions (e.g. this house), and pronouns (e.g. you, that). We will have
only a limited amount to say about pronouns; for the full story, sec article 40 (Biiring)
Pronouns. (Another important category comprises generic NPs; for these, see article 47
(Carlson) Genericity.) Whether other types of NP, such as indefinite descriptions (e.g.
a letter from my mother), can be properly said to be referring expressions is an issue of
some dispute -see below,section 10.
4. Reference: Foundational issues
51
1.2. The metaphysical problem of reference
Philosophers have explored the question of what it is, in virtue of which an expression
has a reference - what links an expression to a reference and how did it come to do that?
Frege (1892) argued that expressions express a sense, which is best thought of as a
collection of properties. The reference of an expression is that entity which possesses exactly
the properties contained in the sense. Definite descriptions arc the clearest examples for
this theory; the inventor of bifocals specifies a complex property of having been the first to
think up and create a special type of spectacles, and that NP refers to Benjamin Franklin
because he is the one who had that property. One problem with this answer to the
question of what determines reference is the mysterious nature of senses. Frege insisted they
were not to be thought of as mental entities, but he did not say much positive about what
they are, and that makes them philosophically suspect. (See article 3 (Textor) Sense and
reference.) Another answer, following Kripke (1972), is what is usually referred to as a
"causal (or historical) chain".The model in this case is proper names, and the idea is that
there is some initial kind of naming event whereupon an entity is bestowed with a name,
and then that name is passed down through the speech community as a name of that
entity. In this article we will not be so much concerned with the metaphysical problem of
what determines reference, but instead the linguistic problem of reference - determining
what it is that referring expressions contribute scmantically to the phrases and sentences
in which they occur.
1.3. The linguistic problem of reference
The two answers to the metaphysical problem of reference correlate with two answers
to the linguistic question of what it is that NPs contribute to the semantic content of
phrases and sentences in which they appear. Frege's answer to the linguistic question is
that expressions contribute their reference to the reference of the phrases in which they
occur, and they contribute their sense to the sense of those phrases. But there are
complications to this simple answer that we will review below in section 3. The other answer
is that expressions contribute only their reference to the semantic content of the phrases
and sentences in which they occur. Since this is a simpler answer, we will begin by looking
at it in some more detail, in section 2, in order to able to understand why Frege put
forward his more complex theory.
2. Direct reference
The theory according to which an NP contributes only its reference to the phrases and
sentences in which it occurs is currently called the "direct reference" theory (the term
was coined by Kaplan 1989). It is also sometimes called the "F/do-Fido" theory, the idea
being that you have the name Fido and its reference is the dog, Fido, and that's all there
is to reference and all there is to the meaning of such phrases. One big advantage of this
simple theory is that it docs not result in the postulation of any suspect entities. However
there are two serious problems for this simple theory: one is the failure of preferential
NPs to be fully equivalent scmantically, and the other is presented by seemingly
meaningful NPs which do not have a reference - so called "empty NPs". We will look more
closely at each of these problems in turn.
52
I. Foundations of semantics
2.1. Failure of substitutivity
According to the direct reference theory, coreferential NPs - NPs which refer to the
same thing - should be able to be substituted for each other in any sentence without
a change in the semantics of the sentence or its truth value. (This generalization
about intersubstitutivity is sometimes referred to as "Leibniz' Law".) If all that an
NP has to contribute scmantically is its reference, then it should not matter how that
reference is contributed. However almost any two coreferential NPs will not seem to
be intersubstitutable - they will seem to be scmantically different. Frege (1892)
worried in particular about two different kinds of sentence that showed this failure of
substitutivity.
2.1.1. Identity sentences
The first kind are identity sentences - sentences of the form a = b, or (a little more
colloquially) a is (the same entity as) b. If such a sentence is true, then the NPs a and b
are coreferential so, according to the direct reference theory, it shouldn't matter which
NP you use, including in identity sentences themselves! That is, the two sentences in (1)
should be scmantically equivalent (these are Frege's examples).
(1) a. The morning star is the morning star,
b. The morning star is the evening star.
However, as Frege noted, the two sentences are very different in their cognitive impact,
(la) is a trivial sentence, whose truth is known to anyone who understands English (it is
analytic), (lb) on the other hand gives the results of a major astronomical finding.Thus
even though the only difference between (la) and (lb) is that we have substituted
coreferential NPs (the evening star for the morning star), there is still a semantic difference
between them.
2.1.2. Propositional attitude sentences
The other kind of sentence that Frege worried about was sentences about propositional
attitudes - the attitudes of sentient beings about situations or states of affairs. Such
sentences will have a propositional attitude verb like believe, hope, know, doubt, want, etc.
as their main verb, plus a sentential complement saying what the subject of the verb
believes, hopes, wants, etc. Just as with identity statements, coreferential NPs fail to be
intersubstitutable in the complements of such sentences. However the failure is more
serious in this case. Intersubstitution of coreferential NPs in identity sentences always
preserves truth value, but in propositional attitude sentences the truth value may change.
Thus (2a) could be true while (2b) was false.
(2) a. Mary knows that the morning star is a planet,
b. Mary knows that the evening star is a planet.
We can easily imagine that Mary has learned the truth about the morning star, but not
about the evening star.
4. Reference: Foundational issues
53
2.2. Empty NPs
The other major problem for the direct reference theory is presented by NPs that do not
refer to anything - NPs like the golden mountain or the round square. The direct
reference theory seems to make the prediction that sentences containing such NPs should be
semantically defective, since they contain a part which has no reference. Yet sentences
like those in (3) do not seem defective at all.
(3) a. Lee is looking for the golden mountain.
b. The philosopher's stone turns base metals into gold.
Sentences about existence, especially those that deny existence, pose special problems
here. Consider (4):
(4) The round square does not exist.
Not only is (4) not semantically defective, it is even true! So this is the other big problem
for the Fido-Fido direct reference theory.
3. Frege's theory of sense and reference
As noted above, Frege proposed that referring expressions have semantic values
on two levels, reference and sense. Frege was anticipated in this by Mill (1843), who
had proposed a similar distinction between denotation (reference) and connotation
(sense). (Mill's use of the word "connotation" must be kept distinct from its current
use to mean hints or associations connected with a word or phrase. Mill's connotations
functioned like Frege's senses - to determine reference (denotation).) One
important aspect of Frege's work was his elegant arguments. He assumed two fundamental
principles of compositionality. At the level of senses he assumed that the sense of
a complex expression is determined by the senses of its component parts plus their
syntactic mode of combination. Similarly at the level of reference, the reference of
a complex expression is determined by the references of its parts plus their mode of
combination. (Following Frege, it is commonly assumed today that meanings,
whatever they are, are compositional.That is thought to be the only possible explanation of
our ability to understand novel utterances. See article 6 (Pagin & Westerstahl)
Compositionality.) Using these two principles Frege argued further that the reference of
a complete sentence is its truth value, while the sense of a sentence is the proposition
it expresses.
3.1. Solution to the problem of substitutivity
Armed with senses and the principles of compositionality, plus an additional assumption
that we will get to shortly, Frege was able to solve most, though not all, of the problems
pointed out above.
54
I. Foundations of semantics
3.1.1. Identity sentences
First, for the problem of failure of substitutivity of coreferential NPs in identity
statements Frege presents a simple solution. Recall example (1) repeated here.
(1) a. The morning star is the morning star,
b. The morning star is the evening star.
Although the morning star and the evening star have the same reference (making (lb) a
true sentence), the two NPs differ in sense.Thus (lb) has a different sense from (la) and
so we can account for the difference in cognitive impact.
3.1.2. Propositional attitude sentences
Turning to the problem of substitutivity in propositional attitude contexts, Frege again
offers us a solution, although the story is a bit more complicated in this case. Here are
the examples from (2) above.
(2) a. Mary knows that the morning star is a planet,
b. Mary knows that the evening star is a planet.
Simply observing that the evening star has a different sense from the morning star will
account for why (2b) has a different sense from (2a), but by itself does not yet account
for the possible change in truth value. This is where the extra piece of machinery
mentioned above comes in. Frege pointed out that expressions can sometimes shift their
reference in particular contexts. When wc quote expressions, for example, those expressions
no longer have their customary reference, but instead refer to themselves. Consider the
example in (5)
(5) "The evening star" is a definite description.
The phrase the evening star as it occurs in (5) docs not refer to the planet Venus any more,
but instead refers to itself. Frege argued that a similar phenomenon occurs in
propositional attitude contexts. In such contexts, Frege argued, expressions also shift their
reference, but here they refer to their customary sense. This means that the reference of (2a)
(its truth value) involves the customary sense of the phrase the morning star rather than its
customary reference, while the reference/truth value of (2b) involves instead the
customary sense of the phrase the evening star. Since wc have two different components, it
is not unexpected that we could have two different references - truth values - for the two
sentences.
3.2. Empty NPs
NPs that have no reference were the other main problem area for the direct reference
theory. One problem was the apparent meaningfulness of sentences containing such
NPs. It is easy to sec how Frege's theory solved this problem. As long as such NPs have
a sense, the sentences containing them can have a well-formed sense as well, so their
4. Reference: Foundational issues
55
meaningfulness is not a problem. We should note, though, that Frege's theory predicts
that such sentences will not have a truth value.That is because the truth value, as we have
noted, is determined by the references of the constituent expressions in a sentence, and if
one of those expressions doesn't have a reference then the whole sentence will not have
one either.This means that true negative existential sentences, such as (4) repeated here:
(4) The round square does not exist.
remain a problem for Frcgc. Since the subject docs not have a reference, the whole
sentence should not have a truth value, but it does.
3.3. Further comments on Frege's work
Several further points concerning Frege's work will be relevant in what follows.
3.3.1. Presupposition
As wc have just observed, Frege's principle of compositionality at the level of reference,
together with his conclusion that the reference of a sentence is its truth value, means
that a sentence containing an empty NP will fail to have a truth value. Frege held that
the use of an NP involves a presupposition, rather than an assertion, that the NP in
question has a reference. (See article 91 (Beaver & Geurts) Presupposition.) Concerning
example (6)
(6) The one who discovered the elliptical shape of the planetary orbits died in misery.
Frege said that if one were to hold that part of what one asserts in the use of (6) is that
there is a person who discovered the elliptical shape of the planetary orbits, then one
would have to say that the denial of (6) is (7).
(7) Either the one who discovered the elliptical shape of the planetary orbits did not
die in misery, or no one discovered the elliptical shape of the planetary orbits.
But the denial of (6) is not (7) but rather simply (8).
(8) The one who discovered the elliptical shape of the planetary orbits did not die in
misery.
Instead, both (6) and (8) presuppose that the definite description the one who discovered
the elliptical shape of the planetary orbits has a reference, and if the NP were not to have
a reference, then neither sentence would have a truth value.
3.3.2. Proper names
It was mentioned briefly above that Mill's views were very similar to Frege's in holding
that referring expressions have two kinds of semantic significance - both sense and
56
I. Foundations of semantics
reference, or in Mill's terms, connotation and denotation. Mill, however, made an
exception for proper names, which he believed did not have connotation but only denotation.
Frege appeared to differ from Mill on that point. We can see that, given Frege's principle
of compositionality of sense, it would be important for him to hold that proper names
do have a sense, since sentences containing them can clearly have a sense, i.e. express a
proposition. His most famous comments on the subject occur in a footnote to "On sense
and reference" in which he appeared to suggest that proper names have a sense which is
similar to the sense which a definite description might have, but which might vary from
person to person. The name Aristotle, he seemed to suggest, could mean 'the pupil of
Plato and teacher of Alexander the Great' for one person, but 'the teacher of Alexander
the Great who was born in Stagira' for another person (Frcgc 1892, fn. 2).
3.3.3. Propositions
According to Frcgc, the sense of a sentence is the proposition it expresses, but it has
been very difficult to determine what propositions are. Frege used the German word
"Gedanke" ('thought'), and as we have seen, for Frege (as for many others) propositions
are not only what sentences express, they are also the objects of propositional attitudes.
Subsequent formalizations of Frege's ideas have used the concept of possible worlds to
analyze them. Possible worlds arc simply alternative ways things (in the broadest sense)
might have been. E.g. I am sitting in my office at this moment, but I might instead have
gone out for a walk; there might have been only 7 planets in our solar system, instead of
8 or 9. Using this notion, propositions were analyzed as functions from possible worlds
to truth values (or equivalently, as sets of possible worlds). This meshes nicely with the
idea that the sense of a sentence combined with facts about the way things are (a
possible world) determine a truth value. However there are problems with this view; for
example, all mathematical truths are necessarily true, and thus true in every possible
world, but the sentences expressing them do not seem to have the same meaning (in
some pre-theoretic sense of meaning), and it seems that one can know the truth of one
without knowing the truth of all of them - e.g. someone could know that two plus two is
four, but not know that there are an infinite number of primes. (For continued defense of
the possible worlds view of the objects of thought, see Stalnaker 1984,1999.) Following
Carnap (1956), David Lewis (1972) suggested that sentence meanings are best viewed
as entities with syntactic structure, whose elements are the senses (intensions) of the
constituent expressions. We will see an additional proposal concerning what at least some
propositions are shortly, in section 4 on Russell.
3.4. Summary comments on Frege
Frege's work was neglected for some time, both within and outside Germany.
Eventually it received the attention it deserved, especially during the development of formal
semantic treatments of natural language by Carnap, Kripke, Montague, and others (see
article 10 (Newen& Schroder) Logic and semantics, article 11 (Kempson) Formal
semantics and representationalism). Although the distinction between sense and reference (or
intension and extension) is commonly accepted, Frege's analysis of propositional
attitude contexts has fallen out of favor; Donald Davidson has famously declared Frege's
4. Reference: Foundational issues
57
theory of a shift of reference in propositional attitude contexts to be "plainly incredible"
(Davidson 1968/1984:108), and many others seem to have come to the same conclusion.
4. Reference vs. quantification and Russell's theory of descriptions
In his classic 1905 paper "On denoting", Bcrtrand Russell proposed an alternative to
Frege's theory of sense and reference. To understand Russell's work it helps to know
that he was fundamentally concerned with knowledge. He distinguished knowledge
by acquaintance, which is knowledge we gain directly via perception, from knowledge
by description (cf. Russell 1917), and he sought to analyze the latter in terms of the
former. Russell was a direct reference theorist, and rejected Frege's postulation of senses
(though he did accept properties, or universals, as the semantic content of predicative
expressions). The only genuine referring expressions, for Russell, were those that could
guarantee a referent, and the propositions expressed by sentences containing such
expressions are singular propositions, which contain actual entities. If I were to point at
Mary, and utter (9a), I would be expressing the singular proposition represented in (9b).
(9) a. She is happy.
b. <Mary, happiness>
Note that the first element of (9b) is not the name Mary, but Mary herself. Any NP which
is unable to guarantee a referent cannot contribute an entity to a singular proposition.
Russell's achievement in "On denoting" was to show how such NPs could be analyzed
away into the expression of general, quantificational propositions.
4.1. Quantification
In traditional predicate logic, overtly quantificational NPs like every book, no chair do
not have an analysis per se, but only in the context of a complete sentence. (In logics
with generalized quantifiers, developed more recently, this is not the case; sec article 43
(Keenan) Quantifiers.) Look at the examples in (10) and (11).
(10) a. Every book is blue.
b. Vx[book(x) 3 blue(x)]
(11) a. No table is sturdy.
b. ~3x[table(x) & sturdy(x)]
(10b), the traditional logical analysis of (10a), says (when translated back into English)
For every x, if x is a book then x is blue. We can see that every, the quantificational
element in (10a), has been elevated to the sentence level, in effect, so that it expresses a
relationship between two properties - the property of being a book and the property of
being blue. Similarly (lib) says, roughly, It is not the case that there is an x such thatx is a
table and x is sturdy. It can be seen that this has the same truth conditions as No table is
sturdy, and once again the quantificational element {no in this case) has been analyzed as
expressing a relation between two properties, in this case the properties of being a table
and being sturdy.
58
I. Foundations of semantics
4.2. Russell's analysis of definite descriptions
Russell's analysis of definite descriptions was called "the paradigm of philosophy"
(by Frank Ramsey), and if analysis is the heart of philosophy then indeed it is that. One
of the examples Russell took to illustrate his method is given in (12a), and its analysis is
in (12b).
(12) a. The present king of France is bald.
b. 3x[king-of-France(x) & Vy[king-of-France(y) 3 y=x] & bald(x)]
The analysis in (12b) translates loosely into the three propositions expressed by the
sentences in (13).
(13) a. There is a king of France.
b. There is at most one king of France.
c. He is bald.
(He, in (13c) must be understood as bound by the initial There is a... in (13a).) As can be
seen, the analysis in (12b) contains no constituent that corresponds to the present king of
France. Instead the is analyzed as expressing a complex relation between the properties
of being king of France and being bald. Let us look now at how Russell's analysis solves
the problems for the direct reference theory.
4.3. Failure of substitutivity
4.3.1. Identity sentences
Although Russell was a direct reference theorist, we can see that, under his analysis,
English definite descriptions have more to contribute to the sentences in which they
occur than simply their reference. In fact they no longer contribute their reference at all
(because they are no longer referring expressions, and do not have a reference). Instead
they contribute the properties expressed by each of the predicates occurring in the
description. It follows that two different definite descriptions, such as the morning star
and the evening star, will make two different contributions to their containing sentences.
And thus it is no mystery why the two identity sentences from (1) above, repeated here
in (14), have different cognitive impact.
(14) a. The morning star is the morning star,
b. The morning star is the evening star.
The meaning of the second sentence involves the property of being seen in the evening
as well as that of being seen in the morning.
4.3.2. Propositional attitude sentences
When we come to propositional attitude sentences the story is a little more complicated.
Recall that Russell's analysis does not apply to a definite description by itself, but only
in the context of a sentence. It follows that when a definite description occurs in an
4. Reference: Foundational issues
59
embedded sentence, as in the case of propositional attitude sentences, there will be two
ways to unpack it according to the analysis.Thus Russell predicts that such sentences are
ambiguous. Consider our example from above, repeated here as (15).
(15) Mary knows that the morning star is a planet.
According to Russell's analysis we may unpack the phrase the morning star with respect
to either the morning star is a planet or Mary knows that the morning star is a planet. The
respective results are given in (16).
(16) a. Mary knows that 3x[morningstar(x) &
Vy[morning star(y) 3 y=x] & planet(x)]
b. 3x[morning star(x) &
Vy[morning star(y) 3 y=x] & Mary knows that planet(x)]
The unpacking in (16a) is what is called the narrow scope or de dicto (roughly, about
the words) interpretation of (15).The proposition that Mary is said to know involves he
semantic content that the object in question is the star seen in the morning.The unpacking
in (16b) is called the wide scope or de re (roughly, about the thing) interpretation of (15).
It attributes to Mary knowledge concerning a certain entity, but not under any particular
description of that entity. The short answer to the question of how Russell's analysis
solves the problem of failure of substitutivity in propositional attitude contexts is that,
since there are no referring constituents in the sentence after its analysis, there is nothing
to substitute anything for. However Russell acknowledged that one could, in English,
make a verbal substitution of one definite description for a corcfcrcntial one, but only on
the wide scope, or de re interpretation. If we consider a slightly more dramatic example
than (15), we can see that there seems to be some foundation for Russell's prediction of
ambiguity for propositional attitude sentences. Observe (17):
(17) Oedipus wanted to marry his mother.
Our first reaction to this sentence is probably to think that it is false - after all, when
Oedipus found out that he had married his mother, he was very upset.This reaction is to
the narrow scope, or de dicto reading of the sentence which attributes to Oedipus a desire
which involves being married to specifically his mother and which is false. However there
is another way to take the sentence according to which it seems to be true: there was a
woman, Jocasta, who happened to be Oedipus's mother and whom he wanted to marry.
This second interpretation is the wide scope, or de re, reading of (17), according to which
Oedipus has a desire concerning a particular individual, but where the individual is not
identified for the purposes of the desire itself by any description.
4.4. Empty NPs
Recall our initial illustration of Russell's analysis of definite descriptions, repeated here.
(18) a. The present king of France is bald.
b. 3x[king-of-France(x) & Vy[king-of-France(y) 3 y=x] & bald(x)]
60
I. Foundations of semantics
The example shows Russell's solution to the problem of empty NPs. While for Frege such
sentences have a failed presupposition and lack a truth value, under Russell's analysis
they assert the existence of the entity in question, and are therefore simply false.
Furthermore Russell's analysis of definite descriptions solves the more pressing problem of
empty NPs in existence sentences. Under his analysis The round square does not exist
would be analyzed as in (19)
(19) ~3x[round(x) & square(x) & Vy[[round(y) & square(y)] 3 y=x]]
which is meaningful and true.
4.5. Proper names
Russell's view of proper names was very similar to Fregc's view; he held that they
are abbreviations for definite descriptions (which might vary from person to person)
and thus that they have semantic content in addition to, or more properly in lieu of, a
reference (cf. Russell 1917).
4.6. Referring and denoting
As we have seen, for Russell, definite descriptions are not referring expressions, though
he did describe them as denoting. For Russell almost any NP is a denoting phrase,
including, e.g. every hat and nobody. One might ask what, if any, expressions were
genuine referring expressions for Russell. Ultimately he held that only a demonstrative like
this, used demonstratively, would meet the criterion, since only such an expression could
guarantee a referent. These were the only true proper names, in his view. (See Russell
1917: 216 and fn. 5). It seems clear that we often use language to convey information
about individual entities; if Russell's analysis is correct, it means that the propositions
containing that information must almost always be inferred rather than being directly
expressed. Russell's analysis of definite descriptions (though not of proper names) has
been defended at length by Neale (1990).
5. Strawson's objections to Russell
Russell's paper "On Denoting" stood without opposition for close to 50 years, but in
1950 P.F. Strawson's classic reply "On Referring" appeared. Strawson had two major
objections to Russell's analysis - his neglect of the indexicality of definite descriptions
like the king of France, and his claim that sentences with definite descriptions in them
were used to assert the existence of a reference for the description. Let us look at each
of these more closely.
5.1. Indexicality
Indcxical expressions arc those whose reference depends in part on aspects of the
utterance context, and thus may vary depending on context. Obvious examples are pronouns
like / and you, and adverbs like here, and yesterday. Such expressions make vivid the
4. Reference: Foundational issues
61
difference between a sentence and the use of a sentence to make a statement - a
difference which may be ignored for logical purposes (given that mathematical truths are non-
indexical) but whose importance in natural language was stressed by Strawson. Strawson
pointed out that a definite description like the king of France could have been used at
different past times to refer to different people - Louis XV in 1750, but Louis XVI in
1770, for example. Hence he held that it was a mistake to speak of expressions as
referring; instead we can only speak of using an expression to refer on a particular occasion.
This is the pragmatic view of reference that was mentioned at the outset of this article.
Russell lived long enough to publish a rather tart response, "Mr. Strawson on referring",
in which he pointed out that the problem of indexicality was independent of the
problems of reference which were his main concern in "On denoting". However indexicality
does raise interesting and relevant issues, and we return to it below, in section 7. (See also
article 61 (Schlenker) Indexicality and de se.)
5.2. Empty NPs and presupposition
The remainder of Strawson's paper was primarily concerned with arguing that Russell's
analysis of definite descriptions was wrong in its implication that sentences containing
them would be used to assert the existence (and uniqueness) of entities meeting their
descriptive content. Instead, he said, a person using such a sentence would only imply
"in a special sense of'imply'" (Strawson 1950:330) that such an entity exists. (Two years
later he introduced the term presuppose for this special sense of imply, Strawson 1952:
175.) And in cases where there is no such entity - that is for sentences with empty NPs,
like The king of France is bald as uttered in 1950 - one could not make either a true or
a false statement. The question of truth or falsity, in such cases, simply does not arise.
(See article 91 (Beaver & Geurts) Presupposition.) We can see that Strawson's position
on empty NPs is very much the same as Frege's, although Strawson did not appear to be
familiar with Frege's work on the subject.
6. Donnellan's attributive-referential distinction
In 1966 Keith Donnellan challenged Russell's theory of definite descriptions, as well as
Strawson's commentary on that theory. He argued that both Russell and Strawson had
failed to notice that there are two distinct uses of definite descriptions.
6.1. The basic distinction
When one uses a description in the attributive way in an assertion, one "states
something about whoever or whatever is the so-and-so" (Donnellan 1966:285). This use
corresponds pretty well to Russell's theory, and in this case, the description is an essential
part of the thought being expressed. The main novelty was Donnellan's claim of a
distinct referential use of definite descriptions. Here one "uses the description to enable
his audience to pick out whom or what he is talking about and states something about
that person or thing" (Donnellan 1966:285). In this case the description used is simply a
device for getting one's addressee to recognize whom or what one is talking about, and
62
I. Foundations of semantics
is not an essential part of the utterance. Donnellan used the example in (20) to illustrate
his distinction.
(20) Smith's murderer is insane.
For an example of the attributive use, imagine the police detective at a gruesome crime
scene, thinking that whoever could have murdered dear old Smith in such a brutal way
would have to have been insane. For a referential use, we might imagine that Jones has
been charged with the murder and that everybody is pretty sure he is the guilty party.
He behaves very strangely during the trial, and an onlooker utters (20) by way of
predicating insanity of Jones. The two uses involve different presuppositions: on the
attributive use there is a presupposition that the description has a reference (or denotation in
Russell's terms), but on the referential use the speaker presupposes more specifically of
a particular entity (Jones, in our example) that it is the one meeting the description. Note
though, that a speaker can know who or what a definite description denotes and still use
that description attributively. For example I might be well acquainted with the dean of
my college, but when I advise my student, who has a grievance against the chair of the
department, by asserting (21),
(21) Take this issue to the dean of the college.
I use the phrase the dean of the college attributively. I mean to convey the thought that
the dean's office is the one appropriate for the issue, regardless of who happens to be
dean at the current time.
6.2. Contentious issues
While it is generally agreed that an attributive-referential distinction exists, there have
been several points of dispute. The most crucial one is the status of the distinction -
whether it is semantic or pragmatic (something about which Donnellan himself seemed
unsure).
6.2.1. A pragmatic analysis
Donnellan had claimed that, on the referential use, a speaker can succeed in referring to
an entity which does not meet the description used, and can make a true statement in so
doing.The speaker who used (20) referentially to make a claim about Jones, for instance,
would have said something true if Jones was indeed insane whether or not he murdered
Smith. Kripke (1977) used this aspect to argue that Donnellan's distinction is nothing
more than the difference between speaker's reference (the referential use) and semantic
reference (the attributive use). He pointed out that similar kinds of misuses or errors can
arise with proper names, for which Donnellan's distinction, if viewed as semantic, could
not be invoked (see below, section 8). Kripke argued further that since the same kind of
attributive-referential difference in use of definite descriptions would arise in a language
stipulated to be Russellian - that is, in which the only interpretation for definite
descriptions was that proposed by Russell in "On denoting" - the fact that it occurs in English
4. Reference: Foundational issues
63
does not argue that English is not Russellian, and thus does not argue that the distinction
is semantic. (See Reimer 1998 for a reply.)
6.2.2. A semantic analysis
On the other hand David Kaplan (1978) (among others, but cf. Salmon 2004) noted the
similarity of Donnellan's distinction to the de dicto/de re ambiguity which occurs in prop-
ositional attitude sentences. Kaplan suggested an analysis on which referential uses are
involved in the expression of Russellian singular propositions. Suppose Jones is Smith's
murderer; then the referential use of (20) expresses the singular proposition consisting
of Jones himself plus the property of being insane. (In suggesting this analysis Kaplan
was rejecting Donnellan's claim about the possibility of making true statements about
misdescribed entities. Others have also adopted this revised view of the referential use,
e.g. Wettstein 1983, Reimer 1998.) Kaplan's analysis of the referential understanding is
similar to the analysis of the complement of a prepositional attitude verb when it is
interpreted de re, while the ordinary Russellian analysis seems to match the analysis of
the complement interpreted de dido. However the two understandings of a sentence like
(20), without a propositional attitude predicate, always have the same truth value while,
as wc have seen, in the context of a sentence about someone's propositional attitude, the
two interpretations can result in a difference in truth value. In explaining his analysis,
Kaplan likened the referential use of definite descriptions to demonstrative NPs. This
brings us back to the topic of indexicality.
7. Kaplan's theory of indexicality
A major contribution to our understanding of reference and indexicality came with
Kaplan's (1989) classic, but mistitled, article "Demonstratives". The title should have
been "Indexicals". Acknowledging the error, Kaplan distinguished pure indexicals like
/ and tomorrow, which do not require an accompanying indication of the intended
reference, from demonstrative indexicals, e.g. this, that book, whose uses do require such
an indication (or demonstration, as Kaplan dubbed it). The paper itself was equally
concerned with both subcategories.
7.1. Content vs. character
The most important contribution of Kaplan's paper was his distinction between two
elements of meaning - the content of an expression and its character.
7.1.1. Content
The content of the utterance of a sentence is the proposition it expresses. Assuming com-
positionality, this content is determined by the contents of the expressions which go to
make up the uttered sentence. The existence of indexicals means that the content of an
utterance is not determined simply by the expressions in it, but also by the context of
utterance. Thus different utterances of, e.g., (22)
64
I. Foundations of semantics
(22) I like that.
will determine different propositions depending who is speaking, the time they are
speaking, and what they are pointing at or otherwise indicating, and would vary depending
on these parameters. In each case the proposition involved would be, on Kaplan's view
(following Russell), a singular proposition.
7.1.2. Character
Although on Kaplan's view the contribution of indcxicals to propositional content is
limited to their reference, they do have a meaning: /, for example, has a meaning involving
the concept of being the speaker. These latter types of linguistically encoded meaning
are what Kaplan referred to as "character". In general, the character of an expression is
a function which, given a context of utterance, returns the content of that expression in
that context. In a way Kaplan is showing that Frcgc's concept of sense actually needs to
be subdivided into these two elements of character and content. (Cf. Kaplan 1989, fn. 26.)
7.2. An application of the distinction
Indcxicals, both pure and demonstrative, have a variable character - their character
determines different contents in different contexts of utterance. However the content
so determined is constant (an actual entity, on Kaplan's view). Using the distinction
between character and content, Kaplan is able to explain why (23)
(23) I am here now.
is in a sense necessary, but in another sense contingent. Its character is such that anyone
uttering (23) would be making a true statement, but the content determined on any such
occasion would be a contingent proposition. For instance if I were to utter (23) now, my
utterance would determine the singular proposition containing me, my office; 2:50 pm
on January 18,2007; and the relation of being which relates an entity, a place, and a time.
That proposition is true at the actual world, but false in many others.
7.3. The problem of the essential indexical
Perry (1979), following Castaneda (1968), pointed out that indexicality seems to pose a
special problem in propositional attitude sentences. Suppose Mary, a ballet dancer, has
an accident resulting in amnesia. She sees a film of herself dancing, but docs not
recognize herself. As a result of seeing the film she comes to believe, de re, of the person in the
film (i.e. herself) that that person is a good dancer. Still she lacks the knowledge that it is
she herself who is a good dancer. As of now we have no way of representing this missing
piece of knowledge. Perry proposed recognizing belief states, in addition to the
propositions which are the objects of belief, in order to solve this problem; Mary grasps the
proposition, but does not grasp it in the first person way. Lewis (1979) proposed instead
viewing belief as attribution to oneself of a property; he termed this "belief de se". Belief
concerning a nonindexical proposition would then be self-attribution of the property of
4. Reference: Foundational issues
65
belonging to a possible world where that proposition was true. Mary has the latter kind
of belief with respect to the proposition that she is a good dancer, but does not (yet)
attribute to herself good dancing capability.
7.4. Other kinds of NP
As we have seen, Strawson pointed out that some definite descriptions which would
not ordinarily be thought of as indexical can have an element of indexicality to them.
An indexical definite description like the (present) king of France would, on Kaplan's
analysis, have both variable character and variable content. As uttered in 1770, for
example, it would yield a function whose value in any possible world is whoever is
king of France in 1770. That would be Louis XVI in the actual world, but other
individuals in other worlds depending on contingent facts about French history. As
uttered in 1950 the definite description has no reference in the actual world, but does
in other possible worlds (since it is not a necessary fact that France is a republic and
not a monarchy in 1950 - the French Revolution might have failed). A nonindexical
definite description like the inventor of bifocals has a constant character but variable
content. That is, in any context of utterance its content is the function from possible
worlds that picks out whoever it is who invented bifocals in that world. For examples
of NPs with constant character and constant content, we must turn to the category of
proper names.
8. Proper names and Kripke's return to Millian nondescriptionality
It may be said that in asserting that proper names have denotation without connotation,
Mill captured our ordinary pre-theoretic intuition.That is, it seems intuitively clear that
proper names do not incorporate or express any properties like having taught Alexander
the Great or having invented bifocals. On the other hand we can also understand why
both Frege and Russell would be driven to the view that, despite this intuition, they do
express some kind of property.That is because the alternative would be the direct
reference, or Fido-Fido view, and the two kinds of problems that we saw arising for that view
arise for proper names as well as definite descriptions. Thus identity sentences of the
form a = b arc informative with proper names as they arc with definite descriptions, as
exemplified in (24).
(24) a. Mark Twain is Mark Twain.
b. Samuel Clemens is Mark Twain.
Intuitively (24b) conveys information over and above that conveyed by (24a). Similarly
exchanging co-rcfcrcntial proper names in propositional attitude sentences can seem to
change truth value.
(25) a. Mary knows that Mark Twain wrote Tom Sawyer.
b. Mary knows that Samuel Clemens wrote Tom Sawyer.
We can well imagine someone named Mary for whom (25a) would be true yet for
whom (25b) would seem false. Furthermore there arc many proper names which arc
66
I. Foundations of semantics
non-referential, and for which negative identity sentences like (26) would seem true and
not meaningless.
(26) Santa Claus does not exist.
If proper names have a sense, or are otherwise equivalent to definite descriptions, then
some or all of these problems are solved. Thus it was an important development when
Kripke argued for a return to Mill's view on proper names. But before we get to that, we
should briefly review a kind of weakened description view of proper names.
8.1. The 'cluster' view
Both Wittgenstein (1953) andSearle (1958) argued for a view of proper names according
to which they are associated semantically with a cluster of descriptions - something like
a disjunction of properties commonly associated with the bearer of the name.
Wittgenstein's example used the name Moses, and he suggested that there is a variety of
descriptions, such as "the man who led the Israelites through the wilderness", "the man who as a
child was taken out of the Nile by Pharaoh's daughter", which may give meaning to the
name or support its use (Wittgenstein 1953, §79). No single description is assumed to give
the meaning of the name. However, as Searle noted, on this view it would be necessarily
true that Moses had at least one of the properties commonly attributed to him (Searle
1958:172).
8.2. The return to Mill's view
In January of 1970 Saul Kripke gave an important series of lectures titled "Naming and
necessity" which were published in an anthology in 1972 and in 1980 reissued as a book.
In these lectures Kripke argued against both the Russell-Frege view of proper names as
abbreviated definite descriptions and the Wittgenstein-Searle view of proper names as
associated semantically with a cluster of descriptions, and in favor of a return to Mill's
nondescriptional view of proper names. Others had come to the same conclusion (e.g.
Marcus 1961, Donnellan 1972), but Kripke's defense of the nondescriptional view was
the most thorough and influential.The heart of Kripke's argument depends on intuitions
about the reference of expressions in alternative possible worlds. These intuitions
indicate a clear difference in behavior between proper names and definite descriptions. A
definite description like the student of Plato who taught Alexander the Great refers to
Aristotle in the actual world, but had circumstances been different - had Xenocrates
rather than Aristotle taught Alexander the Great - then the student of Plato who taught
Alexander the Great would refer to Xenocrates, and not to Aristotle. Proper names, on
the other hand, do not vary their reference from world to world. Kripke dubbed them
"rigid designators".Thus sentences like (27) seem true to us.
(27) Aristotle might not have taught Alexander the Great.
Furthermore, Kripke pointed out, a sentence like (27) would be true no matter what
contingent property description is substituted for the predicate. In fact something like
(28) seems to be true:
4. Reference: Foundational issues
67
(28) Aristotle might have had none of the properties commonly attributed to him.
But the truth of (28) seems inconsistent with both the Frege-Russell definite description
view of proper names and the Wittgenstein-Searle cluster view. On the other hand a
sentence like (29) seems false.
(29) Aristotle might not have been Aristotle.
This supports Kripke's claim of rigid designation for proper names; since the name
Aristotle must designate the same individual in any possible world, there is no possible
world in which that individual is not Aristotle. And thus, to put things in Kaplan's terms,
proper names have both constant character and constant content.
8.3. Natural kind terms
Although it goes beyond our focus on NPs, it is worth mentioning that Kripke extended
his theory of nondescriptionality to at least some common nouns - those naming species
of plants or animal, like elm and tiger, as well as those for well-defined naturally
occurring substances or phenomena, such as gold and heat, and some adjectives like loud, and
red. In this Kripke's views differed from Mill, but were quite similar to those put
forward by Putnam (1975). Putnam's most famous thought experiment involved imagining
a "twin earth" which is identical to our earth except that the clear, colorless, odorless
substance which falls from the sky as rain and fills the lakes and rivers, and which is called
water by twin-English speaking twin earthlings, is not H20 but instead a complex
compound whose chemical formula Putnam abbreviates XYZ. Putnam argues that although
Oscar, on earth and Oscar2 on twin earth arc exactly the same mentally when they think
"I would like a glass of water", nevertheless the contents of their thoughts are different.
His famous conclusion:" 'Meanings'just ain't in the head" (Putnam 1975:227;see Segal
2000 for an opposing view.)
8.4. Summary
Let us take stock of the situation. We saw that the simplest theory of reference, the
Fido-Fido or direct reference theory, had problems with accounting for the apparent
semantic inequivalence of coreferential NPs - the fact that true identity statements
could be informative, and that exchanging coreferential NPs in propositional attitude
contexts could even result in a change in truth value. This theory also had a problem
with non-referring or empty NPs, a problem which became particularly acute in
the case of true negative existence statements. Frege's theory of sense seemed to
solve most of these problems, and Russell's analysis of definite descriptions seemed to
solve all of them. However, though the theories of Frcgc and Russell arc plausible
for definite descriptions, as Kripke made clear they do not seem to work well for
proper names, for which the direct reference theory is much more plausible. But the
same two groups of problems - those involving co-referential NPs and those involving
empty NPs - arise for proper names just as they do for definite descriptions. Of these
problems, the one involving substituting coreferential NPs in propositional attitude
68
I. Foundations of semantics
contexts has attracted the most attention. (See also article 60 (Swanson) Propositional
attitudes.)
9. Propositional attitude contexts
Kripke's arguments for a return to Mill's view of proper names have generally been found
to be convincing (although exceptions will be noted below).This appears to leave us with
the failure of substitutivity of coreferential names in propositional attitude contexts.
However Kripke (1979) argued that the problem was not actually one of substitutivity, but a
more fundamental problem in the attribution of propositional attitudes.
9.1. The Pierre and Peter puzzles
Kripke's initial example involved a young Frenchman, Pierre, who when young came to
believe on the basis of postcards and other indirect evidence that London was a beautiful
city. He would sincerely assert (30) whenever asked.
(30) Londres et jolie.
Eventually, however, he was kidnapped and transported to a very bad section of London,
and learned English by the direct method. His circumstances did not allow him to explore
the city (which he did not associate with the city he knew as Londres), and thus based on
his part of town, he would assert (31).
(31) London is not pretty.
The question Kripke presses us to answer is that posed in (32):
(32) Does Pierre, or does he not, believe that London is pretty?
An alternative, monolingual, version of the puzzle involves Peter, who has heard of
Paderewski the pianist, and Paderewski the Polish statesman, but who does not know
that they were the same person and who is furthermore inclined to believe that anyone
musically inclined would never go into politics.The question is (33):
(33) Docs Peter, or docs he not, believe that Paderewski had musical talent?
Kripke seems to indicate that these questions do not have answers:"...our normal
practices of interpretation and attribution of belief are subjected to the greatest possible
strain, perhaps to the point of breakdown. So is the notion of the content of someone's
assertion, the proposition it expresses" (Kripke 1979: 269; italics in original). Others,
however, have not been deterred from answering Kripke's questions in (32) and (33).
9.2. Proposed solutions
Many solutions to the problem of propositional attitude attribution have been proposed.
Wc will look here at several of the more common kinds of approaches.
4. Reference: Foundational issues
69
9.2.1. Metalinguistic approaches
Metalinguistic approaches to the problem involve linguistic expressions as components
of belief in one way or another. Quine (1956) had suggested the possibility of viewing
propositional attitudes as relations to sentences rather than propositions. This would
solve the problem of Pierre, but would seem to leave Peter's problem, given that we
have a single name Paderewski in English (but see Fiengo & May 1998). Others (Bach
1987, Katz 2001) have put forward metalinguistic theories of proper names, rejecting
Kripke's arguments for their nondescriptionality.The idea here is that a name N means
something like "the bearer of AT. This again would seem to solve the Pierre puzzle
(Pierre believes that the bearer of Londres is pretty, but not the bearer of London),
but not Peter's Paderewski problem. Bach argues that (33) would need contextual
supplementation to be a complete question about Peter's beliefs (see Bach 1987:
165ff).
9.2.2. Hidden indexical theories
The remaining two groups of theories are consistent with Kripke's nondescriptional
analysis of proper names. Hidden indexical theories involve postulating an unmen-
tioned (or hidden) element in belief attributions, which is "a mode of presentation" of
the proposition, belief in which is being attributed. (Cf. Schiffer 1992, Crimmins & Perry
1989.) Thus belief is viewed as a three-place relation, involving a believer, a
proposition believed, and a mode of presentation of that proposition. Furthermore these modes
of presentation are like indexicals in that different ones may be invoked in different
contexts of utterance. The answer to (32) or (33) could be cither Yes or No, depending
upon which kind of mode of presentation was understood. The approach of Richard
(1990) is similar, except that the third element is intended as a translation of a mental
representation of the subject of the propositional attitude verb.
9.2.3. Pragmatic theories
Our third kind of approach is similar to the hidden indexical theories in
recognizing modes of presentation. However the verb believe (like other propositional attitude
verbs) is seen as expressing a two-place relation between a believer and a proposition,
and no particular mode of presentation is entailed. Instead, this relation is defined in
such a way as to entail only that there is at least one mode of presentation under which
the proposition in question is believed. (Cf. Salmon 1986.) This kind of theory would
answer either (32) or (33) with a simple Yes since there is at least one mode of
presentation under which Pierre believes that London is pretty, and at least one under which
Peter believes that Paderewski had musical talent. A pragmatic explanation is offered
for our tendency to answer No to (32) on the basis of Pierre's English assertion London
is not pretty.
10.Indefinite descriptions
We turn now to indefinite descriptions - NPs which in English begin with the indefinite
article a/an.
70
I. Foundations of semantics
10.1. Indefinite descriptions are not referring expressions
As we saw above, Russell did not view definite descriptions as referring expressions, so it
will come as no surprise that he was even more emphatic about indefinite descriptions. He
had several arguments for this view (cf. Russell 1919:167ff). Consider his example in (34).
(34) I met a man.
Suppose that the speaker of (34) had met Mr. Jones, and that that meeting constituted
her grounds for uttering (34). In that case, were a man referential, it would have to refer
to Mr. Jones. Nevertheless someone who did not know Jones at all could easily have a full
understanding of (34). And were the speaker of (34) to add (35) to her utterance
(35) .. .but it wasn't Jones.
she would not be contradicting herself (though of course she would be lying, under the
circumstances). On the other hand if it should turn out that the speaker of (34) did not
meet Jones after all, but did meet some other man, it would be very hard to regard (34)
as false. Russell's arguments have been reiterated and augmented by Ludlow & Neale
(1991).
10.2. Indefinite descriptions are referring expressions
Since Russell's time others (e.g. Strawson 1952) have argued that indefinite descriptions
do indeed have referring uses. The clearest kinds of cases are ones in which chains of
reference occur, as in (36) (from Chastain 1975:202).
(36) A man was sitting underneath a tree eating peanuts. A squirrel came by, and the
man fed it some peanuts.
Both a man and a squirrel in (36) seem to be coreferential with subsequent expressions
that many people would consider to be referring - if not the man, then at least it.
Chastain argues that there is no reason to deny rcfcrcntiality to the indefinite NPs which
initiate such chains of reference, and that indeed, that is where the subsequent
expressions acquired their referents. It should be noted, though, that if serving as antecedent
for one or more pronouns is considered adequate evidence for referentiality, then
overtly quantificational NPs should also be considered referential, as shown in (37).
(37) a. Everybody who came to my party had a good time. They all thanked me
afterward,
b. Most people don't like apples.They only eat them for their health.
10.3. Parallels between indefinite and definite descriptions
Another relevant consideration is the fact that indefinite descriptions seem to parallel
definite descriptions in several ways.They show an ambiguity similar to the de dicto-de re
ambiguity in propositional attitude contexts, as shown in (38).
4. Reference: Foundational issues
71
(38) Mary wants to interview a diplomat.
(38) could mean either that there is a particular diplomat whom Mary is planning
to interview (where a diplomat has wide scope corresponding to the de re reading for
definite descriptions), or that she wants to interview some diplomat or other - say to
boost the prestige of her newspaper. (This reading, where a diplomat has narrow scope
with respect to the verb wants, corresponds to the de dicto reading of definite
descriptions.) Neither of these readings entails the other - either could be true while the other
is false. Furthermore indefinite descriptions participate in a duality of usage, the specific-
nonspecific ambiguity (see article 42 (von Heusinger) Specificity), which is very similar
to Donncllan's referential-attributive ambiguity for definite descriptions. Thus while
the indefinites in Chastain's example above are most naturally taken specifically, the
indefinite in (39) must be taken nonspecifically unless something further is added (since
otherwise the request would be infelicitous).
(39) Please hand me a pencil.
(See also Fodor & Sag 1982.) In casual speech specific uses of indefinite descriptions can
be unambiguously paraphrased using non-demonstrative this, as in (40).
(40) This man was sitting underneath a tree eating peanuts.
However non-demonstrative this cannot be substituted for the non-specific a in (39)
without causing anomaly. So if at least some occurrences of definite descriptions are
viewed as referential, these parallels provide an argument that the corresponding
occurrences of indefinite descriptions should also be so viewed. (Devitt 2004 argues in favor
of this conclusion.)
10.4. Discourse semantics
More recently approaches to semantics have been developed which provide an
interpretation for sentences in succession, or discourses. Initially developed independently
by Hcim (1982) and Kamp (1981), these approaches treat both definite and indefinite
descriptions as similar in some ways to quantificational terms and in some ways to
referring expressions such as proper names. Indefinite descriptions introduce new discourse
entities, and subsequent references, whether achieved with pronouns or with definite
descriptions, add information about those entities. (See article 37 (Kamp & Reyle)
Discourse Representation Theory, article 38 (Dckkcr) Dynamic semantics.)
10.5. Another puzzle about belief
We noted above that indefinite descriptions participate in scope ambiguities in proposi-
tional attitude contexts.The example below in (41) was introduced by Geach (1967), who
argued that it raises a new problem of interpretation.
(41) Hob thinks a witch has blighted Bob's marc, and Nob wonders whether she (the
same witch) killed Cob's sow.
72
I. Foundations of semantics
Neither the ordinary wide scope or narrow scope interpretation is correct for (41). The
wide scope interpretation (there is a witch such that...) would entail the existence of a
witch, which does not seem to be required for the truth of (41). On the other hand the
narrow scope interpretation (Hob thinks that there is a witch such that...) would fail to
capture the identity between Hob's witch and Nob's witch. This problem, which Geach
referred to as one of "intentional identity", like many others, has remained unsolved.
11. Summary
As we have seen, opinions concerning referentiality vary widely, from Russell's position
on which almost no NPs are referential to a view on which almost any NP has at least
some referential uses.The differences may seem inconsequential, but they are central to
issues surrounding the relations among language, thought, and communication - issues
such as the extent to which we can represent the propositional attitudes of others in our
speech, and even the extent to which our own thoughts are encoded in the sentences we
utter, as opposed to being inferred from hints provided by our utterances.
12. References
Bach, Kent 1987. Thought and Reference. Oxford: Oxford University Press.
Carnap, Rudolf 1956. Meaning and Necessity: A Study in Semantics and Modal Logic. 2nd edn.
Chicago, IL:The University of Chicago Press.
Castaneda, Hector-Neri 1968. On the logic of attributions of self knowledge to others. Journal of
Philosophy 65,439^156.
Chastain, Charles 1975. Reference and context. In: K. Gunderson (ed.). Minnesota Studies in the
Philosophy of Science, vol. 7: Language Mind and Knowledge. Minneapolis, MN: University of
Minnesota Press, 194-269.
Crinunins, Mark & John Perry 1989. The prince and the phone booth. Journal of Philosophy 86,
685-711.
Davidson, Donald 1968. On saying that. Synthese 19,130-146. Reprinted in: D. L. Davidson.
Inquiries into Truth and Interpretation. Oxford: Clarendon Press, 1984,93-108.
Devitt, Michael 2004.The case for referential descriptions. In: M. Reimer & A. Bezuidenhout (eds.).
Descriptions and Beyond. Oxford: Clarendon Press, 280-305.
Donnellan, Keith S. 1966. Reference and definite descriptions. Philosophical Review 77,281-304.
Donnellan, Keith S. 1972. Proper names and identifying descriptions. In: D. Davidson & G. Harm an
(eds.). Semantics of Natural Language. Dordrecht: Reidel, 356-379.
Fiengo, Robert & Robert May 1998. Names and expressions. Journal of Philosophy 95,377-409.
Fodor, Janet D. & Ivan Sag 1982. Referential and quantificational indefinites. Linguistics &
Philosophy 5,355-398.
Frcgc, Gottlob 1892. Ubcr Sinn und Bedcutung. Zeitschrift fur Philosophie und philosophische
Kritik, 25-50. English Translation in: P. Geach & M. Black (eds.). Translations from the
Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78.
Geach, Peter T 1967. Intentional identity. Journal of Philosophy 64,627-632.
Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation.
University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms.
Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Grocncndijk, T M.V.
Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam:
Mathematical Centre, 277-322. Reprint in: J. Groenendijk, T.M.V. Janssen & M. Stokhof (eds.).
Truth, Interpretation and information: Selected Papers from the Third Amsterdam Colloquium.
Dordrecht: Foris, 1984,1-41.
4. Reference: Foundational issues
73
Kaplan, David 1978. Dthat. In: P. Cole (ed.). Syntax and Semantics 9: Pragmatics. New York:
Academic Press, 221-243.
Kaplan, David 1989. Demonstratives: An essay on the semantics, logic, metaphysics, and episte-
mology of demonstratives and other indexicals. In: J. Almog, J. Perry & H. Wettstein (eds.).
Themes from Kaplan. Oxford: Oxford University Press, 481-563.
Katz, Jerrold J. 2001. The end of Millianism: Multiple bearers, improper names, and compositional
meaning. Journal of Philosophy 98,137-166.
Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of
Natural Language. Dordrecht: Reidel, 253-355 and 763-769. Reissued separately with Preface,
Cambridge, MA: Harvard University Press, 1980.
Kripke, Saul 1977. Speaker's reference and semantic reference. In: P. A. French, T. E. Uehling,
Jr. & H. Wettstein (eds.). Midwest Studies in Philosophy, vol. II: Studies in the Philosophy of
Language. Morris, MN: University of Minnesota, 255-276.
Kripke, Saul 1979. A puzzle about belief. In: A. Margalit (ed.). Meaning and Use. Dordrecht: Reidel,
139-183.
Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural
Language. Dordrecht: Reidel, 169-218.
Lewis, David. 1979. Attitudes de dicto and de se. Philosophical Review 88,513-543.
Ludlow, Peter & Stephen Neale 1991. Indefinite descriptions. Linguistics & Philosophy 14,
171-202.
Marcus, Ruth Barcan 1961. Modalities and intensional languages. Synthese 13,303-322.
Mill, John Stuart 1843. A System of Logic, Ratiocinative and Inductive, Being a Connected View
of the Principles of Evidence, and the Methods of Scientific Investigation. London: John
W. Parker.
Montague, Richard 1973. The proper treatment of quantification inordinary English. In: J. Hintikka,
J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language: Proceedings of the 1970
Stanford Workshop on Grammar and Semantics. Dordrecht: Reidel, 221-242. Reprinted in:
R. Thomason (ed.). Formal Philosophy: Selected Papers of Richard Montague. New Haven,
CT:Yale University Press, 1974,247-270.
Neale, Stephen 1990. Descriptions. Cambridge. MA:The MIT Press.
Perry, John 1979. The problem of the essential indcxical. Nous 13,3-21.
Putnam, Hilary 1975. The meaning of 'meaning'. In: K. Gunderson (ed.). Language, Mind and
Knowledge. Minneapolis, MN: University of Minnesota Press. Reprinted in: H. Putnam.
Philosophical Papers, vol. 2: Mind, Language and Reality. Cambridge: Cambridge University Press,
1975,215-271.
Quine.Willaid van Orman 1956. Quantifiers and propositional attitudes. Journal of Philosophy 53,
177-187.
Reimer, Marga 1998. Donnellan's distinction/Kripke's test. Analysis 58,89-100.
Richard, Mark 1990. Propositional Attitudes. Cambridge: Cambridge University Press.
Russell, Bertrand 1905. On denoting. Mind 14,479-493.
Russell, Bertrand 1917. Knowledge by acquaintance and knowledge by description. Reprinted
in: B. Russell. Mysticism and Logic. Paperback edn. Garden City, NY: Doubleday, 1957,
202-224.
Russell, Bertrand 1919. Introduction to mathematical philosophy. London: Allen & Unwin. Reissued
n.d., New York: Touchstone Books.
Russell, Bertrand 1957. Mr. Strawson on referring. Mind 66,385-389.
Salmon, Nathan 1986. Frege's Puzzle. Cambridge, MA:The MIT Press.
Salmon, Nathan 2004. The good, the bad, and the ugly. In: M. Reimer & A. Bezuidenhout (eds.).
Descriptions and Beyond. Oxford: Clarendon Press, 230-260.
Schiffer, Stephen 1992. Belief ascription. Journal of Philosophy 89,499-521.
Searle, John R. 1958. Proper names. Mind 67,166-173.
Segal, Gabriel M. 2000. A Slim Book about Narrow Content. Cambridge, MA:The MIT Press.
74
I. Foundations of semantics
Stalnaker, Robert C. 1984. Inquiry. Cambridge, MA:The MIT Press.
Stalnaker, Robert C. 1999. Context and Content. Oxford: Oxford University Press.
Strawson, Peter F. 1950. On referring. Mind 59,320-344.
Strawson. Peter F. 1952. Introduction to Logical Theory. London: Methuen.
Wettstein, Howard K. 1983. Tlie semantic significance of the referential-attributive distinction.
Philosophical Studies 44,187-196.
Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated into English by G.E.M.
Ansconibe. Oxford: Blackwell.
Barbara Abbott, Lake Leelanau, MI (USA)
5. Meaning in language use
1. Overview
2. Deixis
3. The Gricean revolution: The relation of utterance meaning to sentence meaning
4. Implications for word meaning
5. The nature of context and the relationship of pragmatics to semantics
6. Summary
7. References
Abstract
In a speech community, meaning attaches to linguistic forms through the ways in
which speakers use those forms, intending and expecting to communicate with their
interlocutors. Grice's (1957) insight that conventional linguistic meaning amounts to the
expectation by members of a speech community that hearers will recognize speakers'
intentions in saying what they say the way they say it. This enabled him to sketch how this
related to lexical meaning and presupposition, and (in more detail) implied meaning. The
first substantive section of this article briefly recapitulates the work of Bar-Hillel (1954)
on indexicals, leading to the conclusion that even definite descriptions have an indexical
component. Section 3 describes Grice's account of the relation of intention to intensions
and takes up the notion of illocutionary force. Section 4 explores the implications of the
meaning-use relationship for the determination of word meanings. Section 5 touches briefly
on the consequences of the centrality of communication for the nature of context and the
relation of context and pragmatic considerations to formal semantic accounts.
1. Overview
At the beginning of the 20th century a tension existed between those who took
languages to be representable as formal systems with complete truth-conditional semantics
(Carnap, Russell, Frege,Tarski) - and who viewed natural language as rather defective
in that regard, and those who took the contextual use of language to be
determinative of the meanings of its forms (Austin, Strawson, the later Wittgenstein). (See also
article 3 (Tcxtor) Sense and reference.) For a very readable account of this intellectual
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 74-95
5. Meaning in language use 75
battleground,see Recanati (2004).The debate has been largely resolved with acceptance
(not always explicitly recognized) of the views of Bar-Hillel and Grice on the character
of the human use of language.
In a speech community, meaning attaches to linguistic forms through the ways in which
speakers use those forms, intending and expecting to communicate with their
interlocutors. This is immediately obvious in the case of the reference of indexical terms like / and
you, and the determination of the illocutionary force of utterances. It extends also to
matters of reference generally, implicature, coherence of texts and discourse understanding,
as well as to the role of linguistic forms in maintaining social relations (politeness).
Grice's (1957) insight that conventional linguistic meaning amounts to the expectation
by members of a speech community that hearers will recognize speakers' intentions in
saying what they say the way they say it enabled him to sketch how this related to lexical
meaning and presupposition, and (in more detail) implied meaning. While the view
presented here has its origins in Grice's insight, it is considerably informed by elaborations
and extensions of it over the past 40 years.
2. Deixis
Bar-Hillel (1954) demonstrated that it is not linguistic forms that carry pragmatic
information, but the facts of their utterance, and this notion was elaborated in Stalnaker
(1972). Bar-Hillel claimed that indexicality is an inherent and unavoidable aspect of
natural language, speculating that more than 90% of the declarative sentences humans utter
have use-dependent meanings in that they involve implicit references to the speaker,
the addressee and/or the speech time. The interpretation of first and second person
pronouns, tenses, and deictic adverbials are only the tip of the iceberg. A whole host of other
relational terms (not to mention deictic and anaphoric third-person references, and even
illocutionary intentions) require an understanding of the speaker's frame of reference
for interpretation.To demonstrate, it is an elementary observation that utterances of
sentences like those in (1) require an indication of when and where the sentence was uttered
to be understood enough to judge whether they are true or false.
(1) a. I am hungry,
b. It's raining.
In fact, strictly speaking, the speaker's intention is just as important as location in space-
time of the speech act.Thus, (lb) can be intended to refer to the location of the speaker
at speech-time, or to some location (like the location of a sports event being viewed on
television) that the speaker believes to be salient in the mind of the addressee, and will
be evaluated as true or false accordingly. Of course, the reference of past and future
tenses is also a function of the context in which they are uttered, referring to times that
are a function of the time of utterance.Thus, if I utter (2a) at t(), I mean that I am hungry
at t0; if I utter (2b) at t0,1 mean that I was hungry at some point before t0, and if I say (2c)
at t0,1 mean that I will be hungry at some point after t0.
(2) a. I am hungry.
b. I was hungry.
c. I will be hungry.
76
I. Foundations of semantics
Although (2a) indicates a unique time, (2b) and (2c) refer very vaguely to some time
before or after t0; nothing in the utterance specifies whether it is on the order of
minutes, days, weeks, years, decades, or millenia distant. Furthermore, it is not clear whether
the time indicated by (2a) is a moment or an interval of indefinite duration which
includes the moment of utterance. See McCawley (1971),Dowty (1979),Partee (1973), and
Hinrichs (1986) for discussion of some issues and ramifications of the alternatives.
Although Bar-Hillel never refers to the notion of intention, he seems to have
recognized in discussing the deictic this that indexicals are multiply indeterminate, despite
being linked to the context of utterance: "'This' is used to call attention to something
in the centre of the field of vision of its producer, but, of course, also to something in
his spatial neighborhood, even if not in his centre of vision or not in his field of vision
at all, or to some thing or some event or some situation, etc., mentioned by himself or
by somebody else in utterances preceding his utterance, and in many more ways" (Bar-
Hillel 1954:373).The indexical this can thus be intended (a) deictically to refer to
something gesturally indicated, (b) anaphorically to refer to a just completed bit of discourse,
(c) cataphorically, to refer to a bit of discourse that will follow directly, or (d) figuratively,
to refer to something evoked by whatever is indicated and deictically referred to (e.g.,
to evoke the content of a book by indicating an image of its dust jacket), as in (3), where
this refers not to a part of a photograph that is gesturally indicated, or to the dust jacket
on which it appears, but to the text of the book the dust jacket was designed to protect.
(3) Oh, I've read this!
Similarly, there is an indexical component to the interpretation of connectives and
relational adverbials in that their contribution to the meaning of an utterance involves
determining what bit of preceding discourse they are intended to connect whatever follows
them to, as illustrated in (4).
(4) a. Therefore, Socrates is mortal.
b. For that reason, wc oppose this legislation.
In addition to such primary indexicals - linguistic elements whose interpretation is
bound to the circumstances of their utterance, there are several classes of expressions
whose interpretation is directly bound to the interpretation of such primary indexicals.
Partee (1989) discussed covert pronominals,for example, local as in (5).
(5) Dan Rather went to a local bar.
Is the bar local relative to Rather's characteristic location? to his location at speech-
time? to the location of the speaker at speech time? to the characteristic location of the
speaker?
In addition, as has long been acknowledged in references to "the universe of
discourse," interpretation of the definite article invokes indexical reference; the
uniqueness presupposition associated with use of the definite article amounts to a belief by the
speaker that the intended referent of the definite NP is salient to the addressee. Indeed,
insofar as the interpretation of ordinary kind names varies with the expectations about
the universe of discourse that the speaker imputes to the addressee (Nunberg 1978), this
5. Meaning in language use 77
larger class of indexicals encompasses the immense class of descriptive terms, including
cat, mat, window, hamburger, red, and the like as in (6).
(6) a. The blond hamburger spilled her coffee,
b. The cat shattered next to the mat.
(Some of the conclusions of Nunberg (1978) are summarized in Nunberg (1979), but the
latter work does not give an adequate representation of Nunberg's explanation of how
the use of words by speakers in contexts facilitates reference. For example, Nunberg
(1979) does not contain accounts of either the unextractability of interpretation from
context (and the non-existence of null contexts), or the notion of a system of normal
beliefs in a speech community. Yet both notions are central to understanding Nunberg's
arguments for his conclusions about polysemy and interpretation.)
The important point here is that with these secondary indexicals (and in fact, even
with the so-called primary indexicals, Nunberg 1993), interpretation is not a simple
matter of observing properties of the speech situation (source of the speech sound,
calendric time, etc.), but involves making judgements about possible mappings from
signal to referent (essentially the same inferential processes as are involved in
disambiguation - cf. Green (1995)). Bar-Hillel (1954) and Morgan (1978) pointed out that
the interpretation of demonstratives involves choosing from among many (perhaps
indefinitely many) objects the speaker may have been pointing at, as well as what
distinct class or individual the speaker may have intended to refer to by indicating that
object (as pointed out by Nunberg 1978).Thus, the interpretation of indexicals involves
choices from the elements of a set (possibly an infinite set) that is limited differently
in each speech situation. For further discussion, see also article 90 (Diessel) Deixis and
demonstratives.
Bar-Hillel's observations on the nature of indexicals form the background for the
development of context-dependent theories of semantics by Stalnaker (1970), Kamp
(1981), Heim (1982) and others. See also articles 37 (Kamp & Reyle) Discourse
Representation Theory and 38 (Dekker) Dynamic semantics. Starting from a very different set
of considerations, Grice (1957) outlined how a speaker's act of using a linguistic form to
communicate so-called literal meaning makes critical reference to speaker and hearer,
with far-reaching consequences. Section 3 takes up this account in detail.
3. The Gricean revolution: The relation of utterance meaning
to sentence meaning
3.1. Meaning according to Grice
Grice (1957) explored the fact that we use the same word (mean) for what an event
entails, what a speaker intends to communicate by uttering something, and what a
linguistic expression denotes. How these three kinds of meaning are related was illuminated
by his later (and regrettably widely misunderstood) work on implicaturc (sec Ncalc 1992
for discussion). Grice (1957) defined natural meaning (meaning^) as natural
entailment: the subject of mean refers to an event or state, as in (7).
(7) Red spots on your body means you have measles.
78
I. Foundations of semantics
He reserved the notion non-natural meaning (meaningNN) for signification by
convention: the subject refers to an agent or a linguistic instrument, as in (8).
(8) a. By "my better half", John meant his wife.
b. In thieves' cant, "trouble and strife" means 'wife'.
Thus, Grice treated the linguistic meaning of expressions as strictly conventional. He
declined to use the word sign for it as he equated sign with symptom, which characterizes
natural meaning. He showed that non-natural meaning cannot be reduced via a behav-
iorist causal strategy to 'has a tendency to produce in an audience a certain attitude,
dependent on conditioning,' as that would not distinguish non-natural meaning from
natural meaning, and in addition, would fail to distinguish connotation from denotation.
He argued that the causal account can characterize so-called "standard" meanings, but
not meaning on an occasion of use, although meaning on an occasion of use is just what
should explain standard meaning, especially on a causal account. Grice also rejected the
notion that X meantvv Y is equivalent to 'the speaker S intended the expression X to
make the addressee H believe Y', because that would include in addition to the
conventional meaning of a linguistic expression, an agent's utterly nonconventional
manipulation of states and events. Finally, he ruled out interpretations of X meantvv Y as 'the
speaker intended X to make H believe Y, and to recognize S's intention', because it still
includes in the same class as conventional linguistic meaning any intentional,
nonconventional acts whose intention is intended to be recognized - such as presenting the head
of St. John the Baptist on a platter. Grice supported instead a theory where A meantm
something by A'entails that (a) an agent intended the utterance of X to induce a belief or
intention in H, and (b) the agent intended X to be recognized as intended to induce that
belief/intention.This formulation implies that A does not believe (a) will succeed without
recognition of A's intention. Grice's argument begins by showing that in directive cases
such as getting someone to leave, or getting a driver to stop a car (examples which do not
crucially involve language), the intended effect must be the sort of thing that is within
the control of the addressee. Thus, X meansNN something amounts to 'people intend X
to induce some belief/intention P by virtue of H taking his recognition of the intention
to produce effect P as a reason to believe/intend P'. Grice pointed out that only primary
intentions need to be recognized: the utterer/agent is held to intend what is normally
conveyed by the utterance or a consequence of the act, and ambiguous cases are resolved
with evidence from the context that bears on identifying a plausible intention.
Grice's (1975) account of communicated meaning is a natural extension of his (1957)
account of meaning in general (cf. also Green 1990, Neale 1992), in that it characterizes
meaning as inherently intentional: recognizing an agent's intention is essential to
recognizing what act she is performing (i.e., what she meantNN by her act). Grice's reference
to the accepted purpose or direction of the talk exchange (Grice 1975: 45) in the
characterization of the Cooperative Principle implies that speaker and hearer are constantly
involved (usually not consciously) in interpreting what each other's goals must be in
saying what they say. Disambiguating structurally or lexically ambiguous expressions like
old men and women, or ear (i.e., of corn, or to hear with), inferring what referent a speaker
intends to be picked out from her use of a definite noun phrase like the coffee place, and
inferring what a speaker meant to implicate by an utterance that might seem unnecessary
or irrelevant all depend equally on the assumptions that the speaker did intend something
5. Meaning in language use 79
to be conveyed by her utterance that was sufficiently specific for the goal of the utterance,
that she intended the addressee to recognize this intention, and by means of recognizing
the intention, to recognize what the speaker intended to be conveyed.
In semantics, as intended, the idea that the act of saying something communicates
more than just what is said allowed researchers to distinguish constant, truth-conditional
meanings that are associated by arbitrary convention with linguistic forms from aspects
of understanding that are a function of a meaning being conveyed in a particular context
(by whatever means). This in turn enabled syntacticians to abandon the hopeless quest
for hidden structures whose analysis would predict non-truth-conditional meanings, and
to concentrate on articulating syntactic theories that were compatible with theories of
compositional semantics, articulating the details of the relation between form and
conventional, truth-conditional meaning. Finally, it inspired a prodigious amount of research
in language behavior (e.g., studies of rhetoric and politeness), where it has, unfortunately,
been widely misconstrued.
The domain of the principles described in Grice (1975) is actually much broader than
usually understood: all intentional use of language, whether literal or not, and regardless
of purpose. That is, Grice intended a broad rather than a narrow interpretation of the
term conversation. Grice's view of the overarching Cooperative Principle: "Make your
conversational contribution such as is required, at the stage at which it occurs, by the
accepted purpose or direction of the talk exchange in which you are engaged" (Grice
1975: 45) is in fact that it is just the linguistic reflex of a more general principle which
governs, in fact, defines, rational behavior: behavior intended to accomplish or enable the
achievement of some purpose or goal. Insofar as these are notions universally
attributable to human beings, such a principle should be universally applicable with regard to
language use. Supposed counterexamples have not held up; for discussion, see Keenan
(1976), Prince (1983), Green (1990). Thus, the Cooperative Principle will figure in the
interpretation of language generally, not just clever talk, and in fact, will figure in the
interpretation of behavior generally, whether communicative or not. Additional facets
of this perspective are discussed in articles 91 (Beaver & Geurts) Presupposition, and 92
(Simons) Implicature.
3.2. The Cooperative Principle
Since Grice was explicit about the Cooperative Principle not being restricted to
linguistic acts, and because the imperative formulation has led to so much
misunderstanding, it is useful to rephrase it more generally, and declaratively, where it amounts
to this:
Individuals act in accordance with their goals.
Grice described four categories (Quantity, Quality, Relevance and Manner) of special
cases of this principle, that is, applications of it to particular kinds of requirements, and
gave examples of their application in both linguistic and non-linguistic domains. It is
instructive to translate these into general, declarative formulations as well:
An agent will do as much as is required for the achievement of the current goal.
(QUANTITY I)
80
I. Foundations of semantics
An agent will not do more than is required. (QUANTITY II)
Agents will not deceive co-agents. (QUALITY)
Consequently, an agent will try to make any assertion one that is true.
(I) An agent will not say what she believes to be false.
(II) An agent will not say that for which she lacks adequate evidence.
An agent's action will be relevant to and relative to an intention of the agent. (RELATION)
An agent will make her actions perspicuous to others who share a joint intention.
(MANNER)
(I) Agents will not disguise actions from co-agents. Consequently, agents will not
speak obscurely in attempting to communicate.
(II) Agents will act so that intentions they intend to communicate are unambiguously
reconstructible.
(III) Agents will spend no more energy on actions than is necessary.
(IV) Agents will execute sub-parts of a plan in an order that will maximize the
perceived likelihood of achieving the goal.
The status of the maxims as just special cases of the Cooperative Principle implies that
they are not (contra Lycan 1984: 75) logical consequences (corollaries), because they
don't follow as necessary consequences in all possible worlds. Nor are they
additional stipulations, or an exhaustive list of special cases. Likewise, the maxims do not
constitute the Cooperative Principle, as some writers have thought (e.g., Sperber & Wilson
1986: 36). On the contrary, the Cooperative Principle is a very general principle which
determines, depending on the values shared by participants, any number of maxims
instantiating ways of conforming to it.
The maxims are not rules or norms that are taught or learned, as some writers would
have it (e.g., Pratt 1981: 11, Brown & Yule 1983:32, Blum-Kulka & Olshtain 1986:175,
Allwood, Anderson & Dahl 1977: 37, Ruhl 1989: 96, Sperber & Wilson 1986: 162). See
Green (1990) for discussion. Rather, they are just particular ways of acting in accordance
with one's goals; all other things being equal, conforming to the Cooperative Principle
involves conforming to all of them. When you can't conform to all of them, as Grice
discusses, you do the best you can.
The premise of the Cooperative Principle, that individuals act in accordance with
their goals is what allows Grice to refer to the Cooperative Principle as a definition of
what it means to be rational (Grice 1975:45,47,48—49), and the maxims as principles that
willy-nilly govern interpersonal human behavior. If X's goal is to get Y to do some act
A, or believe some proposition P, it follows that X will want to speak in such a way that
A or P is clearly identifiable (Maxims of Quality, Quantity, Manner), and X will not say
things that will distract Y from getting the point (Maxim of Relevance). Furthermore,
most likely, X will not want to antagonize Y (everybody's Maxim of Politeness). Cohen
& Levesque (1991) suggest that their analysis of joint intention enables one to
understand "the social contract implicit in engaging in a dialogue in terms of the conversants'
jointly intending to make themselves understood, and to understand the other" (Cohen
& Levesque 1991:509), observing that this would predict the back-channel
comprehension checks that pervade dialogue, as "means to attain the states of mutual belief that
discharge this joint intention of understanding" (Cohen & Levesque 1991:509).
5. Meaning in language use 81_
The characterization of the Cooperative Principle as the assumption that speakers
act rationally (i.e., in accordance with their goals) makes a variety of predictions about
the interpretation of behavior. First of all, it predicts that people will try to interpret
weird, surprising, or unanticipated behavior as serving some unexpected goal before
they discount it as irrational. The tenacity with which we assume that speakers observe
the Cooperative Principle, and in particular the maxim of relevance was illustrated
in Green (1990). That is, speakers assume that other speakers do what they do, say
what they say, on purpose, intentionally, and for a reason (cf. Brown & Levinson 1978:
63). In other words, they assume that speech behavior, and indeed, all behavior that
isn't involuntary, is goal-directed. Speakers "know" the maxims as strategies and
tactics for efficiently achieving goals, especially through speech. A person's behavior will
be interpreted as conforming to the maxims, even when it appears not to, because
of the assumption of rationality (goal-directedness). Hearers will make inferences
about the world or the speaker, or both, whenever novel (that is, previously unas-
sumed) propositions have to be introduced into the context to make the assumption
of goal-directedness and the assumption of knowledge of the strategies consistent with
the behavior. Implicatures arise when the hearer additionally infers that the speaker
intended those inferences to be made, and intended that intention to be recognized (cf.
article 92 (Simons) Implicature). If no such propositions can be imagined, the speaker
will be judged irrational, but irrationality will consist in believing the unbelievable, or
believing something unfathomable, not in having imperfect knowledge of the maxims,
or in not observing the maxims.
Only our imagination limits the goals, and beliefs about the addressee's goals, that we
might attribute to the speaker. If we reject the assumption that the speaker is irrational,
then we must at the very least assume that there is some goal to which his utterance is
relevant in the given context, even if we can't imagine what it is.
Another illustration of the persistence of the assumption that utterances arc acts
executed in accordance with a plan to achieve a goal: even if we know that a sequence
of sentences was produced by a computer, making choices entirely at random within the
constraints of some grammar provided to it, if the sentences can be construed as
connected and produced in the service of a single goal, it is hard not to understand them that
way. That is why output like (9) from random sentence generators frequently produces
giggles.
(9) a. Sandy called the dog.
b. Sandy touched the dog.
c. Sandy wanted the dog.
d. The dog arrived.
e. The dog asked for Kim.
One further point requires discussion here. Researchers eager to challenge or to apply
a Griccan perspective have often failed to appreciate how important it is that discourse
is considered purposive behavior. Grice presumes that participants have goals in
participating (apparently since otherwise they wouldn't be participating). This is the gist of his
remark that "each participant recognizes in [talk exchanges], to some extent, a common
purpose or set of purposes, or at least a mutually accepted direction" (Grice 1975: 45).
82
I. Foundations of semantics
This is perhaps the most misunderstood passage in "Logic and Conversation". Grice is
very vague about these purposes: how many there are, how shared they have to be. With
decades of hindsight, we can see that the purposes are first of all not unique. Conversants
typically have hierarchically embedded goals.
Second, goals are not so much shared or mutual, as they are mutually modelled
(Cohen & Perrault 1979, Cohen & Levesque 1980, Green 1982, Appelt 1985, Cohen &
Levesque 1990, Perrault 1990): for George to understand Martha's utterance of "X" to
George, George must have beliefs about Martha which include Martha's purpose in
uttering "X" to George, which in turn subsumes Martha's model of George, including
George's model of Martha, etc. Grice's assertion (1975: 48) that "in characteristic talk-
exchanges, there is a common aim even if [...] it is a second-order one, namely, that each
party should, for the time being, identify himself with the transitory conversational
interests of the other" is an underappreciated expression of this view.The idea that participants
will at least temporarily identify with each other's interests, i.e., infer what each other is
trying to do, is what allows quarrels, monologues, and the like to be included in the talk
exchanges that the Cooperative Principle governs. Grice actually cited quarrels and letter-
writing as instances that did not fit an interpretation of the Cooperative Principle that
he rejected (Grice 1975: 48), vitiating critiques by Pratt (1981) and Schauber & Spolsky
(1986).The participants may have different values and agendas, but given Grice's (1957)
characterization of conventional meaning, for any communication to occur, each must
make assumptions about the other's goals, at least the low-lcvcl communicative goals.
This is the sense in which participants recognize a "common goal". When the
assumptions participants make about each other's goals are incorrect, and this affects non-trivial
beliefs about each other, we say they are "talking at cross-purposes", precisely because of
the mismatch between actual goals and beliefs, and attributed goals and beliefs.
Interpreting the communicative behavior of other human beings as intentional, and
as relevant, necessary, and sufficient for the achievement of some presumed goal seems
to be unavoidable. As Gould (1991:60) noted, in quite a different context, "humans are
pattern-seeking animals. We must find cause and meaning in all events." This is, of course,
equally true of behavior that isn't intended as communicative. Crucially, we do not seem
to entertain the idea that someone might be acting for no reason.That alternative, along
with the possibility that he isn't even doing what we perceive him to be doing, that it's
just happening to him, is one that we seem reluctant to accept without any independent
support, such as knowledge that people do that sort of thing as a nervous habit, like
playing with their hair.
Thus, "cooperative" in the sense of the Cooperative Principle does not entail each
party accepting all of their interlocutor's goals as their own and helping to achieve them.
Rather, it is most usefully understood as meaning no more - and no less - than 'trying
to understand the interaction from the other participants' 'point of view', i.e., trying to
understand what their goals and assumptions must be. When Grice refers to the
Cooperative Principle as "rational", it is just this assumption that actions are undertaken to
accomplish goals that he has in mind.
3.3. Illocutionary intentions
The second half of the 20th century saw focusscd investigation of speech acts in general,
and illocutionary aspects of meaning in particular, preeminently in the work of Austin
5. Meaning in language use 83
(1962),Searle (1969) and Bach & Harnish (1979). Early on, the view within linguistics
was that (i) illocutionary force was indicated by performative verbs (Austin 1962) or
other Illocutionary-Force-Indicating-Devices (IFIDs,Stampe 1975) such as intonation or
"markers" like preverbal please), and that (ii) where illocutionary force was ambiguous
or unclear, that was because the performative clause prefixed to the sentence (Lakoff
1968, Ross 1970, Fraser 1974, Sadock 1974) was "abstract" and invisible. This issue has
not been much discussed since the mid-1970s, and Dennis Stampe's (1975) conclusion
that so-called performative verbs are really constative, so that the appearance of perfor-
mativity is an inference from the act of utterance may now be the default assumption.
It makes pcrformativity more a matter of the illocutionary intentions of the speaker
than of the classification of visible or invisible markers of illocutionary force. The term
"illocutionary intentions" is meant to include the relatively large but limited set of
illocutionary forces described and classified by Austin (1962) and Searle (1969) and many
others (stating, requesting, promising and the like). So, expositives are utterances which
a speaker (S) makes with the intention that the addressee (A) recognize S's intention
that A believe that S believes their content. Promises are made with the intention that A
recognize S's intention that A believe that S will be responsible for making their content
true. Interrogatives are uttered with the belief that A will recognize S's intention that A
provide information which S indicates.
From the hearer's point of view such illocutionary intentions do not seem significantly
different from intentions about the essentially unlimited class of possible pcrlocutionary
effects that a speaker might intend an addressee to recognize as intended on an occasion
of use. So, reminders are expositives uttered with the intention that A will recognize that
S believes A has believed what S wants A to believe. Warnings are utterances the uttering
of which is intended to be recognized as intended to inform A that some imminent or
contingent state of affairs will be bad for A. An insult is uttered with the intention that
A recognize S's intent to convey S's view of A as having properties normally believed to
be bad. Both kinds of intentions figure in conditions on the use of linguistic expressions
of various sorts. For example, there are a whole host of phrasal verbs (many meaning
roughly 'go away') such as butt out, that can be described as directive-polarity
expressions (instantiating, reporting, or evoking directives), as illustrated in (10), where the
asterisk prefixed to an expression indicates that it is ungrammatical.
(10) a. Butt out!
b. They may butt out.
c. *Why do they butt out?
d. They were asked to butt out.
e. They refused to butt out.
f. Why don't you butt out?
But there are also expressions whose distribution depends on mental states of the
speaker regarding the addressee that are not matters of illocutionary force. Often it
is an intention to produce a particular pcrlocutionary effect, as with threat-polarity
expressions like Or else!, exemplified in (11).
(11) a. Get out, or else!
b. *I want to get out, or else!
84
I. Foundations of semantics
c. *Who's on first, or else?
d. They said we had to get out, or else.
e. They knew they had to get out, or else.
f. They were surprised that we had to get out, or else.
Sometimes, however, the relevant mental state of the speaker docs not involve
intentions at all. This is the case with ignorance-polarity constructions and idioms which are
acceptable only in contexts where a relevant being (typically the speaker) is ignorant of
the answer to an indicated question, as in (12) (cf. Horn 1972,1978).
(12) a. Where the hell is it?
b. I couldn't figure out where the hell it was.
c. *We went back to the place where the hell we left it.
d. *I knew where the hell it would be found.
It is possible to represent this kind of selection formally, say, in terms of conjoined and/or
nested propositions representing speaker intentions and speaker and addressee beliefs
about intentions, actions, existence, knowledge and values. But it is not the linguistic
sign which is the indicator of illocutionary intentions, it is the act of uttering it. An
utterance is a warning just in case S intends A to recognize the uttering of it as intended to
cause A to recognize that some situation may result in a state which A would consider
bad. In principle, such conditions could be referenced as constraints in the lexical entries
for restricted forms. A threat-polarity item like or else! presupposes a warning context,
with the additional information that the speaker intends the addressee to recognize (that
the speaker intends the addressee to recognize) that if the warning is not heeded, some
individual or situation will be responsible for some state of affairs that endangers the
referent of the subject of the clause to which or else is appended.
Illocutionary force, like dcixis, is an aspect of meaning that is fairly directly derivative
of the use of a linguistic expression. At the other end of the spectrum, lexical meaning,
which appears much more concrete and fixed, has also been argued to depend on the sort
of social contract that the Cooperative Principle engenders, as explicated by Nunberg
(1978,2004). This is the topic of Section 4.
4. Implications for word meaning
In general, to the extent that we are able to understand each other, it is because we
all use language in accordance with the Cooperative Principle. This entails (cf. Grice
1957) that we will only use a referential term when we believe that our addressee will
be able to identify our intended referent from our reference to it by that term in that
context, and will believe that we intended him to do so. But the bottom line is that the
task of the addressee is to deduce what sense the speaker most likely intended, and he
will use all available clues to do so, without regard to whether they come from within
or outside the immediate sentence or bit of discourse at hand. This means that even
so-called literal meanings have an indexical character in depending on the speaker's
ability to infer correctly what an addressee will assume a term is intended to refer to on
an occasion of use (cf. Nunberg 1978). Even the sense of predicative lexical items in an
utterance is not fixed by or in a linguistic system,but can only be deduced in connection
5. Meaning in language use 85
with assumptions about the speaker's beliefs about the knowledge and expectations of
the addressee.
To illustrate, as has been often noted (cf. Ruhl 1989, Green 1998), practically any word
can be used to denote an almost limitless variety of kinds of objects or functions: in
addition to referring to a fruit, or a defective automobile, lemon might refer to the wood of
the lemon tree, as in (13a), to the flavor of the juice of the fruit (13b), to the oil from the
peel of the fruit (13c), to an object which has the color of the fruit (13d), to something the
size of the fruit (13e), and to a substance with the flavor of the fruit (13f).These are only
the most obvious uses from an apparently indefinitely large set.
(13) a. Lemon has an attractive grain, much finer than beech or cherry.
b. I prefer the '74 because the '73 has a lemon aftertaste.
c. Lemon will not penetrate as fast as linseed.
d. The lemon is too stretchy, but the coral has a snag in it.
e. Shape the dough into little lemons, and let rise.
f. Two scoops of lemon, please, and one of Rocky Road.
The idea that what a word can be used to refer to might vary indefinitely is clearly
unsettling. It makes the fact that we (seem to) understand each other most of the time
something of a miracle, and it makes the familiar, comfortable Conduit Theory of
communication (critiqued in Reddy 1979), according to which speakers encode ideas in
words and sentences and send them to addressees to decode, quite irrational. But the
conclusion that as language users we are free to use any word to refer to anything at all,
any time we want is unwarranted. Lexical choice is always subject to the pragmatic
constraint that we have to consider how likely it is that our intended audience will be able to
correctly identify our intended referent from our use of the expression we choose. What
would really be irrational would be using a word to refer to anything other than what we
estimate our addressee is likely to take it to refer to, because it would be self-defeating.
Thus, in spite of all the apparent freedom afforded to language users, rationality severely
limits what a speaker is likely to use a term to refer to in a given context. Since people
assume that people's actions are goal-directed (so that any act will be assumed to have
been performed for a reason), a speaker must be assumed to believe that, all things
considered, the word she chooses is the best word to further her goals in its context and with
respect to her addressee.
Speakers frequently exploit the freedom they have, within the bounds of this
constraint, referring to movies as turkeys, cars as lemons, and individuals in terms of objects
associated with them, as when we say that the flute had to leave to attend his son's soccer
game, or that the corned beef spilled his beer. If this freedom makes communication
sound very difficult to effect, and very fragile, it is important to keep in mind that we are
probably less successful at it than we think we are, and generally oblivious of the work
that is required as well. But it is probably not really that fragile. Believing as an
operational principle in the convenient fiction that words have fixed meanings is what makes
using them to communicate appear to require no effort. If we were aware of how much
interpretation we depended on each other to do to understand us, we might hesitate to
speak. Instead, wc all act as if wc believe, and believe that everyone else believes, that
the denotation an individual word may have on an occasion of use is limited, somewhat
arbitrarily, as a matter of linguistic convention. Nunberg (1978), extending observations
86
I. Foundations of semantics
made by Lewis (1969), called this sort of belief a normal belief, defined so that the
relation normally-believe holds of a speech community and a proposition P when people
in that community believe that it is normal (i.e., unremarkable, to be expected) in that
community to believe P and to believe that everyone in that community believes that
it is normal in that community to believe P. See also Stalnaker (1974) and Atlas (2004).
(The term speech community, following Nunberg (1978), is not limited to geographical
or political units, or even institutionalized social units, but encompasses any group of
individuals with common interests. Thus, we all belong simultaneously to a number of
speech communities, depending on our interests and backgrounds; we might be women
and mothers and Lutherans and lawyers and football fans and racketball players, who
knit and surf the internet, and arc members of countless speech communities besides
these.) Illustrating the relevance to traditional semantic concerns of the notion of normal
belief, it is normal beliefs about cars and trucks, and about what properties of them good
old boys might find relevant that would lead someone to understand the coordination
in (14a) with narrow adjectival scope and that in (14b) with wider scope: for example,
because of the aerodynamic properties of trucks relative to cars, fast truck is almost an
oxymoron.
(14) a. The good ol'boys there drive fast cars and trucks,
b. The good ol' boys there drive red cars and trucks.
This technical use of normal belief should not be confused with other notions that
may have the same name. A normal belief in the sense intended is only remotely related
to an individual's belief about how things normally are, and only remotely related (in
a different direction) to a judgement that it is unremarkable to hold such a belief. The
beliefs that are normal within a community are those that "constitute the background
against which all utterances in that community are rationally made" (Nunberg 1978:
94-95).
Addressing the issue of using words to refer to things, properties, and events, what it
is considered normal to use a word like tack or host or rock or metal to refer to varies
with the community. These are social facts, facts about societies, and only incidentally
and contingently and secondarily facts about words. More precisely, they are facts about
what speakers believe other speakers believe about conventions for using words.Thus, it
is normal among field archaeologists to use mesh bound in frames to sift through
excavated matter for remnants of material culture, and it is normally believed among them
that this is normal, and that it is normal to refer to the sieves as screens. Likewise, among
users of personal computers, it is normally believed that the contents of a data file maybe
inspected by projecting representations of portions of it on an electronic display, and it is
normally believed that this belief is normally held, and that it is normal to refer to the
display as a screen. Whether screen is (intended to be) understood as (normally) referring to
a sort of sieve or to a video display depends on assumptions made by speaker and hearer
about the assumptions each makes about the other's beliefs, including beliefs about what
is normal in a situation of the sort being described, and about what sort of situation (each
believes the other believes) is being discussed at the moment of utterance. This is what
makes word meaning irrcducibly a matter of language use. Although different senses
of a word may sometimes have different syntactic distributions (so-called selectional
restrictions), McCawley (1968) showed that this is not so much a function of the words
5. Meaning in language use 87
themselves as it is a function of properties that language users attribute to the presumed
intended referents of the words.
Normal use is defined in terms of normal belief, and normal belief is an intensional
concept. If everybody believes that everybody believes that it is normal to believe P,
then belief in P is a normal belief, even if nobody actually believes P. In light of this, we
are led to a view of word usage in which, when a speaker rationally uses a word w to
indicate some intended individual or class a, she must assume that the addressee will
consider it rational to use w to indicate a in that context. She must assume that if she and
her addressee do not in fact have the same assumptions about what beliefs are normal
in the community-at-large, and in every relevant subgroup, at least the addressee will be
able to infer what relevant beliefs the speaker imputes to the addressee, or expects the
addressee to impute to the speaker, and so on, in order to infer the intended referent.
If we define an additional, recursive relation mutually-believe as holding among two
sentient beings A and B and a proposition when A believes the proposition, believes that
B believes the proposition, believes that B believes that A believes the proposition, and
so on (cf. Cohen & Levesque 1990), then we can articulate the notion normal meaning
(not to be confused with 'normal referent out of context' - cf. Green 1995: 14f, Green
1996:59 for discussion):
some set (or property) m is a normal meaning (or denotation) of an expression w insofar as
it is normally believed that w is used to indicate m.
A meaning m for an expression w is normal in a context insofar as speaker and addressee
mutually believe that it is normally believed that w is used to indicate m in that context.
We can then say that 'member of the species canis familiaris' is a normal meaning for the
word dog insofar as speaker and addressee mutually believe that it is normally believed
in their community that such entities are called dogs.
Some uses of referential expressions like those exemplified in (13) are not so much
abnormal or less normal than others as they are normal in a more narrowly defined
community. In cases of systematic polysemy, all the use-types (or senses), whether normal in
broadly defined or very narrowly exclusive communities, are relatable to one another in
terms of functions like 'source of,'product of,'part of,'mass of, which Nunberg (1978)
characterized as referring functions (for discussion, see Nunberg 1978,2004, Green 1996,
1998, Pelletier & Schubert 1986, Nunberg & Zaenen 1992, Copestake & Briscoe 1995,
Helmreich 1994). For example, using the word milkshake as in (15) to refer to someone
who orders a milkshake exploits the referring function 'purchaser of, and presumes a
mutual belief that it is normal for restaurant personnel to use the name of a menu item to
refer to a purchaser of that item, or more generally, for sales agents to use a description
of a purchase to refer to the purchaser.
(15) The milkshake claims you kicked her purse.
This is in addition, of course, to the mutual belief it presumes about what the larger class
of English speakers normally use milkshake to refer to, and the mutual belief that the
person identified as the claimant ordered a milkshake.
The assumption that people's actions are purposeful, so that any act will be assumed
to have been performed for a reason, is a universal normal belief - everyone believes
88
I. Foundations of semantics
it and believes that everyone believes it (cf. Green 1993). The consequence of this for
communicative acts is that people intend and expect that interpreters will attribute
particular intentions to them, so consideration of just what intention will be attributed
to speech actions must enter into rational utterance planning (cf. Green 1993, also
Sperber & Wilson 1986). This is the Gricean foundation of this theory (cf. also Neale
1992).
If the number of meanings for a given lexical term is truly indefinitely extendable (as
it appears to be), or even merely very large, it is impractical in the extreme to try to list
them. But the usual solution to the problem of representing an infinite class in a finite
(logical) space is as available here as anywhere else, at least to the extent that potential
denotations can be described in terms of composablc functions on other denotations, and
typically, this is the case (Nunberg 1978: 29-62). It is enough to know, Nunberg argues,
that if a term can be used to refer to some class X, then it can be used, given appropriate
context, to refer to objects describable by a recognizable function on X. This principle
can be invoked recursively, and applies to functions composed of other functions, and to
expressions composed of other expressions, enabling diverse uses like those for lemon in
(13) to be predicted in a principled manner.
Because the intended sense (and thence the intended referent) of an utterance of any
referential term ultimately reflects what the speaker intends the hearer to understand
from what the speaker says by recognizing that the speaker intends him to understand
that, interpreting an utterance containing a polysemic ambiguity (or indeed, any sort of
ambiguity) involves doping out the speaker's intent, just as understanding a speaker's
discourse goals does. For additional discussion,see also articles 25 (de Swart) Mismatches
and coercion and 26 (Tyler & Takahashi) Metaphors and metonymies.
5. The nature of context, and the relationship of pragmatics
to semantics
5.1. Disambiguation and interpretation
Pragmatic information is information about the speaker's mental models.
Consequently, such semantic issues as truth conditions and determination of ambiguity
are interdependent with issues of discourse interpretation, and the relevant contexts
essential to the resolution of both are not so much the surrounding words and phrases
as they are abstractions from the secular situations of utterance, filtered through the
minds of the participants (cf. Stalnaker 1974). As a result, it is unreasonable to expect
that ambiguity resolution independent of models of those situations can be
satisfactory. Linguistic pragmatics irrcducibly involves the speaker's model of the addressee,
and the hearer's model of the speaker (potentially recursively). For George to
understand Martha's utterance of "X" to him, he must not only recognize (speech
perception, parsing) that she has said "X," he must have beliefs about her which allow him to
infer what her purpose was in uttering "X," which means that he has beliefs about her
model of him, including her model of his model of her, and so on. Any of these beliefs is
liable to be incorrect at some level of granularity. George's model of Martha (and hers
of him) is more like a sketch than a photograph: there are lots of potentially relevant
things they don't know about each other, and they most likely have got a few
(potentially relevant) things wrong right off the bat as well. Consequently, even under the
5. Meaning in language use 89
best of circumstances, whatever proposition George interprets Martha's utterance to
be expressing may not exactly match the proposition she intended him to understand.
The difference, which often is not detected, may or may not matter in the grand scheme
of things. Since acts are interpreted at multiple levels of granularity, this holds for the
interpretation of acts involved in choosing words and construction types, as well as for
acts of uttering sentences containing or instantiating them. From this it follows that the
distinction between pragmatic effects called "intrusive pragmatics" (Levinson 2000)
or explicature (Sperber & Wilson 1986) and those described as implicature proper has
little necessary significance for an information-based account of language structure.
Because the computation of pragmatic effects by whatever name involves analysis of
what is undcrspccificd in the actual utterance, and how what is uttered compares to
what might reasonably have been expected, it involves importing propositions, a
process beyond the bounds of computation within a finite domain, even if supplemented
by a finite set of functions.
When a reader or hearer recognizes that an utterance is ambiguous, resolving that
ambiguity amounts to determining which interpretation was intended. When
recognizing an ambiguity affects parsing, resolving it may involve grammar and lexicon, as
for example, when it involves a form which could be construed as belonging to different
syntactic categories (e.g., The subdivision houses most of the officers vs. The subdivision
houses are very similar, or Visiting relatives is a lot of fun vs. Visiting relatives are a lot of
fun). But grammar and lexicon may not be enough to resolve such ambiguities, as in the
case of familiar examples like (16).
(16) a. I saw her duck.
b. Visiting relatives can be a lot of fun.
They will rarely suffice to resolve polysemies or attachment ambiguities like / returned
the key to the library. In all of these cases, it is necessary to reconstruct what it would
be reasonable for the speaker to have intended, given what is known or believed about
the beliefs and goals of the speaker, exactly as when seeking to understand the
relevance of an unambiguous utterance in a conversation - that is, to understand why the
speaker bothered to utter it, or to say it the way she said it.The literature on natural
language understanding contains numerous demonstrations that the determination of how
an ambiguous or vague term is intended to be understood depends on identifying the
most likely model of the speaker's beliefs and intentions about the interaction. Nunberg
(1978: 84-87) discusses the beliefs and goals that have to be attributed to him in order
for his uncle to understand what he means by jazz when he asks him if he likes jazz.
Crain & Steedman (1985), and Altmann & Steedman (1988) offer evidence that
experimentally controllable aspects of context that reflect speakers' beliefs about situations
affect processing in more predictable ways than mechanistic parsing strategies. Green
(1996:119-122) describes what is involved in identifying what is meant by IBM, at, and
seventy-one in Sandy bought IBM at 71.
At the other end of the continuum of grammatical and pragmatic uncertainty is
Sperber & Wilson's (1986: 239-241) discussion of the process of understanding irony
and sarcasm, as when one says I love people who don't signal, intending to convey'I hate
people who don't signal.' (A further step in the process is required to interpret / love
people who signal! as intended to convey the same thing; the difference is subtle, because
90
I. Foundations of semantics
the contextual conditions likely to provoke the two utterances are in a subset relation.
One might say / love people who don't signal to inform an interlocutor of one's
annoyance at someone who the speaker noticed did not signal, but / love people who signal
is likely to be used sarcastically only when the speaker believes that it is obvious to the
addressee that someone should have signaled and did not.)
Nonetheless, it may be instructive here to examine the resolution of a salient lexical
ambiguity. Understanding the officials' statement in example (17) involves comparing
how different assumptions about mutual beliefs about the situation are compatible
with different interpretations, in order to determine whether plant refers to a vegetable
organism or to the production facility of a business (or perhaps just to the apparatus
for controlling the climate within it, or maybe to the associated grounds, offices and
equipment generally), or even to some sort of a decoy.
(17) Officials at International Seed Co. becfed up security at their South
Carolina facility in the face of rumors that competitors would stop at nothing to get
specimens of a newly-engineered variety, saying, "That plant is worth $5 million."
If the interpreter supposes that what the company fears is simply theft of samples of the
organism, she will take the official as intending plant to refer to the (type of the) variety:
being able to market tokens of that type represents a potential income of $5 million.
On the other hand, if the interpreter supposes that the company fears damage to their
production facility or the property surrounding it - say, because she knows that samples
of the organism are not even located at the production facility any more, and/or that
extortionists have threatened to vandalize company property if samples of the organism
are not handed over, she is likely to take plant as intended to refer to buildings and
grounds or equipment. Believing that the company believes that potential income from
marketing the variety is many times greater than $5 million would have the same effect.
If the interpreter believes that the statement was made in the course of an interview
where a company spokesperson discussed the cost of efforts to protect against industrial
espionage, and mentioned how an elaborate decoy system had alerted them to a threat
to steal plant specimens, she might even take plant as intended to refer to a decoy. The
belief that officials believe that everyone relevant believes that both the earnings
potential of the organism, and the value of relevant structures and infrastructure are orders of
magnitude more or less than $5 million would contribute to this conclusion, and might
even suffice to induce it on its own.
Two points are relevant here. First, depending on how much of the relevant
information is salient in the context in which the utterance is being interpreted, the sentence
might not even be recognized as ambiguous. This is equally true in the case of
determining discourse intents. For example, identifying sarcastic intent is similar to
understanding the reference of an expression in that it depends on attributing to the speaker
intent to be sarcastic. (Such an inference is supported by finding a literal meaning to
be in conflict with propositions assumed to be mutually believed, but this is neither
necessary nor sufficient for interpreting an utterance as sarcastic.) Being misled in
the attribution of intent is a common sort of misunderstanding, indeed, a sort that is
likely to go undetected. These issues are discussed at length in articles 23 (Kennedy)
Ambiguity and vagueness, 38 (Dekker) Dynamic semantics, 88 (Jaszczolt) Semantics
5. Meaning in language use 91_
and pragmatics, 89 (Zimmermann) Context dependency and 94 (Potts) Conventional
implicature.
Second, there is nothing linguistic about the resolution of the lexical ambiguity in (17).
All of the knowledge that contributes to the identification of a likely intended referent
is encyclopedic or contextual knowledge of (or beliefs about) the relevant aspects of the
world, including the beliefs of relevant individuals in it (e.g., the speaker, the (presumed)
addressees of the quoted speech, the reporter of the quoted speech, and the (presumed)
addressees of the report).That disambiguation of an utterance in its context may require
encyclopedic knowledge of a presumed universe of discourse is hardly a new
observation; it has been a commonplace in the linguistics and Artificial Intelligence literature
for decades. Its pervasiveness and its significance sometimes seem to be surprisingly
underappreciated.
A similar demonstration could be made for many structural ambiguities, including
some of the ones mentioned at the beginning of this section. Insofar as language users
resolve ambiguities that they recognize by choosing the interpretation most consistent
with their model of the speaker and of the speaker's model of the world, modelling this
ability of theirs by means of probability-based grammars and lexicons (e.g., Copestake
& Briscoe 1995) is likely to provide an arbitrarily limited solution. When language users
fail to recognize ambiguities in the first place, it is surely because beliefs about the
speaker's beliefs and intentions in the context at hand which would support alternative
interpretations are not salient to them.
This view treats disambiguation, at all levels, as indistinct from interpretation, insofar
as both involve comparing an interpretation of a speaker's utterance with goals and
beliefs attributed to that speaker, and rejecting interpretations which in the context are
not plausibly relevant to the assumed joint goal for the discourse. This is a conclusion
that is unequivocally based in Gricc's seminal work (Gricc 1957,1975), and yet it is one
that he surely did not anticipate. See also Atlas (2004).
5.2. Computational modelling
Morgan (1973) showed that determining the presuppositions of an utterance depends
on beliefs attributed to relevant agents (e.g., the speaker, and agents and experiences
of propositional attitude verbs) and is not a strictly linguistic matter. Morgan's account,
and Gazdar's (1979) formalization of it, show that the presuppositions associated with
lexical items are filtered in being projected as presuppositions of the sentence of which
they form a part, by conversational implicature, among other things. (See also articles
37 (Kamp & Reyle) Discourse Representation Theory, 91 (Beaver & Geurts)
Presupposition and 92 (Simons) Implicature.) Conversational implicature, as discussed in section
3 of this article is a function of a theory of human behavior generally, not something
specifically linguistic, because it is based on inference of intentions for actions generally,
not on properties of the artifacts that are the result of linguistic actions: conversational
implicatures arise from the assumption that it is reasonable (under the particular
circumstances of the speech event in question) to expect the addressee A to infer that the
speaker S intended A to recognize S's intention from the fact that the speaker uttered
whatever she uttered. It would be naive to expect that the filtering in the projection of
presuppositions could be represented as a constraint or set of constraints on values of
92
I. Foundations of semantics
any discrete linguistic feature, precisely because conversational implicaturc is inherently
indeterminate (Grice 1975, Morgan 1973,Gazdar 1979).
Despite the limitations on the computation and attribution of assumptions, much
progress has been made in recent years on simultaneous disambiguation and parsing of
unrestricted text. To take only the example that I am most familiar with, Russell (1993)
describes a system which unifies syntactic and semantic information from partial parses
to postulate (partial) syntactic and semantic information for unfamiliar words.This
progress suggests that a promising approach to the problem of understanding unrestricted
texts would be to expand the technique to systematically include information about
contexts of utterance, especially since it can be taken for granted that words will be
encountered which arc being used in unfamiliar ways.The chief requirement for such an
enterprise is to reject (following Reddy 1979) the simplistic view of linguistic expressions
as simple conduits for thoughts, and model natural language use as action of rational
agents who treat the exchange of ideas as a joint goal, as Grice, Cohen, Perrault, and
Levesque have suggested in the articles cited. This is a nontrivial task, and if it does not
offer an immediate payoff in computational efficiency, ultimately it will surely pay off in
increased accuracy, not to mention in understanding the subtlety of communicative and
interpretive techniques.
6. Summary
Some of the conclusions outlined here are surely far beyond what Bar-Hillcl articulated in
1954, and what Grice may have had in mind in 1957 or 1968, the date of the William James
Lectures in which the Cooperative Principle was articulated. Nonetheless, our present
understanding rests on the shoulders of their work. Bar-Hillel (1954) demonstrated that it
is not linguistic forms that carry pragmatic information, but the facts of their utterance.The
interpretation of first and second person pro nouns, tenses, and deictic adverbials are only the
most superficial aspect of this. Bar-Hillel's observations on the nature of indexicals form the
background for the development of context-dependent theories of semantics by Stalnaker
(1970), Kamp( 1981 ),Hcim (1982) and others.Starting from a very different set of
considerations, Grice (1957) outlined how a speaker's act of using a linguistic form to communicate
so-called literal meaning makes critical reference to speaker and hearer, with far-reaching
consequences. Grice's (1957) insight that conventional linguistic meaning depends on
members of a speech community recognizing speakers' intentions in saying what they say the
way they say it enabled him to sketch how this related to lexical meaning and
presupposition, and (in more detail) implied meaning. While the view presented here has its origins in
Grice's insight, it is considerably informed by elaborations and extensions of it over the past
40 years. The distinction between natural meaning (entailment) and non-natural
meaning (conventional meaning) provided the background against which his theory of the
social character of communication was developed in the articulation of the Cooperative
Principle. While the interpersonal character of illocutionary acts was evident from the
first discussions, it was less obvious that lexical meaning, which appears much more
concrete and fixed, could also be argued to depend on the sort of social contract that the
Cooperative Principle engenders. Subsequent explorations into the cooperative aspects of
action generally make it reasonable to anticipate a much more integrated understanding
of the interrelations of content and context, of meaning and use, than seemed likely forty
years ago.
5. Meaning in language use 93
7. References
Allwood, Jens, Lars-Gunnar Anderson & Osten Dahl 1977. Logic in Language. Cambridge:
Cambridge University Press.
Altmann, Gerry & Mark Steedman 1988. Interaction with context during sentence processing.
Cognition 30,191-238.
Appelt, Douglas E. 1985. Planning English Sentences. Cambridge: Cambridge University Press.
Atlas, Jay David 2004. Presupposition. In: L. R. Horn & G. Ward (cds.). The Handbook of
Pragmatics. Oxford: Blackwell, 29-52.
Austin, John L. 1962. How to Do Things with Words. Cambridge, MA: Harvard University Press.
Bach. Kent & Robert M. Harnish 1979. Linguistic Communication and Speech Acts. Cambridge,
MA:The MIT Press.
Bar-Hillel, Yehoshua 1954. Indexical expressions. Mind 63,359-379.
Blum-Kulka.Shoshana & Elite Olshtain 1986.Too many words: Length of utterance and pragmatic
failure. Studies in Second Language Acquisition 8,165-180.
Brown, Gillian & George Yule 1983. Discourse Analysis. Cambridge: Cambridge University Press.
Brown, Penelope & Stephen Levinson 1978. Universals in language usage: Politeness phenomena.
In: E. Goody (ed.). Questions and Politeness: Strategies in Social Interaction. Cambridge:
Cambridge University Press, 56-311. (Expanded version published 1987 in book form as
Politeness, by Cambridge University Press.)
Cohen, Philip R. & Hector J. Levesque 1980. Speech acts and the recognition of shared plans.
Proceedings of the Third National Conference of the Canadian Society for Computational Studies of
Intelligence. Victoria, BC: University of Victoria. 263-271.
Cohen, Philip R. & Hector J. Levesque 1990. Rational interaction as the basis for communication.
In: P. Cohen, J. Morgan & M. Pollack (eds.). Intentions in Communication. Cambridge, MA:The
MIT Press,221-256.
Cohen. Philip R. & Hector J. Levesque 1991. Teamwork. Noils 25,487-512.
Cohen, Philip R. & C. Ray Peirault 1979. Elements of a plan based theory of speech acts. Cognitive
Science 3,177-212.
Copestake, Ann & Ted Briscoe 1995. Semi-productive polysemy and sense extension. Journal of
Semantics 12,15-67.
Crain, Stephen & Mark Steedman 1985. On not being led up the garden path: The use of context
by the psychological parser. In: D. R. Dowty, L. Karttunen & A. M. Zwicky (cds.). Natural
Language Parsing. Cambridge: Cambridge University Press, 320-358.
Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Reidel.
Fraser, Bruce 1974. An examination of the performative analysis. Papers in Linguistics 7,1-40.
Gazdar, Gerald 1979. Pragmatics, Implicature, Presupposition, and Logical Form. New York:
Academic Press.
Gould, Stephen J. 1991. Bully for Brontosaurus. New York: W. W. Norton.
Green, Georgia M. 1982. Linguistics and the pragmatics of language use. Poetics 11,45-76.
Green. Georgia M. 1990. On the universality of Gricean interpretation. In: K. Hall et al. (eds.).
Proceedings of the 16th Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley
Linguistics Society, 411-428.
Green, Georgia M. 1993. Rationality and Gricean Inference. Cognitive Science Technical Report
UIUC-BI-CS-93-09. Urbana. IL, University of Illinois.
Green, Georgia M. 1995. Ambiguity resolution and discourse interpretation. In: K. van Deemter &
S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications,
1-26.
Green, Georgia M. 1996. Pragmatics and Natural Language Understanding. 2nd edn. Hillsdale, NJ:
Lawrence Erlbaum Associates.
Green, Georgia M. 1998. Natural kind terms and a theory of the lexicon. In: E. Antonsen (ed.).
Studies in the Linguistic Sciences 28. Urbana, IL: Department of Linguistics, University of
Illinois, 1-26.
94
I. Foundations of semantics
Grice, H. Paul 1957. Meaning. Philosophical Review 66,377-388.
Grice, H. Paul 1975. Logic and conversation. In: P. Cole & J. L. Morgan (eds.). Syntax and Semantics
3: Speech Acts. New York: Academic Press, 41-58.
Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation,
University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms.
Hehnreich, Stephen 1994. Pragmatic Referring Functions in Montague Grammar. Ph.D.
dissertation. University of Illinois, Urbana, IL.
Hinrichs. Erhard 1986. Temporal anaphora in discourses of English. Linguistics & Philosophy 9,
63-82.
Horn, Laurence R. 1972. On the Semantic Properties of Logical Operators in English. Ph.D.
dissertation. University of California, Los Angeles, CA. Reprinted: Bloomington, IN: Indiana
University Linguistics Club. 1976.
Horn, Laurence R. 1978. Some aspects of negation. In: J. Greenberg, C. Ferguson & E. Moravcsik
(eds.). Universals of Human Language, vol. 4, Syntax. Stanford. CA: Stanford University Press,
127-210.
Kainp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T M.
V. Janssen & M. Stokhof (eds.). Formal Methods in the Study of Language. Amsterdam:
Mathematical Centre, 277-321.
Keenan. Elinor O. 1976. The universality of conversational implicature. Language in Society 5,
67-80.
Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural
Language. Dordrecht: Reidel, 253-355 and 763-769.
Lakoff, Robin L. 1968. Abstract Syntax and Latin Complementation. Cambridge, MA: The MIT
Press.
Levinson, Stephen C. 2000. Presumptive Meanings: The Theory of Generalized Conversational
Implicature. Cambridge, MA:Thc MIT Press.
Lewis, David 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University
Press.
Lycan.William G. 1984. Logical Form in Natural Language. Cambridge, MA:The MIT Press.
McCawlcy. James D. 1968. The role of semantics in a grammar. In: E. Bach & R.T. Harms (eds.).
Universals in Linguistic Theory. New York: Holt, Rinchart & Winston, 124-169.
McCawley, James D. 1971.Tense and time reference in English. In: C. Fillmore & D.T Langendoen
(eds.). Studies in Linguistic Semantics. New York: Holt, Rinehart & Winston, 97-113.
Morgan. Jerry L. 1973. Presupposition and the Representation of Meaning: Prolegomena. Ph.D.
dissertation. University of Chicago, Chicago, IL.
Morgan, Jerry L. 1978. Two types of convention in indirect speech acts. In: P. Cole (ed.). Syntax and
Semantics 9: Pragmatics. New York: Academic Press, 261-280.
Neale, Stephen 1992. Paul Grice and the philosophy of language. Language and Philosophy 15,
509-559.
Nunberg, Geoffrey 1978. The Pragmatics of Reference. Ph.D. dissertation. CUNY, New York.
Reprinted: Bloomington, IN: Indiana University Linguistics Club. (Also published by Garland
Publishing, Inc.)
Nunberg, Geoffrey 1979. The non-uniqueness of semantic solutions: Polysemy. Linguistics &
Philosophy 3,145-185.
Nunberg. Geoffrey 1993. Indexicality and deixis. Linguistics & Philosophy 16,1-44.
Nunberg. Geoffrey 2004. The pragmatics of deferred interpretation. In: L. R. Horn & G. Ward
(eds.). The Handbook of Pragmatics. Oxford: Blackwcll, 344-364.
Nunberg, Geoffrey & Annie Zaenen 1992. Systematic polysemy in lexicology and lexicography. In:
H.Toinmola, K. Varantola, & J. Schopp (eds.). Proceedings of Euralex 92. Part II. Tampere: The
University of Tampere, 387-398.
Partee, Barbara 1973. Some structural analogues between tenses and pronouns in English. Journal
of Philosophy 70,601-609.
5. Meaning in language use 95
Partee, Barbara 1989. Binding implicit variables in quantified contexts. In: C. Wiltshire, B. Music
& R. Graczyk (eds.). Papers from the 25th Regional Meeting of the Chicago Linguistic Society.
Chicago, IL: Chicago Linguistic Society, 342-365.
Pelletier, Francis J. & Lenhart K. Schubert 1986. Mass expressions. In: D. Gabbay & F. Guenthner
(eds.). Handbook of Philosophical Logic, vol. 4. Dordrecht: Reidel, 327-407.
Perrault, Ray 1990. An application of default logic to speech act theory. In: P. Cohen, J. Morgan &
M. Pollack (eds.). Intentions in Communication. Cambridge, MA:The MIT Press, 161-186.
Pratt.Mary L. 1981. The ideology of speech-act theory. Centrum (N.S.) 1:1,5-18.
Prince, Ellen 1983. Grice and Universality. Ms. Philadelphia, PA. University of Pennsylvania, http.7/
www.ling.upenn.edu/-ellen/grice.ps.August 3,2008.
Putnam, Hilary 1975. The meaning of 'meaning.' In: K. Gunderson (ed.). Language, Mind, and
Knowledge. Minneapolis, MN: University of Minnesota Press, 131-193.
Recanati, Francois 2004. Pragmatics and semantics. In: L. R. Horn & G. Ward (eds.). The Handbook
of Pragmatics. Oxford: Blackwell, 442^62.
Reddy, Michael 1979.The conduit metaphor - a case of frame conflict in our language about
language. In: A. Ortony (ed.). Metaphor and Thought. Cambridge: Cambridge University Press,
284-324.
Ross, John R. 1970. On declarative sentences. In: R. Jacobs & P. S. Roscnbaum (eds.). Readings in
English Transformational Grammar. Waltham, MA: Ginn, 222-272.
Ruhl. Charles 1989. On Monosemy. Albany, NY: SUNY Press.
Russell, Dale W. 1993. Language Acquisition in a Unification-Based Grammar Processing System
Using a Real-World Knowledge Base. Ph.D. dissertation. University of Illinois, Urbana, IL.
Sadock, Jcrrold M. 1974. Toward a Linguistic Theory of Speech Acts. New York: Academic Press.
Schauber, Ellen & Ellen Spolsky 1986. The Bounds of Interpretation. Stanford, CA: Stanford
University Press.
Seal le. John. 1969. Speech Acts. Cambridge: Cambridge University Press.
Sperber, Dan & Deirdre Wilson 1986. Relevance: Communication and Cognition. Cambridge, MA:
Harvard University Press.
Stalnaker, Robert C. 1970. Pragmatics. Synthese 22,272-289.
Stalnaker, Robert C. 1972. Pragmatics. In: D. Davidson & G. Harman (eds.). Semantics of Natural
Language. Dordrecht: Reidel, 380-397.
Stalnaker, Robert C. 1974. Pragmatic presuppositions. In: M. K. Munitz & P. K. Ungcr (eds.).
Semantics and Philosophy. New York: New York University Press, 197-214.
Stanipe, Dennis 1975. Meaning and truth in the theory of speech acts. In: P. Cole & J. L. Morgan
(eds.). Syntax and Semantics 3: Speech Acts. New York: Academic Press, 1-40.
Georgia M. Green, Urbana, IL (USA)
96
I. Foundations of semantics
6. Compositionality
1. Background
2. Grammars and semantics
3. Variants and properties of compositionality
4. Arguments in favor of compositionality
5. Arguments against compositionality
6. Problem cases
7. References
Abstract
This article is concerned with the principle of compositionality, i.e. the principle that the
meaning of a complex expression is a function of the meanings of its parts and its mode of
composition. After a brief historical background, a formal algebraic framework for syntax
and semantics is presented. In this framework, both syntactic operations and semantic
functions are (normally) partial. Using the framework, the basic idea of compositionality
is given a precise statement, and several variants, both weaker and stronger, as well as
related properties, are distinguished. Several arguments for compositionality are discussed,
and the standard arguments are found inconclusive. Also, several arguments against
compositionality, and for the claim that it is a trivial property, are discussed, and are found
to be flawed. Finally, a number of real or apparent problems for compositionality are
considered, and some solutions are proposed.
1. Background
Compositionality is a property that a language may have and may lack, namely the
property that the meaning of any complex expression is determined by the meanings of its
parts and the way they are put together. The language can be natural or formal, but it
has to be interpreted. That is, meanings, or more generally, semantic values of some sort
must be assigned to linguistic expressions, and compositionality concerns precisely the
distribution of these values.
Particular semantic analyses that are in fact compositional were given already in
antiquity, but apparently without any corresponding general conception. For instance, in
Sophist, chapters 24-26, Plato discusses subject-predicate sentences, and suggests (pretty
much) that such a sentence is true [false] if the predicate (verb) attributes to what the
subject (noun) signifies things that are [are not]. Notions that approximate the modern
concept of compositionality did emerge in medieval times. In the Indian tradition, in the
4th or 5th century CE, Sabara says that
The meaning of a sentence is based on the meaning of the words.
and this is proposed as the right interpretation of a sutra by Jaimini from sometime 3rd-
6th century BCE (cf. Houben 1997:75-76).The first to propose a general principle of this
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 96-123
6. Compositionality 97
nature in the Western tradition seems to have been Peter Abelard (2008,3.00.8) in the
first half of the 12th century, saying that
Just as a sentence materially consists in a noun and a verb, so too the understanding of it is
put together from the understandings of its parts.
(Translation by and information from Peter King 2007:8.)
Abelard's principle directly concerns only subject-predicate sentences, it concerns the
understanding process rather than meaning itself, and he is unspecific about the nature
of the putting-together operation. The high scholastic conception is different in all three
respects. In early middle 14th century John Buridan (1998,2.3,Soph. 2 Thesis 5, QM 5.14,
fol. 23vb) states what has become known as the additive principle:
The signification of a complex expression is the sum of the signification of its non-logical
terms.
(Translation by and information from Peter King 2001:4).
The additive principle, with or without the restriction to non-logical terms, appears to
have become standard during the late middle ages (for instance, in 1372, Peter of Ailly
refers to the common view that it 'belongs to the [very] notion of an expression that
every expression has parts each one of which, when separated, signifies something of
what is signified by the whole'; 1980:30).The medieval theorists apparently did not
possess the general concept of a function, and instead proposed a particular function, that
of summing (collecting). Mere collecting is inadequate, however, since the sentences
All A's are B's and All B's are A's have the same parts, hence the same collection of
part-meanings and hence by the additive principle have the same meaning.
With the development of mathematics and concern with its foundations came a
renewed interest in semantics. Gottlob Frege is generally taken to be the first person to
have formulated explicitly the notion of compositionality and to claim that it is an
essential feature of human language (although some writers have doubted that Frege really
expressed,or really believed in, compositionality;e.g. Pelletier 2001 and Janssen2001).In
"Uber Sinn und Bedeutung", 1892, he writes:
Let us assume for the time being that the sentence has a reference. If we now replace one
word of the sentence by another having the same reference, this can have no bearing upon
the reference of the sentence.
(Frege 1892: 62)
This is (a special case of) the substitution version of the idea of semantic values being
determined; if you replace parts by others with the same value, the value of the whole
doesn't change. Note that the values here are Bedeutungen (referents), such as truth
values (for sentences) and individual objects (for individual-denoting terms).
Both the substitution version and the function version (see below) were explicitly
stated by Rudolf Carnap in (1956) (for both extension and intension), and collectively
labeled 'Frege's Principle'.
The term 'compositional', was introduced by Hilary Putnam (a former student of
Carnap's) in Putman (1975a: 77), read in Oxford in 1960 but not published until in the
98
I. Foundations of semantics
collection Putnam (1975b). Putnam says "[...] the concept of a compositional mapping
should be so defined that the range of a complex sentence should depend on the ranges
of sentences of the kinds occurring in the 'derivational history' of the complex sentence."
The first use of the term in print seems to be due to Jerry Fodor (a former student of
Putnam's) and Jerrold Katz (1964), to characterize meaning and understanding in a similar
sense.
Today, compositionality is a key notion in linguistics, philosophy of language, logic,
and computer science, but there are divergent views about its exact formulation,
methodological status, and empirical significance. To begin to clarify some of these views we
need a framework for talking about compositionality that is sufficiently general to be
independent of particular theories of syntax or semantics and yet allows us to capture the
core idea behind compositionality.
2. Grammars and semantics
The function version and the substitution version of compositionality are two sides of
the same coin: that the meaning (value) of a compound expression is a function of
certain other things (other meanings (values) and a 'mode of composition'). To formulate
these versions, two things are needed: a set of structured expressions and a semantics for
them.
Structure is readily taken as algebraic structure, so that the set E of linguistic
expressions is a domain over which certain syntactic operations or rules are defined, and
moreover E is generated by these operations from a subset A of atoms (e.g. words). In the
literature there are essentially two ways of fleshing out this idea. One, which originates
with Montague (see 1974a), takes as primitive the fact that linguistic expressions are
grouped into categories or sorts, so that a syntactic rule comes with a specification of
the sorts of each argument as well as of the value. This use of many-sorted algebra as
an abstract linguistic framework is described in Janssen (1986) and Hendriks (2001).
The other approach, first made precise in Hodges (2001), is one-sorted but uses partial
algebras instead, so that rather than requiring the arguments of an operation to be of
certain sorts, the operation is simply undefined for unwanted arguments. (A many-sorted
algebra can in a straightforward way be turned into a one-sorted partial one (but not
always vice versa), and under a natural condition the sorts can be recovered in the partial
algebra (see Westerstahl 2004 for further details and discussion. Some theorists combine
partiality with primitive sorts; for example, Keenan & Stabler 2004 and Kracht 2007.)
The partial approach is in a sense simpler and more general than the many-sorted one,
and we follow it here.
Thus, let a grammar
E = (E,A,1.)
be a partial algebra, where E and A are as above and Z is a set of partial functions over
E of finite arity which generate all expressions in E from A.To illustrate, the familiar
rules
NP -» Det N (NP-rule)
S->NPVP (S-rule)
6. Compositionality 99
correspond to binary partial functions, say a, ft el, such that, if most, dog, and bark are
atoms in A, one derives as usual the sentence Most dogs bark in E, by first applying a
to most and dog, and then applying ft to the result of that and bark. These functions are
necessarily partial; for example, /J is undefined whenever its second argument is dog.
It may happen that one and the same expression can be generated in more than
one way, i.e. the grammar may allow structural ambiguity. So it is not really the
expressions in E but rather their derivation histories, or analysis trees, that should be assigned
semantic values. These derivation histories can be represented as terms in a (partial)
term algebra corresponding to E, and a valuation function is then defined from terms to
surface expressions (usually finite strings of symbols). However, to save space we shall
ignore this complication here, and formulate our definitions as if semantic values were
assigned directly to expressions. More precisely, the simplifying assumption is that each
expression is generated in a unique way from the atoms by the rules. One consequence is
that the notion of a subexpression is well-defined: the subexpressions of t are t itself and
all expressions used in the generation of t from atoms (it is fairly straightforward to lift
the uniqueness assumption, and reformulate the definitions given here so that they apply
to terms in the term algebra instead; see e.g.Westerstahl 2004 for details).
The second thing needed to talk about compositionality is a semantics for E.We take
this simply to be a function fl from a subset of E to some set M of semantic values
('meanings'). In the term algebra case,// takes grammatical terms as arguments.
Alternatively, one may take disambiguated expressions such as phrase structure markings by
means of labeled brackets. Yet another option is to have an extra syntactic level, like
Logical Form, as the semantic function domain.The choice between such alternatives is
largely irrelevant from the point of view of compositionality.
The semantic function fi is also allowed to be partial. For example, it may
represent our partial understanding of some language, or our attempts at a semantics for a
fragment of a language. Further, even a complete semantics will be partial if one wants
to maintain a distinction between meaningfulness (being in the domain of //) and
grammaticality (being derivable by the grammar rules).
No assumption is made about meanings. What matters for the abstract notion of
compositionality is not meanings as such, but synonymy, i.e. the partial equivalence relation
on £ defined by:
u = t iff /i(u),/i(0 are both defined and /i(n) = ft(t).
(Wc use s, t, u, with or without subscripts, for arbitrary members of E.)
3. Variants and properties of compositionality
3.1. Basic compositionality
Both the function version and the substitution version of compositionality can now be
easily formulated, given a grammar E and a semantics // as above.
Funct(//) For every rule a e I there is a meaning operation ra such that
if a(uv ..., un) is meaningful, then
/i(a(u„ ..., un)) = r£n(ux),.. .,/i(M/()).
100
I. Foundations of semantics
Note that Funct(//) presupposes the Domain Principle (DP): subexpressions of meaningful
expressions are also meaningful. The substitution version of compositionality is given by
Subst(=^) If s[uv . . .,i/J ands[tv . . ., fj are both meaningful expressions,and if ui=ti
for 1 < i < n, then s[uv . . ., i/J =^ s[tv . . . , /J.
The notation s[ur ..., un] indicates that s contains (not necessarily immediate) disjoint
occurrences of subexpressions among u{,. .., un, and s[tv ..., fj results from replacing
each u. by r. Restricted to immediate subexpressions Subst(= ) says that = is a partial
congruence relation:
If a(uv..., uj and a(tr ..., tn) are both meaningful and ut = f(
for 1 s/ ^n,then a(uv...,un) =fla{tv...,tn).
Under DP, this is equivalent to the unrestricted version.
Subst(= ) does not presuppose DP, and one can easily think of semantics for which DP
fails. However, a first observation is:
(1) Under DP,Funct(//) and Subst(=^) are equivalent.
That Rule(//) implies Subst(^) is obvious when Subst(=^) is restricted to immediate
subexpressions, and otherwise proved by induction over the generation complexity of
expressions. In the other direction, the operations ramust be found. For m,, . . ., mn e
A/, let ra(mv ..., mn) = fi(a(ur ..., un)) if there are expressions ui such that //(m.) = m{,
1 <, i <,n, and fj(a(ur ..., uj) is defined. Otherwise, ra(m,..., mj can be undefined (or
arbitrary).This is enough, as long as wc can be certain that the definition is independent
of the choice of the w., but that is precisely what Subst(= ) says.
The requirements of basic compositionality are in some respects not so strong, as can
be seen from the following observations:
(2) If /J gives the same meaning to all expressions, then Funct(//) holds.
(3) If H gives different meanings to all expressions, then Funct(//) holds.
(2) is of course trivial. For (3), consider Subst(= ) and observe that if no two expressions
have the same meaning, then ut = t. entails u. = t., so Subst(= ), and therefore Funct(//),
holds trivially.
3.2. Recursive semantics
The function version of compositional semantics is given by recursion over syntax, but
that does not imply that the meaning operations are defined by recursion over meaning,
in which case we have recursive semantics. Standard semantic theories are typically
both recursive and compositional, but the two notions are mutually independent. In the
recursive case wc have:
Rec(//) There is a function b and for every ore £ an operation rasuch that for every
meaningful expression s,
6. Compositionality K)l_
(b(s) if s is atomic
ra(n(ul),...,n(un),ui,...,un)ifs = a(ut,...,un)
For fj to be recursive, the basic function b and the meaning composition operation ra
must themselves be recursive, but this is not required in the function version of
compositionality. In the other direction, the presence of the expressions u{, ..., un themselves
as arguments to ra has the effect that the compositional substitution laws need not hold
(cf.Janssen 1997).
If we drop the recursiveness requirement on b and r^ Rec(//) becomes vacuous. This
is because ra(ml,...,mn,ul,...,un) can simply be defined to be fi(a(ur.. .,«„)) whenever
mi = fj(ut) for all / and a(u{,..., un) is meaningful (and undefined otherwise). Since inter-
substitution of synonymous but distinct expressions changes at least one argument of r^
no counterexample is possible.
3.3. Weaker versions
Basic (first-level) compositionality takes the meaning of a complex expression to be
determined by the meanings of the immediate subexpressions and the top-level syntactic
operation. We get a weaker version - second-level compositionality - if we require only
that the operations of the two highest levels, together with the meanings of expressions
at the second level, determine the meaning of the whole complex expression. A
possible example comes from constructions with quantified noun phrases where the
meanings of both the determiner and the restricting noun - i.e. two levels below the head of
the construction in question - are needed for semantic composition, a situation that
may occur with possessives and some reciprocals. In Peters & Westerstahl (2006, ch. 7)
and in Westerstahl (2008) it is argued that, in general, the corresponding semantics is
second-level but not (first-level) compositional.
Third-level compositionality is defined analogously, and is weaker still. In the extreme
case we have bottom-level, or weak functional compositionality, if the meaning of the
complex term is determined only by the meanings of its atomic constituents and the
entire syntactic construction (i.e. the derived operation that is extracted from a complex
expression by knocking out the atomic constituents). A function version of this becomes
somewhat cumbersome (but see Hodges 2001, sect. 5), whereas the substitution version
becomes simply:
AtSubst(= ) Just like Subst(= ) except that the w. and r are all atomic.
Although weak compositionality is not completely trivial (a language could lack the
property), it does not serve the language users very well: the meaning operation ra that
corresponds to a complex syntactic operation a cannot be predicted from its build-up
out of simpler syntactic operations and their corresponding meaning operations. Hence,
there will be infinitely many complex syntactic operations whose semantic significance
must be learned one by one.
It may be noted here that terminology concerning compositionality is somewhat
fluctuating. David Dowty (2007) calls (an approximate version of) weak functional
compositionality Frege's Principle, and refers to Funct(//) as homomorphism compositionality,
or strictly local compositionality, or context-free semantics. In Larson & Segal (1995), this
102
I. Foundations of semantics
is called strong compositionality. The labels second-level compositionality, third-level, etc.
are not standard in the literature but seem appropriate.
3.4. Stronger versions
We get stronger versions of compositionality by enlarging the domain of the semantic
function, or by placing additional restrictions on meaningfulness or on meaning
composition operations. An example of the first is Zolt£n Szab6's (2000) idea that the same
meaning operations define semantic functions in all possible human languages, not just
for all sentences in each language taken by itself. That is, whenever two languages have
the same syntactic operation, they also associate the same meaning operation with it.
An example of the second option is what Wilfrid Hodges has called the Husserl
property (going back to ideas in Husserl 1900):
(Huss) Synonymous expressions belong to the same (semantic) category.
Here the notion of category is defined in terms of substitution; say that u -^ t if, for
every s in £,s["] e dom(n) ifT s[t] e dom(/j). So (Huss) says that synonymous terms can
be inter-substituted without loss of meaningfulness. This is often a reasonable
requirement (though Hodges 2001 mentions some putative counter-examples). (Huss) also
has the consequence that Subst(s ) can be simplified to Subst,(= ), which only deals
with replacing one subexpression by another.Then one can replace n subexpressions by
applying Subst,(= ) n times; (Huss) guarantees that all the 'intermediate' expressions are
meaningful.
An example of the third kind is that of requiring the meaning composition
operations to be computable.To make this more precise we need to impose more order on the
meaningdomain,viewing meanings too asgiven by an algebra M = (A/,B,O),where flc M
is a finite set of basic meanings, O is a finite set of elementary operations from
n-tuples of meanings to meanings, and M is generated from B by means of the
operations in O.This allows the definition of meaning operations by recursion over M.The
semantic function p is then defined simultaneously by recursion over syntax and by
recursion over the meaning domain. Assuming that the elementary meaning
operations are computable in a sense relevant to cognition, the semantic function itself is
computable.
A further step in this direction is to require that the meaning operations be easy to
compute, thereby reducing or minimizing the complexity of semantic interpretation. For
instance, meaning operations that arc either elementary or else formed from
elementary operations by function composition and function application would be of this kind
(cf. Pagin 2011 for work in this direction).
Another strengthening, also introduced in Hodges (2001), concerns Frege's so-called
Context Principle. A famous but cryptic saying by Frege (1884, x) is: "Never ask for the
meaning of a word in isolation, but only in the context of a sentence". This principle
has been much discussed in the literature (for example, Dummctt 1973, Dummctt 1981,
Janssen 2001, Pelletier 2001), and sometimes taken to conflict with compositionality.
However, if not seen as saying that words somehow lose their meaning in isolation, it can
be taken as a constraint on meanings, in the form of what we might call the Contribution
Principle:
6. Compositionality 103
(CP) The meaning of an expression is the contribution it makes to the meanings of
complex expressions of which it is a part.
This is vague, but Hodges notes that it can be made precise with an additional
requirement on the synonymy = . Assume (Huss), and consider:
InvSubst3(= ) If u * t, there is an expression s such that either exactly one of s[u] and
s[t] is meaningful, or both are and s[m] *„*[']•
So if two expressions of the same category are such that no complex expression of which
the first is a part changes meaning when the first is replaced by the second, they are
synonymous. That is, if they make the same contribution to all such complex
expressions, their meanings cannot be distinguished.This can be taken as one half of (CP), and
compositionality in the form of Subst,(= ) as the other.
Remark: Hodges' main application of these notions is to what has become known
as the extension problem: given a partial compositional semantics fi, under what
circumstances can n be extended to a larger fragment of the language? Here (CP) can be
used as a requirement, so that the meaning of a new word w, say, must respect the (old)
meanings of complex expressions of which w is a part. This is especially suited to
situations when all new items are parts of expressions that already have meanings (cofinality).
Hodges defines a corresponding notion of fregean extension of fi, and shows that in the
situation just mentioned, and given that fl satisfies (Huss), a unique fregcan extension
always exists. Another version of the extension problem is solved in Westerstahl (2004).
An abstract account of compositional extension issues is given in Fernando (2005). End
of remark
We can take a step further in this direction by requiring that replacement of
expressions by expressions with different meanings always changes meaning:
InvSubstv(= ) If for some i.Osisn,^ * r, then for every expression s, either exactly
one of s[ur . . ., un] and s[t{, . . ., tj are meaningful, or both are and
s[ur...,un]*fis[tr...,tn].
This disallows synonymy between complex expressions transformable into each other
by substitution of constituents at least some of which are non-synonymous, but it
docs allow synonymous expressions with different structure. Carnap's principle of
synonymy as intensional isomorphism forbids this, too. With the concept of intension from
possible-worlds semantics it can be stated as
(RC) t^u iff
i) t, u are atomic and co-intensional, or
ii) for some a, t = «(r,,..., tj, u = a(u{,..., un), and f. =^m., 1 £ / £ n
(RC) entails both Subst(= ) and InvSubstv(= ), but is very restrictive. It disallows
synonymy between brother and male sibling as well as between John loves Susan and Susan is
loved by John, and allows different expressions to be synonymous only if they differ at most
in being transformed from each other by substitution of synonymous atomic expressions.
104
I. Foundations of semantics
(RC) seems too strong. We get an intermediate requirement as follows. First, define
/^-congruence, => in the following way:
(=,) t^um
i) t or u is atomic, t = u, and neither is a constituent of the other, or
ii) t = a(tr...,tn),u =^(ur...,un),t.-u.,l £/£/i,and for alls,,...,sn, a(sr...,sn)
= fi(sr ..., sn), if either is defined.
Then require synonymous expressions to be congruent:
(Cong) If f =^m, then f =^11.
By (Cong), synonymous expressions cannot differ much syntactically, but they may differ
in the two crucial respects forbidden by (RC). (Cong) docs not hold for natural
language if logically equivalent sentences are taken as synonymous.That it holds otherwise
remains a conjecture (but see Johnson 2006).
It follows from (Cong) that meanings are (or can be represented as) structured
entities: entities uniquely determined by how they are built, i.e. entities from which
constituents can be extracted. We then have projection operations:
(Rev) For every meaning operation r:E—>E there arc projection operationssr. such
that sr j(r(mr...,mn)) = mr
Together with the fact that the operations rare meaning operations for a compositional
semantic function /i, (Rev) has semantic consequences, the main one being a kind of
inverse functional compositionality:
InvFunct(/y) The syntactic expression of a complex meaning m is determined, up to
/i-congruence, by the composition of m and the syntactic expressions of
its parts.
For the philosophical significance of inverse compositionality, see sections 4.6 and 5.2
below. For (= ), (Cong), InvFunct(/i), and a proof that (Rev) is a consequence of (Cong)
(really of the equivalent statement that the meaning algebra is a free algebra), see Pagin
(2003a). (Rev) seems to be what Jerry Fodor understands by 'reverse compositionality'
in e.g. Fodor (2000:371).
3.5. Direct and indirect compositionality
Pauline Jacobson (2002) distinguishes between direct and indirect compositionality, as
well as between strong direct and weak direct compositionality. This concerns how the
analysis tree of an expression maps onto the expression itself, an issue we have avoided
here, for simplicity. Informally, in strong direct compositionality, a complex expression t
is built up from sub-expressions (corresponding to subtrees of the analysis tree for t)
simply by means of concatenation. In weak direct compositionality, one expression may
6. Compositionality 105
wrap around another (as call up wraps around him in call him up). In indirect
compositionality, there is no such simple correspondence between the composition of analysis
trees and elementary operations on strings.
Even under our assumption that each expression has a unique analysis, our notion
of compositionality here is indirect in the above sense: syntactic operations may delete
strings, reorder strings, make substitutions and add new elements. Strictly speaking,
however, the direct/indirect distinction is not a distinction between kinds of semantics, but
between kinds of syntax. Still, discussion of it tends to focus on the role of
compositionality in linguistics, e.g. whether to let the choice of syntactic theory be guided by
compositionality (cf. Dowty 2007 and Kracht 2007. For discussion of the general significance of
the distinction, see Barker & Jacobson 2007).
3.6. Compositionality for "interpreted languages"
Some linguists, among them Jacobson, tend to think of grammar rules as applying to
signs, where a sign is a triple (e, k, m) consisting of a string, a syntactic category, and a
meaning.This is formalized by Marcus Kracht (see 2003,2007), who defines an interpreted
language to be a set L of signs in this sense, and a grammar G as a set of partial functions
from signs to signs,such that L is generated by the functions in G from a subset of atomic
(lexical) signs.Thus, a meaning assignment is built into the language, and grammar rules
are taken to apply to meanings as well.
This looks like a potential strengthening of our notion of grammar, but is not really
used that way, partly because the grammar is taken to operate independently (though in
parallel) at each of the three levels. Let /?,, pv and p3 be the projection functions on
triples yielding their first, second, and third elements, respectively. Kracht calls a grammar
compositional if for each w-ary grammar rule or there are three operations ral,ra2, and ra3
such that for all signs av ...,on for which a is defined,
a(<rv...,an) =
<r« l(P.(CTl)' • ■ •'/'■K))'^^0!)' • • •^2((Tn))'/'a3(P3(CTl)' • • ^P3(an)))
and moreover a(ar . . ., a J is defined if and only if each r . is defined for the
corresponding projections.
In a sense, however, this is not really a variant of compositionality but rather another
way to organize grammars and semantics. This is indicated by (4) and (5) below, which
are not hard to verify. First, call G strict if a(cr ..., cn) defined and />,(<r) = />,(*,) for 1 £
i £ n entails «(!,,..., t/() defined, and similarly for the other projections. All compositional
grammars are strict.
(4) Every grammar G in Kracht's sense for an interpreted language L is a grammar
(E, A, L) in the sense of section 2 (with E = L, A = the set of atomic signs in L, and
S = the set of partial functions of G). Provided G is strict, G is compositional (in
Kracht's sense) iff each oi prp2, and py seen as assignments of values to signs (so
p3 is the meaning assignment), is compositional (in our sense).
106
I. Foundations of semantics
(5) Conversely, if E = (E,A,S) is a grammar and/i a semantics for E, let L = {(u,u,fi(u)):
u e dom(n)\. Define a grammar G for L (with the obvious atomic signs) by letting
a((ur urn(u{)),.. .,(un, un,n(un))) = (a(ur...,un), a(ur...,un),ft(a(ur...,M/i))>
whenever ore Sis defined for m,,.. ., wn and a(uv...,un) e dom(ji) (undefined
otherwise). Provided /i is closed under subexpressions and has the Husserl property,/i
is compositional iff G is compositional.
3.7. Context dependence
In standard possible-worlds semantics the role of meanings are served by intensions:
functions from possible worlds to extensions. For instance, the intension of a sentence
returns a truth value, when the argument is a world for which the function is defined.
Montague (1968) extended this idea to include not just worlds but arbitrary indices i
from some set I, as ordered tuples of contextual factors relevant to semantic evaluation.
Speaker, time, and place of utterance are typical elements in such indices. The semantic
function /i then assigns a meaning /i(f) to an expression t, which is itself a function such
that for an index / e I,/i(f)(/) gives an extension as value. Kaplan's (1989) two-level
version of this first assigns a function (character) to t taking certain parts of the index (the
context, typically including the speaker) to a content, which is in turn a function from
selected parts of the index to extensions.
In both versions, the usual concept of compositionality straightforwardly applies.The
situation gets more complicated when semantic functions themselves take contextual
arguments, e.g. if a meaning-in-context for an expression t in context c is given as ft(t, c).
The reason for such a change might be the view that the contextual meanings are contents
in their own right, not just extensional fall-outs of the standing, context-independent
meaning. But with context as an additional argument we have a new source of variation.
The most natural extension of compositionality to this format is given by
C-Funct(/i) For every rule ore I there is a meaning operation rasuch that for every
context c, if a(ur ..., un) has meaning in c, then
fi(a(ur..., un), c)= rjji(ur c),.. .,fi(uH, c)).
C-Funct(/i) seems like a straightforward extension of compositionality to a contextual
semantics, but it can fail in a way non-contextual semantics cannot, by a context-shift
failure. For we can suppose that although /ifw., c) = /ifw., c'), 1 £ / £ n, we still have
H(a(ur..., un), c) ± n(a(ur..., un), c'). One might see this as a possible result of so-called
unarticulated constituents. Maybe the meaning of the sentence
(6) It rains.
is sensitive to the location of utterance, while none of the constituents of that sentence
(say,/r and rains) is sensitive to location.Then the contextual meaning of the sentence at
a location / is different from the contextual meaning of the sentence at another location
/',even though there is no such difference in contextual meaning for any of the parts.This
may hold even if substitution of expressions is compositional.
6. Compositionality 102
There is therefore room for a weaker principle that cannot fail in this way, where the
meaning operation itself takes a context argument:
C-Funct(/i)r For every rule a e I there is a meaning operation ra such that for every
context c, if a(ur . . ., uj has meaning in c, then ft(a(ur . . ., uj, c) =
ra(v(urc),...,n(un,c),c).
The only difference is the last argument of ra. Because of this argument, C-Funct(/i)ris
not sensitive to the counterexample above, and is more similar to non-contextual
compositionality in this respect.
This kind of semantic framework is discussed in Pagin (2005); a general format,
and properties of the various notions of compositionality that arise, arc presented in
Westerstahl (2011). For example, it can be shown that (weak) compositionality for
contextual meaning entails compositionality for the corresponding standing meaning, but
the converse does not hold.
So far, we have dealt with extra-linguistic context, but one can also extend
compositional semantics to dependence on linguistic context. The semantic value of some
particular occurrence of an expression may then depend on whether it is an occurrence in,
say, an cxtcnsional context, or an intcnsional context, or a hypcrintcnsional context, a
quotation context, or yet something else.
A framework for such a semantics needs a set C of context types, an initial null context
type 0 e C for unembedded occurrences, and a binary function y/ from context types and
syntactic operators to context types. If a(f,, ..., tn) occurs in context type cy, then f,,..., tn
will occur in context type t//(cr or). The context type for a particular occurrence t" of an
expression Mn a host expression t is then determined by its immediately embedding
operator or,, its immediately embedding operator, and so on until the topmost operator
occurrence.
The semantic function /i takes an expression t and a context type c into a semantic
value.The only thing that will differ for linguistic context from C-Funct(/i)t.above is that
the context of the subexpressions may be different (according to the function y/) from the
context of the containing expression:
LC-Funct(/i)r For every a e S there is an operation rasuch that for every context c, if
a(uv..., un) has meaning in c, then
H(a(uv..., un), c) = rjjt(uvc%.. .,v(un, c'), c),
where c' = y/(c, a).
4. Arguments in favor of compositionality
4.1. Learnability
Perhaps the most common argument for compositionality is the argument from
learnability: A natural language has infinitely many meaningful sentences. It is impossible for
a human speaker to learn the meaning of each sentence one by one. Rather, it must be
108
I. Foundations of semantics
possible for a speaker to learn the entire language by learning the meaning of a finite
number of expressions, and a finite number of construction forms. For this to be
possible, the language must have a compositional semantics. The argument was to some
extent anticipated already in Sanskrit philosophy of language. During the first or second
century BCE Patanjali writes:
... Brhaspati addressed India during a thousand divine years going over the grammatical
expressions by speaking each particular word, and still he did not attain the end. . . . But
then how are grammatical expressions understood? Some work containing general and
particular rules has to be composed ...
(Cf. Staal 1969:501-502.Thanks to Brendan Gillon for the reference.)
A modern classical passage plausibly interpreted along these lines is due to Donald
Davidson:
It is conceded by most philosophers of language, and recently by some linguists, that a
satisfactory theory of meaning must give an account of how the meanings of sentences depend
upon the meanings of words. Unless such an account could be supplied for a particular
language, it is argued, there would be no explaining the fact that we can learn the language:
no explaining the fact that, on mastering a finite vocabulary and a finite set of rules, we are
prepared to produce and understand any of a potential infinitude of sentences. I do not
dispute these vague claims, in which I sense more than a kernel of truth. Instead I want to
ask what it is for a theory to give an account of the kind adumbrated.
(Davidson 1967:17)
Properly spelled out, the problem is not that of learning the meaning of infinitely many
meaningful sentences (given that one has command of a syntax), for if I learn that they
all mean that snow is white, I have already accomplished the task. Rather, the problem is
that there are infinitely many propositions that are each expressed by some sentence in
the language (with contextual parameters fixed), and hence infinitely many equivalence
classes of synonymous sentences.
Still, as an argument for compositionality, the learnability argument has two main
weaknesses. First, the premise that there are infinitely many sentences that have a
determinate meaning although they have never been used by any speaker, is a very strong
premise, in need of justification. That is, at a given time tQ, it may be that the speaker or
speakers employ a semantic function fj defined for infinitely many sentences, or it may
be that they employ an alternative function /i0 which agrees with fi on all sentences
that have in fact been used but is simply undefined for all that have not been used. On
the alternative hypothesis, when using a new sentence s, the speaker or the
community gives some meaning to s, thereby extending /i0 to /i,, and so on. Phenomenologi-
cally, of course, the new sentence seemed to the speakers to come already equipped
with meaning, but that was just an illusion. On this alternative hypothesis, there is
no infinite semantics to be learned. To argue that there is a learnability problem, we
must first justify the premise that we employ an infinite semantic function.This cannot
be justified by induction, for we cannot infer from finding sentences meaningful that
they were meaningful before we found them, and exactly that would have to be the
induction base.
6. Compositionality 109
The second weakness is that even with the infinity premise in place, the conclusion of
the argument would be that the semantics must be computable, but computability does
not entail compositionality, as we have seen.
4.2. Novelty
Closely related to the learnability argument is the argument from novelty: speakers are
able to understand sentences they have never heard before, which is possible only if the
language is compositional.
When the argument is interpreted so that, as in the learnability argument, we need
to explain how speakers reliably track the semantics, i.e. assign to new sentences the
meaning that they independently have, then the argument from novelty shares the two
main weaknesses with the learnability argument.
4.3. Productivity
According to the pure argument from productivity, we need an explanation of why we
are able to produce infinitely many meaningful sentences, and compositionality offers
the best explanation. Classically, productivity is appealed to by Noam Chomsky as an
argument for generative grammar. One of the passages runs
The most striking aspect of linguistic competence is what we may call the 'creativity of
language', that is, the speaker's ability to produce new sentences that are immediately
understood by other speakers although they bear no physical resemblance to sentences that are
'familiar'. The fundamental importance of this creative aspect of normal language use has
been recognized since the seventeenth century at least, and it was the core of Huinboldtian
general linguistics.
(Chomsky 1971:74)
This passage does not appeal to pure productivity, since it makes an appeal to the
understanding by other speakers (cf. Chomsky 1980:76-78). The pure productivity aspect has
been emphasized by Fodor (e.g. 1987:147-148), i.e. that natural language can express an
open-ended set of propositions.
However, the pure productivity argument is very weak. On the premise that a human
speaker can think indefinitely many propositions, all that is needed is to assign those
propositions to sentences. The assignment does not have to be systematic in any way,
and all the syntax that is needed for the infinity itself is simple concatenation. Unless the
assignment is to meet certain conditions, productivity requires nothing more than the
combination of infinitely many propositions and infinitely many expressions.
4.4. Systematicity
A related argument by Fodor (1987: 147-150) is that of systematicity. It can be stated
either as a property of speaker understanding or as an expressive property of a language.
Fodor tends to favor the former (since he is ultimately concerned with the mental). In
the simplest case, Fodor points out that if a language user understands a sentence of the
110
I. Foundations of semantics
form tRu, she will also understand the corresponding sentence uRt, and argues that this
is best explained by appeal to compositionality.
Formally, the argument is to be generalized to cover the understanding of any new
sentence that is formed by recombination of constituents that occur, and construction
forms that are used, in sentences already understood. Hence, in this form it reduces to
one of three different arguments; either to the argument from novelty, or to the
productivity argument, or finally, to the argument from intersubjectivity (below), and only spells
out a bit the already familiar idea of old parts in new combinations.
It might be taken to add an element, for it not only aims at explaining the
understanding of new sentences that is in fact manifested, but also predicts what new sentences
will be understood. However, Fodor himself points out the problem with this aspect, for
if there is a sentence s formed by a recombination that we do not find meaningful, we will
not take it as a limitation of the systematicity of our understanding, but as revealing that
the sentence s is not in fact meaningful, and hence that there is nothing to understand.
Hence, we cannot come to any other conclusion than that the systematicity of our
understanding is maximal.
The systematicity argument can alternatively be understood as concerning natural
language itself, namely as the argument that sentences formed by grammatical
recombination are meaningful. It is debatable to what extent this really holds, and sentences (or
so-called sentences) like Chomsky's Colorless green ideas sleep furiously have been used
to argue that not all grammatical sentences are meaningful.
But even if we were to find meaningful all sentences that we find grammatical, this
does not in itself show that compositionality, or any kind of systematic semantics, is
needed for explaining it. If it is only a matter of assigning some meaning or other, without
any further condition, it would be enough that we can think new thoughts and have a
disposition to assign them to new sentences.
4.5. Induction on synonymy
We can observe that our synonymy intuitions conform to Subst(= ). In case after case, we
find the result of substitution synonymous with the original expression, if the new part
is taken as synonymous with the old. This forms the basis of an inductive generalization
that such substitutions are always meaning preserving. In contrast to the argument from
novelty, where the idea of tracking the semantics is central, this induction argument may
concern our habits of assigning meaning to, or reading meaning into, new sentences: we
tend to do it compositionally.
There is nothing wrong with this argument, as far as it goes, beyond what is in general
problematic with induction. It should only be noted that the conclusion is weak.Typically,
arguments for compositionality aim at the conclusion that there is a systematic pattern to
the assignment of meaning to new sentences, and that the meaning of new sentences can
be computed somehow.This is not the case in the induction argument, for the conclusion
is compatible with the possibility that substitutivity is the only systematic feature of the
semantics. That is, assignment to meaning of new sentences may be completely random,
except for respecting substitutivity. If the substitutivity version of compositionality holds,
then (under DP) so does the function version, but the semantic function need not be
computable, and need not even be finitely specifiable. So, although the argument may
6. Compositionality 1_1_1_
be empirically sound, it does not establish what arguments for compositionality usually
aim at.
4.6. Intersubjectivity and communication
The problems with the idea of tracking semantics when interpreting new sentences can
be eliminated by bringing in intersubjective agreement in interpretation. For by our
common sense standards of judging whether we understand sentences the same way or
not, there is overwhelming evidence (e.g. from discussing broadcast news reports) that
in an overwhelming proportion of cases, speakers of the same language interpret new
sentences similarly. This convergence of interpretation, far above chance, does not
presuppose that the sentences heard were meaningful before they were used.The
phenomenon needs an explanation, and it is reasonable to suppose that the explanation involves
the hypothesis that the meaning of the sentences are computable, and so it isn't left to
guesswork or mere intuition what the new sentences mean.
The appeal to intersubjectivity disposes of an unjustified presupposition about
semantics, but two problems remain. First, when encountering new sentences, these are almost
invariably produced by a speaker, and the speaker has intended to convey something by
the sentence, but the speaker hasn't interpreted the sentence, but fitted it to an antecedent
thought. Secondly, we have an argument for computability, but not for compositionality.
The first observation indicates that it is at bottom the success rate of linguistic
communication with new sentences that gives us a reason for believing that sentences
are systematically mapped on meanings. This was the point of view in Frege's famous
passage from the opening of'Compound Thoughts':
It is astonishing what language can do. With a few syllables it can express an incalculable
number of thoughts, so that even a thought grasped by a terrestrial being for the very first
time can be put into a form of words which will be understood by someone to whom the
thought is entirely new. This would be impossible, were we not able to distinguish parts in
the thoughts corresponding to the parts of a sentence, so that the structure of the sentence
serves as the image of the structure of the thought.
(Frege 1923:55)
As Frege depicts it here, the speaker is first entertaining a new thought, or proposition,
finds a sentence for conveying that proposition to a hearer, and by means of that
sentence the hearer comes to entertain the same proposition as the speaker started out
with. Frege appeals to semantic structure for explaining how this is possible. He claims
that the proposition has a structure that mirrors the structure of the sentence (so that
the semantic relation may be an isomorphism), and goes on to claim that without this
structural correspondence, communicative success with new propositions would not be
possible.
It is natural to interpret Frege as expressing a view that entails that compositionality
holds as a consequence of the isomorphism idea.The reason Frege went beyond
compositionality (or homomorphism, which does not require a one-one relation) seems to be
an intuitive appeal to symmetry: the speaker moves from proposition to sentence, while
the hearer moves from sentence to proposition. An isomorphism is a one-one relation,
so that each relatum uniquely determines the other.
112
I. Foundations of semantics
Because of synonymy, a sentence that expresses a proposition in a particular language
is typically not uniquely determined within that language by the proposition expressed.
Still, we might want the speaker to be able to work out what expression to use, rather
searching around for suitable sentences by interpreting candidates one after the other.
The inverse functional compositionality principle, InvFunct(//), of section 3.4, offers such
a method. Inverse compositionality is also connected with the idea of structured
meanings, or thoughts, while compositionality by itself isn't, and so in this respect Frege is
vindicated (these ideas are developed in Pagin 2003a).
4.7. Summing up
Although many share the feeling that there is "more than a kernel of truth" (cf. section
4.1) in the usual arguments for compositionality, some care is required to formulate and
evaluate them. One must avoid question-begging presuppositions; for example, if a
presupposition is that there is an infinity of propositions, the argument for that had better
not be that standardly conceived natural or mental languages allow the generation of
such an infinite set. Properly understood, the arguments can be seen as inferences to the
best explanation, which is a respectable but somewhat problematic methodology. (One
usually hasn't really tried many other explanations than the proposed one.)
Another important (and related) point is that virtually all arguments so far only
justify the principle that the meaning is computable or recursive, and the principle that up to
certain syntactic variation, an expression of a proposition is computable from that
proposition. Why should the semantics also be compositional, and possibly inversely
compositional? One reason could be that compositional semantics, or at least certain simple
forms of compositional semantics, is very simple, in the sense that a minimal number
of processing steps are needed by the hearer for arriving at a full interpretation (or, for
the speaker, a full expression, cf. Pagin 2011), but these issues of complexity need to be
further explored.
5. Arguments against compositionality
Arguments against compositionality of natural language can be divided into four main
categories:
a) arguments that certain constructions are counterexamples and make the principle
false,
b) arguments that compositionality is an empirically vacuous, or alternatively trivially
correct, principle,
c) arguments that compositional semantics is not needed to account for actual
linguistic communication,
d) arguments that actual linguistic communication is not suited for compositional
semantics.
The first category, that of counterexamples, will be treated in a separate section dealing
with a number of problem cases. Here we shall discuss arguments in the last three
categories.
6. Compositionality 113
5.1 Vacuity and triviality arguments
Vacuity. Some claims about the vacuity of compositionality in the literature are based
on mathematical arguments. For example, Zadrozny (1994) shows that for every
semantics // there is a compositional semantics vsuch that vit)(t) = /J(t) for every expression
t, and uses this fact to draw a conclusion of that kind. But note that the mathematical
fact is itself trivial: let v(f) = // for each t and the result is immediate from (2) in section
3.1 above (other parts of Zadrozny's results use non-wellfounded sets and are less
trivial).
Claims like these tend to have the form: for any semantics // there is a compositional
semantics vfrom which // can be easily recovered. But this too is completely trivial as it
stands: if we let v(f) = (//(f), f), vis 1-1, hence compositional by (3) in section 3.1, and//is
clearly recoverable from v.
In general, it is not enough that the old semantics can be computed from the new
compositional semantics: for the new semantics to have any interest it must agree with
the old one in some suitable sense. As far as we know there are no mathematical results
showing that such a compositional alternative can always be found (see Westerstahl 1998
for further discussion).
Triviality. Paul Horwich (e.g. in 1998) has argued that compositionality is not a
substantial property of a semantics, but is trivially true. He exemplifies with the sentence
dogs barks, and says (1998:156-157) that the meaning property
(7) x means DOGS BARK
consists in the so-called construction property
(8) x results from putting expressions whose meanings are DOG and BARK, in that
order, into a schema whose meaning is NS V.
As far as it goes, the compositionality of the resulting semantics is a trivial consequence
of Horwich's conception of meaning properties. Horwich's view here is equivalent to
Carnap's conception of synonymy as intensional isomorphism. Neither allows that an
expression with different structure or composed from parts with different meanings
could be synonymous with an expression that means DOGS BARK. However, for
supporting the conclusion that compositionality is trivial, these synonymy conditions must
themselves hold trivially, and that is simply not the case.
5.2. Superfluity arguments
Mental processing. Stephen Schiffer (1987) has argued that compositional semantics,
and public language semantics altogether, is superfluous in the account of linguistic
communication. All that is needed is to account for how the hearer maps his mental
representation of an uttered sentence on a mental representation of meaning, and
that is a matter of a syntactic transformation, i.e. a translation, rather than
interpretation. In Schiffer's example (1987:192-200), the hearer Harvey is to infer from his belief
that
114
I. Foundations of semantics
(9) Carmen uttered the sentence 'Some snow is white',
the conclusion that
(10) Carmen said that some snow is white.
Schiffer argues that this can be achieved by means of transformations between
sentences in Harvey's neural language M. M contains a counterpart a to (9), such that a
gets tokened in Harvey's so-called belief box when he has the belief expressed by (9). By
an inner mechanism the tokening of or leads to the tokening of /?, which is Harvey's M
counterpart to (10). For this to be possible for any sentence of the language in question,
Harvey needs a translation mechanism that implements a recursive translation function
/from sentence representations to meaning representations. Once such a mechanism is
in place, we have all we need for the account, according to Schiffer.
The problem with the argument is that the translation function / by itself tells us
nothing about communicative success. By itself it just correlates neural sentences of
which we know nothing except for their internal correlation. We need another recursive
function g that maps the uttered sentence Some snow is white on a, and a third recursive
function h that maps ft on the proposition that some snow is white, in order to have a
complete account. But then the composed function h(J{g(...))) seems to be a recursive
function that maps sentences on meanings (cf. Pagin 2003b).
Pragmatic composition. According to Franqois Recanati (2004), word meanings are
put together in a process of pragmatic composition. That is, the hearer takes word
meanings, syntax and contextual features as his input, and forms the interpretation that best
corresponds to them. As a consequence, semantic compositionality is not needed for
interpretation to take place.
A main motivation for Rccanati's view is the ubiquity of those pragmatic operations
that Recanati calls modulations, and which intuitively contribute to "what is said", i.e. to
communicated content before any conversational implicatures. (Under varying terms
and conceptions, these phenomena have been described e.g. by Sperber & Wilson 1995,
Bach 1994,Carston 2002 and by Recanati himself.) To take an example from Recanati, in
reply to an offer of something to eat, the speaker says
(11) I have had breakfast.
thereby saying that she has had breakfast in the morning of the day of utterance, which
involves a modulation of the more specific kind Recanati calls free enrichment, and
implicating by means of what she says that she is not hungry. On Recanati's view,
communicated contents are always or virtually always pragmatically modulated. Moreover,
modulations in general do not operate on a complete semantically derived proposition,
but on conceptual constituents. For instance, in (11) it is the property of having breakfast
that is modulated into having breakfast this day, not the proposition as a whole or even
the property of having had breakfast. Hence, it seems that what the semantics delivers
docs not feed into the pragmatics.
However, if meanings, i.e. the outputs of the semantic function, are structured
entities, in the sense specified by (Rev) and InvFunct(//) of section 3.4, then the last
objection is met, for then semantics is able to deliver the arguments to the pragmatic
6. Compositionality U_5
operations, e.g. properties associated with VPs. Moreover, the modulations that are in
fact made appear to be controlled by a given semantic structure: as in (11), the
modulated part is of the same category and occupies the same slot in the overall structure
as the semantically given argument that it replaces. This provides a reason for thinking
that modulations operate on a given (syntactically induced) semantic structure, rather
than on pragmatically composed material (this line of reasoning is elaborated in Pagin
& Pelletier 2007).
5.3. Unsuitability arguments
According to a view that has come to be called radical contextualism, truth evaluable
content is radically underdetermined by semantics, i.e. by literal meaning. That is, no
matter how much a sentence is elaborated, something needs to be added to its semantic
content in order to get a proposition that can be evaluated as true or false. Since there
will always be indefinitely many different ways of adding, the proposition expressed
by means of the sentence will vary from context to context. Well-known proponents
of radical contextualism include John Searle (e.g. 1978), Charles Travis (e.g. 1985), and
Sperber & Wilson (1995). A characteristic example from Charles Travis (1985: 197) is
the sentence
(12) Smith weighs 80 kg.
Although it sounds determinate enough at first blush, Travis points out that it can be
taken as true or as false in various contexts, depending on what counts as important in
those contexts. For example, it can be further interpreted as being true in case Smith
weighs
(12') a. 80 kg when stripped in the morning.
b. 80 kg when dressed normally after lunch.
c. 80 kg after being force fed 4 liters of water.
d. 80 kg four hours after having ingested powerful diuretic.
e. 80 kg after lunch adorned in heavy outer clothing.
Although the importance of such examples is not to be denied, their significance for
semantics is less clear. It is in the spirit of radical contextualism to minimize the
contribution of semantics (literal meaning) for determining expressed content, and thereby
the importance of compositionality. However, strictly speaking, the truth or falsity of
the compositionality principle for natural language is orthogonal to the truth or falsity
of radical contextualism. For whether the meaning of a sentence s is a proposition or not
is irrelevant to the question whether that meaning is determined by the meaning of the
constituents of s and their mode of composition. The meaning of s may be unimportant
but still compositionally determined.
In an even more extreme version, the (semantic) meaning of sentence s in a context
c is what the speaker uses s to express in c. In that case meaning itself varies from
context to context, and there is no such thing as an invariant literal meaning. Not even the
extreme version need be in conflict with compositionality (extended to context
dependence), since the substitution properties may hold within each context by itself. Context
116
I. Foundations of semantics
shift failure, in the sense of section 3.7, may occur, if e.g. word meanings are invariant but
the meanings of complex expressions vary between contexts.
It is a further question whether radical contextualism itself, in either version, is a
plausible view. It appears that the examples of contextualism can be handled by other
methods, e.g. by appeal to pragmatic modulations mentioned in section 5.2 (cf. Pagin
& Pelletier 2007), which does allow propositions to be semantically expressed. Hence,
the case for radical contextualism is not as strong as it may prima facie appear. On top,
radical contextualism tends to make a mystery out of communicative success.
6. Problem cases
A number of natural language constructions present apparent problems for
compositional semantics. In this concluding section we shall briefly discuss a few of them, and
mention some others.
6.1. Belief sentences
Belief sentences offer diffculties for compositional semantics, both real and merely
apparent. At first blush, the case for a counterexample against compositionality seems
very strong. For in the pair
(13) a. John believes that Fred is a child doctor,
b. John believes that Fred is a pediatrician.
(13a) may be true and (13b) false, despite the fact that child doctor and pediatrician are
synonymous. If truth value is taken to depend only on meaning and on extra-semantic
facts, and the extra-semantic facts as well as the meanings of the parts and the modes
of composition are the same between the sentences, then the meaning of the sentences
must nonetheless be different, and hence compositionality fails.This conclusion has been
drawn by Jeff Pelletier (1994).
What would be the reason for this difference in truth value? When cases such as
these come up, the reason is usually that there is some kind of discrepancy in the
understanding of the attributee (John) between synonyms. John may e.g. erroneously
believe that pediatrician only denotes a special kind of child doctors, and so would
be disposed to assent to (13a) but dissent from (13b) (cf. Mates 1950 and Burge 1978;
Mates took such cases as a reason to be skeptical about synonymy).This is not a
decisive reason, however, since it is what the words mean in the sentences, e.g. depending
on what the speaker means, that is relevant, not what the attributee means by those
words.The speaker contributes with words and their meanings, and the attributee
contributes with his belief contents. If John's belief content matches the meaning of the
embedded sentence Fred is a pediatrician, then (13b) is true as well, and the problem
for compositionality is disposed of.
A problem still arises, however, if belief contents are more fine-grained than sentence
meanings, and words in belief contexts are somehow tied to these finer differences in grain.
For instance, as a number of authors have suggested, perhaps belief contents are
propositions under modes of presentation (see e.g. Burdick 1982, Salmon 1986. Salmon,
however, existentially quantifies over modes of presentations, which preserves substitutivity).
6. Compositionality 117
It may then be that different but synonymous expressions are associated with different
modes of presentation. In our example, John may believe a certain proposition under a
mode of presentation associated with child doctor but not under any mode of presentation
associated with pediatrician, and that accounts for the change in truth value.
In that case, however, there is good reason to say that the underlying form of a belief
sentence such as (13a) is something like
(14) Bel(John, the proposition that Fred is a child doctor, M('Fred is a child doctor'))
where M(-) is a function from a sentence to a mode of presentation or a set of modes of
presentation. In this form, the sentence Fred is a pediatrician occurs both used and
mentioned (quoted), and in its used occurrence, child doctor may be replaced by pediatrician
without change of truth value. Failure of substitutivity is explained by the fact that the
surface form fuses a used and a mentioned occurrence. In the underlying form, there is
no problem for compositionality, unless caused by quotation.
Of course, this analysis is not obviously the right one, but it is enough to show that the
claim that compositionality fails for belief sentences is not so easy to establish.
6.2. Quotation
Often quotation is set aside for special treatment as an exception to ordinary
semantics, which is supposed to concern used occurrences of expressions rather than
mentioned ones. Sometimes, this is regarded as cheating, and quotation is proposed as a
clear counterexample to compositionality: brother and male sibling are synonymous, but
^brother" and 'male sibling" are not (i.e. the expressions that include the opening and
closing quote). Since enclosing an expression in quotes is a syntactic operation, we have
a counterexample.
If quoting is a genuine syntactic operation, the syntactic rules include a total unary
operator vsuch that, for any simple or complex expression t,
(15) K(t) = r
The semantics of quoted expressions is given simply by
(Q) //(*(*)) = *
Then, since t = u does not imply t = u, substitution of u for t in fc(t) may violate
compositionality.
However, such a non-compositional semantics for quotation can be transformed into
a compositional one, by adapting Frege's view in (1892) that quotation provides a special
context type in which expressions refer to themselves, and using the notion of
linguistically context-dependent compositionality from section 3.7 above. We shall not give the
details here, only indicate the main steps.
Start with a grammar E = (E, A, I) (for a fragment of English, say) and a
compositional semantics/i for E. First, extend E to a grammar containing the quotation operator
118
I. Foundations of semantics
K, allowing not only quote-strings of the form 'John','likes',"Mary", etc., but also things
like John likes 'Mary' (meaning that he likes the name), whereas we disallow things like
John 'likes' Mary or 'John likes' Mary as ungrammatical. Let E' be the closure of E under
the thus extended operations and k, and let I' = [a': ae S}u{/c}.Then we have a new
grammar E' =(E',A,Z') that incorporates quotation.
Next, extend ft to a semantics ft' for E', using the semantic composition operations
that exist by Funct(//), and letting (Q) above take care of K. As indicated, the
semantics ft' is not compositional: even if Mary is the same person as Sue, John likes 'Mary'
doesn't mean the same as John likes 'Sue'. However, we can extend ft' to a semantics ft"
for E'which is compositional in the sense of LC-Funct(/y)t.in section 3.7. In the simplest
case, there are two context types: c(i, the use context type, which is the default type (the
null context), and the quotation context type c -The function y/ from context types and
operators to context types is given by
\c ifB*K
for fi e I' and c equal to cu or c . ft" is obtained by redefining the given composition
operations in a fairly straightforward way, so that LC-Funct(/i")c is automatically
insured./i" then extends ft in the sense that if t e £ is meaningful, ft" (t, c(i) = ft(t), and
furthermore ft" (i<(t), c(i) = ft" (t, cq) = t.
So ft" is compositional in the contextually extended sense. That t = u holds does not
license substitution of u for t in K(t), since t there occurs in a quotation context, and we
may have ft"(t,c ) 4- ft" (u,c ).This approach is further developed in Pagin & Westerstahl
(2009).
6.3. Idioms
Idioms are almost universally thought to constitute a problem for compositionality. For
example, the VP kick the bucket can also mean 'die', but the semantic operation
corresponding to the standard syntax of, say, fetch the bucket, giving its meaning in terms of the
meanings of its immediate constituents/efc/i and the bucket, cannot be applied to give the
idiomatic meaning of kick the bucket.
This is no doubt a problem of some sort, but not necessarily for compositionality.
First, that a particular semantic operation fails doesn't mean that no other operation
works. Second, note that kick the bucket is ambiguous between its literal and its idiomatic
meaning, but compositionality presupposes non-ambiguous meaning bearers. Unless we
take the ambiguity itself to be a problem for compositionality (see the next subsection),
we should first find a suitable way to disambiguate the phrase, and only then raise the
issue of compositionality.
Such disambiguation may be achieved in various ways. We could treat the whole
phrase as a lexical item (an atom), in view of the fact that its meaning has to be learnt
separately. Or, given that it does seem to have syntactic structure, we could treat it as formed
by a different rule than the usual one. In neither case is it clear that compositionality
would be a problem.
6. Compositionality 1_1^
To see what idioms really have to do with compositionality, think of the following
situation. Given a grammar and a compositional semantics for it, suppose we decide to give
some already meaningful phrase a non-standard, idiomatic meaning. Can we then extend
the given syntax (in particular, to disambiguate) and semantics in a natural way that
preserves compositionality? Note that it is not just a matter of accounting for one particular
phrase, but rather for all the phrases in which the idiom may occur. This requires an
account of how the syntactic rules apply to the idiom, and to its parts if it has structure,
as well as a corresponding semantic account.
But not all idioms behave the same. While the idiomatic kick the bucket is fine in
John kicked the bucket yesterday, or Everyone kicks the bucket at some point, it is not
good in
(16) The bucket was kicked by John yesterday.
(17) Andrew kicked the bucket a week ago, and two days later, Jane kicked it too.
By contrast, pull strings preserves its idiomatic meaning in passive form, and strings is
available for anaphoric reference with the same meaning:
(18) Strings were pulled to secure Henry his position.
(19) Kim's family pulled some strings on her behalf, but they weren't enough to get her
the job.
This suggests that these two idioms should be analyzed differently; indeed the latter kind
is called "compositional" in Nunberg, Sag & Wasow (1994) (from which (19) is taken),
and is analyzed there using the ordinary syntactic and semantic rules for phrases of this
form but introducing instead idiomatic meanings of its parts (pull and string), whereas
kick the bucket is called "non-compositional".
In principle, nothings prevents a semantics that deals differently with the two kinds
of idioms from being compositional in our sense. Incorporating idioms in syntax and
semantics is an interesting task. For example, in addition to explaining the facts noted
above one has to prevent kick the pail from meaning 'die' even if bucket and pail are
synonymous, and likewise to prevent the idiomatic versions of pull and string to
combine illegitimately with other phrases. For an overview of the semantics of idioms, see
Nunberg, Sag & Wasow (1994). Westerstahl (2002) is an abstract discussion of various
ways to incorporate idioms while preserving compositionality.
6.4. Ambiguity
Even though the usual formulation of compositionality requires non-ambiguous meaning
bearers, the occurrence of ambiguity in language is usually not seen as a problem
for compositionality. This is because lexical ambiguity seems easily dealt with by
introducing different lexical items for different meanings of the same word, whereas structural
ambiguity corresponds to different analyses of the same surface string.
120
I. Foundations of semantics
However, it is possible to argue that even though there are clear cases of structural
ambiguity in language, as in Old men and women were released first from the occupied
building, in other cases the additional structure is just an ad hoc way to avoid ambiguity.
In particular, quantifier scope ambiguities could be taken to be of this kind. For example,
while semanticists since Montague have had no trouble inventing different underlying
structures to account for the two readings of
(20) Every critic reviewed four films.
it may be argued that this sentence in fact has just one structural analysis, a simple
constituent structure tree, and that meaning should be assigned to that one structure. A
consequence is that meaning assignment is no longer functional, but relational, and hence
compositionality either fails or is just not applicable. Pelletier (1999) draws precisely this
conclusion.
But even if one agrees with such an account of the syntax of (20), abandonment
of compositionality is not the only option. One possibility is to give up the idea that
the meaning of (20) is a proposition, i.e. something with a truth value (in the actual
world), and opt instead for underspecified meanings of some kind. Such meanings can
be uniquely, and perhaps compositionally, assigned to simple structures like constituent
structure trees, and one can suppose that some further process of interpretation of
particular utterances leads to one of the possible specifications, depending on various
circumstantial facts. This is a form of context-dependence, and wc saw in section 3.7 how
similar phenomena can be dealt with compositionally. What was there called standing
meaning is one kind of underspecified meaning, represented as a function from indices
to 'ordinary' meanings. In the present case, where several meanings are available, one
might try to use the set of those meanings instead. A similar but more sophisticated way
of dealing with quantifier scope is so-called Cooper storage (sec Cooper 1983). It should
be noted, however, that while such strategies restore a functional meaning assignment,
the compositionality of the resulting semantics is by no means automatic; it is an issue
that has to be addressed anew.
Another option might be to accept that meaning assignment becomes relational and
attempt instead to reformulate compositionality for such semantics. Although this line
has hardly been tried in the literature, it may be an option worth exploring (For some
first attempts in this direction, sec Wcstcrstahl 2007).
6.5. Other problems
Other problems than those above, some with proposed solutions, include possessives
(cf. Partcc 1997; Peters & Wcstcrstahl 2006), the context sensitive use of adjectives (cf.
Lahav 1989; Szab6 2001; Reimer 2002), noun-noun compounds (cf. Weiskopf 2007),
iw/ess+quantifiers (cf.Higginbotham 1986; Pelletier 1994),a/t>'embeddings (cf. Hintikka
1984), and indicative conditionals (e.g. Lewis 1976).
All in all, it seems that the issue of compositionality in natural language will remain
live, important and controversial for a long time to come.
6. Compositionality 12J_
7. References
Abelard, Peter 2008. Logica 'Ingredientibus'. 3. Commentary on Aristotele's De Interpretatione.
Corpus Christianorum Contiunatio Medicvalis,Turnhout: Brepols Publishers.
Ailly, Peter of 1980. Concepts and Insolubles. Dordrecht: Reidel. Originally published as Conceptus
et Insolubilia, Paris, ca. 1500.
Bach, Kent 1994. Conversational iinpliciture. Mind & Language 9,124-162.
Barker, Chris & Pauline Jacobson (eds.) 2007. Direct Compositionality. Oxford: Oxford University
Press.
Burdick, Howard 1982. A logical form for the propositional attitudes. Synthese 52.185-230.
Burge, Tyler 1978. Belief and synonymy. Journal of Philosophy 75,119-138.
Buridan, John 1998. Summulae de Dialectica 4. Summulae de Suppositionibus, volume 10-4 of Artis-
tarium. Nijmegen: Ingeniuin.
Carnap, Rudolf 1956. Meaning and Necessity. 2nd edn. Chicago, IL: The University of Chicago
Press.
Carston, Robyn 2002. Thoughts and Utterances. The Pragmatics of Explicit Communication. Oxford:
Oxford University Press.
Chomsky, Noam 1971. Topics in the theory of Generative Grammar. In: J. Searle (ed.). Philosophy
of Language. Oxford: Oxford University Press, 71-100.
Chomsky, Noam 1980. Rules and Representations. Oxford: Blackwell.
Cooper, Robin 1983. Quantification and Syntactic Theory. Dordrecht: Reidel.
Davidson, Donald 1967. Truth and meaning. Synthese 17, 304-323. Reprinted in: D. Davidson.
Inquiries into Truth and Interpretation. Oxford: Clarendon Press, 1984, 17-36. Page reference
to the reprint.
Dowty, David 2007. Compositionality as an empirical problem. In: C. Barker & P. Jacobson (eds.).
Direct Compositionality. Oxford: Oxford University Press, 23-101.
Dununett, Michael 1973. Frege. Philosophy of Language. London: Duckworth.
Dununett, Michael 1981. The Interpretation ofFrege's Philosophy. London: Duckworth.
Fernando,Tun 2005. Compositionality inductively, co-inductively and contextually. In: E. Machery,
M. Werning & G. Schurz (eds.). The Compositionality of Meaning and Content: Foundational
Issues, vol. I. Frankfui t/M.: Ontos, 87-96.
Fodor, Jerry 1987. Psychosemantics. Cambridge, MA:The MIT Press.
Fodor, Jerry 2000. Reply to critics. Mind & Language 15,350-374.
Fodor, Jerry & Jerrold Katz 1964. The structure of a semantic theory. In: J. Fodor & J. Katz (eds.).
The Structure of Language. Englewood Cliffs, NJ: Prentice Hall, 479-518.
Frege, Gottlob 1884. Die Grundlagen der Arithmetik: eine logisch-mathematische Untersuchung
uber den Begriff der Zahl. Brcslau: W. Kocbncr. English translation in: J. Austin. The
Foundations of Arithmetic: A logico-mathematical enquiry into the concept of number. 1st edn. Oxford:
Blackwell, 1950.
Frege, Gottlob 1892. Uber Sinn und Bedeutuiig. Zeitschrift fiir Philosophic und philosophische
Kritik 100,25-50. English translation in: P. Geach & M. Black (eds.). Translations from the
Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78.
Frege, Gottlob 1923. Logische Untersuchungen. Dritter Teil: Gedankengefiige. Beitrage zur
Philosophic des deutschen Idealismus ///(1923-1926), 36-51. English translation in: P. Geach (ed.).
Logical Investigations. Oxford: Blackwell, 1977,55-77.
Hendriks, Herman 2001. Compositionality and inodel-theoretic interpretation. Journal of Logic,
Language and Information 10,29-48.
Higginbotham. James 1986. Linguistic theory and Davidson's program in semantics. In:
E. Lepore (ed.). Linguistic Theory and Davidsons Program in Semantics. Oxford: Blackwell,
29^8.
Hintikka, Jaakko 1984. A hundred years later: The rise and fall of Frege's influence in language
theory. Synthese 59,27^9.
122
I. Foundations of semantics
Hodges, Wilfrid 2001. Formal features of compositionality. Journal of Logic, Language and
Information 10,7-28.
Horwich, Paul 1998. Meaning. Oxford: Oxford University Press.
Houben, Jan 1997. The Sanskrit tradition. In: W. van Bekkuin et al. (eds.). The Sanskrit Tradition.
Amsterdam: Benjamins, 49-145.
Husserl, Edmund 1900. Logische Untersuchungen II/l. Translated by J. N. Findlay as Logical
Investigations. London: Routledge & Kegan Paul, 1970.
Jacobson, Pauline 2002. The (dis)organisation of the grammar: 25 years. Linguistics & Philosophy
25,601-626.
Janssen, Theo 1986. Foundations and Applications of Montague Grammar. Amsterdam: CWITracts
19 and 28.
Janssen, Theo 1997. Compositionality. In: J. van Benthem & A. ter Meulen (eds.). Handbook of
Logic and Language. Amsterdam: Elsevier, 417-473.
Janssen, Theo 2001. Frcgc, contcxtuality and compositionality. Journal of Logic, Language and
Information 10,115-136.
Johnson, Kent 2006. On the nature of reverse compositionality. Erkenntnis 64,37-60.
Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from
Kaplan. Oxford: Oxford University Press, 481-563.
Keenan, Edward & Edward Stabler 2004. Bare Grammar: A Study of Language Invariants.
Stanford, CA: CSLI Publications.
King. Peter 2001. Between logic and psychology. John Buridan on mental language. Paper
presented at the conference John Buridan and Beyond. Copenhagen, September 2001.
King, Peter 2007. Abelard on mental language. American Catholic Philosophical Quarterly 81,
169-187.
Kracht, Marcus 2003. The Mathematics of Language. Berlin: Mouton de Gruyter.
Kracht, Marcus 2007. The emergence of syntactic structure. Linguistics & Philosophy 30,
47-95.
Lahav, Ran 1989. Against compositionality: The case of adjectives. Philosophical Studies 57,
261-279.
Larson, Richard & Gabriel Segal 1995. Knowledge of Meaning. An Introduction to Semantic Theory.
Cambridge, MA:Thc MIT Press.
Lewis, David 1976. Probabilities of conditionals and conditional probabilities. The Philosophical
Review 85,297-315.
Mates, Benson 1950. Synonymity. University of California Publications in Philosophy 25, 201-226.
Reprinted in: L. Linsky (ed.). Semantics and the Philosophy of Language. Urbana, IL: University
of Illinois Press, 1952,111-136.
Montague. Richard 1968. Pragmatics. In: R. Klibanski (ed.). Contemporary Philosophy: A Survey
I: Logic and foundations of Mathematics. La Nuove Italia Editice, Florence, 1968, 102-122.
Reprinted in: R. Thornason (ed.). Formal Philosophy. Selected Papers of Richard Montague.
New Haven, CT: Yale University Press, 1974,95-118.
Nunberg, Geoffry, Ivan Sag & Thomas Wasow 1994. Idioms. Language 70,491-538.
Pagin, Peter 2003a. Communication and strong compositionality. Journal of Philosophical Logic
32,287-322.
Pagin, Peter 2003b. Schiffer on communication. Facta Philosophica 5,25-48.
Pagin, Peter 2005. Compositionality and context. In: G. Preyer & G. Peter (eds.). Contextualism in
Philosophy. Oxford: Oxford University Press. 303-348.
Pagin. Peter 2011. Communication and the complexity of semantics. In: W. Hinzcn. E. Machcry
& M. Werning (eds.). The Oxford Handbook of Compositionality. Oxford: Oxford University
Press.
Pagin. Peter & Francis J. Pelletier 2007. Content, context and communication. In: G. Preyer &
G. Peter (eds.). Context-Sensitivity and Semantic Minimalism. New Essays on Semantics and
Pragmatics. Oxford: Oxford University Press, 25-62.
6. Compositionality 123
Pagin, Peter & Dag Westerstahl 2009. Pure Quotation, Compositionality. and the Semantics of
Linguistic Context. Submitted.
Partee, Barbara H. 1997. The genitive. A case study. In: J. van Benthem & A. ter Meulen (eds.).
Handbook of Logic and Language. Amsterdam: Elsevier, 464-470.
Pelletier, Francis J. 1994. The principle of semantic compositionality. Topoi 13, 11-24. Expanded
version reprinted in: S. Davis & B. Gillon (eds.), Semantics:A Reader. Oxford: Oxford University
Press, 2004,133-157.
Pelletier, Francis J. 1999. Semantic compositionality: Free algebras and the argument from
ambiguity. In: M. Faller. S. Kaufmann & M. Pauly (eds.). Proceedings of the Seventh CSLI Workshop
on Logic, Language and Computation. Stanford, CA: CSLI Publications, 207-218.
Pelletier, Francis J. 2001. Did Frcge believe Frege's Principle? Journal of Logic, Language and
Information 10,87-114.
Peters, Stanley & Dag Westerstahl 2006. Quantifiers in Language and Logic. Oxford: Oxford
University Press.
Putnam, Hilary 1975a. Do true assertions correspond to reality? In: H. Putnam. Mind, Language
and Reality. Philosophical Papers vol. 2. Cambridge: Cambridge University Press, 70-84.
Putnam, Hilary 1975b. Mind, Language and Reality. Philosophical Papers vol. 2. Cambridge:
Cambridge University Press.
Recanati, Francois 2004. Literal Meaning. Cambridge: Cambridge University Press.
Reiiner, Marga 2002. Do adjectives conform to compositionality? Nous 16,183-198.
Salmon, Nathan 1986. Frege's Puzzle. Cambridge, MA:The MIT Press.
Schiffer, Stephen 1987. Remnants of Meaning. Cambridge, MA:Thc MIT Press.
Scai 1c, John 1978. Literal meaning. Erkenntnis 13,207-224.
Sperber, Dan & Deirdre Wilson 1995. Relevance. Communication & Cognition. 2nd edn. Oxford:
Blackwell.
Staal. J. F. 1969. Sanskrit philosophy of language. In: T A. Sebeok (ed.). Linguistics in South Asia.
The Hague: Mouton. 499-531.
Szabo. Zoltin 2000. Compositionality as supervenience. Linguistics & Philosophy 23.475-505.
Szab6. Zoltin 2001. Adjectives in context. In: R. M. Harnich & I. Kenesei (eds.). Adjectives in
Context. Amsterdam: Benjamins, 119-146.
Travis, Charles 1985. On what is strictly speaking true. Canadian Journal of Philosophy 15,
187-229.
Weiskopf, Daniel 2007. Compound nominals, context, and compositionality. Synthese 156,161-204.
Westerstahl, Dag 1998. On mathematical proofs of the vacuity of compositionality. Linguistics &
Philosophy 21,635-643.
Westerstahl, Dag 2002. On the compositionality of idioms: An abstract approach. In: J. van
Benthem, D. I. Beaver & D. Barker-Plummer (ed.). Words, Proofs, and Diagrams. Stanford, CA:
CSLI Publications, 241-271.
Westerstahl. Dag 2004. On the compositional extension problem. Journal of Philosophical Logic
33,549-582.
Westerstahl, Dag 2007. Remarks on scope ambiguity. In: E. Ahls£n et al. (eds.). Communication -
Action - Meaning. A Festschrift to Jens Allwood. Goteborg: Department of Linguistics,
University of Gothenburg. 43-55.
Westerstahl. Dag 2008. Decomposing generalized quantifiers. Review of Symbolic Logic 1:3,
355-371.
Westerstahl, Dag 2011. Compositionality in Kaplan style semantics. In: W. Hinzen, E. Machery &
M. Werning (eds.). The Oxford Handbook of Compositionality. Oxford: Oxford University Press.
Zadrozny, Wlodek 1994. From compositional to systematic semantics. Linguistics & Philosophy 17,
329-342.
Peter Pagin, Stockholm (Sweden)
Dag Westerstfihl, Gothenburg (Sweden)
124
I. Foundations of semantics
7. Lexical decomposition: Foundational issues
1. The purpose of lexical decomposition
2. The early history of lexical decomposition
3. Theoretical aspects of decomposition
4. Conclusion
5. References
Abstract
Theories of lexical decomposition assume that lexical meanings are complex. This
complexity is expressed in structured meaning representations that usually consist of
predicates, arguments, operators, and other elements of propositional and predicate logic.
Lexical decomposition has been used to explain phenomena such as argument linking,
selectional restrictions, lexical-semantic relations, scope ambiguites, and the inference
behavior of lexical items. The article sketches the early theoretical development from noun-
oriented semantic feature theories to verb-oriented complex decompositions. It also deals
with a number of theoretical issues, including the controversy between decompositional
and atomistic approaches to meaning, the search for semantic primitives, the function of
decompositions as definitions, problems concerning the interpretability of decompositions,
and the debate about the cognitive status of decompositions.
1. The purpose of lexical decomposition
1.1. Composition and decomposition
The idea that the meaning of single lexical units is represented in the form of lexical
decompositions is based on the assumption that lexical meanings arc complex. This
complexity is expressed as a structured representation often involving predicates,
arguments, operators, and other elements known from propositional and predicate logic. For
example, the noun woman is represented as a predicate that involves the conjunction of
the properties of being human, female, and adult, whereas the verb empty can be thought
of as expressing a causal relation between x and the becoming empty of y.
(1) a. woman: A.x[human(x) & female(x) & adult(x)]
b. to empty: A.yA.x[cAUSE(x, BECOME(EMPTY(y)))]
The structures involved in lexical decompositions resemble semantic structures on the
phrasal and sentential level.There is of course an important difference between semantic
decomposition and semantic composition; semantic complexity on the phrasal and
sentential level mirrors the syntactic complexity of the expression while the assumed
semantic complexity on the lexical level - at least as far as non-derived words are
concerned - need not correspond to any formal complexity of the lexical expression.
Next, we give an overview of the main linguistic phenomena treated within
decompositional approaches (section 1.2). Section 2 looks at the origins of the idea of
lexical decomposition (section 2.1) and sketches some early formal theories on the lexical
decomposition of nouns (sections 2.2, 2.3) and verbs (section 2.4). Section 3 is devoted
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 124-144
7. Lexical decomposition: Foundational issues 125
to a discussion of some long-standing theoretical issues of lexical decomposition, the
controversy between decompositional and non-decompositional approaches to lexical
meaning (section 3.1), the location of decompositions within a language theory (section
3.2), the status of semantic primitives (section 3.3), the putative role of decompositions
as definitions (section 3.4), the semantic interpretation of decompositions (section 3.5),
and their cognitive plausibility (section 3.6). The discussion relies heavily on the
overview of frameworks of lexical decomposition of verbs given in article 17 (Engelberg)
Frameworks of decomposition that can be consulted for a more detailed description of
the theories mentioned in the present article.
1.2. The empirical coverage of lexical decompositions
Which phenomena lexical decompositions are supposed to explain varies from
approach to approach. The following have been tackled fairly often in decompositional
theories:
(i) Argument linking: One of the main purposes for decomposing verbs has been
the attempt to form generalizations about the relationship between semantic
arguments and their syntactic realization. In causal structures like those given
for empty (lb) the first argument of a cause relation becomes the subject of the
sentence and is marked with nominative in nominative-accusative languages or
absolutive in ergative-absolutive languages. Depending on the linking theory
pursued, this can be expressed in different kinds of generalizations, for example, by
simply claiming that the first argument of cause always becomes the subject in
active sentences or - more general - that the least deeply embedded argument
of the decomposition is associated with the highest function in a corresponding
syntactic hierarchy.
(ii) Selectional restrictions: Lexical decompositions account for semantic co-occurrence
restrictions. The arguments selected by a lexical item are usually restricted to
particular semantic classes. If the item filling the argument position is not of the
required class, the resulting expression is semantically deviant. For instance, the verb
preach selects an argument filler denoting a human being for its first argument slot.
The decompositional features of woman (la) account for the fact that the woman
preached is semantically unobtrusive while the hope I chair I tree preached is not.
(iii) Ambiguity resolution: Adverbs often lead to a kind of sentential ambiguity
that was attempted to be resolved by reference to lexical decompositions. In a
scenario where Rebecca is pointing a gun at Jamaal, sentence (2a) may describe three
possible outcomes.
(2) a. Rebecca almost killed Jamaal.
b. kill: X.yX.x[Do(x,CAUSE(BECOME(DEAD(y))))]
Assuming a lexical decomposition for kill as in (2b), ambiguity resolution is achieved by
attaching almost to different predicates within the decomposition, yielding a scenario
where Rebecca almost pulled the trigger (almost do ...), a scenario where she pulled the
trigger but missed Jamaal (almost cause ...),and a scenario where she pulled the trigger,
hit him but did not wound him fatally (almost become dead ...).
126
I. Foundations of semantics
(iv) Lexical relations: Lexical decompositions have also been employed in the
analysis of semantic relations like hyperonymy, complementarity, synonymy, etc. (cf.
Bierwisch 1970:170). For example, assuming that a lexeme A is a hyperonym of a
lexeme B iff the set of properties conjoined in the lexical decomposition of lexeme
A is a proper part of the set of properties conjoined in the lexical decomposition of
lexeme B, we derive that child (3a) is a hyperonym of girl (3b).
(3) a. c/i/W:X.x[human(x) &->adult(x)]
b. girl: A.x[human(x) & ->adult(x) & female(x)]
(v) Lexical field structure: Additionally, lexical decompositions have been used in order
to uncover the structure of lexical fields (cf. section 2.2).
(vi) Inferences: Furthermore, lexical decompositions allow for semantic inferences that
can be derived from the semantics of primitive predicates. For example, a predicate
like become, properly defined, allows for the inference that Jamaal in (2a) was not
dead immediately before the event.
2. The early history of lexical decomposition
2.1. The roots of lexical decomposition
The idea that a meaning of a word can be explained by identifying it with the meaning of a
more complex expression is deeply rooted not only in linguistics but also in our common
sense understanding of language. When asked to explain to a non-native speaker what
the German word Junggeselle means, one would probably say that a Junggeselle is an
unmarried man. A decompositional way of meaning explanation is also at the core of the
Aristotelian conception of word meaning in which the meaning of a noun is sufficiently
explained by its genus proximum (here man) and its differentia specifica (here
unmarried). Like the decompositions in (1), this conception attempts to define the meaning
of a word. However, the distinction between genus proximum and differentia specifica
is not explicitly expressed in lexical decompositions: From a logical point of view, each
predicate in a conjunction as in (la) qualifies as a genus proximum.
The Aristotelian distinction is also an important device in lexicographic meaning
explanations as in (4a), where the next superordinate concept (donkey) of the lexical
item in question (jackass) and one or more distinctive features (male) are given (cf. e.g.,
Svensen 1993: 120ff). Interestingly, meaning explanations based on genus proximum
and differentia specifica have provoked some criticism within lexicography (Wiegand
1989) as well, and a closer look into standard monolingual dictionaries reveals that
many meaning explanations are not of the Aristotelian kind represented in (4a): They
involve near-synonyms (4b), integration of encyclopaedic (4c) and pragmatic
information (4d), extensional listings of members of the class denoted by the lexeme (4e),
pictorial illustrations (cf. numerous examples, e.g., in Harris 1923), or any combinations
thereof.
(4) a. jackass [...] 1. male donkey [...] (Thorndike 1941:501)
b. grumpy [...] surly;ill-humoured;gruff.[...] (Thorndike 1941:413)
7. Lexical decomposition: Foundational issues 12^7
c. scimitar [...] A saber with a much-curved blade with the edge on the convex
side, used chiefly by Mohammedans, esp. Arabs and Persians. [...] (Webster's,
Harris 1923:1895)
d. Majesty [...] title used in speaking to or of a king, queen, emperor, empress,
etc.; as, Your Majesty, His Majesty, Her Majesty. [...] (Thorndike 1941:562)
e. cat [...] 2. Any species of the family Felidae, of which the domestic cat is the
type, including the lion, tiger, leopard, puma, and various species of tiger cats,
and lynxes, also the cheetah. [...] (Webster's, Harris 1923:343)
This foreshadows some persistent problems of later approaches to lexical
decomposition.
2.2. Semantic feature theories and the semantic structure of nouns
As we have seen, the concept of some kind of decomposition has been around ever
since people began to systematically think about word meanings. Yet, it was not until the
advent of Structural Semantics that lexical decompositions have become part of more
restrictive semantic theories. Structural Semantics emerged in the late 1920s as a
reaction to the semantic mainstream, which, at the time, was oriented towards psychological
explanations of idiolectal variation and the diachronic change of single word meanings.
It conceived of lexical semantics as a discipline that revealed the synchronic structure of
the lexicon from a non-psychological perspective. The main tenet was that the meaning
of a word can only be captured in its relation to the meaning of other words.
Within Structural Semantics, lexical decompositions developed in the form of
breaking down word meanings into semantic features (depending on the particular
approach also called'semantic components','semantic markers', or'sememes'). An early
analysis of this sort can be found in Hjelmslev's Prolegomena from 1943 (Hjelmslev
1963: 70) who observed that systematic semantic relationships can be traced back
to shared semantic components (cf. Tab. 7.1). He favored a strict decompositional
approach in that (i) he explicitly conceived of decompositions like the ones in Tab. 7.1 as
definitions of words and (ii) assumed that content-entities like 'ram', 'woman', 'boy'
have to be eliminated from the inventory of content-entities if they can be defined by
decompositions (Hjelmslev 1963:72ff).
Tab. 7.1: Semantic components (after Hjelmslev 1963:70)
'sheep'
'human
'child'
'horse'
being'
'he'
'ram'
'man'
'boy'
'stallion'
'she'
'ewe'
'woman'
'girl'
'mare'
Following the Prague School's feature-based approach to phonology, it was later assumed
that semantic analyses should be based on a set of functional oppositions like [±human],
[±male], etc. (cf. also article 16 (Bierwisch) Semantic features and primes). Semantic
feature theories developed along two major lines. In Europe, structuralists like Pottier
(1963,1964), Coseriu (1964), and Greimas (1966) employed semantic features to reveal
128
I. Foundations of semantics
the semantic structure of lexical fields. A typical example for a semantic feature
analysis in the European structuralist tradition is Pottier's (1963) analysis of the lexical field
of sitting furniture with legs (French siege) that consists of the lexemes chaise, fauteuil,
tabouret,canape, andpouf(ci.Tab.7.2). Six binary features serve to define and structure
the field: s1 = avec dossier 'with back', s2 = sur pied 'on feet', s3 = pour 1 personne 'for
one person', s4 = pour s'asseoir 'for sitting', s5 = avec bras 'with armrest', and s6 = avec
matdriau rigide 'with rigid material'.
Tab. 7.2: Semantic feature analysis of the lexical field siege ('seat with legs') in French (Pottier 1963:16)
s1 s2 s* s4 ss s6
chaise + + + + - +
fauteuil + + + + + +
tabouret - + + + - +
canape + + - + + +
pouf - + + +
In North America, Katz, Fodor, and others tried to develop a formal theory of the
lexicon as a part of so-called Interpretive Semantics that constituted the semantic
component of the Standard Theory of Generative Grammar (Chomsky 1965). In this tradition,
semantic features served, for example, as targets for selectional restrictions (Katz &
Fodor 1963). The semantic description of lexical items consists of two types of features,
'semantic markers' and 'distinguishes', by which the meaning of a lexeme is decomposed
exhaustively into its atomic concepts: "The semantic markers assigned to a lexical item
in a dictionary entry are intended to reflect whatever systematic semantic relations hold
between that item and the rest of the vocabulary of the language. On the other hand,
the distinguishes assigned to a lexical item arc intended to reflect what is idiosyncratic
about its meaning." (Katz& Fodor 1963:187). An example entry is given in Tab. 7.3.
Tab. 7.3: Readings of the cnglish noun bachelor distinguished by semantic markers (in parentheses)
and distinguishcrs (in square brackets) (Katz, after Fodor 1977:65)
bachelor, [+N,...],
— (Human), (Male), [who has never married]
— (Human), (Male), [young knight serving under the standard of another knight]
— (Human), [who has the first or lowest academic degree]
— (Animal), (Male), [young fur seal when without a mate during the breeding time]
Besides this feature-based specification of the meaning of lexical items, Interpretive
Semantics assumed recursive rules that operate over syntactic deep structures and build
up meaning specifications for phrases and sentences out of lexical meaning specifications
(Katz & Postal 1964).
As we have seen, semantic feature theories make it possible to tackle phenomena
in the area of lexical fields and selectional restrictions (cf. also article 21 (Cann) Sense
relations). They can also be used in formal accounts of lexical-semantic relations. For
example, expression A is incompatible with expression B iff A and B have different
7. Lexical decomposition: Foundational issues 129
values for at least one of their semantic features: boy [+human, -adult, -female] is
incompatible with woman [+human, +adult, +female]. Expression A is
complementary to expression B iff A and B have different values for exactly one of their semantic
features: for instance,girl [+human, -adult, +female] is complementary to boy.
Expression A is hyperonymous to expression B iff the set of feature-value assignments for A
is included in the set of feature-value assignments for B: thus, child [+iiuman, -adult] is
hyperonymous to boy.
In European structuralism, the status of semantic features was a matter of debate.
They were usually conceived of as part of a descriptive, language-independent semantic
metalanguage, but were also treated as cognitive entities. In Generative Grammar,
Katz conceived of semantic features as derived from a universal conceptual
structure: "Semantic markers must [...] be thought of as theoretical constructs introduced
into semantic theory to designate language invariant but language linked components
of a conceptual system that is part of the cognitive structure of the human mind." (Katz
1967: 129). In a similar vein, Bierwisch (1970: 183) assumed that the basic semantic
components "are not learned in any reasonable sense of the term, but are rather an
innate predisposition for language acquisition." Thus, language-specific semantic
structures come about by the particular combination of semantic features that yield a
lexical item.
2.3. Some inadequacies of semantic feature analyses
Semantic feature theories considerably stimulated research in lexical semantics. Beyond
that, semantic features had found their way into contemporary generative syntax as a
target for sclcctional restrictions (cf. Chomsky 1965). Yet, a number of empirical
weaknesses have quickly become evident.
(i) Relational lexemes: The typical cases of semantic feature analyses seem to
presuppose that all predicates are one-place predicates. Associating woman with the
feature bundle [+human, +adult, +female] means that the referent of the sole
argument of woman{\) has the three properties of being human, adult, and female.
Simple feature bundles cannot account for relational predicates like mother(\,y) or
devour(x,y), because the argument to which a semantic feature attaches has to be
specified. With mother(x,y), the feature [+female] applies to the first, but not to the
second argument.
(ii) Structure of verb decompositions: Semantic feature analyses usually represent
word meanings as unordered sets of features. However, in particular with verbal
predicates, the decomposition cannot be adequately formulated as a flat structure
(cf. section 2.4).
(iii) Undecomposable words: It has been criticized that cohyponyms in larger
taxonomies, such as lion, tiger, puma, etc. as cohyponyms of cat (cf. 4e) or rose, tulip,
daffodil, carnation, etc. as cohyponyms of flower, cannot be differentiated by semantic
features in a non-trivial way. If one of the semantic features of rose is [+flower],
then what is its distinguishing feature? This feature should abstract from a rose
being a flower since [+flower] has already been added to the feature list.
Moreover, it should not be unique for the entry of rose or else the number of features
130
I. Foundations of semantics
threatens to surpass the number of lexical entries. In other words, there does not
seem to be any plausible candidate for P that would make Vx[rose(x) <-> (p(x) &
flower(x))] a valid definition (cf. Fodor 1977:150; Roelofs 1997:46ff for arguments
of this sort). Besides cohyponymy in large taxonomies, there are other lexical
relations as well that cannot be adequately captured by binary feature descriptions, for
instance the scalar nature of antonymy and lexical rows like hot > warm > tepid >
cool > cold. In general, it is simply unclear for many lexical items what features
might be used to distinguish them from near-synonyms (cf. grumpy in (4b)).
(iv) Exhaustiveness: For most lexical items, it seems to be impossible to give an
exhaustive lexical analysis, that is, one that provides the features that are necessary and
sufficient to distinguish the item from all other lexical items of the language without
introducing features that are used solely for the description of this one particular
item. Katz's distinction between markers and distinguishes does not solve the
problem. Apart from the fact that the difference between 'markers' as targets for
selectional restrictions and 'distinguishes' as lexeme-specific idiosyncratic
features is not supported by the data (cf. Bierwisch 1969:177ff; Fodor 1977:144ff), this
concession to semantic diversity weakens the explanatory value of semantic
feature theories considerably since no restrictions are provided for what can occur as
a distinguishes
(v) Finiteness: Only rarely have large inventories of features been assembled (e.g., by
Lorcnz & Wotjak 1977). Moreover, semantic feature theory has not succeeded in
developing operational procedures by which semantic features can be discovered.
Thus, it has not become evident that there is a finite set of semantic features that
allows a description of the entire vocabulary of a language, in particular that this set
is smaller than the set of lexical items.
(vi) Universality: Another point of criticism has been that the alleged universality of
a set of features has not been convincingly demonstrated. As Lyons (1968: 473)
stated, cross-linguistic semantic comparisons of semantic structures rather point to
the contrary. However, the search for a universal semantic metalanguage has
continued and has become a major topic in particular among the proponents of Natural
Semantic Metalanguage (cf. article 17 (Engelberg) Frameworks of decomposition,
section 8).
(vii) Theoretical status: The often unclear theoretical status of semantic features has
drawn some criticism as well. Among other things, it has been argued that in order
to express that a mare is a female horse it is not necessary to enrich the
metalanguage by numerous features. The relation can equally well be expressed on the level
of object language by assuming a meaning postulate in form of a biconditional:
oVx[mare(x) <-> horse(x) & female(\)].
2.4. Lexical decomposition and the semantic structure of verbs
The rather complex semantic structure of many verbs could not be adequately captured
by semantic feature approaches for two reasons: They focused on one-place lexemes, and
they expressed lexical meaning in flat structures, that is, by simply conjoining semantic
features. Instead, hierarchical structures were needed. Katz (1971) tackled this problem
in the form of decompositions that also included aspectually relevant features such as
'activity' (cf.Tab. 7.4).
7. Lexical decomposition: Foundational issues 13J_
Tab. 7.4: Decomposition of chase (after Katz 1971:304)
chase -»Verb, Verb transitive....;
(((Activity) (Nature: (Physical)) of*), ((Movement) (Rate: Fast)) (Character: Following)),
(Intention of*: (Trying to catch ((V) ((Movement) (Rate: Fast))))); (SR).
While this form of decomposition never caught on, other early verb decompositions (cf.
Bendix 1966, Fillmore 1968, Bierwisch 1970) look more familiar to semantic
representations still employed in many theories of verb semantics:
(5) a. g/ve(x,y,z): x cause (y have z) (after Bendix 1966:69)
b. persuade(x,y^.): x cause (y believe z) (after Fillmore 1968:377)
It was the rise of Generative Semantics in the late 1960s that caused a shift in interest
from decompositional structures of nouns to lexical decompositions of verbs. The
history of lexical decompositions of verbs that emerged from these early approaches is
reviewed in article 17 (Engelberg) Frameworks of decomposition. Starting from early
generative approaches, verb decompositions have been employed in theories as different
as Conceptual Semantics (cf. also article 30 (Jackendoff) Conceptual Semantics),
Natural Semantic Metalanguage, and Distributed Morphology (cf. also article 81 (Harlcy)
Semantics in Distributed Morphology).
3. Theoretical aspects of decomposition
3.1. Decompositionalism versus atomism
Directly opposed to decompositional approaches to lexical meaning stands a theory of
lexical meaning that is known as lexical atomism or holism and whose main proponent is
Jerry A.Fodor (1970,1998,Fodor et al. 1980). According to a decompositional concept of
word meaning, knowing the meaning of a word involves knowing its decomposition, that
is, the linguistic or conceptual entities and relations it consists of. Fodor's atomistic
conception of word meaning rejects this view and assumes instead that there is a direct
correspondence between a word and the mental particular it stands for. A lexical meaning
does not have constituents, and - in a strict formulation of atomism - knowing it does not
involve knowing the meaning of other lexical units.
Fodor (1998: 45) observes in favor of his atomistic, anti-definitional approach that
there arc practically no words whose definition is generally agreed upon - an argument
that cannot be easily dismissed. Atomists are also skeptical about the claim that
decompositions/definitions are simpler than the words they are attached to: "Does anybody
present really think that thinking bachelor is harder than thinking unmarried? Or that
thinking father is harder than thinking parent?" (Fodor 1998:46).
Discussing Jackendoff s (1992) decompositional representation of keep, Fodor (1998:
55) comments on the relation that is expressed in examples as different as someone kept
the money and someone kept the crowd happy: "I would have thought, saying what
relation they both instance is precisely what the word 'keep' is for; why on earth do you
suppose that you can say it 'in other words'?" And he adds: "I can't think of a better way to
say what 'keep' means than to say that it means keep. If, as I suppose, the concept keep is
132
I. Foundations of semantics
an atom, it's hardly surprising that there's no better way to say what 'keep' means than
to say that it means keep" More detailed arguments for and against atomistic positions
will appear throughout this article.
The controversy between decompositionalists and atomists is often connected to the
question whether decompositions or meaning postulates should be employed to
characterize lexical meaning. Meaning postulates are used to express analytic knowledge
concerning particular semantic expressions (Carnap 1952: 67). Lexical meaning postulates
are necessarily true. They consist of entailments where the antecedent is an open lexical
proposition (6a).
(6) a. dVx[bachelor(x) -> man(x)]
dVx[bacmelor(x) -» ->married(x)]
b. dVx[baciielor(x) <-> (man(x) & ->married(x))]
c. bachelor: A.x[man(x) & unmarried(x)]
Meaning postulates can also express bidirectional entailments as in (6b) where the
biconditional expresses a definition-like equivalence between a word and its
decomposition. I will assume that in the typical case on a pure lexical level of meaning
description dccompositional approaches like (6c) conceive of word meanings as
bidirectional entailments as in (6b) while atomistic approaches involve monodirectional
entailments as in (6a) (cf. similarly Chierchia & McConnell-Ginet 1990: 360ff). Thus,
meaning postulates do not per se characterize atomistic approaches to meaning, but it
is rather the kind of meaning postulate that serves to distinguish the two basic stances
on word meaning. Informally, one might say that bidirectional meaning postulates
provide definitions, monodirectional ones single aspects of word meaning in form of
relations to other semantic elements. Three caveats are in order here: (i) Semantic
reconstruction on the basis of meaning postulates is not uniformly accepted either in
the dccompositional or in the atomistic camp. Some proponents of dccompositional
approaches do not adhere to a definitional view of decompositions; they claim that
their decompositions do not cover the whole meaning of the lexical item (cf. section
3.4). At the same time, some radical approaches to atomism reject lexical meaning
postulates completely (Fodor 1998). (ii) Decompositions and bidirectional meaning
postulates are only equivalent on the level of meaning explanation (cf. Chierchia
& McConnell-Ginet 1990: 362). They differ, however, in that in dccompositional
approaches the word defined (bachelor in (6c)) is not accessible within the semantic
derivation while the elements of the decomposition (man(x) & unmarried(x))
are. This can have an effect, for example, on the explanation of scope phenomena,
(iii) Furthermore, decompositions as in (6c) and bidirectional meaning postulates
as in (6b) can give rise to different predictions with respect to language processing
(cf. section 3.6).
3.2. Decompositions and the lexicon
One of the most interesting differences in the way verb decompositions are used in
different language theories concerns their location within the theory. Some approaches
locate decompositions and the principles and rules that build them up in syntax (e.g.,
7. Lexical decomposition: Foundational issues 133
Generative Semantics, Distributed Morphology), some in semantics (e.g., Dowty's
Montague-based theory, Lexical Decomposition Grammar, Natural Semantic
Metalanguage), and others in conceptual structure (e.g., Conceptual Semantics) (cf. article
17 (Engelberg) Frameworks of decomposition). Decompositions are sometimes
constrained by interface conditions as well. These interface relations between linguistic
levels of representation are specified to a different degree in different theories. Lexical
Decomposition Grammar has put some effort into establishing interface conditions
between syntactic, semantic, and conceptual structure. In syntactic approaches to
decomposition (Lexical Relational Structures, Distributed Morphology), however,
the relation between syntactic decompositions and semantic representations often
remains obscure - one of the few exceptions being von Stcchow's (1995) analysis of the
scope properties of German wieder 'again' in syntactic decompositions (cf. article 17
(Engelberg) Frameworks of decomposition).
The way decompositions are conceived has an impact on the structure of the
lexicon and its role within the architecture of the language theory pursued. While some
approaches advocate rather rich meaning representations (e.g., Natural Semantic
Metalanguage, Conceptual Semantics), others downplay semantic representation and reduce it
to a rather unstructured domain of encyclopaedic knowledge (Distributed Morphology)
(cf. the overview in Ramchand 2008). Meaning representation itself can occur on more
than one level. Sometimes the distinction is between semantics proper and some sort
of conceptual representation (e.g., Lexical Decomposition Grammar); sometimes
different levels of conceptual representation are distinguished such as Jackendoffs (2002)
Conceptual Structure and Spatial Structure.Theories also differ in how much the lexicon
is structured by rules and principles. While syntactic approaches often conceive of the
lexicon as a mere inventory, other approaches (e.g. Levin & Rappaport Hovav's Lexical
Conceptual Structures, Wunderlich's Lexical Decomposition Grammar, Pustejovsky's
Event Structures) assume different kinds of linking principles, interface conditions and
structure rules for decompositions (for references, cf. article 17 (Engelberg) Frameworks
of decomposition).
3.3. Decompositions and primitives
Decompositional approaches to lexical meaning usually claim that all lexical items can be
completely reduced to their component parts; that is, they can be defined. This requires
certain conditions on decompositions in order to avoid infinite regress: (i) The
predicates used in the decompositions are either semantic primitives or can be reduced to
semantic primitives by definitions. It is necessary that the primitives are not reduced to
other elements within the vocabulary, but are grounded elsewhere, (ii) Another condition
is that the set of primitives be notably smaller than the lexicon (cf. Fodor et al. 1980:268).
(iii) Apart from their primitivity, it is often required that predicates within decompositions
be general, that is, distinctive for a large number of lexemes, and universal, that is, relevant
to the description of lexemes in all or most languages (cf. Lobner 2002:132ff) (cf. also the
discussion in article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure).
The status of these primitives has been a constant topic within decompositional
semantics. Depending on the particular theories, the vocabulary of semantic primitives
is located on different levels of linguistic structure. Theories differ as to whether these
predicates are elements of the object language (e.g., Natural Semantic Metalanguage),
134
I. Foundations of semantics
of a semantic metalanguage (e.g., Montague Semantics) or of some set of conceptual
entities (e.g., Lexical Conceptual Semantics). A finite set of primitives is rarely given,
the notable exception being Natural Semantic Metalanguage. Most theories obviously
assume a core of the decompositional vocabulary, including such items as cause, become,
do, etc., but they also include many other predicates like alive, believe, in, mouth, write.
Since they are typically not concerned with all subtleties of meaning, most theories often
do not bother about the status of these elements.They might be conceived of as definable
or not. While in Dowty's (1979) approach cause gets a counterfactual interpretation in
the vein of Lewis (1973), Lakoff (1972: 6l5f) treats cause as a primitive universal and,
similarly, in Natural Semantic Metalanguage because is taken as a primitive.
However, no matter how many primitives a theory assumes, something has to be
said about how these elements can be grounded. Among the possible answers are the
following: (i) Semantic primitives are innate (or acquired before language acquisition)
(e.g., Bierwisch 1970). To my knowledge, no evidence from psycholinguistics or neuro-
linguistics has been obtained for this claim, (ii) Semantic primitives can be reduced to
perceptual features. Considering the abstract nature of some semantic features, a
complete reduction to perception seems unlikely (cf. Jackendoff 2002: 339). (iii) Semantic
primitives are conceptually grounded (e.g., Natural Semantic Metalanguage, Conceptual
Semantics).This is often claimed but rarely pursued empirically (but cf. Jackendoff 1983,
Engelberg 2006).
3.4. Decompositions and definitions
If lexical decompositions are semantically identified with biconditional meaning
postulates, they can be regarded as definitions: They provide necessary and sufficient
conditions. It has been questioned whether word meaning can be captured this way. The
non-equivalence of kill and cause to die had been a major argument against
Generative Semantics (cf. article 17 (Engelberg) Frameworks of decomposition, section 2). It
has been observed that the definitional approach simply fails on a word-by-word basis:
"There are practically no defensible examples of definitions; for all the examples we've
got, practically all words (/concepts) are undefinable.And,of course, if a word (/concept)
doesn't have a definition, then its definition can't be its meaning." (Fodor 1998:45) Given
what was said about lexicography in section 2.1, we must concede that even by a less
strict view on semantic equivalence, the definitional approach can only be applied to a
subgroup of lexical items. Atomistic and prototype-based approaches to lexical meaning
thrive on these observations. Decompositionalist approaches to word meaning have
reacted differently to these problems. Some have just denied them: Natural Semantic
Metalanguage claims that based on a limited set of semantic primitives and their
combinatorial potential a complete decomposition is possible. Other approaches, in particular
Conceptual Semantics, point to the particular descriptive level of their decompositions.
They claim that cause does not have the same meaning as cause, the former being a
conceptual entity. This allows Jackendoff (2002: 335f) to state that decompositions are
not definitions since no equation between a word and a synonymous phrasal expression
is attempted. However, the meanings of the decompositional predicates employed are
all the more in need of explanations since our intuition about the meaning of natural
language lexemes does not account for them anymore. As Pulman (2005) puts it in a
discussion of Jackendoff's approach: "[...] if your intuition is that part of the meaning
7. Lexical decomposition: Foundational issues 135
of'drink' is that liquid should enter a mouth, then unless there is some explicit
connection between the construct mouth and the meaning of the English word 'mouth', that
intuition is not accounted for." Finally, some researchers assume that decompositions
only capture the core meaning of a word (e.g., Kornfilt & Correra 1993:83).Thus,
decompositions do not exhaust a word's meaning, and they are not definitions. This, of course,
poses the questions what they are decompositions of and by what semantic criteria core
aspects of lexical meaning can be identified. In any case, a conception of
decompositions as incomplete meaning descriptions weakens the approach considerably."It is, after
all, not in dispute that some aspects of lexical meanings can be represented in quite an
exiguous vocabulary; some aspects of anything can be represented in quite an exiguous
vocabulary," Fodor (1998:48) remarks and adds: "It is supposed to be the main virtue of
definitions that, in all sorts of cases, they reduce problems about the defined concepts
to corresponding problems about its primitive parts. But that won't happen unless each
definition has the very same content as the concept it defines." (Fodor 1998:49) In some
approaches the partial specification of meaning in semantic decompositions results from
a distinction between semantic and conceptual representation (e.g., Bierwisch 1997).
Semantic decompositions are underspecified and are supplemented on the utterance
level with information from a conceptual representation.
The completeness question is sometimes tied to the attempt to distinguish those
aspects of meaning that are grammatically relevant from those that are not. This is done
in different ways. Some assume that decompositions arc incomplete and represent only
what is grammatically relevant. Others differentiate between levels of representation;
Lexical Decomposition Grammar distinguishes Semantic Form, which includes the
grammatically relevant information, from Conceptual Structure. Finally, some assume
one level of representation in which only particular parts are grammatically relevant;
in Rappaport Hovav & Levin (1998), general decompositional templates contain the
grammatically relevant information whereas idiosyncratic aspects arc reflected by
lexeme-specific constants that are inserted into these templates.
However, the distinction between grammatically relevant and irrelevant properties
within a semantic representation raises a serious theoretical question. It is a truism that
not all subtleties of lexical meaning show grammatical effects. With to eat, the implied
aspect of intentional agentivity is grammatically relevant in determining which
argument becomes the subject while the implied aspect of biological food processing is not.
However, distinguishing the grammatically relevant from the irrelevant by assigning
them different locations in representations is not more than a descriptive convention
unless one is able to show that grammatically relevant meaning is a particular type of
meaning that can be distinguished on semantic grounds from grammatically irrelevant
meaning. But do all the semantic properties that have grammatical effects (intentional
agentivity and the like) form one natural semantic class and those that do not (biological
food processing and the like) another? As it stands, it seems doubtful that such classes
will emerge. As Jackendoff (2002: 290) notes, features with grammatical effects form a
heterogeneous set. They include well-known distinctions such as those between agent
and experiencer or causation and non-causation but also many idiosyncratic properties
such as the distinction between emission verbs where the sound can be associated with
an action of moving and which therefore allow a motion construction (the car squealed
around the corner) and those where this is not the case (*the car honked around the
corner) (cf. Levin & Rappaport Hovav 1996).
136
I. Foundations of semantics
3.5. Decompositions and their interpretation
It is evident that in order to give empirical content to theoretical claims on the basis of
lexical semantic representations, the meaning of the entities and configurations of
entities in these semantic representations must be clear. This is a major problem not only
for decompositional approaches to word meaning.To give an example from Distributed
Morphology (DM), Harlcy & Noycr (2000: 368) notice that cheese is a mass noun and
sentences like I had three cheeses for breakfast are unacceptable. The way DM is set up
requires deriving the mass noun restriction from encyclopaedic knowledge (cf. article
17 (Engelberg) Frameworks of decomposition, section 10). Thus, it is listed in the
encyclopaedia that "cheese does not typically come in discrete countable chunks or types".
However, to provide a claim like that with any empirical content, we need a methodology
for determining what the encyclopaedic knowledge for cheese looks like. As we will see,
lexical-semantic representations often exhibit a conflict when it comes to determining
what a word actually means. If we use its syntactic behaviour as a guideline for its meaning,
the semantic representation is not independently motivated but circularly determined by
the very structures it purports to determine. If we use our naive everyday conception of
what a word means, we lack an objective methodology of determining lexical meaning.
Moreover, the two paths lead to different results. If we take a naive, syntax-independent
look at the encyclopaedic semantics of cheese, a visit to the next supermarket will tell us
that - contrary to what Harlcy & Noyer claim - all cheese comes in chunks or slices, so
does all sausage and some of the fruit. Thus, it seems that the encyclopaedic knowledge
in DM is not arrived at by naively watching the world around you. If it were, the whole
architecture of Distributed Morphology would probably break down since, as corpus
evidence shows, the German words for'cheese' Kdse, 'sausage' Wurst, and 'fruit' Obst are
different with respect to the count/mass distinction although their supermarket
appearance with respect to chunks and slices is very similar: Wurst can be freely used as a count
and a mass noun; Kdse is a mass noun that can be used as a count noun, especially, but
not only, when referring to types of cheese, and Obst is obligatorily a mass noun.Thus, the
encyclopaedic knowledge about cheese in Distributed Morphology seems to be forced
by the fact that cheese is grammatically a mass noun. This two-way dependency between
grammar and encyclopaedia immunizes DM against falsification and, thus, renders a
central claim of DM empirically void.
The neglect of semantic methodology and theory particularly in those approaches
that advocate a semantically constrained but otherwise free syntactic generation of
argument structures has of course been pointed out before: "Although lip service is often
paid to the idea that a verb's meaning must be compatible with syntactically determined
meaning [...], it is the free projection of arguments that is stressed and put to work, while
the explication of compatibility is taken to be trivial." (Rappaport Hovav & Levin 2005:
275). It has to be emphasized that it is one thing to observe differences in the syntactic
behaviour of words in order to build assumptions about hitherto unnoticed semantic
differences between them but it is a completely different matter to justify the existence of a
particular semantic property of a word on the basis of a syntactic construction it occurs
in and then use this property to predict its occurrence in this construction. The former
is a useful heuristic method for tracing semantic properties that would then need to be
justified independently; the latter is a circular construction of explanations that can rob
a theory of most of its empirical value (cf. Engelberg 2006). This becomes particularly
7. Lexical decomposition: Foundational issues 137
obvious in approaches that distinguish grammatically relevant from grammatically
irrelevant meaning. For example, Grimshaw (2005: 75f) claims that "some meaning
components have a grammatical life" ('semantic structure') while "some are linguistically inert"
('semantic content'). Only the former have to be linguistically represented. She then
discusses Jackendoff's (1990:253) representation of ear as a causative verb, which according
to Jackendoff means that xcauses^ to go into x's mouth.While she agrees that Jackendoff's
representation captures what eat "pretheoretically" means, she does not consider
causation as part of the representation of eat since eat differs from other causatives (e.g., melt)
in lacking an inchoative variant (Grimshaw 2005: 85f). Thus, we are confronted with a
concept of grammatically relevant causation and a concept of grammatically irrelevant
causation, which apart from their grammatical relevance seem to be identical.The result
is a theory that pretends to explain syntactic phenomena on the basis of lexical meaning
but actually just maps syntactic distinctions onto distinctions on a putatively semantic
level, which, however, is not semantically motivated.
Further problems arise if decompositions are adapted to syntactic structures: There is
consensus among semanticists and philosophers that causation is a binary relation with
both relata belonging to the same type. Depending on the theory, the relation holds either
between events or proposition-like entities. In decompositional approaches, the causing
argument is often represented as an individual argument, cause(x,p), since purportedly
causative verbs only allow agent-denoting NPs in subject position. When this conflict
is discussed, it is usually suggested that the first argument of cause is reinterpreted as
'x does something' or 'that x does something'. Although such a reinterpretation can be
formally implemented, it raises the question why decomposition is done at all. One could
as well stay with simple predicate-argument structures like dry(x,y) and reinterpret them
as 'something that x does causes y to become dry'. Furthermore, the decision to represent
the first argument of cause as an individual argument is not motivated by the meaning
of the lexical item but, in a circular way, by the same syntactic structure that it claims to
explain. Finally, the assumption that all causative verbs require an agentive NP in subject
position is wrong. While it holds for verbs like German trocknen 'to dry' it does not hold
for verbs like vergrojkrn 'enlarge' that allow sentential subjects. In any case, the
asymmetrical representation of causation raises the problem that two cause predicates need
to be introduced where one is reinterpreted in terms of the other. The problem is even
bigger in approaches like Distributed Morphology that have done away with a mediating
lexicon. If wc assume that the bi-propositional (or bi-cventive) nature of causation is
part of the encyclopaedic knowledge of vocabulary items that express causation, then
causative verbs that only allow agentive subjects should be excluded from transitive verb
frames completely.
Even if the predicates used in decompositions are characterized in semantic terms,
the criteria are not always sufficiently precise to decide whether or not a verb has the
property expressed. Levin & Rappaport Hovav (1996) observe that there are
counterexamples to the wide-spread assumption that telic intransitive verbs are unaccusative
and agentive intransitive verbs are unergative, namely, verbs of sound, which are uner-
gative but not necessarily agentive nor telic {beep, buzz, creak, gurgle). Therefore they
propose that verbs that refer to "internally caused eventualities" arc unergative, which
is the case if "[...] some property of the entity denoted by the argument of the verb
is responsible for the eventuality" (Levin & Rappaport Hovav 1996: 501). If we want
to apply this idea to the unaccusative German zerbrechen 'break' and the unergative
138
I. Foundations of semantics
knacken 'creak', which they do not discuss, we have to check whether it is true that some
property of the twig is responsible for the creaking in der Zweig hat geknackt 'the twig
creaked' while there is no property of the twig that is responsible for the breaking in der
Zweig ist zerbrochen 'the twig broke'. In order to do that, we must know what 'internal
causation' is; that is, we have to answer questions like: What is 'causation'? What is
'responsibility'? What is 'eventuality'? Is 'responsibility', contrary to all assumptions
of theories of action, a predicate that applies to properties of twigs? What property of
twigs are we talking about? Is (internal) 'causation', contrary to all theories of
causation, a relation between properties and eventualities? As long as these questions are
not answered, proponents of the theory will agree that the creaking of the twig but not
the breaking is internally caused while opponents will deny it. And there is no way to
resolve this (cf. Engelberg 2001).
Similar circularities have been noticed by others. Fodor (1998: 51, 60ff), citing work
from Pinker (1989) and Higginbotham (1994), criticizes that claims about linking based
on interminably vague semantic properties elude any kind of evaluation. This is all the
worse since the predicates involved in decompositions are not only expressions in the
linguist's metalanguage but are concepts that are attributed to the speaker and his or her
knowledge about language (Fodor 1998:59).
Stipulations about the structure of decompositions can diminish the empirical value
of the theory, too. Structural aspects of decompositions concern the way embedded
predicates are combined as well as the argument structure of these predicates. For
example, predicates like cause or poss are binary because we conceive of causation
and possession as binary relations. Their binarity is a structural aspect that is deeply
rooted in our understanding of these concepts. However, it is just by convention that
most approaches represent the causing entity and the possessor as the first argument
of cause and poss, respectively. Which argument of multi-place predicates stands for
which entity is determined by the truth conditions for this predicates.Thus, the difference
between POss(xposs|:sso",yENTnv"POSSESSED) and poss(xENTmYPOSShSS,£D,yPOSSESSO") is just a notational
one. Whenever explanations rely on this difference, they are not grounded in
semantics but in notational conventions. This is, for example, the case in Lexical
Decomposition Grammar where the first argument of poss falls out as the higher one - with all its
consequences forTheta Structure and linking principles (Wunderlich 1997:39).
In summary, the problems with interpreting the predicates used in decompositions
and structural stipulations severely limit the empirical content of predictions based on
these decompositions. Two ways out of this situation have been pursued only rarely.
Decompositional predicates can be given precise truth conditions, as is done in Dowty
(1979), or they can be linked to cognitive concepts that are independently motivated
(e.g., Jackendoff 1983, Engelberg 2006).
3.6. Decompositions and cognition
Starting from the 1970s, psycholinguistic evidence has been used to argue for or against
lexical decompositions. While some approaches to decompositions accepted
psycholinguistic data as relevant evidence, proponents of particular lexical theories denied
that their theories are about lexical processing at all. Dowty (1979:391) emphasized that
what the main decompositional operators determine is not what the speaker/listener
7. Lexical decomposition: Foundational issues 139
must compute but to what he can infer. Similarly, Goddard (1998:135) states that "there
is no claim that people, in the normal course of linguistic thinking, compose their thoughts
directly in terms of semantic primitives; or, conversely, that normal processes of
comprehension involve real-time decomposition down to the level of semantic primitives." It has
also been suggested that in theorizing about decompositional versus atomistic theories
one should distinguish whether lexical concepts are definitionally primitive,
computationally primitive (pertaining to language processing), and/or developmentally primitive
(pertaining to language acquisition) (cf. Fodoret al. 1980:313; Carey 1982:350f).
When lexical decompositions are interpreted in psycholinguistic terms, the typical
assumption is that the components of a decomposition are processed each time the
lexical item is processed. Lexical processing efforts should emerge as a function from the
complexity of the lexical decomposition to processing time: The more complex the
decomposition, the longer the processing time. Most early psycholinguistic studies did
not produce evidence for lexical decomposition (cf. Fodor et al. 1980; Johnson-Laird
1983). Employing a forced choice task and a rating test, Fodor et al. (1980) failed to find
processing differences between causative verbs like kill, which are putatively decompo-
sitionally complex, and non-causative verbs like bite. Fodor, Fodor & Garrett (1975:522)
reported a significant difference between explicit negatives (e.g., not married) and
putatively implicit negatives (e.g., an UNMARRiED-feature in bachelor) in complex conditional
sentences like (7).
(7) a. If practically all men in the room are not married, then few of the men in the
room have wives,
b. If practically all men in the room are bachelors, then few of the men in the
room have wives.
Sentences like (7a) that contain explicit negatives gave rise to longer processing times,
thus suggesting that bachelor does not contain hidden negatives. Measuring fixation
time during reading, Rayner & Duffy (1986) did not find any differences between
putatively complex words like causatives and non-causatives. Similarly, Roelofs (1997: 48ff)
discussed several models for word retrieval and argued for a non-decompositional
spreading-activation model.
More recent studies display a more varied picture. Gennari & Poeppel (2003)
compared cvcntivc verbs like build, distort, show, which denote causally structured events,
with stative verbs like resemble, lack, love, which do not involve complex cause/
become structures. Controlling for differences in argument structure and thematic
roles, they carried out a self-paced reading study and a visual lexical decision task. They
found that semantic complexity was reflected in processing time and that elements of
decomposition-like structures were activated during processing. McKoon & MacFarland
(2002) adopted Rappaport Hovav & Levin's (1998) template-based approach to
decomposition and their distinction between verbs denoting internal causation (bloom) and
external causation (break) (Levin & Rappaport Hovav 1995,1996).They reported longer
processing times for break-iype verbs than for bloom-Xype verbs in grammatically
judgments, reading time experiments, and lexical decision tasks. They interpreted the
results as confirmation that break-iype verbs involve more complex decompositions than
bloom-type verbs.
140
I. Foundations of semantics
Different conclusions were drawn from other experiments. Applying a "release from
proactive interference" technique, Mobayyen & de Almeida (2005) investigated the
processing times for lexical causatives (bend,crack,grow), morphological causatives (thicken,
darken, fertilize), perception verbs (see, hear, smell), and repetitive perception verbs with
morphological markers (e.g., re-smell). If verbs were represented in the form of
decompositions, the semantically more complex lexical and morphological causatives should
pattern together and evoke longer processing times than perception verbs. However, this
did not turn out to be the case. Morphological causatives and the morphologically
complex re-verbs required longer processing than lexical causatives and perception verbs.
Similar results have been obtained in action-naming tasks carried out with Alzheimer
patients (cf. dc Almeida 2007). That lead Mobayyen and dc Almeida to the conclusion
that the latter two verb types are both semantically simple and refer to non-complex
mental particulars. Another line of psycholinguistic/neurolinguistic research concerns
speakers with category-specific semantic deficits due to brain damage. Data obtained
from these speakers have been used to argue for semantic feature approaches as well
as for approaches employing meaning postulates in the nominal domain (cf. the
discussion in de Almeida 1999). However, de Almeida (2001: 483) emphasizes that so far no
one has found evidence for category-specific verb concept deficits, for example, deficits
concerning features like cause or go.
Evidence for decomposition theories has also been sought in data from language
acquisition. If a meaning of a word is its decomposition then learning a decomposed
word means learning its decomposition.The most explicit early theory of decomposition-
based learning is Clark's (1973) semantic-feature based theory of word-learning. In her
view, only some of the features that make up a lexical representation are present when
a word is first acquired whereas the other features are learned while the word is already
used. The assumption that these features are acquired only successively predicts that
children ovcrgcncralizc heavily when acquiring a new word. In subsequent research, it
turned out that Clark's theory did not conform to the data: (i) Overgeneralization does
not occur as often as predicted; (ii) with recently acquired words, undergeneralization is
more typical than overgeneralization, and (iii) at some stages of acquisition, the referents
a word is applied to do not have any features in common (Barrett 1995:375ff,cf. also the
review in Carey 1982:361 ff). It has been repeatedly argued that meaning postulates are
better suited to explain acquisition processes (cf. Chierchia & McConnell-Ginet 1990:
363f, Bartsch & Vcnncmann 1972:22). However, even if Clark's theory of successive
feature acquisition is not tenable, related data are cited in favor of decompositions to show
that some kind of access to semantic features is involved in acquisition. For example, it
has been argued that a meaning component cause is extracted from verbs by children
and used in overgeneralizations like he failed it (cf. the overview in Clark 2003: 233ff).
Some research on the acquisition of argument structure alternations has been used to
argue for particular decompositional approaches, such as Pinker (1989) for Lexical
Conceptual Structures in the vein of Levin & Rappaport Hovav, and Brinkmann (1997) for
Lexical Decomposition Grammar.
In summary, the question whether decompositions are involved in language
processing or language acquisition remains open. Although processing differences for
different classes of verbs have to be acknowledged, it is often difficult to conclude from
these data what forms of lexical representation are compatible with these data.
7. Lexical decomposition: Foundational issues Hl_
4. Conclusion
From a heuristic and descriptive point of view, lexical decomposition has proven to be a
very successful device that has made it possible to discover and to tackle numerous
lexical phenomena, in particular, at the syntax-semantics interface. Yet, from a theoretical
point of view, lexical decompositions have remained a problematic concept that is not
always well grounded in theories of semantics and cognition:
- The basic predicates within decompositions are often elusive and lack truth
conditions, definitions, or an empirically grounded link to basic cognitive concepts.
- The lack of semantic grounding of decompositions often leads to circular
argumentations in linking theories.
- The cognitive status of decompositions is by and large unclear; it is not known whether
and how decompositions arc involved in lexical processing and language acquisition.
Thus, decompositions still raise many questions: "But even if the ultimate answers are
not in sight, there is certainly a sense of progress since the primitive approaches of the
1960s." (Jackendoff 2002:377)
5. References
de Almeida. Roberto G. 1999. What do category-specific semantic deficits tell us about the
representation of lexical concepts? Brain & Language 68,241-248.
de Almeida, Roberto G. 2001. Conceptual deficits without features: A view from atomism.
Behavioral and Brain Sciences 24,482-483.
de Almeida, Roberto G. 2007. Cognitive science as paradigm of interdisciplinarity: the case of
lexical concepts. In: J. L. Audy & M. Morosini (eds.). Interdisciplinarity in Science and at the
University. Porto Alegre: EdiPUCRS, 221-276.
Barrett, Martyn 1995. Early lexical development. In: P. Fletcher & B. MacWliinney (eds.). 77ie
Handbook of Child Language. Oxford: Blackwell, 362-392.
Bartsch, Renate & Theo Vennemann 1972. Semantic Structures. A Study in the Relation between
Semantics and Syntax. Frankfurt/M.: Athenauin.
Bendix, Edward H. 1966. Componential Analysis of General Vocabulary. The Semantic Structure of
a Set of Verbs in English, Hindi, and Japanese. Blooinington. IN: Indiana University.
Bierwisch,Manfred 1969. Certain problems of semantic representations. Foundations of Languages,
153-184.
Bierwisch, Manfred 1970. Semantics. In: J. Lyons (ed.). New Horizons in Linguistics. Hai mondswoi th:
Penguin, 166-184.
Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M.
Gartner & M. Bierwisch (eds.). 77ie Role of Economy Principles in Linguistic Theory. Berlin:
Akademie Verlag, 227-266.
Brinkmann, Ursula 1997. The Locative Alternation in German. Its Structure and Acquisition.
Amsterdam: Benjamins.
Carey, Susan 1982. Semantic development: The state of the art. In: E. Wanner & L. R. Gleitman
(eds.). Language Acquisition. The State of the Art. Cambridge: Cambridge University Press,
347-389.
Carnap, Rudolf 1952. Meaning postulates. Philosophical Studies 3,65-73.
Cliieichia, Gcnnaro & Sally McConncll-Ginet 1990. Meaning and Grammar. An Introduction to
Semantics. Cambridge, MA:Thc MIT Press.
142
I. Foundations of semantics
Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA:Tlie MIT Press.
Clark, Eve V. 1973. What's in a word? On the child's acquisition of semantics in his first language.
In:T. E. Moore (ed.). Cognitive Development and the Acquisition of Language. New York:
Academic Press, 65-110.
Clark, Eve V. 2003. First Language Acquisition. Cambridge: Cambridge University Press.
Coseriu, Eugenio 1964. Pour une s£inantique diachronique structurale. Travaux de Linguistique et
de Litterature 2,139-186.
Dowty, David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times
in Generative Semantics and in Montague's PTQ. Dordrecht: Reidel.
Engelberg, Stefan 2001. Immunisicrungsstrategien in der lexikalischen Ereignissemantik. In:
J. Dolling & T. Zybatov (eds.). Ereignisstrukturen. Leipzig: Institut fur Linguistik dcr Universitat
Leipzig, 9-33.
Engelberg. Stefan 2006. A theory of lexical event structures and its cognitive motivation. In:
D. Wundcrlich (cd.). Advances in the Theory of the Lexicon. Berlin: dc Gruytcr, 235-285.
Fillmore, Charles J. 1968. Lexical entries for verbs. Foundations of Language 4,373-393.
Fodor, Janet D. 1977. Semantics: Theories of Meaning in Generative Grammar. New York: Crowell.
Fodor, Janet D., Jerry A. Fodor & Merrill F. Garrett 1975. The psychological unreality of semantic
representations. Linguistic Inquiry 6,515-531.
Fodor, Jerry A. 1970. Three reasons for not deriving'kill' from 'cause to die'. Linguistic Inquiry 1,
429^38.
Fodor, Jerry A. 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Clarendon Press.
Fodor,Jerry A., Merrill F. Garrett, Edward C.T.Walker & Cornelia H. Parkes 1980. Against
definitions. Cognition 8,263-367.
Gennari, Silvia & David Poeppel 2003. Processing correlates of lexical semantic complexity.
Cognition 89, B27-B41.
Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press.
Greimas, Algirdas Julicn 1966. Semantique structurale. Paris: Larousse.
Griinshaw. Jane 2005. Words and Structure. Stanford, CA: CSLI Publications.
Harley, Heidi & Rolf Noyer 2000. Formal versus encyclopedic properties of vocabulary: Evidence
from nominalizations. In: B. Pccters (ed.). The Lexicon-Encyclopedia Interface. Amsterdam:
Elsevier, 349-375.
Harris, William T (ed.) 1923. Webster's New International Dictionary of the English Language.
Springfield, MA: Merriam.
Higginbotham, James 1994. Priorities of thought. Supplementary Proceedings of the Aristotelian
Society 68.85-106.
Hjelmslev, Louis 1963. Omkring sprogteoriens grundlxggelse. Festskrift udgivet af K0benhavns
Universitet i anledning of Universitetets Aarsfest. K0benhavn: E. Munksgaard, 1943, 3-113.
English translation: Prolegomena to a Theory of Language. Madison, WI: The University of
Wisconsin Press, 1963.
Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA:Thc MIT Press.
Jackendoff. Ray 1990. Semantic Structures. Cambridge, MA:The MIT Press.
Jackendoff, Ray 1992. Languages of the Mind. Essays on Mental Representation. Cambridge, MA:
The MIT Press.
Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford:
Oxford University Press.
Johnson-Laird, Philip N. 1983. Mental Models. Towards a Cognitive Science of Language, Inference,
and Consciousness. Cambridge, MA: Harvard University Press.
Katz, Jen old J. 1967. Recent issues in semantic theory. Foundations of Language 3.124-194.
Katz, Jerrold J. 1971. Semantic theory. In: D. D. Steinberg & L. A. Jakobovits (eds.). Semantics.
An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology. Cambridge: Cambridge
University Press, 297-307.
7. Lexical decomposition: Foundational issues 143
Katz, Jen old J. & Jerry A. Fodor 1963. The structure of a semantic theory. Language 39,170-210.
Katz, Jen old J. & Paul M. Postal 1964. An Integrated Theory of Linguistic Descriptions. Cambridge,
MA:The MIT Press.
Kornfilt, Jaklin & Nelson Correa 1993. Conceptual structure and its relation to the structure of
lexical entries. In: E. Reuland & W. Abraham (eds.). Knowledge and Language, vol. II: Lexical
and Conceptual Structure. Dordrecht: Kluwer, 79-118.
Lakoff, George 1972. Linguistics and natural logic. In: D. Davidson & G. Harinan (eds.). Semantics
of Natural Language. Dordrecht: Reidel, 545-655.
Levin, Beth 1993. English Verb Classes and Alternations. A Preliminary Investigation. Chicago. IL:
The University of Chicago Press.
Levin, Beth & Malka Rappaport Hovav 1995. Unaccusativity. At the Syntax-Lexical Semantics
Interface. Cambridge, MA:The MIT Press.
Levin, Beth & Malka Rappaport Hovav 1996. Lexical semantics and syntactic structure. In:
S. Lappin (cd.). 77ie Handbook of Contemporary Semantic Theory. Oxford: Blackwcll, 487-507.
Lewis, David 1973. Causation. The Journal of Philosophy 70,556-567.
Lobner, Sebastian 2002. Understanding Semantics. London: Arnold.
Lorenz, Wolfgang & Gerd Wotjak 1977. Zum Verhaltnis von Abbild und Bedeutung. Uberlegungen
im Grenzfeld zwischen Erkenntnistheorie und Semantik. Berlin: Akademie Verlag.
Lyons, John 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press.
McKoon, Gail & Talke MacFarland 2002. Event templates in the lexical representations of verbs.
Cognitive Psychology 45.1-44.
Mobayyen, Forouzan & Roberto G. de Almeida 2005.The influence of semantic and morphological
complexity of verbs on sentence recall: Implications for the nature of conceptual representation
and category-specific deficits. Brain and Cognition 57,168-175.
Pinker. Steven 1989. Learnability and Cognition. The Acquisition of Argument Structure. Cambridge,
MA:The MIT Press.
Pottier, Bernard 1963. Recherches sur {'analyse semantique en linguistique et en traduction meca-
nique. Nancy: Publications Linguistiques de la Faculte* de Lettres.
Pottier, Bernard 1964. Vers une semantique moderne. Travaux de Linguistique et de Litterature 2,
107-137.
Pulman. Stephen G. 2005. Lexical decomposition: For and against. In: J. I.Tait (cd.). Charting a New
Course: Natural Language Processing and Information Retrieval: Essays in Honour of Karen
Sparck Jones. Dordrecht: Kluwer. 155-174.
Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge:
Cambridge University Press.
Rappaport Hovav, Malka & Beth Levin 1998. Building verb meanings. In: M. Butt & W. Geuder
(eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI
Publications. 97-134.
Rappaport Hovav, Malka & Beth Levin 2005. Change-of-state verbs: Implications for theories of
argument projection. In: N. Ertcschik-Shir & T Rapoport (eds.). The Syntax of Aspect. Deriving
Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 274-286.
Rayner, Keith & Susan A. Duffy 1986. Lexical complexity and fixation times in reading: Effects of
word frequency, verb complexity, and lexical ambiguity. Memory & Cognition 14,191-201.
Roelofs, Ardi 1997. A case for nondecomposition in conceptually driven word retrieval. Journal of
Psycholinguistic Research 26,33-67.
von Stechow, Arnim 1995. Lexical decomposition in syntax. In: U. Egli ct al. (eds.). Lexical
Knowledge in the Organisation of Language. Amsterdam: Benjamins, 81-177.
Svens£n, Bo 1993. Practical Lexicography. Principles and Methods of Dictionary-Making. Oxford:
Oxford University Press.
Thorndike, Edward L. 1941. Thorndike Century Senior Dictionary. Chicago, IL: Scott, Foresman
and Company.
144
I. Foundations of semantics
Wiegand, Herbert E. 1989. Die lexikographische Definition im allgemeinen einsprachigen
Worterbuch. In: F. J. Hausmann et al. (eds.). Worterbiicher. Dictionaries. Dictionnaires. Ein
internationales Handbiich zur Lexikographie. An International Encyclopedia of Lexicography.
Encyclopedic internationale de lexicographic (HSK 5.1). Berlin: de Gruyter, 530-588.
Wundeiiich, Dieter 1997. Cause and the structure of verbs. Linguistic Inquiry 28,27-68.
Stefan Engelberg, Mannheim (Germany)
II. History of semantics
8. Meaning in pre-i9th century thought
1. Introduction
2. Semantic theories in classical antiquity
3. Hellenistic theories of meaning (ca. 300 B.C-200 A.D.)
4. Late classical sources of medieval semantics
5. Concepts of meaning in the scholastic tradition
6. Concepts of meaning in modern philosophy
7. References
Abstract
The article provides a broad survey of the historical development of western theories of
meaning from antiquity to the late 18th century. Although it is chronological and
structured by the names of the most important authors, schools, and traditions, the focus is
mainly on the theoretical content relevant to the issue of linguistic meaning, or on doctrines
that are directly related to it. I attempt to show that the history of semantic thought does
not have the structure of a continuous ascent or progress; it is rather a complex multilayer
process characterized by several ruptures, such as the decline of ancient or the expulsion
of medieval learning by Renaissance Humanism, each connected with substantial losses.
Quite a number of the discoveries of modern semantics are therefore in fact rediscoveries
of much older insights.
1. Introduction
Although it is commonly agreed that semantics as a discipline emerged in the 19th and
20th centuries, the history of semantic theories is both long and rich. In fact semantics
started with a rather narrow thematic scope, since it focused, like Reisig, on the
"development of the meaning of certain words, as well as the study of their use", or, like Brdal,
on the "laws which govern changes in meaning, the choice of new expressions, the birth
and death of phrases" (see article 9 (Nerlich) Emergence of semantics), but many of the
theoretical issues that were opened up by this flourishing discipline during the 20th
century were already traditional issues of philosophy, especially of logic.
The article aims to give an overview of the historical development of western
theories of meaning from antiquity to the late 18th century. The attempt to condense more
than 2000 years of intense intellectual labor into 25 pages necessarily leads to omissions
and selections which are, of course, to a certain degree subjective. The article should not
therefore be read with the expectation that it will tell the whole story. Quite a lot of what
could or should have been told is actually not even mentioned. Still it is, I think, a fair
sketch of the overall process of western semantic thought.
The oldest known philosophical texts explicitly concerned with the issue of linguistic
meaning evolved out of more ancient reflexions on naming and on the question whether,
and in what sense, the relation between words and things is conventional or natural. Plato
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 145-172
146 II. History of semantics
and Aristotle were fairly aware that word meaning needs to be discussed together with
sentence meaning (see sections 2.2., 2.3.). The Stoic logicians later on were even more
aware of this (see section 3.), but Aristotle's approach in his most influential book Peri
hermeneias was more influential than them. His account of the relation between linguistic
expressions (written and spoken), thoughts and things, combining linguistics (avant la
lettre), epistemology, and ontology, (see section 4.2.) brought about a long-lasting
tendency within the scholastic tradition of Aristotle commentaries to focus on word meaning
and to center semantics on the question whether words signify mental concepts or things.
The numerous theories that have been designed to answer to this question (see section
5.2.) have strengthened the insight, shared by modern conceptual semantics, that
linguistic signs do not refer to the world per sc, but rather to the world as conceptualized
by language users (see article 30 (Jackendoff) Conceptual Semantics). The way in which
the question was put and answered in the scholastic tradition, however, in some respects
fell short of its own standards. For ever since Abelard (see section 5.3.) there existed a
propositional approach to meaning in scholastic logic fleshed out especially in the termi-
nist logic (see section 5.4.) whose elaborate theories of syncategorematic terms and
supposition suggested that by no means every word is related to external things, and that the
reference of terms is essentially determined by the propositional context.
Medieval philosophy gave birth to a number of innovative doctrines such as Roger
Bacons 'use theory' of meaning (see section 5.5.), the speculative grammar,
functionally distinguishing the word classes according to their different modes of signifying (sec
section 5.6.), or the late medieval mentalist approach, exploring the relations between
spoken utterances and the underlying mental structures of thought (see section 5.7.).
Compared to the abundance of semantic theories produced particularly within the
framework of scholastic logic, the early modern contribution to semantics in the narrow
sense is fairly modest (see section 6.1.). The philosophy of the 17th and 18th centuries,
however, disconnecting semantics from logic and centering language theory on
epistemology, opened up important new areas of interest, and reopened old ones. So, the idea
of a universal grammar underlying all natural languages was forcefully reintroduced by
the rationalists in the second half of the 17th century independently from the medieval
grammatica speculativa (see section 6.4.). One of the new issues, or issues that were
discussed in a new way, was the question of the function of language, or of signs in general,
in the process of thinking.The fundamental influence of language on thought and
knowledge became widely accepted, particularly, though not exclusively, within the empirist
movement (see sections 6.2., 6.3., 6.5., 6.6.). Another innovation: from the late 17th
century the philosophy of language showed a growing consciousness of and interest in the
historical dimension of language (see section 6.7.).
2. Semantic theories in classical antiquity
2.1. The preplatonic view of language
Due to the fragmentary survival of textual relics from the preplatonic period the very
beginnings of semantic thought are withdrawn from our sight. Though some passages in
the preserved texts as well as in several later sources might prompt speculations about
the earliest semantic reflexions, the textual material available seems to be insufficient for
a reliable reconstruction of a prcsocratic semantic theory in the proper sense. The case
8. Meaning in pre-19th century thought 147
of the so-called Sophists (literally: 'wise-makers') of the 5th century B.C. is somewhat
different: they earned their living mainly by teaching knowledge and skills that were
thought to be helpful for gaining private and political success. Their teachings largely
consisted of grammar and rhetoric as resources or the command of language and
argumentation. One of their standard topics was the 'correctness of names' (orthotes ono-
maton) which was seemingly treated by all the main sophists in separate books, such as
Protagoras (ca.490^20 B.C.),Hippias (ca.435 B.C.), and Prodicus (ca.465^15 B.C.), the
temporary teacher of Socrates (Schmitter 2001).
2.2. Plato (427-347 B.C.)
The problem of the correctness of names, spelled out as the question whether the
relation between names and things is natural or conventional, also makes up the main topic
of Plato's dialogue Cratylus (ca. 388 B.C.), the earliest known philosophical text on
language theory.
The position of naturalism is represented by Cratylus, claiming that "everything has a
right name of its own, which comes by nature, and that a name is not whatever people call
a thing by agreement,... but that there is a kind of inherent correctness in names, which
is the same for all men, both Greeks and barbarians." (Cratylus 383a-b).
The opposite view of conventionalism is advanced by Hermogenes who denies that
"there is any correctness of name other than convention (syntheke) and agreement
(homologia)" and holds that "no name has been generated by nature for any particular
thing, but rather by the custom and usage (thesei) of those who use the name and call
things by it" (384c-d). Democritus (ca. 420 B.C.) had previously argued against
naturalism by pointing to semantic phenomena such as homonymy, synonymy, and name
changes. The position represented by Hermogenes, however, goes beyond the
conventionalism of Democritus insofar as he does not make any difference between the various
natural languages and an arbitrary individual name-giving {Cratylus 385d-e) resulting in
autonomous idiolects or private language.
Plato's dialogue expounds with great subtlety the problems connected to both
positions. Plato advocates an organon theory of language, according to which language is
a praxis and each name is functioning as "an instrument {organon) of teaching and of
separating (i.e. classifying) reality" (388b-c). He is therefore on the one hand showing,
contrary to Hermogenes's conventionalism, that names, insofar as they stand for
concepts of things, have to be well defined and to take into account the nature of things in
order to carve reality at the joints. On the other hand he demonstrates against Cratylus's
overdrawn naturalism the absurdities one can encounter in trying to substantiate the
correctness of names etymologically, by tracing all words to a core set of primordial
or 'first names' {prota onomata) linked to the objects they denote on grounds of their
phonological qualities.
Even if Plato's own position is more on the side of a refined naturalism (Sedley
2003), the dialogue, marking the strengths and shortcomings of both positions, leaves the
question undecided. Its positive results rather consist (1) in underlining the difference
between names and things and (2) in the language critical warning that "no man of sense
should put himself or the education of his mind in the power of names" (440c).
Plato in his Cratylus (421d-e,424e,431 b-c) draws,for the first time,a distinction between
the word-classes of noun {onoma) and verb {rhema). In his later Dialogue Sophistes he
148 II. History of semantics
shows that the essential function of language does not consist in naming things but rather
in forming meaningful propositions endowed with a truth-value (Soph.26lc-262c; Baxter
1992). The question of truth is thus moved from the level of individual words and their
relation to things to the more adequate level of propositions. No less important, however,
is the fact that Plato uses this as the basis for what can be seen as the first account of
propositional attitudes (see article 60 (Swanson) Propositional attitudes). For in order to
claim that "thought,opinion, and imagination ... exist in our minds both as true and false"
he considers it necessary to ascribe a propositional structure to them. Hence Plato defines
"forming opinion as talking and opinion as talk which has been held, not with someone
else, nor yet aloud, but in silence with oneself" (Theaetet 190a), and claims, more generally,
that "thought and speech (arc) the same, with this exception, that what is called thought
is the unuttered conversation of the soul with herself", whereas "the stream of thought
which flows through the lips and is audible is called speech" (Soph. 263e).
Even if this view is suggested already by the Greek language itself in which logos
means (inter alia) both speech or discourse and thought, it is in the Sophistes where the
idea that thought is a kind of internal speech is explicitly stated for the first time. Thus,
the Sophistes marks the point of departure of the long tradition of theories on the issue
of mental speech or language of thought.
2.3. Aristotle (384-322 B.C.)
Aristotle's book Peri hermeneias (De interpretation, On interpretation), especially the short
introductory remarks (De int. 16a 3-8), can be seen as "the most influential text in the
history of semantics" (Krctzmann 1974: 3). In order to settle the ground for his attempt to
explain the logico-semantic core notions of name, verb, negation, affirmation, statement and
sentence, Aristotle delineates the basic coordinate system of semantics, comprising the four
elements he considers to be relevant for giving a full account of linguistic signification: i.e.
(1) written marks, (2) spoken words, (3) mental concepts, and (4) things. This system, which
under labels like 'order of speech' (ordo orandi) or 'order of signification' (ordo significa-
tionis,see 5.2.), became the basic framework for most later semantic theories (see Tab. 8.1.)
does not only determine the interrelation of these elements but also gives some hints about
the connection of semantics and epistemology. According to Aristotle (De int. 16a 3-8),
spoken words (ta en tephone: literally: 'those which are in the voice') are symbols (symbola)
of affections (pathemata) in the soul (i.e. mental concepts (noemata) according to De int. 16a
10), and written marks symbols of spoken words. And just as written marks are not the same
for all men, neither arc spoken sounds. But what these arc in the first place signs (semeia)
of - affections of the soul - are the same for all: and what these affections are likenesses
(homoiomata) of- actual things - are also the same.
Though the interpretation of this passage is highly controversial, the most commonly
accepted view is that Aristotle seems to claim that the four elements mentioned are
interrelated such that spoken words which are signified by written symbols are signs of mental
concepts which in turn arc likenesses (or images) of things (Wcidcmann 1982).This passage
presents at least four assumptions that are fundamental to Aristotle's semantic theory:
1. Mental concepts are natural and therefore intersubjectively the same for all human
beings.
8. Meaning in pre-19th century thought 149
2. Written or spoken words are, in contrast, conventional signs, i.e. sounds significant by
agreement (kata syntheken, De int. 16a 19-27).
3. Mental concepts which are both likenesses (homoiomata) and natural effects of
external things are directly related to reality and essentially independent from
language.
4. Words and speech are at the same time sharply distinguished from the mental concepts
and closely related to them, insofar as they refer to external things only through the
mediation of concepts. Aristotle's description of the relation between words, thoughts,
and things can therefore be seen as an anticipation of the so called 'semantic triangle'
(Lieb 1981).
All four tenets played a prominent role in philosophy of language for a long time and
yet, each of them, at one time or another, came under severe attack.The tenet that vocal
expressions signify things trough the mediation of concepts, which the late ancient
commentators saw as the core thesis of Aristotle's semantics (Ebbesen 1983), leads, if spelled
out, to the distinction of intension and extension (Graeser 1996: 39). For it seem to be
just an abbreviated form of expressing that for each word to signify something {semainei
ti) depends upon its being connected to an understanding or a formula (logos) which can
be a definition, a description or a finite set of descriptions (Metaphysics 1006a 32-1006b
5) picking out a certain thing or a certain kind of things by determining what it is to be a
such and such thing (Met. 1030a 14-17).
Due to the prominency of the order of signification it may seem as if Aristotle were
focusing on word semantics. There is, however, good reason to maintain that Aristotle's
primary intention was a semantics of propositions. Sedley (1996: 87) has made the case
that for Aristotle, just as for other teleologists like Plato and the Stoics, "who regard
the whole as onto logically prior to the part ... the primary signifier is the sentence, and
individual words are considered only secondarily, in so far as they contribute to the
sentence's function." In face of this he characterizes De interpretatione, which is usually seen
as a specimen of word semantics as "the most seriously misunderstood text in ancient
semantics" (Sedley 1996:88). This may hold for most of the modern interpretations but
not so for the medieval commentators. For quite a number of them were well aware that
the somewhat cryptic formulation "those which are in the voice" does not necessarily
stand for nouns and verbs exclusively but was likely meant to include also propositions
and speech in general. That the treatment of words is tailored to that of propositions
is indicated already by the way in which Aristotle distinguishes onomata (nouns) and
rhemata (predicative expressions) not as word classes as such but rather as word classes
regarding their function in sentences. For when, as Aristotle remarks, predicative
expressions "are spoken alone, as such, they are nouns (onomata) that signify something -
for the one who utters them forms the notion [or: arrests the thought], and the hearer
pauses" (De int. 16a 19-21).
The two classes of nouns and predicative expressions, the members of which are
described as the smallest conventionally significant units, are complemented by the
'conjunctions' (syndesmoi) which include conjunctions, the article, and pronouns and are the
direct ancestors of Priscian's syncategoremata that became a major topic in medieval
logical semantics (sec section 4.5.).
Although the natural/conventional distinction separates mental concepts from spoken
language, Aristotle describes the performance of thinking, just like Plato, in terms of an
150 II. History of semantics
internal speaking. He distinguishes between an externally spoken logos and a logos or
"speech in the soul" (Met. 1009a 20) the latter of which provides the fundament of
signification, logical argumention and demonstration (Posterior analytics 76b 24; Panaccio
1999: 34-41), and is at the same time the place where truth and falsehood are located
properly (Meier-Oeser 2004: 314). Thus, when in De interpretatione (17a 2-24) it is said
that the fundamental logical function of declarative sentences to assert or deny
something of something presupposes a combination of a name and a verb or an inflection
of a verb, this interconnection of truth or falsity and propositionality only seemingly
refers primarily to spoken language. For in other passages, as Met. 1027b 25-30, or in the
last chapter of De interpretatione, Aristotle emphasizes that combination and
separation and thus truth "arc in thought" properly and primarily. How this internal discourse
which should be 'the same for all' precisely relates to spoken language remains unclear
in Aristotle.
The close connection Aristotle has seen between linguistic signification and thought
becomes evident in his insistance on the importance of an unambiguous and well-defined
use of language the neglect of which must result in a destruction of both
communication and rational discourse; for, as he claimed (Met. 1006b 7-12): not to signify one thing
is to signify nothing, and if words do not signify anything there is an end of discourse
with others, and even, strictly speaking, with oneself; because it is impossible to think of
anything if we do not think of one thing.
Noticing that the normal way of speaking docs not conform to the ideal of unambiguity,
Aristotle is well aware about the importance of a detailed analysis of the semantic issues
of homonymy, synonymy, paronymy etc. (Categories la;Sophistical refutations I65f) and
devotes the first two books of his Topics to the presentation of rules for detecting, and
strategies for avoiding ambiguitiy.
3. Hellenistic theories of meaning (ca. 300 B.C-200 A.D.)
It is uncontrovcrsial that a fully fledged theory of language, considering all the different
phonetic, syntactic, semantic, and pragmatic aspects of language as its essential subject
matter, is the invention of the Stoic philosophers (Pohlenz 1939). However, since hardly
any authentic Stoic text has come down to us, it is still true that "the nature of the Stoics'
philosophy of language is the most tantalizing problem in the history of semantics"
(Kretzmann 1967:363). After a long period of neglect and contempt, the Stoic account of
semantics is today generally seen as superior even to Aristotle's, which was historically
more influential (Graeser 1978: 77). For it has become a current conviction that central
points of Stoic semantics are astonishingly close to some fundamental tenets that have
paved the way for modern analytical philosophy.
According to Sextus Empiricus (Adv. math. 8.11-12; Long & Sedley 1987 [=L&S]
33B) the principal elements of Stoic semantics are:
1. the semainon, i.e. that which signifies,or the signifier, which is a phoneme or grapheme,
i.e. the material configuration that makes up a spoken word or rather - because Stoic
semantics is primarily a sentence semantics - a spoken or written sentence;
2. the tynchanon (or:'name-bearer'), i.e the external material object or event referred
to; and
8. Meaning in pre-19th century thought 15J_
3. the semainomenon, i.e. that which is signified. This is tantamount to the core
concept and the most characteristic feature of the Stoic propositionalist semantics, the
lekton (which can be translated both as 'that which is said' and 'that which can be
said', i.e. the 'sayable').The lekton proper or the 'complete lekton' (lekton autoteles)
is the meaning of a sentence, whereas the meaning of a word is characterized as an
'incomplete lekton' (lekton ellipes). Even though questions and demands may have
a certain kind of lekton, the prototype of the Stoic lekton corresponds to what in
modern terminology would be classified as the propositional content of a declarative
sentence.
Whereas the semainon and the tynchanon are corporeal things or events, the lekton is
held to be incorporeal. This puts it in an exceptional position within the materialist Stoic
ontology, which considered almost everything - even god, soul, wisdom, truth, or thought
- as material entities. Hence the lekton "is not to be identified with any thoughts or with
any of what goes on in one's head when one says something" (Annas 1992: 76). This is
evident also from the fact that the lekton corresponds to the Stoic notion of pragma
which, however, in Stoic terminology does not stand for an external thing, as it does for
Aristotle, but rather for a fact or something which is the case. The Stoic complement of
the Aristotelian notion of mental concept (noema), is the material phantasia logike, i.e.
a rational or linguistically expressible presentation in the soul. The phantasia logike is a
key feature in the explanation of how incorporeal meaning is connected to the physical
world, since the Stoics maintained "that a lekton is what subsists in accordance with a
phantasia logike" (Adv. math. 8.70; L&S 33C). It should be clearly distinguished from
the lekton or meaning as such. The phantasia logike makes up the 'internal discourse'
(logos endiathetos) and is thus part of the subjective act of thinking, in contrast to the
lekton, which is the "objective content of acts of thinking (noeseis)" (Long 1971:82).The
semainon and the tynchanon more or less correspond to the written or spoken signs and
the things in the Aristotelian order of signification, but it has no equivalent of the lekton
(sec Tab. 8.1.). Meaning, as the Stoics take it, is neither some thing nor something in the
head. Nor is the Stoic lekton a quasi-Platonic entity that would exist in some sense
independent of whether or not one thinks of it, unlike Frege's notion of Gedanke (thought)
(Graeser 1978:95; Barnes 1993:61). Whitin the context of logical inference and
demonstration a lekton may function as a sign (semeion, not to be mistaken with semainonl), i.e.
as a "leading proposition in a sound conditional, revelatory of the consequent" (Sextus
Empiricus, Outlines of Pyrrhonism, L&S 35C).
Besides the notion of propositional content, one of the most interesting and
innovative elements of Stoic semantics is the discovery that to take into account propositional
content alone might be not in any case sufficient in order to make explicit what a sentence
means.This discovery, pointing in the direction of modern speech act theory, is indicated
when Plutarch (with critical intention) reports that the Stoics maintain "that those who
forbid say one thing, forbid another, and command yet another. For he who says 'do not
steal' says just this,'do not steal', forbids stealing and commands not stealing" (Plutarch,
On Stoic self-contradictions 1037d-e). In this case there are three different lekta
associated to a single sentence, corresponding to the distinct linguistic functions this
sentence can perform.Thus, the Stoics seem to have been well aware that the explication of
meaning "involves not only the things we talk about and the thoughts we express but also
the jobs we do by means of language alone" (Kretzmann 1967:365a).
152
II. History of semantics
A semantic theory which is significantly different from both the Aristotelian and the
Stoic is to be found in the Epicurean philosophers who, as Plutarch (Against Colotes;
L&S 19K) reports,"completely abolish the class of sayables ..., leaving only words and
name-bearers" (i.e. external things), so that words refer directly to the objects of
sensory perception.This referential relation still presupposes a mental'preconception' (pro-
lepsis) or a schematic presentation (typos) of the thing signified, for we would not "have
named something if we had not previously learnt its delineation (typos) by means of
preconception (prolepsis)" (Diogenes Laertius, Lives and Opinions of eminent Philosophers
L&S 17E), but signification itself remains formally a two-term relation. It is true, the
semantic relation between words and things is founded on a third; but this intermediate
third, the prolepsis or typos, unlike the passiones or concepts in the Aristotelian account,
does not enter the class of what is signified (see Tab. 8.1.).
Tab. 8.1: The ordo significationis
Labeled grey fields stand for elements (I, II, III) being signifiers and/or signified. Labeled white
fields stand for semantic relations (1, 2,3) characterizing the element on the left in regard to the
one on the right. Labeled light grey fields stand for elements involved in the process of signification
though neither being signifiers nor signified.
4. Late classical sources of medieval semantics
4.1. Augustinus (354-430)
Whereas Aristotelian logic and semantics was transmitted to the Middle Ages via
Boethius, the relevant works of Augustinus provide a complex compound of modified
Stoic, Skeptic and Neoplatonic elements, together with some genuinely new ideas.
Probably his most important and influential contribution to the history of semantics and
semiotics consists of (1) the explicit definition of words as signs, and (2) his definition of the
sign which, for the first time, included both the natural indexical sign and the
conventional linguistic sign as species of an all-embracing generic notion of sign. Augustinus
8. Meaning in pre-19th century thought 153
thus opened a long tradition in which the theory of language is viewed as a special branch
of the more comprehensive theory of signs.
The sign, in general, is defined as "something which, offering itself to the senses,
conveys something other to the intellect" (Augustinus 1963:33).This triadic sign conception
provides the general basis for Augustinus's communicative approach to language, which
holds that a word is a "sign of something, which can be understood by the hearer when
pronounced by the speaker" (Augustinus 1975: 86). In contrast to natural signs which
"apart from any intention or desire of using them as signs, do yet lead to the knowledge
of something else", words are conventional signs "living beings mutually exchange in
order to show ..., the feelings of their minds, or their perceptions, or their thoughts"
(Augustinus 1963: 34). The preeminent position of spoken words among the different
sorts of signs used in human communication, some of which "relate to the sense of sight,
some to that of hearing, a very few to the other senses", does not result from their
quantitative preponderance but rather from their significative universality, i.e. from the fact
that, as Augustinus sees it, everything which can be indicated by nonverbal signs could be
put into words but not vice versa (Augustinus 1963:35).
The full account of linguistic meaning entails four elements (Augustinus 1975: 88f):
(1) the word itself, as an articulate vocal sound, (2) the dicibile, i.e. the sayable or
"whatever is sensed in the word by the mind rather than by the ear and is retained
in the mind", (3) the dictio, i.e. the word in its ordinary significative use in contrast to
the same word just being mentioned. It "involves both the word itself and that which
occurs in a mind as the result of the word" (i.e. the dicibile or meaning), and (4) the
thing signified (res) in the broadest sense comprising anything "understood, or sensed, or
inapprehensible".
Even if the notion of dicibile obviously refers to the Stoic lekton, Augustinus has
either missed or modified the essential point. For describing it as "what happens in the
mind by means of the word" implies nothing less than the rejection of the Stoic expulsion
of meaning (lekton) from the mind. It is as if he mixed up the lekton with the phantasia
logike, which was characterized as a representation whose content "can be expressed in
language" (Sextus Empiricus,y4dv. maf/i.8.70;L&S 33C).
Augustinus's emphasis on the communicative and pragmatic aspects of language is
manifest in his concept of the 'force of word' (vis verbi) which he describes as the
"efficacy to the extent of which it can affect the hearer" (Augustinus 1975: 100). The import
spoken words have on the hearer is not confined to their bare signification but includes
a certain value and some emotive moments resulting from their sound, their familiarity
to the hearer or their common use in language - aspects which Frege called Fdrbungen
(colorations).
In his later theory of verbum mentis (mental word), especially in De trinitate,
Augustinus advocates the devaluation of the spoken word against the internal sphere
of mental cognition. It is now the mental or 'interior word' (verbum interius), i.e., the
mental concept, that is considered as word in its most proper sense, whereas the spoken
word appears as a mere sign or voice of the word (signum verbi, vox verbi; Augustinus
1968: 486). In line with the old concept of an internal speech, Augustinus claims that
thoughts (cogitationes) arc performed in mental words. The verbum mentis however,
corresponding to what later was called the conceptus mentis or intellectus, is by no means
a linguistic entity in the proper sense, for it is "nullius linguae", i.e. it does not belong to
any spoken language like Latin or Greek.
154 II. History of semantics
Between mental and spoken words there is a further level of speech, consisting of
imaginative representations of spoken words, or, as he calls them, imagines sonorum
(images of sounds), closely corresponding to Saussure's notion of image accoustique.
In Augustin's theory of language this internalized version of uttered words does not
seem to play a major role, but it will gain importance in late medieval and early modern
reflections on the influence of language on thought.
4.2. Boethius (480-528)
Boethius' translations of and comments on parts of the Aristotelian Organon (especially
De Interpretatione) are for a long time the only available source texts for the semantics
of Aristotle and his late ancient Neoplatonic commentators for the medieval world. The
medieval philosophers thus first viewed Aristotle's logic through the eyes of Boethius,
who made some influential decisions on semantic terminology, as well as on the
interpretation of the Aristotelian text. What they learned through his writings were inter
alia the conventional character of language, the view that meaning is established by an
act of 'imposition', i.e., name-giving or reference-setting, and the influential idea that to
'signify' (significare) is to "establish an understanding" (intellectum constituere).
Especially in his more elaborate second commentary on De interpretatione, Boethius
discusses at length Aristotle's four elements of linguistic scmciosis (scripta, voces,
intellects, res), which he calls the 'order of speaking' (ordo orandi) (Magee 1989:64-92).The
ordo orandi determines the direction of linguistic signification: written characters signify
spoken words, whereas spoken words primarily signify mental concepts and, by means
of the latter, secondarily denote the things, or, in short: words signify things by means of
concepts (Boethius 1880:24,33).
The first step of the process of homogenizing or systematizing the semantic relations
between these four elements, later continued in the Middle Ages (see 5.2.), is to be seen
in Boethius's translation of Aristotle's De interpretatione (De int. 16a 3-8) (Meier-Oeser
2009). For whereas Aristotle characterizes these relations with the three terms of sym-
bola, semeia, and homoiomata, Boethius translates both symbola and semeia as notae
(signs; see Tab. 8.1.).
Boethius makes an additional distinction in the work of late classical Aristotle
commentators: he distinguishes three levels of speech: besides - or rather, at the basis of -
written and spoken discourse there is a mental speech (oratio mentis) in which thinking
is performed. This mental speech is, just like Augustinus's mental word, not made up of
words of any national language but rather of transidiomatic or even non-linguistic mental
concepts (Boethius 1880:36) which are, as Aristotle had claimed, "the same for all".
5. Concepts of meaning in the scholastic tradition
The view that semantic issues are not only a subject matter of logic but its primary
and most fundamental one is characteristic for the scholastic tradition. This tradition
was not confined to the Middle Ages but continued, after a partial interruption in the
mid-l6th century, during the 17th and 18th centuries. Because it is the largest and most
elaborate tradition in the history of semantics, it seems advisable to begin with an
overview of some basic aspects of medieval semantics, such as the use and the definition of
8. Meaning in pre-19th century thought 155
the pivotal term of significatio (see 5.1.), and the most fundamental medieval debate on
that subject (see 5.2.).
5.1. The use and definition of significatio in the scholastic tradition
The problematic vagueness or ambiguity that has frequently been pointed out regarding
the notion of meaning holds, at least partly, for the Latin term of significatio as well.
There is, however, evidence that some medieval authors were aware of this
terminological difficulty. Robert Kilwardby (1215-1279) for instance noted that significatio can
designate either the 'act or form of the signifier' (actus et forma significantis), the
'signified' (significatum), or the relation between the two (comparatio signi ad significatum;
Lewry 1981:379). A look at the common scholastic use of the term significatio does not
only verify this diagnosis but reveals a number of further variants (Mcier-Ocser 1996:
763-765) which mostly resulted from debates on the appropriate ontological description
of significatio as some sort of quality, form, relation, act etc. The scholastic definitions
of significatio and significare (to signify), however, primarily took up the question what
signification is in the sense of'what it is for a word or sign to signify'.
There is a current misconception: the medieval notions of sign and signification
are often described in terms of what Biihler (1934: 40) called the "famous formula
of aliquid stat pro aliquo" (something stands for something). This description,
confusing signification with supposition (see 5.4.), falls short by reducing signification to
a two-term relation between a sign and its significate, whereas in scholastic logic the
relation to a cognitive faculty of a sign recipient is always a constitutive element of
signification. In this vein, the most widely spread scholastic definition (Meier-Oeser
1996: 765-768), based on Boethius's translation of Aristotle's De interpretatione (De
int. 16b 20), characterizes signification or the act of signifying as "to establish an
understanding" (constituere intellectum) of some thing, or "to cvoquc a concept in the
intellect of the hearer". The relation to the sign recipient remains pivotal when in the later
Middle Ages the act of signifying is primarily defined as "to represent something to an
intellect" (aliquid intellectui repraesentare). The common trait of the definitions
mentioned (and all the others not mentioned) is (1) to describe signification - in contrast to
meaning - not as something a word has, but rather as an act of signifying which is - in
contrast to the stare pro - involved in a triadic relation including both the significate
and a cognitive faculty.
5.2. The order of signification and the great altercation about whether
words are signs of things or concepts
The introductory remarks of Aristotle's De interpretatione have been described as "the
common starting point for virtually all medieval theories of semantics" (Magee 1989:
8). They did at least play a most important role in medieval semantics. In the late 13th
and early 14th centuries the order of the four principal elements of linguistic
signification (i.e. written and spoken words, mental concepts, and things) is characterized as
ordo significationis (order of signification; Aquinas 1989: 9a), or ordo in significando
(order in signifying; Ockham 1978: 347). The coinage of these expressions is the
terminological outcome of the second step in the process mentioned above of homogenizing
156 II. History of semantics
the relations between these four elements. It took place in the mid-13th century, when
mental concepts began to be described as signs. The Boethian pair of notae and similitu-
dines was thus further reduced to the single notion of sign (signum, see Tab. 8.1.), so that
the entire order of signification was then uniformly described in terms of sign relations,
or, as Antonius Andreas (ca. 1280-1320) said:"written words, vocal expressions,concepts
in the soul and things are coordinated according to the notion of sign and significate"
(Andreas 1508: fol.63va).
From the second half of the 13th century on, most scholastic logicians shared the view
that mental concepts were signs, which provided new options for solving the "difficult
question of whether a spoken word signifies the mental concept or the thing" (Roger
Bacon 1978: 132). This question made up the subject matter of the most fundamental
scholastic debate on semantic issues. John Duns Scotus (1265/66-1308) labeled it "the
great altercation" (magna altercatio). The simple alternative offered in the
formulation of the question is, however, by no means exhaustive, but only marks the extreme
positions within a highly differentiated spectrum of possible answers (Ashworth 1987;
Meier-Oeser 1996:770-777).
The various theories of word meaning turn out to be most inventive in producing
variants of the coordination of linguistic signs, concepts and things. For example
(1) while especially some early authors held the mental concept to be the only proper
significate of a spoken word, (2) Roger Bacon (ca. 1214 or 1220 - ca. 1292) as well as most
of the so-called late medieval nominalists favored an cxtcnsionalist reference semantics
according to which words signify things. (3) Especially Thomist authors took up the
formula that words signify things by the mediation of concepts (voces significant res medi-
antibus conceptibus) and answered the question along the lines of the semantic triangle
(Aquinas 1989: 11a). Some later authors put it the other way round and maintained
(4) that words signify concepts only by the mediation of their signification of things
(voces significant conceptus mediante signification rerum). For "if I do not know which
things the words signify I shall never learn by them which concepts the speaker has in
his mind" (Smiglecius 1634: 437). Sharing the view that concepts were signs of things,
(5) Scotus referred to the principle of semantic transitivity, claiming that "the sign of a
sign is [also] the sign of the significate" (signum signi est signum signati), and held that
words signify both concepts and things by one and the same act of signifying (1891:451f).
Others, in contrast, maintained (6) that there had to be two simultaneous but
distinguishable acts of signification (Conimbriccnscs 1607:2.39f). And still others tried to solve the
problem by introducing further differentiations. Some of them were related to the medi-
antibus conceptibus-foTmu\a by the claim (7) that this formula does not imply that
concepts were the immediate significates of words but rather that they were a prerequisite
condition for words to signify things (Henry of Ghent 1520:2.272v). Others were related
to the notion of things, taking (8) the 'thing conceived' (res concepta; res ut intelligitur)
as the proper significate of words (Scotus 1891:543). Further approaches tried to decide
the question either (9) by distinguishing senses of the term significare (significare sup-
positive - manifestative), claiming that spoken words stand for the thing but manifest the
concepts (Rubius 1605:21), or (10) by differentiating between things being signified and
thoughts being expressed (Soto 1554: fol.3 rb-va),or again (11) by taking into account
the different roles of the discourse participants, so that words signify concepts for the
speaker and things for the hearer (Versor 1572: fol. 8 r), or lastly (12) by distinguishing
between different types of discourse, maintaining that in familiar speech words primarily
8. Meaning in pre-19th century thought 157
refer to concepts whereas in doctrinal discourse to things (Nicolaus a S. Iohanne Baptista
1687:40.43ff). No matter how subtle the semantic doctrines behind these positions may
have been in detail; it is still true that most of the contributions to the 'great altercation1
were focusing primarily on word semantics. Scholastic semantics, however, was by no
means confined to this approach.
5.3. Peter Abelard (1079-1142) and the meaning of the proposition
As early as Peter Abelard a shift in the primary interest of scholastic logic became
apparent. His treatment of logic and semantics is determined by a decidedly proposi-
tional approach; all distinctions he draws and all discussions he conducts are guided by
his concentration on propositions (Jacobi 1983:91). In a conceptual move comparable to
the one in Frege's famous Ober Sinn unci Bedeutung, Abelard transposes the distinction
between the signification of things (which is akin to Frege's Bedeutung) and concepts
(Frege's Sinn) to the level of propositions. On the one hand, and in line with Frege's
principle ofcompositionality, the signification of a proposition is the complex comprehension
of the sentence as it is integrated from the meanings of its components (sec article 6
(Pagin & Westerstahl) Compositionality). On the other hand, it corresponds to Frege's
Gedanke, i.e. to the propositional content of a sentence. Abelard calls this, in accordance
with the Stoic Ickton, dictum propositionis ('what is said by the proposition') or res
propositionis ('thing of the proposition', i.e. the Stoic pragma).These similarities do not
concern only terminology, but also the ontological interpretation. For the res propositionis
is characterized as being essentially nothing (nullae penitus essentiae; Abelard 1927:332,
24) or as entirely nothing (nil omnino;366:1). And yet the truth value or the modal state
of a proposition depends on the res propositionis being either true or false, necessary or
possible etc. (367:9-16).
In the logical textbooks of the late 12th and 13th centuries Abelard's notion of dictum
propositionis is present under the name of enuntiabile ('what can be stated') (de Rijk
1967 2/2.208: 15ff). In the 14th century it has its analog in the theory of the 'complexly
signifiable' (complexe significabile) developed by Adam Wodeham (ca. 1295-1358),
Gregory of Rimini (ca. 1300-1358), and others in the context of intense discussions on
the immediate object of knowledge and belief (Tachau 1987).
These conceptions of the significatc of propositions correspond in important points
to Frege's notion of'thought' (Gedanke) or Bolzano's'sentences as such' (Satze an sich),
and, of course, to the Stoic Ickton. In contrast, Walter Burlcy (ca. 1275-1344) answered
the question about the ultimate significatc of vocal and mental propositions by
advocating the notion of a propositio in re (proposition in reality) or a proposition composed
of things, which points more in the direction of Wittgensteins notion of Sachverhalte
(cases) or Tatsachen (facts; or better: states of affairs) as described in his Tractatus logico-
philosophicus ("l.The world is all that is the case. 1.1.The world is the totality of facts,
not of things."). Whereas the advocates of the complexe significabile project proposition-
ality onto a Fregian-like 'third realm' of propositional content, Burley and some other
authors project it onto the real world, maintaining that what makes our propositions true
(or false) are not the things as such but rather the states of affairs, i.e. the things relating
(or not relating) to each other in the way our propositions say they do (Meier-Oeser
2009:503f).
158 II. History of semantics
5.4. The theory of supposition and the propositional approach
to meaning
The propositional approach to meaning is also characteristic of the so-called 'terminist'
or 'modern logic' (logica moderna), emerging in the late 12th and 13th centuries with
a rich and increasingly sophisticated continuation from the 14th to the early 16th
century. Most of what is genuinely novel in medieval logic and semantics is to be found
in this tradition whose two most important theoretical contributions arc (1) the theory
of syncategorematic terms and (2) the theory of the properties of terms (proprietates
terminorum).
l.The theory of syncategorematic terms is concerned with the semantic and logical
functions of those parts of speech that have been missed out in the ordo significationis
since they are neither nouns nor verbs and thus have neither a proper meaning nor a
direct relation to any part of reality (Meier-Oeser 1998). Even if syncategorematic terms
(i.e. quantifiers, prepositions, adverbs, conjunctions etc. like 'some', 'every', 'besides',
'necessarily',or the copula'est') do not signify'something' (aliquid) but only, as was later
said,'in some way' (aliqualiter), they perform semantic functions that are determinative
for the meaning and the truth-value of propositions.
Since the late 12th century the syncategorematic terms became the subject matter
of a special genre of logical textbooks, the syncategoremata tracts. They also played an
important role in the vast literature on sophismata, i.e. on propositions like "every man
is every man" or "Socrates twice sees every man besides Plato", which, due to a
syncategorematic term contained in them, need further analysis in order to make explicit their
unterlying logical form as well as the conditions under which they can be called true or
false (Kretzmann 1982).
2. The second branch of terminist semantics was concerned with those properties of
terms that are relevant for explaining truth, inference and fallacy. While signification was
seen as the most fundamental property of terms, the one to which they devoted most
attention was suppositio. Whereas any term, due to its imposition, has signification or
lexical meaning on its own, it is only within the context of a proposition that it achieves
the property of supposition or the function of standing for (supponere pro) a certain
object or a certain number or kind of objects. Thus it is the propositional context that
determines the reference of terms.
The main feature of supposition theory is the distinction of different kinds of
suppositions. The major distinction is that between 'material supposition' (suppositio materialis,
when a term stands for itself as a token, e.g. 'I write donkey', or a type; e.g. 'donkey is
a noun'),'simple supposition' (s. simplex, when a term stands for the universal form or
a concept; e.g. 'donkey is a species'), and 'personal supposition' (s. personalis; when a
term stands for ordinary objects, e.g. 'some donkey is running'). This last most
important type of supposition is further divided and subdivided depending whether the truth
conditions of the proposition in which the term appears require a particular
quantification of the term (all x, every x, this x, some x, a certain x, etc). While supposition theory
in general provides a set of rules to determine how the terms in a given propositional
context have to be understood in order to render the proposition true or an
inference valid, the treatment of suppositio personalis and its subclasses, which are at the
center of this logico-semantic approach, focuses on the extension of the terms in a given
proposition.
8. Meaning in pre-19th century thought 159
The characteristic feature of terminist logic, as it is exemplified both in the theory
of syncategorematic terms and in the theory of supposition, is commonly described as
a contextual approach (de Rijk 1967: 123-125), or, more precisely, as a propositional
approach to meaning (de Rijk 1967:552).
5.5. Roger Bacon's theory of the foundation and the change of reference
Roger Bacons theory of linguistic meaning is structured around two dominant features:
(1) his semiotic approach, according to which linguistic signification is considered in
connection to both conventional and natural sign processes, and (2) his original and
inventive interpretation of the doctrine of the 'imposition of names' (impositio nominum) as
the basis of word meaning. Bacon accentuates the arbitrariness of meaning (Frcdborg
1981:87ff). But even though the first name-giver is free to impose a word or sign on
anything whatsoever, he or she performs the act of imposition according to the paradigm of
baptizing a child, so that Bacon in this respect might be seen as an early advocate of what
today is known as causal theory of reference (see article 4 (Abbott) Reference). This
approach, if taken seriously, has important consequences for the concept of signification.
For:"all names which we impose on things we impose inasmuch as they are present to us,
as in the case of names of people in baptism" (Bacon 1988:90). Contrary to the tradition
of Aristotelian or Boethian semantics (Ebbesen 1983), Bacon favors the view that words
according to their imposition immediately and properly signify things rather than mental
concepts of things. Thus, his account of linguistic signification abandons the model of
the semantic triangle and marks an important turning point on the way from the
traditional intensionalist semantics to the extensionalist reference semantics as it became
increasingly accepted in the 14th century (Pinborg 1972:58f).
With regard to mental concepts, spoken words function just as natural signs, which
indicates that the speaker possesses some concept of the object the word refers to, for
this is a prerequisite for any meaningful use of language (Bacon 1978:85f, 1988:64).
When Bacon treats the issue of linguistic meaning as a special case of sign
relations, he considers the sign relation after the model of real relations presupposing both
the distinction of the terms related (so that nothing can be a sign of itself) and their
actual existence (so that there can be no relation to a non-existent object). As a
consequence of this account, words lose their meaning, or, as Bacon says, "fall away from their
signification" (cadunt a significatione) if their significate ceases to exist (1978:128).
But even if the disappearance of the thing signified annihilates the sign relation and
therefore must result in the corruption of the sign itself, Bacon is well aware that the
use of names and words in general is not restricted to the meaning endowed during
the first act of imposition (the term homo does not only denote those men who were
present when the original act of its imposition took place); nor do words cease to be used
when their name-bearers no longer physically exist (Bacon 1978:128). As a theoretical
device for solving the resulting difficulties regarding the continuity of reference, Bacon
introduced a distinction of two modes of imposition that can be seen as "his most
original contribution to grammar and semantics" (Fredborg 1981:168). Besides the 'formal
mode of imposition', conducted by an illocutionary expression like "I call this ..."
{modus imponendi sub forma impositionis vocaliter expressa), there is a kind of
'secondary imposition', taking place tacitly (sine forma imponendi vocaliter expressa)
whenever a term is applied (transumitur) to any object other than that which the first
160 II. History of semantics
name-giver 'baptized' (Bacon 1978:130). Whereas the formal mode of imposition refers
to acts of explicitly coining a new word, the second mode describes what happens in the
everyday use of language. Bacon (1978:130) states:
We notice that infinitely many expressions are transposed in this way; for when a man is
seeing for the first time the image of a depicted man he does not say that this image shall
be called 'man' in the way names are given to children, he rather transposes the name of
man to the picture. In the same way he who for the first time says that god is just, does
not say beforehand 'the divine essence shall be called justice', but transposes the name of
human justice to the divine one because of the similitude. In this way wc act the whole day
long and renew the things signified by vocal expressions without an explicit formula of vocal
imposition.
In fact this modification of the meaning of words is constantly taking place even
without the speaker or anyone else being actually aware of it. For by simply using
language we "all day long impose names without being conscious of when and
how" (nos tota die imponimus nomina et non advertimus quando et quomodo;
Bacon 1978: 100, l30f).Thus, according to Roger Bacon, who in this respect is playing
off a use theory of meaning against the causal approach, the common mode of
language use is too complex and irregular as to be sufficiently described solely by
the two features of a primary reference-setting and a subsequent causal chain of
reference-borrowing.
5.6. Speculative grammar and its critics
The idea, fundamental already for Bacon, that grammar is a formal science rather than a
propaedeutic art, is shared by the school of the so-called 'modist' grammarians (modistae)
emerging around 1270 in the faculty of arts of the university of Paris and culminating in
the Grammatica Speculativa of Thomas of Erfurt (ca. 1300). The members of this school
who took it for granted that the objective of any formal science was to explain the facts
by giving reasons for them rather than to simply describe them, made it their business to
deduce the 'modes of signifyng' {modi significandi), i.e. grammatical features common to
all languages, from universal 'modes of being' (modi essendi) by means of corresponding
'modes of understanding' (modi intelligendi).
Thus the tradition of 'speculative grammar' (grammatica speculativa) adopted
Aristotle's commonly accepted claim that mental concepts, just as things, are the same
for all men, and developed it further to the thesis of a universal grammar based on the
structural analogy between the 'modes of being' (modi essendi), the 'modes of
understanding' (modi intelligendi), and the 'modes of signifying' (modi significandi) that are
the same for all languages (Bursill-Hall 1971). Thus, Boethius Dacus (1969: 12): one of
the most important theoreticians of speculative grammar, states that
... all national languages are grammatically identical.The reason for this is that the whole
grammar is borrowed from the things ... and just as the natures of things are similar for
those who speak different languages, so are the modes of being and the modes of
understanding; and consequently the modes of signifying are similar, whence, so are the modes
of grammatical construction or speech. And therefore the whole grammar which is in one
language is similar to the one which is in another language.
8. Meaning in pre-19th century thought 161
Even though the words are arbitrarily imposed (whence arise the differences between
all languages), the modes of signifying are uniformly related to the modes of being by
means of the modes of understanding (whence arise the grammatical similarities among
all languages). Soon after 1300 the modistic approach came under substantial criticism.
The main point that critics like Ockham oppose is not the assumption of a basic
universal grammar, for such a claim is implied in Ockham's concept of mental grammar too,
but rather two other aspects of modism: (1) the assertion of a close structural analogy
between spoken or mental language and external reality (William of Ockham 1978:158),
and (2) the inadmissible reification of the modus significandi, which is involved in its
description as some quality or form added to the articulate voice (dictionisuperadditum)
through the act of imposition. To say that vocal expressions 'have' different modes of
signifying is, as Ockham points out, just a metaphorical manner of speaking; for what is
meant is simply the fact that different words signify whatever they signify in different
ways (Ockham 1974:798).
According to John Aurifaber (ca. 1330), a vocal term is significative, or is a sign,
solely by being used significatively, not on grounds of something inherent in the sound.
In order to assign signification a proper place in reality, it must be ascribed to the
intellect rather than to the vocal sound (Pinborg 1967:226).This criticism of modist grammar
is connected to a process that might be described as a progressive 'mentalization' of
signification.
5.7. The late medieval mentalist approach to signification
The idea behind this process is the contention that without some sort of'intentionality'
the phenomena of sign, signification, and semiosis in general must remain inconceivable.
The tendency to relocate the notions of sign and signification from the sphere of spoken
words to the sphere of the mind is characteristic for the mentalist logic, emerging in the
early 14th century and remaining dominant throughout the later Middle Ages.
The signification of spoken words and external signs in general is founded on the
natural signification instantiated in the mental concepts. The cognitive mental act as that
which makes any signification possible is now conceived as a sign or an act of
signification in its most proper sense. The introduction of the notion of formal signification (signi-
ficatio formalis), identical with the mental concept (Raulin 1500: d3vb), is the result of a
fundamental change in the conception of signification.The mental concept does not have
but rather is signification.This, however, does not imply that it is the signified of a spoken
word but, quite the contrary: the mental concept, as Ockham (1974: 7f) claimed, is the
primary significr in subordination to which a spoken word (to which again is subordinate
the corresponding written term) takes up the character of a sign (see Tab. 8.1.).Thus the
significative force of mental concepts is seen as the point at which the analysis of
signification necessarily must find its end. It is an ultimate fact for which no further rationale
can be given (Meier-Oeser 1997:141-143).
6. Concepts of meaning in modern philosophy
Whereas in Western Europe, under the growing influence of humanism, the scholastic
tradition of terminist logic and semantics came to an end in the third decade of the 16th
century, it continued vigourously on the Iberian Peninsula until the 18th century. It was
162 II. History of semantics
then reimported from there into central Europe in the late 16th and early 17th century
and dominated, though in a somewhat simplified form, academic teaching in Catholic
areas for more than a century. In what is commonly labeled 'modern philosphy',
however, logic, the former center of semantic theory, lost many of its medieval attainments
and subsided into inactivity until the middle of the 19th century. In early modern
philosophy of language the logico-semantic approach of the scholastic tradition is displaced
by an epistemological approach, so that in this context the focus is not on the issue of
meaning but rather on the cognitive function of language.
6.1. The modern redefinition of meaning
In early modern philosophy (outside the scholastic discourse) the "difficult question
of whether a spoken word signifies the mental concept or the thing" (see 5.2.) which
once had stimulated a rich variety of distinctly elaborated semantic theories was
unanimously considered as definitively answered. Due to the prevalent persuasion that the
primary function of speech was to express one's thoughts, most of the non-scholastic
early modern authors took up the view that words signify concepts rather than things
which, from a scholastic point of view, had been classified as the "more antiquated" one
(Rubius 1605: 2.18). Given, however, that concepts, ideas, or thoughts are the primary
meaning of words, the thesis that language has a formative influence on thought, which
became increasingly widely accepted during the 18th century, turns out to be a thesis of
fundamental importance to semantics.
6.2. The influence of conventional language on thought processes
Peterof Ailly (1330-1421) claimed that there is such a habitually close connection between
the concept of the thing and the concept of its verbal expression that by stimulating
one of these concepts the other is always immediately stimulated as well (Kaczmarck
1988: 403ff). Still closer is the correlation of language and thought in Giovanni Battista
Giattini's (1601-1672) account of language acquisition. Upon hearing certain words
frequently and in combination with the sensory perception of their significata, a 'complex
species' is generated, and this species comprises, just like the Saussurean sign, the sound-
image as well as the concept of its correlate object ("... generantur ... species complexae
talium vocum simul et talium obiectorum ex ipsa consuetudine"; Giattini 1651:431). In
this vein, Jean de Raey (1622-1702) sees the "union of the external vocal sound and
the inner sense" as the "immediate fundament of signification" and holds that sound
(sonus) and meaning or sense (sensus) make up "one and the same thing rather than
two things" (Racy 1692:29). Thinking, therefore, "seems to be just some kind of internal
speech or logos endiathetos without which there would be no reasoning" (30). Even if de
Reay refers to the old tradition of describing the process of thinking in terms of internal
speech (see 2.2.), a fundamental difference becomes apparent when he claims that "both
the speaker and the hearer have in mind primarily the sound rather than the meaning
and often the sound without meaning but never the meaning without sound" (29). Until
then, internal speech had generally been conceived as being performed in what Augus-
tinus had called verba nullius linguae (see 4.1.), but from de Raey (and other authors of
that time) inner speech is clearly intimately linked to spoken language.
8. Meaning in pre-19th century thought 163
The habitual connection of language and thought was the theoretical foundation for
the thesis of an influence of language on thinking in the 17th century. As the introduction
to the Port-Royal logic notes: "this custom is so strong, that even when we think alone,
things present themselves to our minds only in connection with the words to which we
have been accustomed to recourse in speaking to others." (Arnauld & Nicole 1662:30).
Because most 17th century authors adhered to the priority of thought over language
they considered this custom just a bad habit. While this critical attitude remained a
constant factor in the philosophical view of language during the following centuries, a new
and alternative perspective, taking into account also its positive effects, was opened with
Hobbes, Locke and Leibniz.
6.3. Thomas Hobbes (1588-1679)
In his Logic (Computatio sive logica), which makes up the first part (De Corpore) of his
Elementa Philosophiae (1655, English 1656), Thomas Hobbes draws a parallel between
reasoning and a mathematical calculus, the basic operations of which can be described as
an addition or subtraction of ideas, thoughts, or concepts (Hobbes 1839a: 3-5). Because
thoughts are volatile and fleeting (fluxae et caducae) they have to be fixed by means of
marks (notae) which, in principle, everyone can arbitrarily choose for himself (1839a:
1 If). Because the progress of science, however, can be obtained only in form of a
collective accumulation of knowledge, it is necessary that "the same notes be made common
to many" (Hobbes 1839b: 14). So, in addition to these marks, signs (signa) as a means of
communication are required. In fact, both functions are obtained by words: "The nature
of a name consists principally in this, that it is a mark taken for memory's sake; but
it serves also by accident to signify and make known to others what we remember
ourselves" (Hobbes 1839b: 15).
Hobbes adopts the scholastic practice of viewing linguistic signs in light of the general
notion of sign. His own concept of sign, however, according to which signs in general can
be described as "the antecedents of their consequents, and the consequents of their
antecedents", is from the outset confined to the class of indcxical signs (Hobbes 1839b: 14).
Words and names ordered in speech must therefore be indexical signs rather than
expressions of conceptions, just as they cannot be "signs of the things themselves; for
that the sound of this word stone should be the sign of a stone, cannot be understood
in any sense but this, that he that hears it collects that he that pronounces it thinks of a
stone" (Hobbes 1839b: 15). It is true, from Roger Bacon onwards we find in scholastic
logic the position that words are indexical signs of the speaker's concepts. Connected to
this, however, was always the assumption, explicitly denied by Hobbes, that the proper
significate of words are the things talked about.
Names, according to Hobbes, "though standing singly by themselves, are marks,
because they serve to recall our own thoughts to mind, ... cannot be signs, otherwise
than by being disposed and ordered in speech as parts of the same" (Hobbes 1839b: 15).
As a result of the distinction between marks and signs, any significative function of words
can be realized only in the framework of communication. These rudiments of a
sentence semantics (Hungerland & Vick 1973), however, were not elaborated any further
by Hobbes.
164 II. History of semantics
6.4. The 'Port-Royal Logic' and 'Port-Royal Grammar'
The so called Port-Royal Logic (Logique ou Vart depenser), published in 1662 by Antoine
Arnauld (1612-1694) and Pierre Nicole (1625-1695), is not only one of the most
influential early modern books on the issue of language but in some respect also the most
symptomatic one. For "it marks, better than any other, the abandonment of the
medieval doctrine of an essential connection between logic and semantics", and treats the
"most fundamental questions ... with the kind of inattention to detail that came to
characterize most of the many semantic theories of the Enlightenment" (Kretzmann 1967:
378a).
The most influential, though actually quite modest, semantic doctrine of this text is
the distinction between comprehension and extension which is commonly seen as a direct
ancestor of the later distinction between intension and extension. And yet it is different:
Whereas the comprehension of an idea is tantamount to "the attributes it comprises in
itself that cannot be removed from it without destroying it", extension is described as
"the subjects with which that idea agrees, which are also called the inferiors of the
general term, which in relation to them, is called superior; as the idea of triangle is extended
to all the various species of triangle" (Arnauld & Nicole 1662: 61f). It is manifest that
Arnauld in this passage unfolds his doctrine along the lines of the relation of genus and
species, so that the idea of a certain species would be part of the extension of the idea of
the superior genus, which does not match with the current use of the intension/extension
distinction.The extension of a universal idea, however, does not consist of species alone;
for Arnauld also notices that the extention of a universal idea can be restriced in two
different modes: either (1) "by joining another distinct or determinate idea to it" (e.g.
"right-angled" to "triangle") which makes it the idea of a certain subclass of the universal
idea (= extension l),or (2) by "joining to it merely an indistinct and indeterminate idea
of a part" (e.g. the quantifying syncategoreme "some") which makes it the idea of an
undetermined number of individuals (= extension 2) (Arnauld & Nicole 1662:62). What
Arnauld intends to convey is simply that the restriction of a general idea can be achieved
cither by specification or by quantification - which, however, result in two different and
hardly combinable notions of extension.
While empirism, according to which sense experience is the ultimate source of
all our concepts and knowledge, was prominently represented by Hobbes, the Port-
Royal Logic, taking a distinct Cartesian approach, is part of the rationalist philosophy
acknowledging the existence of innate ideas or, at least, of predetermined structures
of rational thought. This also holds for the Port-Royal Grammar (1660) (Arnauld &
Lancelot 1667/1966) that opened the modern tradition of universal grammar which
dominated linguistic studies in the 17th and 18th centuries (Schmitter 1996). The
universal grammarians aimed to reduce, in a Chomskian-style analysis, the fundamental
structures of language to universally predetermined mental structures. The distinction
of deep and surface structure seems to be heard when Nicolas Beauzde (1717-1789)
claims that since all languages are founded on an identical "mechanisme intellectuel"
the "differences qui se trouvent d'une langue a 1'autre ne sont, pour ainsi dire, que
superficielles." (Beauzee 1767: viiif).
Whereas the rationalist grammarians took language as "the exposition of the analysis
of thought" (Beauzde 1767: xxxiiif) and thus as a means of detecting the rules of thought,
empirists like Locke or Condillac saw language as a means of forming and analyzing
8. Meaning in pre-19th century thought 165
complex ideas, thus showing a pronounced tendency to ascribe a certain influence of
language on thought.
6.5. John Locke (1632-1704)
Locke's Essay Concerning Human Understanding (1690/1975) is the most influential
early modern text on language, even if the third book, which is devoted to the issue "of
words", hardly offers more than a semantics of names, differentiating between names of
simple ideas, of mixed modes, and of natural substances. In this context Locke focuses on
the question of how names and the ideas they stand for are related to external reality.
With regard to simple ideas the answer is simple as well. For their causation by external
object shows such a degree of regularity that our simple ideas can be considered as "very
near and undiscernably alike" (389). Error and dissent, therefore, turn out to be
primarily the result of inconsiderate use of language: "Men, who well examine the Ideas of
their own Minds, cannot much differ in thinking; however, they may perplex themselves
with words" (180).
Locke places emphasis on the priority of ideas over words in some passages (437;
689) and distinguishes between a language-independent mental discourse and its
subsequent expression in words (574ff). However, the thesis that thought is at least in principle
independent of language is counterbalanced by Locke's account of individual language
acquisition and the actual use of language. Even if the idea is logically prior to the
corresponding word, this relation is inverted in language learning. For in most cases the
meaning of words is socially imparted (Lenz 2010), so that we learn a word before being
acquainted with the idea customarily connected to it (Locke 1975: 437). This habitual
connection of ideas with words does not only effect an excitation of ideas by words but
quite often a substitution of the former by the latter (408). Thus, the mental
proposition made up of ideas actually turns out to be a marginal case, for "most Men, if not
all, in their Thinking and Reasoning within themselves, made use of Words instead of
Ideas" (575). Even if clear and distinct knowledge were best achieved by "examining
and judging of Ideas by themselves their Names being quite laid aside", it is, as Locke
conceeds, "through the prevailing custom of using Sounds for Ideas ... very seldom
practised" (579).
Locke's theory of meaning is often characterized as vague or even incoherent
(Kretzmann 1968; Landesman 1976). For, on the one hand, Locke states that "Words in
their primary or immediate Signification, stand for nothing, but the Ideas in the Mind
of him he uses them" (Locke 1975: 405f, 159, 378, 402, 420, 422) so that words "can be
Signs of nothing else" (408). On the other hand, like most scholastic authors unlike the
contempora trend of the 17th and 18th centuries, he considers ideas as signs of things,
and advocates the view that words "ultimatly ... represent Things" (Locke 1975:520) in
accordance with the scholastic mediantibus conceptibus-lhesis (see 5.2.; Ashworth 1984).
Whereas in the late 19th century it was considered "one of the glories of Locke's
philosophy that he established the fact that names are not the signs of things but in their
origins always the signs of concepts" (Miiller 1887: 77), it is precisely for this view that
Locke's semantic theory is often criticized as a paradigm case of private-language
philosophy and semantic subjectivism. But if there is something like semantic subjectivism
in Locke then it is more in the sense of a problem that he is pointing to than something
that his theory tends to result in.For one of his main points regarding our use of language
166 II. History of semantics
is that we should keep it consistent with the use of others (Locke 1975: 471), since
words are "no Man's private possession, but the common measure of Commerce and
Communication" (514).
Locke saw sensualism as supported by an interesting observation regarding meaning
change (see article 99 (Fritz) Theories of meaning change) of words. Many words, he
noticed, "which are made use of to stand for actions and notions quite removed from
sense, have their rise from ... obvious sensible ideas and are transferred to more
abstruse significations" (Locke 1975: 403). Therefore large parts of our vocabulary are
"metaphorical" concepts in the sense that metaphor is defined by the modern cognitive
account (see article 26 (Tyler &Takahashi) Metaphors and metonymies).That is, thinking
about a concept from one knowledge domain in terms of another domain, as is
exemplified by terms like "imagine, apprehend, comprehend, adhere, conceive, instil, disgust,
disturbance, tranquillity".
This view was substantiated by Leibniz's comments on the epistemic function of the
"analogic des choses sensibles et insensibles", as it becomes manifest in language. It
would be worthwile, Leibniz maintained, to consider "1'usage des prepositions, comme a,
avec, de, devant, en, hors, par, pour, sur, vers, qui sont toutes prises du lieu, de la distance,
et du mouvement, et transferees depuis a toute sorte de changemens, ordres, suites,
differences, convenances" (Leibniz 1875-1890: 5.256). Kant, too, agreed in the famous §59
of his Critique of Judgement that this symbolic function of language would be "worthy
of a deeper study". For "... the words ground (support, basis), to depend (to be held up
from above), to flow from (instead of to follow), substance ... and numberless others,
are ... symbolic hypotyposes, and express concepts without employing a direct intuition
for the purpose, but only drawing upon an analogy with one, i.e., transferring the
reflection upon an object of intuition to quite a new concept, and one with which perhaps no
intuition could ever directly correspond."
Thus, according to Locke, Leibniz and Kant, metaphor is not simply a case of deviant
meaning but rather, as modern semantics has found out anew, an ubiquitous feature of
language and thought.
6.6. G. W. Leibniz (1646-1716) and the tradition of symbolic knowledge
While Hobbes and Locke, at least in principle and to a certain degree, still acknowledged
the possibility of a non-linguistic mental discourse or mental proposition, Gottfried
Wilhelm Leibniz emphazised the dependency of thinking on the use of signs: "thinking
can take place without words ... but not without other signs" (1875-1890: 7.191). For
"all our reasoning is nothing but a process of connecting and substituting characters
which may be words or other signs or images" (7.31). This view became explicit and
later extremely influential under the label of cognitio symbolica (symbolic knowledge,
symbolic cognition), a term Leibniz coined in his Meditationes de cognitione, veritate et
ideis (1684; 1875-1890: 4.423). Symbolic knowledge is opposed to intuitive knowledge
(cognitio intuitiva) which is defined as a direct and simultaneous conception of a complex
notion together with all its partial notions.
Because the limited human intellect cannot comprehend more complex concepts
other than successively, the complex concept of the thing itself must be substituted by a
sensible sign in the process of reasoning always supposing that a detailed explication of
its meaning could be given if needed. Leibniz, therefore, maintains that the knowledge or
8. Meaning in pre-19th century thought 167
cognition of complex objects or notions is always symbolic, i.e. performed in the medium
of signs (1875-1890:4.422f).The possibility and validity of symbolic knowledge is based
on the principle of proportionality according to which the basic signs used in symbolic
knowledge may be choosen arbitrarily, provided that the internal relations between
the signs are analogous to the relations between the things signified (7.264). In his
Dialogue (1677) Leibniz remarks that "even if the characters are arbitrary, still the use and
interconnection of them has something that is not arbitrary - viz. a certain
proportionality between the characters and the things, and the relations among different
characters expressing the same things. This proportion or relation is the foundation of truth."
(Leibniz 1875-1890:7.192).
The notion of cognitio symbolica provides the cpistcmological foundation of both
his project of a characteristica universalis or universal language of science and his
philosophy of language. For the basic principle of analogy is also realized in natural
languages to a certain extent. Leibniz therefore holds that "languages are the best mirror
of the human mind and that an exact analysis of the signification of words would make
known the operations of the understanding better than would anything else" (5.313).
Especially through its reception by Christian Wolff (1679-1654) and his school the
doctrine of symbolic knowledge became one of the main epistemological issues of the
German Enlightenment. In this tradition it is a common conviction that the use of signs
in general and of language in particular provides an indispensable function for any
higher intellectual operation. Hence Leibniz's proposal of a characteristica universalis
has been massively taken up, even though mostly in a somewhat altered form. For what
the 18th century authors are generally aiming at is not the invention of a sign system for
obtaining general knowledge, but rather a general science of sign systems. The general
doctrine of signs, referred to with names like Characteristica, Semiotica, or Semiologia,
was considered as a most important desideratum. In 1724 Wolff's disciple Georg Bern-
hard Bilfingcr (1693-1750) suggested the name Ars semantica for this, as he saw it, until
then neglected discipline, the subject matter of which would be the knowledge of all sorts
of signs in general as well as the theory about the invention, right use, and assessment of
linguistic signs in particular (Bilfinger 1724:298f).
The first extended attempt to fill this gap was made by Johann Heinrich Lambert
(1728-1777) with his Semiotik, oder die Lehre von der Bezeichnung der Gedanken und
Dinge (Semiotics, or the doctrine of the signification of thoughts and things), published
as the second part of his Neues Organon (1764).The leading idea of this work is Leibniz's
principle of proportionality which guarantees, as Lambert claims, the interchangeability
of "the theory of the signs" and "the theory of the objects" signified. (Lambert 1764:
3.23-24).
Besides the theory of sign invention, the hermeneutica, the theory of sign
interpretation, made up an essential part of the characteristica. Within the framework of
'general hermeneutics' {hermeneutica generalis), originally designed by Johann Conrad
Dannhauer (1603-1666) as a complement to Aristotelian logic, the reflections on
linguistic meaning focused on sentence meaning. Due to Dannhauer's influential Idea
boni interpretis (Presentation of the good interpreter, 1630), some relics of scholastic
semantics were taken up by hcrmcncutical theory, for which particularly the theory of
supposition (see 5.4.), which provided the "knowledge about the modes in which the
signification of words may vary according to their connection to others" (Reusch 1734:266)
had to be of pivotal interest. The developing discipline of hermeneutics as "the science
168 II. History of semantics
of the very rules in compliance of which the meanings can be recognized from their
signs" (Meier 1757: 1), shows a growing awareness of several important semantic
doctrines, as for instance the methodological necessity of the principle of charity in form of
a presumption of consistency and rationality (Scholz 1999:35-64), the distinction between
language meaning and contextual meaning, as it is present in Christian August Crusius's
(1747:1080) distinction between grammatic meaning (grammatischer Verstand),i.e."the
totality of meanings a word may ever have in one language", and logic meaning, i.e. the
"totality of what a word can mean at a certain place and in a certain context", or
the principle of compositionality (see article 6 (Pagin & Westerstahl) Compositionality),
as it appears in Georg Friedrich Meier's claim that "the sense of a speech is the sum total
of the meanings of the words that make up this speech and which arc connected to and
determining each other" (Meier 1757:57).
6.7. Condillac (1714-1780)
One of the most decisive features of 18th century science is its historical approach. In
linguistics this resulted in a great number of works on the origin and development of
language (Gessinger & von Rahden 1989).The way in which the historic-genetic point of view
opened new perspectives on the relation between language and thought is most clearly
reflected in Etienne Bonnot de Condillac's Essai sur Vorigine des connaissances humaines
(1746). Already in the early 18th century it was widely accepted that linguistic signs provide
the basis for virtually any intellectual knowledge. Christian Thomasius (1655-1728) had
argued that the use of language plays a decisive role in the ontogenetic development of
each individual's cognitive faculties (Meier-Oeser 2008). Condillac goes even further and
argues that the same holds for the phylogenetic development of mankind as well.
Language, therefore, is not only essentially involved in the formation of thoughts or abstract
ideas but also in the formation of the subject of thought, viz. of man as an intellectual being.
According to Condillac all higher cognitive operations are nothing but 'transformed
sensation' (sensation transforme). The formative principle that effects this
transformation is language or, more generally, the use of signs (Vusage des signes). Condillac
reconstructs the historical process of language development as a process leading from a
primordial natural language of actions and gestures (langage d'action), viewed as a
language of simultaneous ideas (langage des idees simultanees), to the language of articulate
voice (langage des sons articules), viewed as a language of successive ideas (langage des
idees successives).
We are so deeply accustomed to spoken language with its sequential catenation of
articulate sounds, he notices, that wc believe our ideas would by nature come to our
mind one after another, just as we speak words one after another. In fact, however, the
discursive structure of thinking is not naturally given but rather the result of our use of
linguistic signs. The main effect of articulate language consists in the gradual analysis of
complex and indistinct sensations into abstract ideas that are connected to and make up
the meaning of words. Since language is a necessary condition of thought and knowledge,
Locke was mistaken to claim that the primary purpose of language is to communicate
knowledge (Condillac 1947-1951:1.442a):
The primary purpose of language is to analyze thought. In fact we cannot exhibit the ideas
that coexist in our mind successively to others except in so far as wc know how to exhibit
8. Meaning in pre-19th century thought 169
them successively to ourselves. That is to say, we know how to speak to others only in so far
as we know how to speak to ourselves.
It was therefore Locke's adherence to the scholastic idea of mental propositions that
prevented him "to realize how necessary the signs are for the operations of the soul"
(1.738a). Every language, Condillac claims,"is an analytic method". Due to his
comprehensive notion of language this also holds vice versa: "every analytic method is a
language" (2.119a), so that Condillac, maintaining that sciences are analytical methods by
essence, comes to his provocative and controversial thesis that "all sciences are nothing
but well-made languages" (2.419a).
Condillac's theory of language and its epistemic function became the core topic of the
so-called school of'ideology' (ideologic) that dominated the French scene in early 19th
century. Although most authors of this school rejected the absolute necessity of signs and
language for thinking, they adhered to the subjectivistic consequences of sensualism and
considered it impossible "that one and the same sign should have the same value for all
of those who use it and even for each of them at different moments of time" (Destutt de
Tracy 1803:405).
7. References
7.1. Primary Sources
Abelard, Peter 1927. Logica ingredientibus, Glossae super Peri ermenias. Ed. B. Geyer. Munstcr:
Ascliendorff.
Andreas, Antonius 1508. Scriptum in arte veteri. Venice.
Aquinas,Thomas 1989. Expositio libriperyermenias. In: Opera omnia I* 1. Ed. Coinmissio Leonina.
2nd edn. Rome.
Arnauld, Antoine & Pierre Nicole 1965. La logique ou I'art depenser. Paris.
Arnauld, Antoine & Claude Lancelot 1667/1966. Grammaire generate et raisonnee ou La Gram-
maire de Port-Royal. Ed. H. E. Brekle, t. 1, reprint of the 3rd edn. Stuttgart-Bad Cannstatt:
Frornmami. 1966.
Augustinus 1963. De docti ina Christiana. In: W. M. Green (ed.). Sancti Augustini Opera (CSEL 80).
Vienna: Osterreichische Akademie der Wissenschaften.
Augustinus 1968. De trinitate. In: W. J. Mountain & F. Glorie (eds.). Aurelii Augustini Opera (CCSL
50).Turnhout: Brepols.
Augustinus 1975. De dialectica. Ed. Jan Pinborg, translation with introduction and notes by B.
Dan el Jackson. Dordrecht: Reidel.
Bacon, Roger 1978. De signis. Ed. K.M. Fredborg, L. Nielsen & J. Pinborg. Traditio 34,75-136.
Bacon, Roger 1988. Compendium Studii Theologiae. Ed.Th. S. Maloney. Leiden: Brill.
Bilfinger, Georg Bernhard 1724. De Sermone Sinico. In: Specimen doctrinae vetentm Sinarum
moralis et politicae, Frankfurt/M.
Beauz£e, Nicolas 1767. Grammaire generate. Paris.
Boethius, Anicius M.T. S. 1880. In Perihermeneias editio secunda. Ed. C. Meiser. Leipzig:Teubner.
Boethius Dacus 1969. Modi significandi. Ed. J. Pinborg & H. Roos. Copenhagen: C. E. C. Gad.
Condillac, Eticnne Bonnot de 1947-1951. Oeuvres philosophiques. Ed. G. Lc Roy. Paris: PUF.
Conimbricenses 1607. Commentaria in universam Aristotelis dialectica. Cologne.
Crusius, Christ ian August 1747. Weg zur Gewifilheit und Zu verlassigkeit der menschlichen Erkenntnis.
Leipzig.
Dannhauer, Johann Conrad 1630. Idea boni interpretis. Strasburg.
Destutt de Tracy, Antoine 1803. Elements d'ideologie. Vol. 2. Paris.
Giattini, Johannes Baptista 1651. Logica. Rome.
170 II. History of semantics
Henry of Ghent 1520. Summa quaestionum ordinarium. Paris.
Hobbes. Thomas 1839a. Opera philosophica quae scripsit omnia. Ed. W. Molesworth, vol. 1. London.
Hobbes. Thomas 1839b. The English Works. Ed. W. Molesworth, vol. 1. London.
Hoffbauer, Johannes Christoph 1789. Tentamina semiologica sive quaedam generalem theoriam sig-
norum spectantia. Halle.
Lambert, Johann Heinrich 1764. Neues Organon oder Gedanken uber die Erforschung und Bezeich-
nung des Wahren. Vol. 2. Leipzig.
Leibniz, Gottfried Wilhelm 1875-1890. Die philosophischen Schriften. Ed. C. I. Gerhardt. Berlin.
Reprinted: Hildesheim: Olms, 1965.
Locke, John 1975. An Essay concerning Human Understand. Ed. P. H. Nidditch. Oxford: Clarendon
Press.
Meier, Georg Friedrich 1757. Versuch einer allgemeinen Auslegungskunst. Halle.
Nicolaus a S. Iohanne Baptista 1687. Philosophia augustiniana. Genova.
Ockham, William of 1974. Summa logicae. Ed. Ph. Bochncr, G. Gal & S. Brown. In: Opera
philosophica 1. St. Bona venture, NY: The Franciscan Institute.
Ockham, William of 1978. Expositio in librum Perihermeneias Aiistotelis. Ed. A. Gambatese &
S. Brown. In: Opera philosophica 2. St. Bona venture, NY: The Franciscan Institute.
Raey, Jean de 1692. Cogitata de interpretation. Amsterdam.
Raulin, John 1500. In logicam Aristotelis commentarium. Paris.
Reusch, Johann Peter 1734. Systema logicum. Jena.
Rubius. Antonius 1605. Logica mexicana. Cologne.
Scotus, John Duns 1891. In primum librum perihermenias quacstiones. In: Opera Omnia. Ed.
L. Vives. vol. 1. Paris.
Smiglecius. Martin 1634. Logica. Oxford,
de Soto, Domingo 1554. Summulae. Salamanca.
Vcrsor, Johannes 1572. Summulae logicales. Venice.
7.2. Secondary Sources
Annas. Julia E. 1992. Hellenistic Philosophy of Mind. Berkeley. LA: University of California Press.
Ashworth, E. Jennifer 1984. Locke on language. Canadian Journal of Philosophy 14,45-74.
Ashworth, E. Jennifer 1987. Jacobus Naveros on the question: 'Do spoken words signifiy concepts
or things?'. In: L. M. de Rijk & C. A. G. Braakhuis (eds.). Logos and Pragma. Nijmegen: Inge-
nium Publishers. 189-214.
Barnes, Jonathan 1993. Meaning, saying and thinking. In: K. Doling & T. Ebert (eds.). Dialektiker
und Stoiker. Zur Logik der Stoa und ihrer Vorlaufer. Stuttgart: Steiner, 47-61.
Baxter,Timothy M. S. 1992. The Cratylus: Plato's Critique of Naming. Leiden: Brill.
Biihler, Karl 1934. Sprachtheorie. Jena: Fischer.
Bursill-Hall, G. E. 1971. Speculative Grammars of the Middle Ages. The Hague: Mouton.
Ebbesen, Sten 1983. The odyssey of semantics from the Stoa to Buridan. In: A. Eschbach & J.
Trabant (eds.). History of Semiotics. Amsterdam: Benjamins. 67-85.
Fredborg, Karen Margareta 1981. Roger Bacon on "Impositio vocis ad significandum'. In: H. A. G.
Braakhuis, C. H. Kneepkens & L. M. de Rijk (eds.). English Logic and Semantics. Nijmegen:
Ingenium Publishers. 167-191.
Gessinger, Joachim & Wolfert von Rahden 1989 (eds.). Theorien vom Ursprung der Sprache. Berlin:
de Gruyter.
Graeser, Andreas 1978. The Stoic theory of meaning. In: J. M. Rist (ed.). The Stoics. Berkeley, CA:
University of California Press, 77-100.
Graeser, Andreas 1996. Aristotclcs. In:T. Borschc (cd.). Klassiker der Sprachphilosophie, Miinchcn:
C.H. Beck, 33-^7.
Hungei land, Isabel C. & George R. Vick 1973. Hobbes's theory of signification. Journal of the
History of Philosophy 11,459^82.
8. Meaning in pre-19th century thought 17j_
Jacobi, Klaus 1983. Abelard and Frege. Tlie seinanics of words and propositions. In: V.M. Abrusci,
E. Casari & M. Mugnai (eds.). Alii del Convegno Internazionale di storia della logica, San
Gimignano 1982. Bologna: CLUEB, 81-96.
Kaczinarek, Ludger 1988. 'Notitia' bei Peter von Ailly. In: O. Pluta (ed.). Die Philosophic im 14. und
15. Jahrhundert. Amsterdam: Griiner, 385-420.
Kretzmann, Norman 1967. Semantics, history of. In: P. Edwards (ed.). The Encyclopedia of
Philosophy. Vol. 7. New York: The Macmillan Company, 358b-406a.
Kretzmann, Normann 1968. The main thesis of Locke's semantic theory. Philosophical Review 78,
175-196.
Kretzmann, Norman 1974. Aristotle on spoken sound significant by convention. In: J. Corcoran
(ed.). Ancient Logic and its Modern Interpretation. Dordrecht: Kluwer, 3-21.
Kretzmann, Norman 1982. Syncategoremata. exponibilia, sophismata. In: N. Kretzmann, A.
Kenny & J. Pinborg (eds.). The Cambridge History of Later Medieval Philosophy. Cambridge:
Cambridge University Press, 211-245.
Landcsman, Charles 1976. Locke's theory of meaning. Journal of the History of Philosophy 14,
23-35.
Lenz, Martin 2010. Lockes Sprachkonzeption. Berlin: de Gruyter.
Lewry, Osmund 1981. Robert Kilwardby on meaning. A Parisian course on the logica vetus.
Miscellanea mediaevalia 13,376-384.
Lieb, Hans-Heinrich 1981. Das 'semiotische Dreieck' bei Ogden und Richards: Eine Neuformu-
lierung des Zeichenmodells von Aristoteles. In: H. Geckeler et al. (eds.). Logos Semantikos.
Studia Linguistica in Honorem Eugenio Coseriu, vol. 1. Berlin: dc Gruyter/Madrid: Gredos,
137-156.
Long, Anthony A. & David N. Sedley 1987. The Hellenistic Philosophers. Cambridge: Cambridge
University Press.
Long, Anthony A. 1971. Language and thought in Stoicism. In: A. A. Long (cd.). Problems in
Stoicism. London: The Athlonc Press, 75-113.
Magee, John 1989. Boethius on Signification and Mind. Leiden: Brill.
Meier-Oeser, Stephan 1996. Signification. In: J. Ritter & K. Gi under (eds.). Historisches Worterbuch
der Philosophic 9. Basel: Schwabc, 759-795.
Mcicr-Ocscr, Stephan 1997. Die Spur des Zeichens. Das Zeichen und seine Funktion in der
Philosophic des Mittelalters und der friihen Neuzeit. Berlin: de Gruyter.
Meier-Oeser, Stephan 1998. Synkategorem. In: J. Ritter & K. Griinder (eds.). Historisches
Worterbuch der Philosophic 10. Basel: Schwabe, 787-799.
Meier-Oeser, Stephan 2004. Sprache und Bilder im Geist. Philosophisches Jahrbuch 111,312-342.
Meier-Oeser, Stephan 2008. Das Ende der Metapher von der 'inneren Rede'. Zuin Verhaltnis von
Sprache und Denken in der deutschen Friihaufklarung. In: H. G. Bodecker (ed.). Strukturen der
deutschen Friihaufklarung 1680-1720. Gottingen: Vandenhoeck & Ruprecht, 195-223.
Meier-Oeser, Stephan 2009. Walter Buiiey's piopositio in re and the systematization of the ordo
significationis. In: S. F. Brown, Th. Dewender & Th. Kobusch (eds.). Philosophical Debates at
Paris in the Early Fourteenth Century. Leiden: Brill, 483-506.
Miiller. Max 1887. The Science of the Thought. London: Longmans, Green & Co.
Panaccio, Claude 1999. Le Discours Interieure de Platon a Guillaume Ockham. Paris: Seuil.
Pinborg, Jan 1967. Die Entwicklung der Sprachtheorie im Mittelalter. Minister: Aschendorff.
Pinborg, Jan 1972. Logik und Semantik im Mittelalter. Ein Oberblick. Ed. H. Kohlenberger.
Stuttgart-Bad Cannstatt: Frommann-Holzboog.
Pohlcnz, Max 1939. Die Bcgrundung der abcndlandischcn Sprachlchrc durch die Stoa. (Nach-
richten von der Gesellschaft der Wissenschaften, Gottingen. Philologisch-historische Klasse,
N.F. 3). Gottingen: Vandenhoeck & Ruprecht, 151-198.
Rijk, Lambertus Maria de 1967. Logica modemorum, vol. II/l. Assen: Van Gorcum.
Schmitter, Peter (ed.) 1996. Sprachtheorien der Neuzeit II. Von der Grammaire de Port-Royal zur
Konstitution moderner linguistischer DiszipHnen.Tub'ingen: Narr.
172 II. History of semantics
Schmitter, Peter (2001).The emergence of philosophical semantics in early Greek antiquity. Logos
and Language 2,45-56.
Scholz, Oliver R. 1999. Verstehen und Rationalitat. Frankfui t/M.: Klostermann.
Sedley, David 1996. Aristotle's de interpretatione and ancient semantic. In: G. Manetti (ed.).
Knowledge Through S/gns. Turnhout: Brepols, 87-108.
Sedley, David 2003. Plato's Cratylus. Cambridge: Cambridge University Press.
Tachau, Katherine H. 1987. Wodeham, Crathorn and Holcot: The development of the complexe
significabile. In: L. M. de Rijk & H. A. G. Braakhuis (eds.). Logos and Pragma. Nijmegen: Inge-
nium Publishers. 161-189.
Weidemann, Hermann 1982. Ansatze zu einer semantischen Theorie bei Aristoteles. Zeitschrift fur
Semiotik 4,241-257.
Stephan Meier-Oeser, Berlin (Germany)
9. The emergence of linguistic semantics in the
19th and early 20th century
1. Introduction: The emergence of linguistic semantics in the context of
interdisciplinary research in the 19th century
2. Linguistic semantics in Germany
3. Linguistic semantics in France
4. Linguistic semantics in Britain
5. Conclusion
6. References
Abstract
This chapter deals with the 19th-century roots of current cognitive and pragmatic
approaches to the study of meaning and meaning change. It demonstrates that 19th-century
linguistic semantics has more to offer than the atomistic historicism for which 19th-century
linguistics became known and for which it was often criticised. By contrast, semanticists
in Germany, France and Britain in particular sought to study meaning and change of
meaning from a much more holistic point of view, seeking inspiration from philosophy,
biology, geology, psychology, and sociology to study how meaning is 'made' in the context
of social interaction and how it changes over time under pressure from changing linguistic,
societal and cognitive needs and influences.
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 172-191
9. The emergence of linguistic semantics in the 19th and early 20th century 173
The subject in which I invite the reader to follow me is so new in kind that it has not even
been given a name. The fact is that most linguists have directed their attention to the forms
of words: the laws which govern changes in meaning, the choice of new expressions, the
birth and death of phrases, have been left behind or have been noticed only in passing.
Since this subject deserves a name as much as does phonetics or morphology, I shall call it
semantics [...]. the science of meaning.
(Breal 1883/1991:137)
In memory of Peter Schmitter who first helped me to explore the history of semantics
1. Introduction: The emergence of linguistic semantics in the
context of interdisciplinary research in the 19th century
The history of semantics as a reflection on meaning is potentially infinite, starting
probably with Greek or Hindu philosophers of language and embracing more than two
thousand years of the history of mankind. It is spread over numerous disciplines, from ancient
philosophy (Schmitter 2001a), to modern cognitive science. The history of semantics as
a linguistic discipline is somewhat shorter and has been well explored (Ncrlich 1992b,
1996a, 1996b). This chapter therefore summarizes results from research dealing with the
history of semantics from the 1950s onwards and I shall follow the standard view that
semantics as a linguistic discipline began with Christian Karl Reisig's lectures on Latin
semasiology or Bedeutungslehre, given in the 1820s (Schmitter 1990,2004).
As one can see from the motto cited above, another central figure in the history of
linguistic semantics in the 19th century is undoubtedly Michel Brdal, the author of the
famous Essai de semantique, published in 1897, the cumulative product of work started
as early as 1866 (Breal 1883/1991). When this seminal book was translated into English in
1900, the name of the discipline that studied linguistic meaning and changes of meaning
became 'semantics', and other terms, such as semasiology, or sematology were sidelined
in the 20th century.
Although the focus is here on linguistic semantics, it should not be forgotten that the
study of'meaning' also preoccupied philosophers and semioticians.Two seminal figures
in the 19th century that should be mentioned in this context are Charles Sanders Peirce
in the United States who worked in the tradition of American pragmatism (Nerlich &
Clarke 1996) and Gottlob Frege who helped found mathematical logic and analytic
philosophy (see article 10 (Newcn & Schroder) Logic and semantics and article 3 (Textor)
Sense and reference). Although Peirce exerted enormous influence on the developments
of semantics, pragmatics and semiotics in the late 19th and early 20th century, his main
interest can be said to have been in epistemology. Frege too exerted great influence
on the development of formal semantics (see article 11 (Kempson) Formal semantics
and representationalism and article 14 (ter Meulen) Formal methods), truth-conditional
semantics, feature semantics and so on, especially through his distinction between Sinn
and Bedeutung or sense and reference. However, his main interest lay in logic, arithmetic
and number theory. Neither Frege nor Peirce were widely discussed in the treatises on
linguistic semantics which will be presented below, except for the philosophical and
psychological reflections of meaning around the tradition of 'signifies'. Frege was a logician,
not a linguist and he himself pointed out that "[t]o a mind concerned with the beauties
of language, what is trivial to the logician may seem to be just what is important" (Frege
1977:10).The linguists discussed below were all fascinated with the beauty of language.
174 II. History of semantics
This chapter focuses on linguistic semantics in Germany, France, and Britain, thus
leaving aside Eastern Europe, Russia, Scandinavia, and our closer neighbours, Italy,
Spain, the Benelux countries and many more. However, the work carried out in these
countries was, as far as I know, strongly influenced by, if not dependent on the theories
developed in Germany, France, and Britain. Terminologically, German linguists initially
wrote about Semasiologie, French ones about la semantique, and some English ones
about sematology. In the end the term semantics was universally adopted.
In general terms one can say that linguistic semantics emerged from a dissatisfaction
with traditional grammar on the one hand, which could not deal adequately with
questions of meaning, and with traditional lexicography and etymology on the other, which
did not give a satisfactory account of the evolution of meaning, listing the meanings of
words in a rather arbitrary fashion, instead of looking for a logical, natural or inner order
in the succession of meanings. To redefine grammar, scholars looked for inspiration in
the available traditions of philosophy; to redefine lexicography they grasped the tools
provided by rhetoric, that is the figures of speech, especially metaphor and metonymy
(Nerlich 1998).The development of the field was however not only influenced by internal
factors relating to the study of language but also by developments in other fields such
as geology and biology, for example, from which semantics imported concepts such as
'uniformitarianism', 'transformation', 'evolution', 'organism' and 'growth'. After initial
enthusiasm about ways to give semantics'scientific' credibility in this way, a debate about
whether framing the development and change of meaning in such terms was legitimate
would preoccupy scmanticists in the latter half of the 19th century.
The different traditions in the field of semantics were influenced by different
philosophical traditions on the one hand, by different sciences on the other. In the case of
German semasiology, the heritage of Kantian philosophy, idealism, the romantic
movement, and the new type of philology, or to quote some names, the works of Immanuel
Kant, especially his focus on the 'active subject', Wilhclm von Humboldt and his
concept of'ergon' and 'energeia' (Schmitter 2001b) and Franz Bopp and his research into
comparative grammar were of seminal importance. German semasiology after Reisig
was very much attached to the predominant paradigm in linguistic science, that is, to
historical-comparative philology. This might be the reason why the term 'semasiology',
as designating one branch of a prospering and internationally respected discipline, was
at first so successful in English speaking countries where an autonomous approach to
semantics was missing. This was not the case in France, where Brcal, from 1866 onwards,
used semantic research as a way to challenge the German supremacy in linguistics. Later
on, however, German Semasiologie, just like the French tradition of la semantique, began
to be influenced by psychology, and thus these two traditions moved closer together.
French semantics, especially the Brealian version, was influenced by the French
philosophy of language which was rooted in the work of Etienne Bonnot de Condillac and
his followers, the Ideologues, on words as signs. But Breal was also an admirer of Bopp
and of Humboldt. Breal first expressed his conviction that semantics should be a
psychological and historical science in his review of the seminal book, La Vie des mots, written in
1887 by Arsene Darmesteter (first published in English, in 1886, based on lectures given
in London) and advocated caution in adopting terms and concept from biology, such as
organism and transformation (Brdal 1887).
Darmesteter had derived his theory of semantic change from biological models, such
as Charles Darwin's theory of evolution and August Schleicher's model of language
as an evolving organism and of languages as organised into family trees, transforming
9. The emergence of linguistic semantics in the 19th and early 20th century 175
themselves independently from the speakers of the language. Darmesteter applied this
conception to words themselves. It is therefore not astonishing to find that Darmesteter's
booklet contains a host of biological metaphors about the birth, life and death of words,
their struggle for survival, etc.This metaphorical basis of Darmesteter's theory was noted
with skepticism by his colleagues who agreed however that Darmesteter's book was the
first really sound attempt at analysing how and why words change their meanings. To
integrate Darmesteter's insights into his own theoretical framework, Brdal had only to
replace the picture of the autonomous change of a language by the axiom that words
change their meaning because the speakers and hearers use them in different ways, in
different situations (Delesalle 1987, Nerlich 1990). In parallel with Reisig, Darmesteter
used figures of speech, such as metaphor, metonymy and synecdoche to describe the
transitions between the meanings of words (Darmesteter 1887), being inspired by the
achievements of French rhetoric, especially the work of Cesar Chesneau Du Marsais
(1757) on tropes as being used in ordinary language.
The English tradition of semantics emerged from philosophical discussions about
language and mind in the 17th and 18th centuries (John Locke), and about etymology and
the search for 'true' and original meaning (John Home Tooke). Philosophical influences
here were utilitarianism and a certain type of materialism. Semantics in its broadest sense
was also used at first to underpin religious arguments about the divine origin of language.
The most famous figure in what one might call religious semantics, was Richard Chenevix
Trench, an Anglican ccclcsiast, bishop of Dublin and later Dean of Westminster, who wrote
numerous books on the history of the English language and the history of English words.
His new ideas about dictionaries led the Philological Society in London to the creation of
the New English Dictionary, later called the Oxford English Dictionary, which is
nowadays the richest source-book for those who want to study semantic change in the English
language.
After the turn of the 19th to the 20th century one can observe in Britain a rapid
increase in books on 'words' - the trivial literature of semantics, so to speak (Nerlich
1992a) -, but also a more thoroughly philosophical reflection on meaning, as well as the
start of a new tradition of contextualism in the work of (Sir) Alan Henderson Gardiner
and John Rupert Firth, mainly influenced by the German linguist Philipp Wegener and
the anthropologist Bronislaw Malinowski (see Nerlich & Clarke 1996).
Many of those interested in semantics tried to establish classifications of types or
causes of semantic change, something that became in fact one of the main
preoccupations for 19th-century semanticists.The classifications of types of semantic change were
mostly based on logical principles, that is a typology according to the figures of speech,
such as metaphor, metonymy, and extension and restriction, specifying the type of
relation or transition between the meanings of a word; the typologies of causes of semantic
change were mostly based on psychological principles, specifying human drives and
instincts; and finally some classifications applied historical or cultural principles; but most
frequently these enterprises used a mixture of all three approaches. Later on in the
century, when the issue of phonetic laws reverberated through linguistics, semanticists tried
not only to classify semantic changes, but to find laws of semantic change that would be
as strict as the sound laws were then believed to be and in doing so, some believed to turn
linguistics into a 'science' in the sense of natural science.
After this preliminary sketch I shall now deal with the three traditions of semantics
one by one, the German, the French, and the British one. However, the reader of the
following sections has to keep in mind that the three traditions of semantics are not as
176
II. History of semantics
strictly separable as it might appear. There were numerous links of imitation, influence,
cross-fertilization, and collaboration.
2. Linguistic semantics in Germany
It has become customary to distinguish two main periods in German semantics: (1) a
logico-historical or historico-descriptive one, and (2) a psychologico-explanatory one.
The father of the first tradition is Reisig, the father of the second Steinthal.
Christian Karl Reisig (Schmitter 1987,2004) was, like many other early semanticists,
a classical philologist. In his lectures on Latin grammar given in the 1820s, and first
published by his student Friedrich Haase in 1839 (2nd edition 1881-1890), he tried to reform
the standard view of traditional grammar by adding to it a new discipline: semasiology
or Bedeutungslehre, that is the study of meaning in language. Grammar was normally
considered to consist in the study of syntax and etymology (which then meant
approximately the same as morphology or Formenlehre). Reisig claims that the word should not
only be studied with regard to its form (etymology) and in its relation to other words
(syntax), but as having a certain meaning. He points out that there are words whose
meaning is neither determined by their form alone nor by their place in the sentence,
and that the meaning of these words has to be studied by semasiology. More
specifically semasiology is the study of the development of the meaning of certain words, as
well as the study of their use, both phenomena that were covered by neither etymology
nor syntax.
Reisig puts forward thirteen principles of semantics, according to which language
precedes grammar, languages are the products of nations, not of single human beings,
language comes into being through imagination and enthusiasm in the social
interaction of people. We shall see that this dynamic view of language and semantic change
became central to historical semantics at the end of the 19th century when it was also
linked to a more contextual approach. According to Reisig the evolution of a particular
language is determined by free language-use within the limits set by the general laws of
language. These general laws of language are Kant's laws of pure intuition (space and
time), and his categories. This means that language and language change are brought
about by a dynamic interplay of several forces which Reisig derived from his
knowledge of German idealism on the one side and the romantic movement on the other.
In line with German idealistic philosophy, he wanted to find the general principles of
semantic change, assumed to be based on general laws of the human mind. In tune with
the romantic movement, he saw that every language has, however, its individuality, based
on the history of the nation and he recognized that speakers too have certain degrees of
freedom in their creative use of language. This freedom is however constrained by
certain habitual associations between ideas which had already been discussed in rhetoric,
but which semasiology should, he claims, take account of, namely synecdoche, metonymy
and metaphor. Whereas rhetoric focuses on their aesthetic function, semasiology focuses
on the way these figures have come to structure language use in a particular language.
The recognition of rhetorical figures, such as metaphor and metonymy, as habitual
cognitive associations and also as procedures of semantic innovation and change was a
very important step in the constitution of semantics as an autonomous discipline. The
figures of speech were reinterpreted in two ways and thus emancipated from their
definitions in philosophical and literary discourse: they were no longer regarded as mere
abuses of language, but as necessary for the life of language, that is, the continuous use
9. The emergence of linguistic semantics in the 19th and early 20th century 177
of it; and they were not mere ornaments of speech, but essential instruments of mind
and language to cope with ever new communicative needs - an insight deepened in
various contributions to the 19th and early 20th-century philosophy of metaphor and
rediscovered in modern cognitive semantics (Nerlich & Clarke 2000; see also article 100
(Geeraerts) Cognitive approaches to diachronic semantics).
To summarize: Reisig's approach to semantics is philosophical with regard to the
general laws of the mind, as inherited from Kant, but it is also historical, because Reisig
stressed the necessity of always studying the Latin texts very closely. One can also claim
that his semantics is to some degree influenced by psychology, in as much as Reisig adds
to Kant's purely philosophical principles of intuition and reason a third source of human
language: sensations or feelings. It is also rhetorical and stylistic, a perspective later
rejected by his followers Heerdegen and Hey.
The theory of the sign that underlies his semantics is very traditional, that is
representational: the sign signifies an idea/concept (Begriff) or a feeling (Empfindung);
language represents thought, a thought that itself represents the external world. For Reisig
thoughts and feelings exist independently of the language that represents them, a view
that Humboldt tried to destroy in his conception of language as constitutive of thought.
The study of semantic change can therefore only be the study of the development of ideas
or thoughts as reflected in the words that signify them.This development of thought can
run along the following ('logical') lines, called metaphor (the interchange of ideas, II:
6), metonymy (the interchange of representations, II: 4), or synecdoche (II: 4). To these
he adds the use of originally local prepositions to designate temporal relations, based
on the interchange of the forms of intuition, time and space, again an insight that was
rediscovered in modern cognitive semantics. Semasiology has to show how the different
meanings of a word have emerged from the first meaning (logically and historically, II:
2), that is, how thought has unfolded itself in the meaning of words.This kind of'Vorstel-
lungsscmantik' (Knobloch 1988:271) would dominate German semasiology until 1930
approximately. It was then challenged and eventually overthrown by Leo Weisgerber in
linguistics and by Karl Biihler in psychology. 19th-century diachronic and psychological
semantics was replaced by 20th-century synchronic and structural semantics.
However, Reisig does not leave it at the narrow view of semantics sketched above.
He points out that the meaning of a word is not only constituted by its function of
representing ideas, but that it is determined as well by the state of the language in general and
by the use of a word according to a certain style or register (Schmittcr 1987:126). In fact,
in dealing with the 'stylistic' problem of choosing a word from a series of words with 'the
same' meaning, he reshapes his unidimensional definition of the sign as signifying one
concept. Reisig deals here with synonyms, either complete or quasi synonyms; he even
indicates the importance of a new kind of study: synonymology.
This rather broad conception of semantics, including word semantics, but also sty-
listics and synonymology, is very similar to the one later advocated by Brdal. However,
Reisig's immediate followers in Germany gradually changed Reisig's initial conception
of semasiology in the following ways: they dropped the philosophical underpinnings
and narrowed the scope of Reisig's semasiology by abandoning the study of words in
their stylistic context, reducing semasiology more and more to a purely atomistic study
of changes in word-meaning. In this new shape and form, semasiology flowered in
Germany, especially after the new edition of Reisig's work in the 1880s. Up to the end of
the century a host of treatises on the subject were published by philologists but also by a
number of'schoolmen' (Nerlich 1992a).
178 II. History of semantics
A new impetus to the study of meaning came from the rise of psychological thought in
Germany,especially under the influence of Johann Friedrich Herbart, Heymann Steinthal,
Moritz Lazarus, and Wilhelm Wundt, the latter three fostering a return to Humboldt's
philosophy of language. At the time when Steinthal tried to reform linguistic thought
through the application of psychological principles, even the most hard-nosed linguists,
the neogrammarians themselves, were forced to turn to psychology. This was due to the
introduction of analogy as the second most important principle of language change,
apart from sound laws.
As early as 1855 Steinthal had written a book where he tried to refute the belief
held by many of his fellow linguists that language is based on logical principles and that
grammar is based on logic. According to Steinthal, language is plainly based on
psychological principles, and these principles are largely of a semantic nature. Criticizing the
view inherited from Reisig that grammar has three parts, etymology, semasiology, and
syntax, he claims that there is meaning (what he calls after Humboldt an 'inner form')
in etymology as well as syntax. In short, semasiology should be part of etymology and
syntax, not be separated from them (1855: xxi-xxii). Using the Humboldtian dichotomy
of inner and outer form, he wants to study grammar (etymology and syntax) from two
points of view: semasiology and phonetics. For him language is 'significant sound', that is,
sound and meaning can not be artificially separated.
In 1871 Steinthal wrote his Abriss der Sprachwissenschaft. Volume I was intended
to be an Introduction into psychology and linguistics. In this work, he wants to explain
the origin of language as meaningful sound. His theory of the emergence of language
can be compared to modern symbolic interactionism (Nerlich & Clarke 1998).The
principle axiom is that when we emit a sound which is understood by the other in a certain
way, we understand not only the sound we made, but we understand ourselves, attain
consciousness. The origin of language and of consciousness thus lies in understanding.
This principle became very important to Philipp Wegener who fostered a new approach
to semantics, no longer a mere word-semantics, but a semantics of communication and
understanding (Wegener 1885/1991). Steinthal's conception of psychology was the basis
of an influential book on semantic change in the Greek language by Max Hecht, which
appeared in 1888 and was extensively quoted by the classical philologists among the
semanticists.
Hermann Paul is often regarded as one of the leading figures in the neogrammarian
movement and his book, the Prinzipien der Sprachgeschichte, first published in 1880, is
regarded by some as the bible of the neogrammarians. It is true that Paul intended his
book at first to be just that. But already in the second edition (1886) he extensively
elaborated his at first rather patchy thoughts on semantic topics, such that Brdal - normally
rather critical of neogrammarian thought, especially their phonetic bias - could say in
his review of the second edition (Brdal 1887) that Paul's book constituted a major
contribution to semantics (Breal 1897:307).
How had this change of emphasis from sound change to semantic change come about?
In 1885 Wegener, like Paul a follower of Steinthal, had published his Untersuchungen iiber
die Grundfragen des Sprachlebens (Wegener 1885/1991) where he had devoted a long
chapter to semantic change, especially its origin in human communication and
interaction. Paul had read (and reviewed) this book (just as Wegener had read and reviewed
Paul's). In doing so, Paul must have discovered many affinities between his ideas and those
of Wegener, and he must have been inspired to devote more thought to semantic
questions. What were the affinities? The most direct resemblance was their insistence on the
9. The emergence of linguistic semantics in the 19th and early 20th century 179
interaction between speaker and hearer; here again their debt to Steinthal is clear, as clear
as their opposition to another very influential psychologist of language, namely Wundt.
Paul's intention was to get rid of abstractions or 'hypostasiations' such as the soul of
a people or a language, ghosts that Wundt, and even Steinthal, still tried to catch. These
entities, if indeed they are entities, escape, according to Paul, the grasp of any science that
wants to be empirical. What can be observed, from a psychological and historical
perspective, are only the psychological activities of individuals, but individuals that interact
with others. This social and psychological interaction is a mediated one; it is mediated
by physiological factors: the production and reception of sounds. Historical linguistics
(and all linguistics should be historical in Paul's eyes) as a science based on principles
is therefore closely related to two other disciplines: physiology and the psychology of
the individual. From this perspective, language use does not change autonomously as it
had been believed by a previous generation of linguists, but neither can it be changed by
an individual act of the will. It evolves through the cumulative changes occurring in the
speech activity of individuals.This speech activity normally proceeds unconsciously - we
are only conscious of what we want to say, not of how we say it or how we change what
we use in our speech activity: the sounds and the meanings. Accordingly, Paul devotes
one chapter to sound change, one to semantic change, and one to analogy (more
concerned with the changes in word-forms).
The most important dichotomy that Paul introduced into the study of semantics is that
of usual and occasional meaning (usuelle und okkasionelle Bedeutung) (Paul 1920/1975:
75), a distinction that exerted lasting influence on semantics and also the psychology and
philosophy of meaning (Stout 1891, for example).The usual signification is the
accumulated sedimentation of occasional significations, the occasional signification, based on
the usual signification is imbued with the intention of the speaker and reshaped by being
embedded in the situation of discourse. This context-dependency of the occasional
signification can have three forms: it depends on a certain perceptual background shared
by speaker and hearer; it depends on what has preceded the word in the discourse; and
finally it depends on the shared location of discourse. "Put the plates in the kitchen" is
understood because we know that we are speaking about that kitchen here and now and
no other. These contextual clues facilitate especially the understanding of words which
are ambiguous or polysemous in their usual signification (but Paul does not use term
'polysemy', which had been introduced by BnSal in 1887;Nerlich & Clarke 1997,2003).
A much more radical view of the situation as a factor in meaning construction was put
forward by Wegener in 1885 (Nerlich 1990).
Like so many linguists of the 19th century, Paul tries to state the main types of changes
of meaning, but he insists that they correspond to the possibilities we have to modify the
meaning of words on the level of occasional signification.The first type is the
specialization of meaning (what Reisig and Darmesteter would have called 'synecdoche'), which
he defines as the restriction of the extension of a word (the set of all actual things the
word describes) and enrichment of its intension (the set of all possible things a word
or phrase could describe). This type of semantic change is very common. Paul gives the
example of German Schirm, which can be used to designate any object employed as a
'screen'. In its occasional usage it may signify a 'fire-screen', a 'lamp-screen', a 'screen'
for the eyes, an 'umbrella', a 'parasol', etc. But normally, on hearing the word Schirm
we think of a 'Regenschirm', an umbrella - what cognitive semanticists would now call
its prototypical meaning. This meaning has somehow separated itself from the general
meaning of 'screen' and become independent. A second basic means to extend (and
180 II. History of semantics
restrict) word meaning is metaphor. A third type of semantic change is the transfer of
meaning upon that which is connected with the usual meaning in space, time or causally
(what would later be called via 'contiguity'). Astonishingly, Paul does not use the term
'metonymy' to refer to this type of semantic change.
One of the most important contributions to linguistics in general and semantics
in particular was made by Wegener in his Untersuchungen iiber die Grundfragen des
Sprachlebens, published in 1885. Although Wegener can to some extent be called a
neogrammarian, he never accepted their strict distinction between physiology and
psychology, as advocated for example by Hermann Osthoff. For Wegener language is a
phenomenon based on the whole human being, their psyche and their body,of a human being
who is an integral part of a communicative situation. Paul was influenced by Wegener
and so was Gardiner, the Egyptologist and general linguist who dedicated his book The
Theory of Speech and Language (Gardiner 1932/1951) to Wegener.
So much for some major contributions to German semantics. As one could easily
devote an entire book to each of the three national trends in semantic thought, of which
the German tradition was by far the most prolific, I can only indicate very briefly the
major lines of development that semantics took in Germany after Paul. Many classical
philologists continued the tradition started by Reisig. Others took Paul's achievements
as a starting point for treatises on semantic change that wanted to illustrate Paul's main
types of semantic change by more and more examples. Others still, such as Johan Stock-
lein tried to develop Paul's core theory further by stressing, for instance, the importance
of the context of the sentence for semantic change (Stocklcin 1898).
But most importantly, the influence of psychology on semantics increased strongly.
Apart from Herbart's psychology of mechanical association which had had a certain
influence on Steinthal and hence Paul and Wegener, and apart from some more
incidental influences such as that of Sigmund Freud and Carl Gustav Jung on Hans Sperber,
for example, the most important development in the field of psychology was Wundt's
Volkerpsychologie. Two volumes of his monumental work on the psychology of such
collective phenomena as language, myth and custom were devoted to language (Wundt
1900), and of these a considerable part was concerned with semantic change (on the
psychology of language in Germany see Knobloch 1988).
Wundt distinguished between regular semantic change based on social processes or,
as he said, the psyche of the people, and singular semantic change, based on the psyche
of the individual. He divided the first class into assimilative change and complicative
change, the second into name-making according to individual (or singular) associations,
individual (or singular) transfer of names, and metaphorically used words. In short, the
different types of semantic change were mainly based on different types of association
processes (similar to Reisig in this way).
However, Wundt's work attracted a substantial body of criticism, especially from the
psychologist Karl Biihler (see Nerlich & Clarke 1998) and the philosopher and general
linguist Anton Marty who developed a descriptive semasiology in opposition to Wundt's
historical approach (Marty 1908), in this comparable to Raoul de La Grasserie in France
(de La Grasserie 1908).
Two other developments in German linguistics have at least to be mentioned: the new
focus on words and things, that is on designation (Bezeichnung), and not so much on
meaning (Bedeutung) and the new focus on lexical and semantic fields instead of single
words. After the introduction of this first new perspective, the term 'semasiology' itself
9. The emergence of linguistic semantics in the 19th and early 20th century 181_
changed its meaning, standing now in opposition to 'onomasiology'.The second new
perspective lead to the flourishing new field of field semantics (Nerlich & Clarke 2000).
3. Linguistic semantics in France
After this sketch of the evolution of semasiology in Germany we now turn to France,
where a rather different doctrine was being developed by Michel Breal, the most famous
of French semanticists.
But Breal was by no means the only one interested in semantic questions.
Lexicographers, such as Emile Littre" (1880) and later Darmesteter and Adolphe Hatzfeld,
contributed to the discussion on semantic questions from a 'naturalist' point of view,
applying insights of Darwin's theory of evolution, of Lamarckian transformationism and,
in the case of Littre\ of Auguste Comte's positivism to the problems of etymology. Littre"
was one of the first to advocate uniformitarian principles in linguistics, which became
so important for Darmesteter and Breal in France and William Dwight Whitney in the
United States (Nerlich 1990). According to the uniformitarian view, inherited from
geology (Lyell 1830-1833), the laws of language change now in operation, and which
can therefore be 'observed' in living languages, were the same that structured language
change in the past. Hence, one can explain past changes by laws now in operation.
Darmesteter was indeed the first to put forward a program for semantics which
resembles in its broad scope that of Reisig before him and Br£al after him and in his emphasis
on history that of Paul. He contended that the philosophy of language should focus on the
history of languages, the transformations of syntax, grammatical forms and word
meanings, as a contribution to the history of the human mind and he also claims that these
figures of speech also structure changes in grammatical forms and syntactic constructions.
However, as early as the 1840s, before Littre and Darmesteter, the immediate
predecessors of Br£al, another group of linguists had started to do 'semantics' under the
heading of ideologic, or as one of its members later called it fonctologie, a term obviously
influenced by Schleicher's distinction between form, function and relation (Schleicher
1860). French semantics of this type focused, like the later German semasiology, on the
isolated word, but even more on the idea it incarnates, and excluded from its
investigation the sentential or other contexts. Later on de La Grasserie (1908) proposed an
'integral semantics' based on this framework. He was (with Marty) the first to point out the
difference between synchronic and diachronic semantics, or as he called it 'la s£mantique
statique' and 'la s£mantique dynamique'.
As we shall see in part 4 of this chapter, as early as 1831 the English philosopher
Benjamin Humphrey Smart was aware of the dangers of that type of 'ideology' and
wanted to replace it by his type of'sematology', stressing heavily the importance of the
context in the production and understanding of meaning - that is replacing mental
association by situational embeddedness (Smart 1831:252).
However, the real winner in this 'struggle for survival' between opposing approaches
to semantics, was the new school surrounding Breal. Language was no longer regarded
as an organism, nor did words 'live and die'. The focus was now on the language users,
their psychological make-up and the process of mutual understanding. It was in this
process that 'words changed their meanings'. Hence the laws of semantic change were no
longer regarded as'natural'or'logical' laws, but as intellectual laws (Breal 1883), or what
one would nowadays call cognitive laws. This new psychological approach to semantic
182 II. History of semantics
problems resembled that advocated in Germany by Steinthal, Paul and Wegener. Paul,
Wegener, Darmesteter, and Breal all stressed that the meaning of a word is not so much
determined by its etymological ancestry, but by the value it has in current usage, a point
of view that moved 19th-century semantics slowly from a purely diachronic to a more
synchronic and functional perspective.
Although Br£al and Wegener seem not to have known each other's work, their
conceptions of language and of semantics are in some ways astonishingly similar (they also
overlap with theories of language developed by Whitney in the United States and Johan
Nicolai Madvig (1875) in Denmark, see Hauger 1994). Brought up in the framework of
traditional German comparative linguistics, both objected to the rcification of language
as an autonomous sclf-cvolving system, both saw in psychology a valuable help to get an
insight into how people speak and understand each other and change the language as
they go along, both made fruitful use of the form-function distinction, where the function
drives the evolution of the linguistic form, both assumed that to understand a sentence
the hearer had much more to do than to decode it word by word - s/he has to draw
inferences from the context of the sentence as a whole, as well as from the context of its use
or its function in discourse -, and, finally, they both had a much broader conception of
historical semantics than their contemporaries, especially some of the semasiologists in
Germany and the 'ideologists' in France. In their eyes semantic change is a phenomenon
not only of the word or idea, but must be observed at the morphological and syntactical
level, too.The evolution of grammar or of syntax is thus an integral part of semantics.
Br£al's thoughts on semantics, gathered and condensed in the Essai de semantique
{Science des significations) (1897), had evolved over many years.The stages in the
maturation of his semantic theory were, briefly stated, the following: 1866 - lecture on the
form and function of words; 1868 - lecture on latent ideas; 1883 - introduction of the term
semantique for the study of semantic change and more particularly for the search of the
intellectual laws of language change in general; 1887 - review of Darmcstctcr's book on
the life of words, a review called quite intentionally 'history of words'. Br£al rejected all
talk about the life of words. For him words do not live and die, they change according
to the use speakers make of them. But postulating the importance of the speaker was
not enough for him, he went so far as to proclaim that the will or consciousness of the
speaker are the ultimate forces of language change. This made him very unpopular
among those French linguists who still adhered to the biological paradigm of language
change, based on Schleicher's views on the transformation of language. But Breal was
also criticised by some of his friends such as Antoine Meillct. Meillct stressed the role of
collective forces, such as social groups, over and above the individual will of the speaker,
and became as such important for a new trend in 20th-century French semantics:
sociosemantics (Meillet 1904-1905).
As mentioned before, Br£al was not the only one who wrote a review of Darmesteter's
book. Two of his friends and colleagues had done the same: Gaston Paris and Victor
Henry, and they had basically adopted the same stand as Breal. Henry should be
remembered for his criticism of Breal's insistence on consciousness, or at least certain degrees
of consciousness, as factors in language change. Henry held the view that all changes in
language arc the result of unconsciously applied procedures, a view he defended in his
booklet on linguistic antinomies (1896) and in his study of a case of glossolalia (1901).
How was Br£al received in Germany, a country where a long tradition of'semasiology'
already existed? It is not astonishing to find that the psychologically oriented Br£al was
9. The emergence of linguistic semantics in the 19th and early 20th century 183
warmly applauded by Steinthal in his 1868 review of Breal's lecture on latent ideas. He
was also mentioned approvingly by Paul (1920/1975: 78 fn. 2). Breal's division of
linguistics into phonetics and semantics as the study of meaning at the level of the lexicon,
morphology and syntax also corresponds to some extent to Steinthal's conce ption
outlined above. It did, however, disturb those who, after Ferdinand Heerdegen's narrowing
of the field of semasiology, practiced the study of semantic change almost exclusively
on the level of the word, excluding morphology and syntax. This difference between
French semantics and German semasiology was noted by Oskar Hey in his review of the
Essai (1898:551). Hey comes to the conclusion that if Br£al had not entirely ignored the
German achievements in the field of semasiology, he would have noticed that everything
he has to say about semantic change had already been said. He concedes, however, that
etymologists, morphologists and syntacticians may have a different view on some parts
of Breal's work than he has as a classical philologist and semasiologist (see p. 555). From
this it is clear that Hey had not really grasped the implications of Breal's novel approach
to semantics. Br£al tried to open up the field of historical semantics from a narrow study
of changes in word-meaning to the analysis of language change in general based on the
assumption that meaning and change of meaning are a function of discourse.
If one had to answer the question: where did Breal's thoughts on semantics come
from, if not from German semasiology (but Breal had read Reisig,whom he mentions in
the context of a discussion on pronouns, see 1897:207 fn.l), one would have to look more
closely at 18th-century philosophy of language, specifically the work of the philosopher
Eticnne Bonnot de Condillac on words as signs and about the progress of knowledge
going hand in hand with the progress of language. From this point of view words are not
'living beings' and the fate of language is not mere decay. However, Br£al did not accept
Condillac's use of etymology as the instrument to find the real, original meanings of
words and to get insights into the constitution of the human mind (a view also espoused
in Britain by Tookc, sec below). For Brdal the progress of language is linked to the
progressive forgetting of the original etymological meaning, it is based on the liberation of
the mind from its etymological burden.
The red thread that runs through Breal's semantic work is the following insight: To
understand the evolution and the structure of languages we should not focus so much
on the forms and sounds but on the functions and meanings of words and constructions,
used and built up by human beings under the influence of their will and intelligence, on
the one hand, and the influence of the society they live and talk in, on the other. Unlike
some of his contemporaries, Breal therefore looked at how ideas, how our knowledge of
a language and our knowledge of the world, shape the words we use. However, he was
also acutely aware of the fact that this semantic and cognitive side of language studies
was not yet on a par with the advances made in the study of phonetics, of the more
physiological side of language, and had much to learn from the emerging experimental
sciences of the human mind (BnSal 1883/1991:151).
Breal's most famous contribution to semantics as a discipline was probably his
discussion of polysemy, a term he invented in 1887. For him, as for Elisabeth Closs Traugott
today, all semantic change arises by polysemy, i.e., new meanings coexist with earlier
ones, typically in restricted contexts (Traugott 2005).
French semantics had its peak between 1870 and 1900. Ironically, when the term
'semantics' was created through the translation of Breal's Essai, the interest in
semantics faded slightly in France. However, there were some continuations of 19th-century
184 II. History of semantics
semantics, as for example in the work of Meillet who focused on the social aspects of
semantic change.There was also, just as in Germany, a trend to study affective and
emotional meaning, a trend particularly well illustrated by the work of the Swiss linguist and
student of Ferdinand de Saussure, Charles Bally, on'stylistics' (1951), followed finally by
a period of syntheses, of which the work of the Belgian writer Albert Joseph Carnoy is
the best example (1927).
4. Linguistic semantics in Britain
In Britain the study of semantic change was linked for a long time to a kind of etymology
that had also prevailed in 18th-century France, that is the use of etymology as the
instrument to find the real, original meanings of words and so to get insights into the
constitution of the human mind. Genuinely philological considerations only came to dominate
the scene by the middle of the century with the creation of the Philological Society in
1842, and its project to create a New English Dictionary.
The influence of Locke's Essay Concerning Human Understanding (1689) on English
thinking had been immense, strengthened by John Home Tooke's widely read
Diversions ofPurley (Tooke 1786-1805).Tooke's theory of meaning can be summarised in the
slogan "one word - one meaning". Etymology has to find this meaning, and any use that
deviates from it is regarded as'wrong' - linguistically and morally - this also has religious
implications. Up to the 1830s Tooke was much in vogue. His doctrine was, however,
challenged by two philosophers: the Scottish common sense philosopher Dugald Stewart
and, following him to some extent, Benjamin Humphrey Smart. In his 1810 essay "On
the Tendency of some Late Philological Speculations", "Stewart attacked", as Aarsleff
points out, "what might be called the atomistic theory of meaning, the notion that each
single word has a precise idea affixed to it and that the total meaning of a sentence is,
so to speak, the sum of these meanings." He "went to the heart of the matter, asserting
that words gain meaning only in context, that many have none apart from it" (Aarsleff
1967/1983: 103). According to Stewart, words which have multiple meanings in the
dictionary, or as Br£al would say, polysemous words, are easily understood in context.
This contextual view of meaning is endorsed by Smart in his anonymously published
book entitled An Outline of Sematology or an Essay towards establishing a new theory
of grammar, logic and rhetoric (1831), which was followed by a sequel to this book
published in 1851, and finally by his 1855 book on thought and language. In both his 1831 and
1855 books Smart quotes the following lines from Stewart:
(...) our words, when examined separately, are often as completely insignificant as the
letters of which they are composed, deriving their meaning solely from the connection or
relation in which they stand to others.
(Stewart 1810:208-209)
The Outline is based on Locke's Essay, but goes far beyond it. Smart takes up Locke's
threefold division of knowledge into (1) physicology or the study of nature, (2) prac-
tology or the study of human action, and (3) sematology, the study of the use of signs for
our knowledge or in short the doctrine of signs (Smart 1831:1-2). This study deals with
signs "which the mind invents and uses to carry on a train of reasoning independently of
actual existences" (1831:2, note carried over from 1).
In the first chapter of his book, which is devoted to grammar, Smart tries "to imagine
the progress of speech upwards as from its first invention" (Smart 1831:3). It starts with
9. The emergence of linguistic semantics in the 19th and early 20th century 185
natural cries which have 'the same' meaning as a 'real' sentence composed of different
parts, and this because "if equally understood for the actual purpose, [it] is, for this
purpose, quite adequate to the artificially compounded sign", the sentence (Smart 1831:8).
But as it is impossible to have a (natural) sign for every occasion or for every purpose (to
signify a perception or conception), it was necessary to find an expedient.This expedient
was to put together several signs, which each had served a particular purpose, in such a
way that they would modify each other, and could, united, serve the new purpose, signify
something new (see Smart 1831: 9-10), From these rudest beginnings language
developed gradually as an artificial instrument of communication. I cannot go into Smart's
presentation of the evolution of the different parts of speech, but it is important to point
out that Smart, like Stewart, rejected the notion that words have meaning in isolation.
Words have only meaning in the sentence, the sentence has only meaning inside a
paragraph, the paragraph only inside a text (see Smart 1831:54-55). Signs in isolation signify
notions, or what the mind knows on the level of abstraction. Signs in combination signify
perceptions, conceptions and passions (see Smart 1831:10-12). Words have thus a double
force "by which they signify at the same time the actual thought, and refer to knowledge
necessary perhaps to come at it" (Smart 1831:16).This knowledge is not God-given. "It
is by frequently hearing the same word in context with others, that a full knowledge of
its meaning is at length obtained; but this implies that the several occasions on which it
is used, are observed and compared; it implies in short, a constant enlargement of our
knowledge by the use of language as an instrument to attain it" (Smart 1831:18-19). And
thus, as only words give access to ideas, ideas do not exist antecedently to language. As
language does not represent notions, the understanding of language is not as simple as
one might think, it cannot be used to transfer notions from the head of the speaker to the
head of the hearer. Instead we use words in such a way that we adapt them to what the
hearer already knows (see Smart 1831:191).
It is therefore not astonishing to find a praise of tropes and figures of speech in the
third chapter of the Outline, devoted to rhetoric. Smart claims that they are "essential
parts of the original structure of language; and however they may sometimes serve the
purpose of falsehood, they are, on most occasions, indispensable to the effective
communication of truth. It is only by [these] expedients that mind can unfold itself to mind;
- language is made up of them; there is no such thing as an express and direct image of
thought." (Smart 1831:210).Tropes and figures of speech "are the original texture of
language and that from which whatever is now plain at first arose. All words arc originally
tropes; that is expressions turned (...) from their first purpose, and extended to others."
(Smart 1831:214, Nerlich & Clarke 2000).
In his 1855 book Smart wants to correct and extend Locke's thought even further, in
particular get rid of the mistake according to which there is a one to one relationship
between ideas and words. According to Smart, we do not add meaning to meaning to
make sense of a sentence or a text, on the contrary: we subtract (Smart 1855: 139). As
an example Smart gives the syntagm old men. Just as the French vieillards, the English
old men does not mean the same thing as vieux added to hommes. To understand a
whole sentence, we step down from what he calls premise to premise until we reach the
conclusion.
Unfortunately, Smart's conception of the construction of meaning seems to have had
little influence on English linguistic thought in the 19th century. He left, however, an
impression on philosophers and psychologists of language, such as Victoria Lady Welby
(1911) and George Frederick Stout who had also read Paul's work for example. Stout
186 II. History of semantics
picked up Paul's distinction between usual and occasional meaning for example, but
points out that it "must be noticed, however, that the usual signification is, in a certain
sense, a fiction" (Stout 1891:194) and that:"Permanent change of meaning arises from the
gradual shifting of the limits circumscribing the general significations. This shifting is due
to the frequent repetition of the same kind of occasional application" (Stout 1891:196).
What James A.H. Murray later called 'sematology' (in a different sense to Smart's
use of the term), that is the use of semantics in lexicography, received its impulses from
the progress in philology and dictionary writing in Germany and from the
dissatisfaction with English dictionary writing. This dissatisfaction was first expressed most strongly
by Richard Garnett in 1835 when he attacked English dictionaries for overlooking the
achievements of historical-comparative philology. Garnett even went as far as to call some
of his countrymen's lexicographical attempts "etymological trash" (Garnett 1835:306).
The next to point out certain deficiencies in dictionaries was the man who became
much more popular for his views on semantic matters:Trench. He used his knowledge of
language to argue against the 'Utilitarians', against the biological transformationists and
those who held 'uniformitarian' or evolutionary views of language change. For him
language is a divine gift, insofar as God has given us the power of reason, and thus the power
to name things. His most popular book was On the Study of Words (1851), used here in
its 21st edition of 1890. The difference between Trench and the up to then prevailing
philosophy of language is summarized by Aarsleff in the following way:
(...) by making the substance of language - the words - the object of inquiry,Trench placed
himself firmly in the English tradition, which had its beginning in Locke. There was one
important difference, however. Trench shared with the Lockcian school, Tookc and the
Utilitarians, the belief that words contained information about thought, feeling, and
experience, but unlike them he did not use this information to seek knowledge of the original,
philosophical constitution of the mind, but only as evidence of what had been present to the
conscious awareness of the users of words within recent centuries; this interest was not in
etymological metaphysics, not in conjectural history; not in material philosophy, but in the
spiritual and moral life of the speakers of English.
(Aarsleff 1967/1983:238)
He studied semantic change at one and the same time as historical records and a
lessons in changing morals and history.This is best expressed in this chapter-title: "On the
Morality of Words"; the chapter contains 'records of sin' and 'records of good and evil
in language'. Despite these moralizing overtones, Trench's purely historical approach to
etymology became slowly dominant in Britain and it found its ultimate expression in the
New English Dictionary.
However,Trench's book contains some important insights into the nature of language
and semantic change which would later on be treated more fully by Darmcstctcr and
Br£al. It is also surprising to find that language is for Trench as it was for Smart "a
collection of faded metaphors" (Trench 1851/1890:48), that words are for him fossilized poetry
(for very similar views, see Jean Paul 1962-1977). Trench also writes about what we
would nowadays call the amelioration orpejoration of word-meaning, about the changes
in meaning due to politics, commerce, the influence of the church, on the rise of new
words according to the needs and thoughts of the speakers, and finally we find a chapter
"On the Distinction of Words", which deals with a phenomenon called by Hey, Br£al
and Paul the differentiation of synonyms. The study of synonyms deals with the essential
9. The emergence of linguistic semantics in the 19th and early 20th century 187
(but not entire) resemblance between word-meanings (Trench 1851/1890:248-249). For
Trench there can never be perfect synonyms, and this for the following reason:
Men feel, and rightly, that with a boundless world lying around them and demanding to be
catalogued and named [...], it is a wanton extravagance to expend two or more signs on that
which could adequately be set forth by one - an extravagance in one part of their
expenditure, which will be almost sure to issue in, and to be punished by, a corresponding scantness
and straitncss in another. Some thought or feeling or fact will wholly want one adequate
sign, because another has two. Hereupon that which has been well called the process of
'desynonymizing' begins - that is, of gradually discriminating in use between words which
have hitherto been accounted perfectly equivalent, and, as such, indifferently employed.
(...)This may seem at first sight only as a better regulation of old territory: for all practical
purposes it is the acquisition of new.
(Trench 1851/1890:258-259)
Trench's books, which became highly popular, must have sharpened every educated
Englishman's and Englishwoman's awareness for all kinds and sorts of semantic changes.
On a more scientific level Trench's influence was even more profound. As Murray
wrote in the Preface to the first volume of the New English Dictionary the "scheme [for
the NED] originated in a resolution of the Philological Society, passed in 1857, at the
suggestion of the late Archbishop Trench, then Dean of Westminster" (Murray 1884: v).
In this dictionary the new historical method in philology was for the first time applied to
the "life and use of words" (ibid.). The aim was "to furnish an adequate account of the
meaning,origin,and history of English words now in general use,or known to have been in
use at any time during the last seven hundred years." (ibid.).The dictionary endeavoured
(1) to show, with regard to each individual word, when, how, in what shape, and with what
signification, it became English; what development of form and meaning it has since received;
which of its uses have, in the course of time, become obsolete, and which still survive; what
new uses have since arisen, by what processes, and when: (2) to illustrate these facts by a
series of quotations ranging from the first known occurrence of the word to the latest, or
down to the present day; the word being thus made to exhibit its own history and meaning:
and (3) to treat the etymology of each word strictly on the basis of historical fact, and in
accordance with the methods and results of modern philological science.
(Murray 1884: vi)
Etymology was no longer seen as an instrument used to get insights into the workings
of the human mind, as philosophers in France and Britain had believed at the end of the
18th century, or to discover the truly original meaning of a word. It was now put on a
purely scientific, i.e. historical, footing.
But, as we have seen, by then a new approach to semantics, fostered by Stcinthal
and Br£al under the influence of psychological thought, brought back considerations of
the human mind, of the speaker, and of communication, opening up semantics from the
study of the history of words in and for themselves to the study of semantic change in the
context of psychology and sociology. One Cambridge philosopher, who knew Scottish
common sense philosophy just as well as Kantianism, and studied meaning in the context
of communication was John Grote.In the 1860s, he developed a theory of meaning as use
and of thinking as a social activity based on communication (Gibbins 2007). His focus
on 'living meaning' as opposed to 'fossilised' or historical meaning had parallels with the
188 II. History of semantics
theory of meaning developed by Breal at the same time and can be regarded as a direct
precursor of ordinary language philosophy.
Influenced by Wegener, Ferdinand de Saussure (and his distinction between speech
and language) and the anthropologist Malinowski (and his claim that language can only
be studied as part of action), Gardiner and then Firth tried to develop a new contextualist
approach to semantics and laid the foundation for the London School of Linguistics.
However, by the 1960s Britain, like the rest of Europe, began to feel the influence of
structuralism (Pottier 1964). Gustav Stern (1931) in Denmark tried tosynthesise
achievements in semantics, especially with reference to the English language. Stephen Ullmann
in Oxford attempted to bridge the gap between the old (French and German) type of
historical-psychological and the new type of structural semantics in his most influential
books on semantics written in 1951 and 1962. Although brought up in the tradition of
Gardiner and Firth, Sir John Lyons finally swept away the old type of semantics and
advocated the new 'structural semantics' in 1963 (Lyons 1963). The focus was now both on
how meanings are shaped by their mutual relations in a system, rather than by their
evolution over time and on the way meanings are constituted internally by semantic features
which were, by some thought to be invariant (see Matthews 2001; for more information
see article 16 (Bierwisch) Semantic features and primes). Variation and change, discourse
and society, mind and metaphor which had so fascinated earlier linguists were sidelined
but later rediscovered inside frame semantics (see article 29 (Gawron) Frame Semantics),
prototype semantics (sec article 28 (Taylor) Prototype theory), cognitive semantics (sec
article 27 (Talmy) Cognitive Semantics) and studies of grammaticalisation (see article 101
(Eckhardt) Grammaticalization and semantic reanalysis).
5. Conclusion
One of the pioneers in the history of semantics, Dirk Geeraerts has, in the past,
distinguished between five stages in the history of lexical semantics, namely pre-structuralist
diachronic semantics, structuralist semantics, lexical semantics as practized in the
context of generative grammar, logical semantics, and cognitive semantics (Geeraerts 1997).
Geeraerts himself has now provided a masterly overview of the history of semantics
from its earliest beginnings up to the present (Geeraerts 2010), with chapters on
historical-philological semantics, structuralist semantics, generativist semantics, neostructur-
alist semantics and cognitive semantics. His first chapter is at the same time broader and
narrower than the history of early (pre-structuralist) semantics provided here. It
provides an overview of speculative etymology in antiquity as well as of the rhetorical
tradition, which I do not cover in this chapter; and when Geeraerts focuses on developments
between 1830 and 1930 he mainly focuses on Br£al and Paul.So I hope that by examining
discussions of what words mean and how meaning is achieved as a process between
speaker and hearer, author and reader and so on in three national traditions, the French,
the British and the German in early semantics, between around 1830 and 1930, and by
focusing more on context than cognition, I supplement Geeraerts' work to some extent.
6. References
Aarsleff, Hans 1967/1983. The Study of Language in Britain 1780-1860. 2nd edn. London: Atlilone
Press. 1983.
9. The emergence of linguistic semantics in the 19th and early 20th century 189
Bally, Charles 1951. Traite de Stylistique Francaise. 3rd edn. Geneve: George Klincksieck.
Brlal, Michel 1883. Les lois iiitellectuelles du langage. Fragment de sdmantique. Anniiaire de
I'Association pour Vencouragement des etudes grecques en France 17,132-142.
Brlal, Michel 1883/1991. The Beginnings of Semantics: Essays, Lectures and Reviews. Translated
and introduced by George Wolf. Reprinted: Stanford, CA: Stanford University Press, 1991.
BnSal, Michel 1887. L'histoire des mots. Revue des Deux Mondes 1,187-212.
Bidal, Michel 1897. Essai de Semantique (Science des Significations). Paris: Hachette.
Carnoy. Albert J. 1927. La Science du Mot. Traite de Semantique. Louvain: Editions 'Universitas'.
Darmesteter, Arsene 1887. La Vie des Mots Etudiee d'apres leurs Significations. Paris: Delagrave.
Delesalle, Sinione 1987. Vie des mots et science des significations: Arsene Darmesteter et Michel
BnSal (DRLAV). Revue de Linguistique 36-37,265-314.
Firth, John R. 1935/1957. The technique of semantics. Transactions of the Philological Society for
1935,36-72. Reprinted in: J.R. Firth, Papers in Linguistics 1934-1951. London: Oxford
University Press, 1957,7-33.
Frcgc, Gottlob 1977.Thoughts. In: P. Gcach (cd.). Logical Investigations. Gottlob Frege. New Haven,
CT: Yale University Press, 1-30.
Gardiner, Alan H. 1932/1951. The Theory of Speech and Language. 2nd edn., with additions. Oxford:
Clarendon Press, 1951.
Garnett, Richard 1835. English lexicography. Quarterly Review 54,294-330.
Geeraerts, Dirk 1997. Diachronic Prototype Semantics. A Contribution to Historical Lexicology.
Oxford: Oxford University Press.
Geeraerts, Dirk 2010. Theories of Lexical Semantics: A Cognitive Perspective. Oxford: Oxford
University Press,
de la Grasserie, Raoul 1908. Essai d'une Semantique Integrate. Paris: Leroux.
Gibbins, John R. 2007. John Grote, Cambridge University and the Development of Victorian
Thought. Exeter: Imprint Academic.
Hauger, Brigitte 1994. Johan Nicolai Madvig. The Language Theory of a Classical Philologist.
Minister Nodus.
Hecht.Max 1888. Diegriechische Bedeutungslehre:EineAufgabe der klassischen Philologie. Leipzig:
Tcubner.
Henry, Victor 1896. Antinomies Linguistique. Paris: Alcan.
Henry, Victor 1901. Le Langage Martien. Paris: Maisonneuve.
Hey, Oskar 1898. Review of Br6al (1897). Archiv fiir Lateinische Lexicographic und Grammatik
10,551-555.
Jean Paul [Johann Paul Fricdrich Richter] 1962-1977[ 1804]. Vorschule der Asthetik. [Introduction
to Aesthetics]. In: Werke [Works]. Edited by Norbert Miller. Abt. 1, Vol. 5. Darmstadt: Wissen-
schaftliche Buchgesellschaft, 7-330.
Knobloch, Clemens 1988. Geschichte der psychologischen Sprachauffassung in Deutschland von
1850 bis 1920. Tubingen: Niemeyer.
Littre\ Emile 1880. Pathologie verbalc ou lesion de certains mots dans 1c coins de 1'usage. Etudes et
Glanures pour Faire Suite a I'Histoire de la Langue Francaise. Paris: Didier. 1-68.
Locke, John 1975/1689. Essay on Human Understanding. Oxford: Oxford University Press. 1st edn.
1689.
Lyell, Charles 1830-1833. Principles of Geology, Being an Attempt to Explain the Former Changes of
the Earth's Surface by Reference to Causes now in Operation. London: Murray.
Lyons, John 1963. Structural Semantics. Oxford: Blackwell.
Madvig, Johann N. 1875. Kleine philologische Schriften. Leipzig: Tcubner. Reprinted: Hildcshcim:
Olms, 1966.
Du Maisais, Cesar de Chesneau 1757. Des Tropes ou des Differens Sens dans lesquels on Peut
Prendre un Meme Mot dans line Meme Langue. Paris: chez David.
Matthews, Peter 2001. A Short History of Structural Linguistics. Cambridge: Cambridge University
Press.
10. The influence of logic on semantics 19_1^
Steinthal, Heymann 1855. Grammatik, Logik und Psychologie: Ihre Prinzipien und ihr Verhdltnis
zueinander. Berlin: Diimmler.
Steinthal, Heymann 1871. Abriss der Sprachwissenschaft 1. Berlin: Diimmler.
Stern, Gustav 1931. Meaning and Change of Meaning. With Special Reference to the English
Language. Goteborg: Elanders Boktryckeri Aktiebolag. Reprinted: Bloomington, IN: Indiana
University Press, 1968.
Stewart, Dugald 1810. Philosophical Essays. Edinburgh: Creech.
Stocklein, Johann 1898. Bedeutungswandel der Worter. Seine Entstehung und Entwicklung.
Miinchen: Lindauer.
Stout, George Frederick 1891.Thought and language. Mind 16,181-197.
Tooke, John H. 1786-1805. Epea ptepoenta; or, the Diversions of Purley. 2nd edn. London: printed
for the author.
Traugott, Elizabeth C. 2005. Semantic change In: K. Brown (ed.). Encyclopedia of Language and
Linguistics. Amsterdam: Elsevier.
Trench, Richard C. 1851/1890. On the Study of Words. 21st edn. London: Kegan Paul, Trench,
Trubncr & Co., 1890.
Ullmann, Stephen 1951. The Principles of Semantics. A Linguistic Approach to Meaning. 2nd edn.
Glasgow: Jackson, 1957.
Ullmann, Stephen 1962. Semantics. Oxford: Blackwell.
Wegener, Philipp 1885/1991. Untersuchungen iiberdie Grundfragen des Sprachlebens. Halle/Saale:
Niemeyer.
Welby, Victoria Lady 1911. Signifies and Language. The Articulate Form of our Expressive and
Interpretative Resources. London: Macmillan & Co.
Wundt, Wilhelm 1900. Volkerspsychologie I: Die Sprache. Leipzig: Engelmann.
Brigitte Nerlich, Nottingham (United Kingdom)
10. The influence of logic on semantics
1. Overview
2. Pre-Fregean logic
3. GottlobFrege's progress
4. Bertrand Russell's criticism and his theory of definite descriptions
5. Rudolf Carnap's theory of extension and intension: Relying on possible worlds
6. Willard V. O. Quine: Logic, existence and propositional attitudes
7. Necessity and direct reference:The two-dimensional semantics
8. Montague-Semantics: Compositionality revisited
9. Generalized quantifiers
10. Intensional theory of types
11. Dynamic logic
12. References
Abstract
The aim of this contribution is to investigate the influence of logical tools on the
development of semantic theories and vice versa. Pre-19th-century logic was limited to a few
sentence forms and their logical interrelations. Modern predicate logic and later type
logic, both inspired by investigating the meaning of mathematical sentences, widened
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 191-215
192 II. History of semantics
the view for new sentence forms and thereby made logic relevant for a wider range of
expressions in natural language. In a parallel course of developments the problem of
different levels of meaning like sense and reference, or intension and extension were
studied and initiated a shift to modal contexts in natural language. Montague bundled
in his intensional type-theoretical framework a great part of these development in a
unified formal framework which had strong impact on the formal approaches in natural
language semantics. While the logical developments mentioned so far could be seen as
direct answers to natural language phenomena, the first approaches to dynamic logic
did not get their motivation from natural language, but from the semantics of
computer programming. Here, a logical toolset was adapted to specific problems of natural
language semantics.
1. Overview
The aim of this contribution is to investigate the influence of logical tools on the
development of semantic theories and vice versa. The article starts with an example of Pre-
Fregean logic, i.e. Aristotelian syllogistic.This traditional frame of logic has severe limits
for a theory of meaning (e.g. no possibility for multiple quantification, no variety of
scope). These limitations have been overcome by Frege's predicate logic which is the
root of standard modern logic and the basis for the first developments in a formal
philosophy of language: Wc will present Frege's analysis of mathematical sentences
and his transfer to the analysis of natural language sentences by introducing a theory
of sense and reference. The insufficient treatment of singular terms in Frege's logic is
one reason for Russell's theory of definite descriptions. Furthermore, Russell tries to
organize semantics without a difference between sense and reference. Carnap introduces
the first logical framework for a semantics of possible worlds and shows how one can
keep the Frcgcan semantic intuitions without using the problematic tool of "senses".
Carnap develops a systematic theory of intension and extension defining the intension of
an expression as a function from possible worlds to the relevant extension. An important
formal step was then the invention of modal logic by Kripke.This is the framework for
the theory of direct reference of proper names and for the so-called two-dimensional
semantics which is relevant to receive an adequate treatment of names, definite
descriptions, and especially indexicals. Tarski's formal theory of truth is used by Davidson to
argue that truth-conditions arc the adequate tool to characterize the meaning of
assertions. Although the idea of a truth-conditional semantics is already in the background
since Frege, with Davidson's work it became the leading idea for modern semantics.
The second part of the article (starting with section 8) will concentrate on important
progresses made in this overall framework of truth-conditional semantics. Montague
presented a compositional formal semantics including quantifiers, intensional contexts
and the phenomenon of deixis. His ideal was to offer an absolute truth-condition for
any sentence. In the next step new formal tools were invented to account not only
for the extralinguistic environment but also for the discourse as a core feature of the
meaning of utterances. Context-dependency in this sense is considered in approaches
of dynamic semantics. Formal tools arc nowadays not only used to answer the leading
question "What is the meaning of a natural language expression?" In recent
developments new logic formalisms are used to answer questions like "How is a convention
established?" and "How can we account for the pragmatics of the utterance?" It has
10. The influence of logic on semantics 193
become clear that truth-conditional semantics has to be completed by aspects of the
environment,social conventions and speaker's intentions to receive an adequate account
of meaning. Therefore the latest trend is to offer new formal tools that can account for
these features.
2. Pre-Fregean logic
The first system of logic was grounded by Aristotle (see 1992). He organized inferences
according to a syllogistic schema which consists of two premises and a conclusion. Each
sentence of such a syllogistic schema contains two predicates (F, G), a quantifier in front
of the first predicate (some, every) and a negation (not) could be added in front of the
second predicate. Each syllogistic sentence has the following structure "Some/every F is/
is not G".Then we receive four possible types of sentences: A sentence is universal if it
starts with "every" and particular if it starts with "some". A sentence is affirmative if it
docs not contain a negation in front of the second predicate otherwise it is negative. We
receive Tab. 10.1.
Tab. 10.1: Syllogistic types of sentences
NAME
a
i
e
o
FORM
Every F is G
Some F is G
Every F is not G
Some F is not G
TITLE
Universal Affirmative
Particular Affirmative
Universal Negative
Particular Negative
The name of the affirmative syllogistic sentences is due to the latin word "affirmo".The
first vowel represents the universal sentence while the second vowel represents the
particular.The name of the negative syllogistic sentences is due to the latin word "nego"
again with the same convention concerning the use of the first and second vowel. As
we will see the sequence of vowels is also used to represent the syllogistic inferences. If
we introduce the further quantifier "no" we can find equivalent representations but no
new propositions.The sentence "No F is G" is equivalent to (e) "Every F is not G" and
"No F is not G" is equivalent to (a) "Every F is G".The proposition (e) is intuitively
more easily to grasp in the form "No F is G" while proposition (a) is better
understandable in the original format "Every F is G". So we continue with these formulations.
On the basis of these sentences we can systematically arrange the typical syllogistic
inferences, e.g. the inference called "barbara" because it contains three sentences of the
form (a).
Tab. 10.2: Barbara
Premise 1 (a): Every G is H. abbreviation: GaH
Premise 2 (a): Every F is G. abbreviation: FaG
Conclusion (a): Every F is H. abbreviation: FaH
Now we can start with systematic variations of the four types of sentences (a, i, e, o).The
aim of Aristotle was to select all and only those inferences which are valid. Given the
same structure of the predicates in the premises and the conclusion only varying the kind
of sentence we receive e.g. the valid inferences of Tab. 10.3.
194 II. History of semantics
Tab. 10.3: Same predicate structure, varying types of sentences
Barbara
Every M is H
Every F is M
Every F is H
Darii
Every M is H
Some F are M
Some F are H
Ferio
No M is H
Some F is M
Some F is not H
Celarent
No M is H
Every F is M
No F is H
To present the complete list of possible syllogistic inferences we have to account for
different kinds of predicate positions in the inference. We can distinguish four general
schemata including the one we already had presented so far. Our first schema has the general
structure (I) and we also receive the other structures (II to IV) in Tab. 10.4.
Tab. 10.4: Predicate structures
I. M H II. H M III. M H IV. H M
FM FM MF MF
FH FH FH FH
For each general format we can vary the kinds of sentences that are involved in the way
presented above.This leads to all possible syllogistic inferences in the Aristotelian logic.
While making this claim wc arc ignoring the fact that Aristotle already worked out a
modal logic, cf. Nortmann (1996). Concentrating on nonmodal logic we have presented
the core of the Aristotelian system. Although it was an ingenious discovery in ancient
times, the Aristotelian system has its strong limitations: Strictly speaking, there is no
space in syllogistic inferences for (a) singular terms, (b) existence claims (like "Trees
exist") and there are only very limited possibilities for quantification (see section on
Frege's progress). Especially, there is no possibility for multiple uses of quantifiers in
one sentence. This is the most important progress which is due to the predicate logic
essentially developed by Gottlob Frege. Before we present this radical step into modern
logic, we shortly describe some core ideas of G.W. Leibniz, who invented already some
influential ideas on the way to modern logic.
Leibniz is well-known for introducing the idea of a calculus of logical inferences. He
introduced the idea that the syntax of the sentences mirrors the logical structure of the
thoughts expressed and that there can be defined a purely syntactic procedure of proving
a sentence. This leads to the modern understanding of a syntactic notion of proof which
ideally allows for all sentences to decide simply on the basis of syntactic
transformations whether they are provable or not. The logic systems developed by Leibniz are
essentially advanced compared to the Aristotelian syllogistic. It has been shown that
his logic is equivalent to the Boolean logic, i.e. the monadic predicate logic, see Lenzen
(1990). Furthermore, Leibniz introduced a calculus of concepts defining concept identity,
inclusion, containment and addition, see Zalta (2000) and Lenzen (2000). He reserved
a special place for individual concepts. Since his work had almost no influence on the
general development of logic the main ideas are only mentioned here. Ignoring a lot of
interesting developments (e.g. modal systems) wc can characterize a great deal of the
logical systems initiated by Aristotle until the 19th century by the square of opposition
(cf. article 8 (Meier-Oeser) Meaning in pre 19th-century thought).
The square of opposition (see Fig. 10.1) already involves the essential distinction
between different understandings of "oppositions": A contradiction of a sentence is
an external negation of sentence "It is not the case that ..." while the contrary involves
10. The influence of logic on semantics
195
Every S is P
Some S is P
Fig. 10.1: Square of oppositions
Some S is not P
an "internal" negation. What is meant by an "internal" negation can only be illustrated
by transforming the syllogistic sentences into modern predicate logic. The most
important general features in the traditional understanding are the following: (i) From two
contradictory sentences one must be false and one be true, (ii) For two contrary
sentences holds that they cannot both be true (but they can both be false), (iii) For two
subcontrary sentences holds that they cannot both be false (but they can both be true).
A central problem pointed out by Abelard in the Dialectica (1956) is presupposition
of existential import: According to one understanding of the traditional square of
opposition it presupposes that sentences like "Every F is G" or "Some F is G" imply "There
is at least one thing which is F".This is the so-called existential import condition which
leads into trouble.The modern transformation of the syllogistic sentences into predicate
logic, as included into the figure above, does not involve the existential presupposition.
Let us use an example: On the one hand, "Some man is black" implies that at least one
thing is a man, namely the man who has to be black if "Some man is black" is true. On
the other hand,"Some man is not black" also implies that something is a man, namely the
man who is not black if "Some man is not black" is true. But these are two subcontrary
sentences, i.e. according to the traditional view they cannot both be false; one has to be
true.Therefore (since both imply that there is a thing which is a man) it follows that men
exist. In such a logic the use of a predicate F in a sentence "Some F..." presupposes that
F is non-empty (simply given the meaning of F and the traditional square of opposition),
i.e. there are no empty predicates. But of course (as Abelard points out) surely men
might not exist. This observation leads to the modern square of opposition which uses
the reading of sentences in predicate logic and leaves out the relations of contraries and
subcontraries and only keeps the relation of the contradictories. The leading intuition to
avoid the problematic consequence mentioned above is the claim that meaningful
universal sentences like "Every man is mortal" do not imply that men exist. The existential
import is denied for both predicates in universal sentences. Relying on the interpretation
of particular sentences like "Some men are mortal" according to modern predicate logic,
the truth of the particular sentences just means that there is at least one man which is
mortal. If we want to allow nonempty predicates then we receive the modern square of
opposition.
196 II. History of semantics
3. Gottlob Frege's progress
Frege was a main figure from two points of view: He introduced a modern predicate logic
and he also developed the first systematic modern philosophy of language by transferring
his insights in logic and mathematics into a philosophy of language. His logic was first
developed in the "Begriffsschrift" (1879, in the following shortly: BS), see Frege (1977).
He used a special notation to characterize the content of a sentence by a horizontal line
(the content line) and the act of judging the content by a vertical line (the judgement line).
V A
This distinction is a clear precursor of the distinction between illocution (the type of
speech act) and proposition (the content of the speech act) in Searle's speech act theory.
Frege's main aim was to clarify the status of arithmetic sentences. Dealing with a
mathematical expression like „32" Frege analyzes it into the functional expression „()2"
und the argument expression „3". The functional expression refers to the function of
squaring something while the argument expression refers to the number 3. An essential
feature of the functional expression is its being unsaturated, i.e. it has an empty space
that needs to be filled by an argument expression to constitute a complete sentence.
The argument expression is saturated, i.e. it has no space for any addition. Frege
transferred this observation into the philosophy of language: Predicates are typical
expressions which are unsaturated, while proper names and definite descriptions are typical
examples for saturated expressions. Predicates refer to concepts while proper names and
definite descriptions refer to objects. Since predicates are unsaturated expressions which
need to be completed by a saturated expression, Frege defines concepts (as the
reference of predicates) as functions which - completed by objects (as the referents of proper
names and other singular terms) - always have a truth value (Truth or Falsity) as result.
The truth-value is the reference of the sentence composed of the proper name and the
predicate. By analogy from mathematical sentences Frege starts to analyze sentences of
natural language and dealing with natural language and develops a systematic theory of
meaning. Before outlining some basic aspects of this project, we first introduce the idea
of a modern system of logic. Frege developed the following propositional calculus (for a
detailed reconstruction of Frege's logic see von Kutschera 1989, chap. 3):
Axioms:
(1) a. A^(B^A)
b. (C^(B^A))^((C^B)^(C^A))
c. (D^(B^A))^(B^(D^A))
d. (B -+ A) -+ (^A -+ ^B)
e. ^A-^A
f. A -+ -.-.A
A rule of inference, which allows to derive theorems by starting with two axioms or
theorems:
(2) A^B,A\-B
10. The influence of logic on semantics 197
If we add an axiom and a rule we receive a system of predicate logic that is complete and
consistent. Frege suggested the following axiom:
(3) VxA[x]^A[a](BS:5\).
The additionally relevant rule of inference was not explicitly marked by Frege but
presupposed implicitly:
(4) A -* B[a] \- A -* VjcB[jc], if „a" is not involved in the conclusion (BS: 21).
Frege tried to show the semantic consistency of the predicate calculus (which was then
intensity debated) but he did not try to prove the completeness since he lacked the notion
of interpretation to develop such a proof (von Kutschera 1989: 34). A formal system of
axioms and rules is complete for first-order predicate logic (FOPL) if all sentences
logically valid in FOPL are derivable in the formal system.The completeness proof was for
the first time worked out by Kurt Godel (1930). Frege already included second-order
predicates into his system of logic. The interesting fact that second-order predicate logic
is incomplete was for the first time shown by Kurt Godel (1931).
One of the central advantages of modern predicate logic for the development of
semantics is the fact that we can now use as much quantifiers in sequence as we want.
We have of course to take care of the meaning of quantifiers given the sequence. The
following sentences which cannot be expressed in the system of Aristotelian syllogism
can be nicely expressed by the modern predicate logic using "L" as a shorthand for the
two-place predicate "( ) loves ( )".
(5) Everyone loves everyone: \/x\/yL(x,y)
Using mixed quantifiers their sequence becomes relevant:
(6) a. Someone loves everyone: 3x\/yL(x,y)
b. Everyone loves someone: \/x3yL(x, y)
[This can be a different person for everyone]
c. Someone is loved by everyone: 3y\/xL(x, y)
[There is (at least) one specific human being who is loved by all human beings]
d. Everyone is loved by someone: Vv3jcL(jc, y)
e. Someone loves someone: 3x3yL(x,y)
[There is (at least) one human being who loves (at least) one human being]
Frege's philosophy of language is based on a principle of compositionality (cf. article 6
(Pagin & Westerstahl) Compositionality), i.e. the principle that the value of a complex
expression is determined by the values of the parts plus its composition. He developed a
systematic theory of sense and reference (cf. article 3 (Tcxtor) Sense and reference). The
reference of a proper name is the designated object and the reference of a predicate is
a concept while both determine the reference of the sentence, i.e. the truth-value. Given
this framework of reference it follows that the sentences
198 II. History of semantics
(7) The morning star is identical with the morning star,
and
(8) The morning star is identical with the evening star.
have the same reference, i.e. the same truth-value: The truth-value is determined by the
reference of the name and the predicate. Each token of the predicate refers to the same
concept and the two names refer to the same object, the planet Venus. But sentence (7)
is uninformative while sentence (8) is informative. Therefore, we need a new aspect of
meaning to account for the informativity: the sense of an expression.The sense of a proper
name is a mode of presentation of the designated object, i.e. "the evening star" expresses
the mode of presentation characterized as the brightest star in the evening sky.
Furthermore the sense of a sentence is a thought. The latter (in the case of simple sentences like
"Socrates is a philosopher") is constituted by the sense of a predicate "() is a philosopher"
and the sense of the proper name "Socrates". Frege defines the sense of an expression in
general as the mode of presentation of the reference. To develop a consistent theory of
sense and reference Frege introduced different senses for one and the same expression
in different linguistic contexts, e.g. indirect speech (propositional attitude ascriptions) or
quotations are contexts in which the sense of an expression changes. Frege's philosophy
of language has at least two major problems: (1) the necessity of an infinite hierarchy of
senses to account for the recursive syntactic structure (John believes that Mary believes
that Karl believes ) and (2) the problem of indexical expressions (it is accounted for in
two-dimensional semantics and dynamic semantics, see below).
4. Bertrand Russell's criticism and his theory of definite descriptions
Russell (1903, partly in cooperation with Whitehead (Russell & Whitehead 1910-1913))
also developed himself both a system of logic and a philosophy of language in contrast
to Frege such that we nowadays speak of Neo-Fregean and Neo-Russellian theories of
meaning. Let us first have a look at Russell's logical considerations. Russell developed
his famous paradox which was a serious problem for Frege because Frege presupposes
in his system that he could produce sets of sets in an unconstrained manner. But if there
are no constraints we run into Russell's paradox: Let R be the set of all sets which are not
members of themselves.Then R is neither a member of itself nor not a member of itself.
Symbolically, let R := [x :x g JtJ.Then Re RiffRe R.To illustrate the consideration:
If R is a member of itself it must fulfill the definition of its members, i.e. it must not be
a member of itself. If R is not a member of itself then it should not fulfill the definition
of its members, i.e. it must be a member of itself. When Russell wrote his discovery in a
letter to Frege who was just completing Grundlagen der Arithmetik Frege was despaired
because the foundations of his system were undermined. Russell himself developed a
solution by introducing a theory of types (1908). The leading idea is that we always have
to clarify those objects to which the function will apply before a function can be defined
exactly. This leads to a strict distinction between object language and meta-language:
We can avoid the paradox by avoiding self-references and this can be done by arranging
all sentences (or, equivalently, all propositional functions) into a hierarchy. The lowest
level of this hierarchy will consist of sentences about individuals. The next lowest level
10. The influence of logic on semantics 199
will consist of sentences about sets of individuals. The next lowest level will consist of
sentences about sets of sets of individuals, and so on. It is then possible to refer to all
objects for which a given condition (or predicate) holds only if they are all at the same
level, i.e. of the same type. The theory of types is a central element in modern theory of
truth and thereby also for semantic theories. Russell's contribution to the philosophy of
language is essentially connected with his analysis of definite descriptions (Russell 1905).
The meaning of the sentence "The present King of France is bald" is analyzed as follows:
1. there is an x such that x is the present King of France (3x(Fx))
2. for every x that is the present King of France and every y that is the present King
of France, x equals y (i.e. there is at most one present King of France) (Vjc(Fjc —*■ Vy
(Fy^y = x)))
3. for every x that is the present King of France, x is bald. (\/x(Fx —► Bx))
Since France is no longer a kingdom, assertion Lis plainly false; and since our statement
is the conjunction of all three assertions, our statement is false.
Russell's analysis of definite descriptions involves a strategy to develop a purely exten-
sional semantics, i.e. a semantic theory that can characterize the meaning of sentences
without introducing the distinction between sense and reference or any related
distinction of intensional and extensional meanings. Definite descriptions are analyzed such
that there remains no singular term in the reformulation and ordinary proper names
are according to Russell's theory hidden definite descriptions. His strategy eliminates
singular terms with only one exception: He needs the basic singular term "this/that" to
account for our speech about sense-data (Russell 1910). Since he takes an acquaintance
relation with sense-data (and also with universals) including a sense-data ontology as a
basic presupposition of his specific semantic approach, the only core idea that survived
in modern semantics is his logical analysis of definite descriptions.
5. Rudolf Carnap's theory of extension and intension:
Relying on possible worlds
Since Russell's project was idiosyncratically connected with a sense-data theory
it was for the great majority of scientists not acceptable as a purely extensional
project. The extensional semantics had to wait until Davidson used Tarski's theory
of truth as a framework to characterize a new extensional semantics. Meanwhile it
was Rudolf Carnap who introduced the logic of extensional and intensional
meanings to modernize Frege's twofold distinction of semantics. The central progress
was made by introducing the idea of possible worlds into logics and semantics: The
actual world is constituted by a combination of states of affaires which are
constituted by objects (properties, relations etc.). If at least one state of affairs is changed
we speak of a new possible world. If the world consists of basic elements which
constitute states of affaires then the possible combinations of these elements allow us
to characterize all possible states of affaires. Thereby we can characterize all
possible worlds since a possible world can be characterized by a class of states of affaires
that is realized in this world. Using this new instrument of possible worlds Carnap
introduces a systematic theory of intension. His notion of intension should
substitute Frege's notion of sense and thereby account for the informational content of a
200 II. History of semantics
sentence. His notion of extension is closely connected to Frege's notion of reference:
The extension of a singular term is the object referred to by the use of the term, the
extension of a predicate is the property referred to and the extension of a complete
assertive sentence is its truth-value. The intension which has to account for the
informational content is characterized as a function from possible worlds to the relevant
extensions. In the case of a singular term the intension is a function from possible
worlds (p.w.) to the object referred to in the relevant possible world. In the same line
you receive the intension of predicates (as function from p.w. to sets or n-tuples) and of
sentences (as functions from p.w. to truth-values) as shown in Tab. 10.5.
Tab. 10.5: Carnap's semantics of possible worlds
Extension Intension
singular terms objects individual concepts
predicates sets of objects and n-tuples of objects properties
sentences truth-values propositions
A principle limitation of a semantic of possible worlds is that you cannot account for
so-called hyperintensional phenomena, i.e. one cannot distinguish the meaning of two
sentences which are necessarily true (e.g. two different mathematical claims) because
they arc simply characterized by the same intension (matching each p.w. onto the value
"true").
6. Willard V. O. Quine: Logic, existence and propositional attitudes
How should we relate quantifiers with our ontology? Quine (1953) is famous for his
slogan "To be is to be the value of a bound variable". Quantifiers "there is (at least)
an x (3jc)", "for all x (Vjc)" are the heart of modern predicate logic which was already
introduced by Frege (s. above). For Quine the structure of the language determines the
structure of the world: If the language which is necessary to receive the best available
complete description of the world contains several existential and universal
quantifications then these quantifications at the same time determine the objects, properties etc. we
have to presuppose. Logic, language and ontology are essentially connected according
to this view. Although Quine's special views about connecting logic, language and world
are very controversial nowadays the core of the idea of combining quantificational and
ontological claims is widely accepted.
Another problem that is essentially inspired by the development of logic is the analysis
of propositional attitude ascriptions. Quine established the following standard story:
There is a certain man in a brown hat whom Ralph has glimpsed several times under
questionable circumstances on which wc need not enter here; suffice it to say that Ralph suspects
he is a spy. Also there is a gray-haired man, vaguely known to Ralph as rather a pillar of the
community, whom Ralph is not aware of having seen except once at the beach. Now Ralph
does not know it, but the men are one and the same. Can we say of this man (Bernard J.
Ortcutt, to give him a name) that Ralph believes him to be a spy? If so. we find ourselves
accepting a conjunction of the type:
(9) w sincerely denies' '. w believes that
10. The influence of logic on semantics 201_
as true, with one and the same sentence in both blanks. For, Ralph is ready enough to say. in
all sincerity.'Bernard J. Ortcutt is no spy.' If, on the other hand, with a view to disallowing
situations of the type (9), we claim simultaneously that
(10) Ralph believes that the man in the brown hat is a spy.
(11) Ralph does not believe that the man seen at the beach is a spy.
then we cease to affirm any relationship between Ralph and any man at all.[...] 'believes
that' becomes, in a word, referentially opaque.
(Quine 1956: 179, examples renumbered)
In line with Russell, Quine starts to analyze the cognitive situation of Ralph by
distinguishing two readings of the sentence
(12) Ralph believes that someone is a spy.
namely:
(13) Ralph believes [3x(x is a spy)]
(14) 3jc(Ralph believes [jc is a spy])
Quine calls (13) the notional and (14) the relational reading of the original sentence
which is at the first glance parallel to the traditional distinction between de dicto (13) and
de re (14) reading. But he shows that the difference of these two readings is not sufficient
to account for Ralph's epistemic situation as characterized with the sentences (10) and
(11). Intuitively Ralph has a de re reading in both cases, one of the man on the beach and
the other of a person wearing a brown hat.The transformation into de re readings leads to:
(15) 3jc(Ralph believes [jc is a spy]) (out of (10))
(16) 3jc(Ralph does not believe [jc is a spy]) (out of (11))
Since both extensional quantifications are about the same object, we receive the
combined sentence which explicitly attributes contradictory beliefs to Ralph:
(17) 3jc(Ralph believes [jc is a spy] a Ralph does not believe [jc is a spy])
To avoid this unacceptable consequence Quine suggests that the ambiguity of the belief
sentences cannot be accounted for by a distinction of the scopus of the quantifier (leading
to de re and de dicto readings) but by a systematic ambiguity of the belief predicate:
He suggests to distinguish a two-place predicate "Believe2 (subject, proposition)" and a
three-place predicate "believe3 (subject, object-of-belief, property)".
(18) Believe2 (Ralph, that the man with the brown hat is a spy)
(19) believe3 (Ralph, the man with the brown hat, spy-being)
202 II. History of semantics
This distinction is the basis for Quine's further famous claim: We are not allowed to
quantify into propositional attitudes (i.e. implying (14) from (18)): if we have interpreted
a sentence such that the 'believe' predicate is used intensionally (as a two-place
predicate) then we cannot ignore that and we are not allowed to change the reading to the
sentence into one using a three-place predicate. We are not allowed to change from a
notional reading (18) into a relational reading (19) and vice versa. This line of strategy
was further improved e.g. by Kaplan (1969) and Loar (1972). It definitely made clear that
we cannot always understand the belief expressed by a belief sentence simply as a
relation between a subject and a proposition. Sometimes it has to be understood differently.
Quine's consequence is a systematic ambiguity of the predicate "believe". This is
problematic since it leads also to four-place, five-place predicates etc. (Haas-Spohn 1989:66):
for each singular term which is used in the scopus of the belief ascription we have to
distinguish a notional and a relational reading. Therefore Cresswell & von Stechow (1982)
suggested an alternative view which only needs to presuppose a two-place predicate
"believe" but therefore changes the representation of a proposition: A proposition is not
completely characterized by a set of possible worlds (according to which the relevant
state of affaires is true) but in addition by a structure of the components of the
proposition. Structured propositions are the alternative to a simple possible world semantics to
account for propositional attitude ascriptions.
7. Necessity and direct reference: The two-dimensional semantics
The development of modal logic essentially put forward by Saul A. Kripke had strongly
influenced the semantical theories. The basic intuition the modal logic started with is
rather straightforward: Each entity is necessarily identical with itself (and necessarily
different from anything else). Kripke (1972) shows that there arc sentences which
express a necessary truth but nevertheless are a posteriori: "Mark Twain is identical
with Samuel Clemens". Since "Samuel Clemens" is the civil name of Mark Twain the
sentence expresses a self-identity but it is not known a priori since knowing that both
names refer to the same object is not part of standard linguistic knowledge. There are
also sentences which express contingent facts but which can be known to be true a
priori, e.g. "I am speaking now". If I utter the sentence it is a priori graspablc that it
is true but it is not a necessary truth since otherwise I would be a necessary speaker
at this timepoint (but of course I could have been silent). To account for the new
distinction between epistemic dimension of a priori/a posteriori and the metaphysical
dimension of necessary/contingent Kripke introduced the theory of direct reference of
proper names and Kaplan (1979/1989) introduced the two-dimensional semantics. Since
Kaplan's theory of characters is nowadays a standard framework to account for names
and indcxicals we shortly introduce the core idea: We have to distinguish the
utterance context which determines a proposition which is expressed by uttering a sentence
and the circumstance of evaluation which is the relevant possible world according to
which this proposition will be evaluated as true or false. We can illustrate this two-step
approach as in Fig. 10.2:
10. The influence of logic on semantics 203
character of a sentence
1st step utterance context
intension = truth-condition
(= proposition)
2nd step circumstance of evaluation
extension = truth-value
Fig. 10.2: Two-dimensional semantics
A character of a sentence is a function from possible utterance contexts to truth-
conditions (propositions) and these truth-conditions are then evaluated relative to the
circumstances of evaluation. Especially in the cases of indcxicals wc can demonstrate
the two-dimensional semantics: Let us investigate the sentence "I am a philosopher"
according to three different utterance contexts w0, w, und w2 while in each world there
is a different speaker: in w0 Cicero is the speaker of the utterance, in w, Caesar and in
w2 Augustus. Furthermore it is in w0 the case that Cicero is a philosopher while Caesar
and Augustus are not. In w, Caesar and Augustus are philosophers while Cicero isn't. In
w2 no one is a philosopher (poor world!).The three worlds function both as utterance
contexts and as circumstances of evaluation but of course different facts are relevant in
the two different functions. Now the utterance contexts are represented veritically while
the circumstances of evaluation are represented horizontally. Then the sentence "I am a
philosopher" receives the character shown in Tab. 10.6.
Tab. 10.6: Character of "I am a philosopher"
Utterance Contexts Circumstances Truth conditions
of evaluation
^0 W\ W2
wQ w f f (Cicero: being a philosopher)
w, / w f (Caesar; being a philosopher)
w2 f w f (Augustus: being a philosopher)
Each line represents the proposition that is determined by the sentence relative to the
utterance contexts and this proposition receives a truth-value for each circumstance of
evaluation. The character of the sentences has in principle to be represented for all
possible worlds not only for the three ones selected above. This instrument of a "character"
of a sentence is useable for all expressions of a natural language. Such it is a principle
improvement and enlargement in the formal semantics that contains Carnap's theory of
intensions as a special case.
8. Montague-Semantics: Com positional ity revisited
Frege claimed in his principle of compositonality that the meaning of a complex
expression is a function of the meanings of the expression parts and their way of composition
(see section 3.). He regarded every complex expression as composed of a saturated part
and a non-saturated part. The semantic counterparts are objects and functions. Frege
204 II. History of semantics
transfers the syntactic notion of an expression which needs some complement to be a
complete and well-formed expression to the semantic realm: Functions are regarded as
incomplete and non-saturated. This leads him to the ontological view that functions are
not "objects" ("Gegenstande").This is Frege's term for any entitity that can be a value
of a meaning function mapping expressions to their extensions. Functions, however, can
be reified as "Werthverlaufe" (courses-of-values), e.g. by combining their expressions
with the expression the function,but in Frege's ontology, the "Werthverlaufe" are distinct
from the functions themselves.
Successors of Frege did not follow him in this ontological respect of his semantics. In
Tarskian semantics one-place predicates are usually mapped to sets of individuals, two-
placc predicates to sets of pairs of individuals etc. N-placc functions arc regarded as
special kinds of (n-l)-place relations. This approach allowed to make explicit the meaning
of Frege's non-saturated expressions and to give a precise compositional account of the
meanings of expressions of predicate logic in form of a recursive definition. But for every
type of composition, like connecting formulae, applying a predicate to its arguments, or
prefixing a quantifier to a formula, distinct forms of meaning composition were needed.
The notion of compositionality could be radically simplified by two developments:
The application of type theory and lambda abstraction.Type theory goes back to Russell
and was developed to avoid paradoxical notions as sets not containing themselves, see
above sec. 4. There are a lot of versions of type theory, but in natural language semantics
usually a variant is used which starts with a number of basic types, e.g. the type of objects
(entities) e and the type of truth values t for an extensional language, and provides a type
of functions (7,, 72) for every type 7, and every type 72, i.e. the type of functions from
entities of type 7, to entities of type 72. Sometimes the set of types is extended to types
composed of more than two types. But any such system can easily be reduced to this
binary system. Predicates can now be viewed as expressions of type (e, t), i.e. as functions
from objects to truth values, because they yield a true sentence if an argument is filled in
that refers to an instance of the predicate is filled in,otherwise they yield a false sentence.
That just means that we take the characteristic function of the predicate extension as its
meaning. A characteristic function of a set maps its members to the truth value true, its
non-members to false.
In the same sense the negation operator is of type (t, t), i.e. a function from a truth-
value to a truth-value (namely to the opposite one); binary sentence connectives are of
type (t,(t,t)), i.e. a function taking the first argument and yielding a function which takes
the second argument and then results in a truth value.
Frege had already recognized that first-order quantifiers (as the existential quantifier
something) are just second-order predicates, i.e. predicates applicable to first-order
predicates. The application of an existential quantifier to a one-place first-order predicate is
true if the predicate is non-empty. Therefore the existential quantifier can be regarded
as a second-order predicate which has non-empty first-order predicates as instances, it
is of type ((e, t), f).The semantics of the universal quantifier is analoguous: It yields the
truth value true for a first-order predicate which has the whole universe of discourse as
its extension.
A type problem arises with this view for predicates of arity greater than one: Their
type does not fit to the quantifier.The predicate /ove,e.g.isof lypc(e,(e,t)) because if we
add one object as an argument we receive a one-place predicate as introduced above. In
the sentence
10. The influence of logic on semantics 205
(20) Everybody loves someone.
one of the two quantifiers has to be composed with love in a compositional approach. Let
us assume someone is this quantifier.Then its meaning of type ((e,t),t) has to be applied
to the meaning of love, leading to a type clash, because the quantifier needs a type (e,
t) as its argument while the predicate love is of a different type. Analoguous problems
arise for complex formulae. How can the meaning of the two-place predicate love be
transformed into the type required?
The type theory needs an extension by the introduction of a lambda-operator,
borrowed from Alonzo Church's lambda calculus which was developed in Church (1936).
The lambda operator is used to transform a type t expression into an expression of (Tr
t) depending on the variable type 7*,. Let e.g. jc be a variable of type e and P be of type
t, then Xjc[F] has type (e, t), i.e. is a one-place first-order predicate. If we consider love as
a two-place first-order predicate and love(jc, y) as an expression of type t with two free
variables, then Xjc[love(jc, y)] is an expression of type (e, f).This is the type required by
the quantifier.The variable jc is bound to the lambda operator and y is free in this
expression. If we now take someone as a first-order quantifier, which has type ((e, t), t), then
someone(Ajc[love(jc,)>)]) is an expression of type t again with free variable y.This can be
made a predicate of type (e, t) by using the same procedure again: Xy[someone(Ajc[love
(jc, y)])], which can be used as an argument of a further quantifier everybody. We receive
the following new analysis of (20):
(21) everybody(Xy[someone(Ajc[love(jc,y)])])
The semantics of a lambda expression Xjc[F] is defined as the characteristic function
which yields true for all arguments which would make P true, if they were taken as
assignments to jc, false for the others. With this semantics we get the following logical
equivalences which we used implicitly in the above formalizations:
a-CONVERSlON
(22) Xx[P] = \y[P\y/x]]
where /'[v/jc] is the same as P besides the fact that all occurences of jc which are free in P
are changed into y. tt-conversion is just a formal renaming of variables.
^-REDUCTION
(23) \x[F](a)mF[alx]
where P[a/x] is the same as P besides the fact that all occurences of jc which are free in P
are changed into a. This, however, may be false in some non-extensional contexts, if a is
a non-rigid designator. If we take
(24) Ralph believes that jc is a spy.
as P then the de re reading of (10) is Xx[P](a).a is interpreted outside the non-extensional
context of P and its factual denotation is taken as its extension, /'[o/jc], however, is the de
206 II. History of semantics
dicto reading because a is interpreted within the belief-context. For intensional contexts
these differences are treated in the intensional theory of types, see section 10 below.
^-conversion
(25) Xx[P(x)] = P
where x docs not occur as a free variable in P. 77-convcrsion is needed for the conversion
between atomic predicates and X-expressions.
In this type-theoretical view, a lot of other linguistic expression types can easily be
integrated. Adjectives, which are applied to nouns of type (e, t), can be seen as type
((e, t), (e, t)), e.g. tasty is a modifier which takes predicates, expressed by nouns, like apple
and yields new predicates, like tasty apple.
Modifiers in general, like adverbs, prepositional phrases and relative and adverbial
clauses, are regarded as type (7,, 7,), because they modify the meaning of their
argument, but this results in an expression of the same type. This also mirrors that modifiers
can be applied in an iterated manner, as red tasty apple, where red further modifies the
predicate tasty apple.
The compositionality of meaning got a very strict interpretation in Montague's work.
The type-theoretic semantics was accompanied by a syntactic formalism whose
expression categories could be directly mapped onto semantic types, called catcgorial grammar,
which was based on ideas by Kasimierz Ajdukiewicz in the mid-1930s and Yehoshua
Bar-Hillel (1953).
Complex semantic types, i.e. semantic types needing some complemetation, like
(7,, 7,) for some types 7, and 7,, have their counterpart in complex syntactic categories
like SJSX and 52\5, which need a complementation by 5, to yield the syntactic category
S2. Given a syntactic category SJSy a complement of type 5, has to be added to the
right, in case of S2\S] to the left. Let e.g. N be the syntactic category of a noun and DP
be the category of a determiner phrase, then DP/N is the category of the determiner in
the examples above. The interaction of Montague's syntactic and semantic conception
results in the requirement that the meaning of a complex expression can be recursively
decomposed into function-argument-pairs which are always expressed by syntactically
immediately adjacent constituents.
Categorial grammars describe the same class of languages as context free grammars,
and they are subject to the same problems when applied to natural languages. Although
most phenomena in natural languages can in principle be represented by a context free
grammar, and therefore by a categorial grammar, too, both formalisms lead to quite
unintuitive descriptions when applied to a realistic fragment of natural language.
Especially with regard to semantic compositionality discontinous constituents require a quite
unintuitive multiplication of categories.
The type theoretic view of nouns and noun phrases has also consequences for the
semantic concept of quantifying expressions. Determiners like all or two as parts of
determiner phrases will have to be assigned a suitable type of meaning.
9. Generalized quantifiers
As mentioned above, already Frege regarded quantifiers as second- or higher-order
predicates. But Frege himself did not go the step from this insight to the consideration
10. The influence of logic on semantics 207
of other quantifiers than the universal and the existential ones. Without being aware of
Frege's concept of quantifiers the generalized view on quantifiers by Mostowski (1957)
and Lindstrom (1966) pathed the way to a generalized theory of quantifiers in linguistics
around 1980. This allowed for a proper treatment of syntactically first-order quantifying
expressions whose semantics was not expressible in first-order logic, e.g. most.
Furthermore, now, noun phrases which had no mere anaphoric function, could be
interpreted as a quantifier, consisting of the determiner, e.g. all, which specifies the
basic quantifier semantics, and the noun, e.g. women, or a noun-like expression which
restricts the quantification. The determiner is a function of type ((e, t), ((e, t), t)) taking
the restrictor as argument and resulting in a generalized quantifier, e.g. all(woman) for
the noun phrase all women. This generalized quantifier is (the characteristic function of)
a second-order predicate which has all (charatcristic functions of) first-order predicates
as its instances which are true for all women. As the determiner designates the principle
function in such a noun phrase some linguists prefer the term determiner phrase.
Generalized quantifiers can be studied with respect to their monotonicity properties.
Let Q be a generalized quantifier and Q{P) be true.Then it can be the case - depending
on Q's semantics - that Q(P') is always true if
(A) the extension of P' is a subset of the extension of P or
(B) the extension of P is a subset of the extension of P'
In the first case, we call Q monotone decreasing or downward entailing, in case (B)
monotone increasing or upward entailing. An example of the first quantifier type is
no women, an example of the second type (at least) two women, cf. e.g. the entailment
relations between sentences the following. (26a) entails (26b), and (27a) entails (27b).
(26) a. At least two women worked as extremely successful CEOs,
b. At least two women worked as CEOs.
(27) a. No women worked as CEOs.
b. No women worked as extremely successful CEOs.
While quantifiers also expressible in first-order logic, like those in the examples above,
always show one of the monotonicity properties, there are other generalized
quantifiers which do not. Numerical expressions providing a lower and an upper bound for a
quantity, like exact numbers, are examples for non-montonic quantifiers. If we replace at
least two women with exactly two women in the examples above, the entailment relations
between the sentences disappear.
Mononicity of quantifiers seems to play an important role in natural language
although not all quantifiers are monotonic themselves. But it is claimed that all simple
quantifiers, i.e. one-word quantifiers or quantifiers of the form determiner + noun, are
expressible as conjunctions of monotonic quantifiers. E.g. exactly three Ps is equivalent
to the conjunction at least three Ps and no more than three Ps, the first conjunct (at least
three Ps) being upward monotonic and the latter (no more than three Ps) being
downward monotonic. There arc, of course, quantifiers not bearing this property, and they
are expressible in natural language, cf.au even number of Ps, but there does not seems
to be any natural language which reserves a simple lexical item for such a purpose. The
208 II. History of semantics
theory of generalized quantifiers therefore raises empirical questions about language
universals.
Similar considerations on entailment conditions can be applied to the first argument
of the determiner. For some determiners D it might be the case that D(R)(P) entails
D(R')(P) always if
(A') the extension of R' is a subset of the extension of R or
(B') the extension of R is a subset of the extension of R'.
In the second case the determiner is called persistent, while in the first it is called anti-
persistent. Consider the entailment relations between the sentences below. Here, (28a)
entails (28b), and (29a) entails (29b).
(28) a. Some extremely successful female CEOs smoke,
b. Some female CEOs smoke.
(29) a. All female CEOs smoke.
b. All extremely successfull female CEOs smoke.
It is easy to see that some is persistent while all is antipersistent. All combinations of
monotonicity and persistence/antipersistence are realized in natural languages.Tab. 10.7
shows some examples. Note that the quantifiers of the square of opposition are part of
this scheme.
Tab. 10.7: Monotonicity and (anti-)peisistence of quantifiers
upward inonotonic downward inonotonic
antipersistent all, every no, at most three
persistent some, (at least) three not all
And the relations of being contradictory and (sub-)contrary (sec sec. 2) in the square of
oppositions are mirrored by negations of the whole determiner-governed sentence or the
second argument: -<D(R, P) is contradictory to D(R, P), while D(R, ->P) is (sub-)contrary
to D(R, P) (we use ->P as short for Xx[^P(x)].).
Analyzing quantified expressions as structures consisting of a determiner, a restrictor
and a quantifier scope provides us with a relational view of quantifiers. The determiner
type ((e, t), ((e, t), t)) is that of a second-order two-place relation. Besides the logical
properties of the argument positions discussed above, there arc a number of interesting
properties regarding the relation of the two arguments. One of them is conservativity. A
determiner is conservative iff it is always the case that
D(R, P) = D(R, P a R)
(where we use P a R as the conjunction of the predicates P and R,more precisely Xx [P(x)
a R(x)]). It is quite evident that determiners in general fulfill this condition. E.g. from
(30) Most CEOs are incompetent.
10. The influence of logic on semantics 209
follows
(31) Most CEOs are incompetent CEOs.
But are really all determiners conservative? Only is an apparent counterexample:
(32) Only CEOs are incompetent,
cannot be paraphrased as
(33) Only CEOs are incompetent CEOs.
(32) being contingent, (33) tautological. But besides this observation there are
syntactical reasons to doubt the classification of only as a determiner. Other quantifying items
whose conservativity is questioned are e.g. many and few.
The foundation for the generalization of quantifiers was in principle laid in Montague's
work, but he himself did not refer to other quantifiers than the classical existential and
universal quantifier. The theory of generalized quantifiers was recognized in linguistics in
the carlcy 1980s, cf. Barwisc & Cooper (1980) and article 43 (Kccnan) Quantifiers.
10. Intensional theory of types
Montague developed his semantics as an intensional semantics, taking into account the
non-extensional aspects of natural languages as alethic-modal, temporal and deontic
operators. The intensional theory of types builds on Carnap's concept of intensions as
functions from possible worlds to extensions. These functions are built into the type
system by adding a new functional type from possible worlds s to the other types. Type s
differs from the other types insofar as there are no expressions - constants or variables
- directly denoting objects of this type, i.e. no specific possible worlds besides the contcx-
tually given current one can be addressed.
The difference between an extensional sentential operator like negation and an
intensional like necessarily is reflected by their respective types: The meaning of the negation
operator has type (t, t) while necessarily needs ((s, t), t) because not only the truth value
of an argument p in the current world has impact on the truth value of necessarily p, but
also the truth values in the alternative worlds.
In Montague (1973) he does not directly define a model-theoretic mapping for
expressions of English, although this should be feasible in principle, but he gives a translation of
English expressions into a logical language. Besides the usual ingredients of modal
predicate logic, the lambda operator as well as variables and constants of the various types, he
introduces the intensor A and extensor v of type (T,(s, T)) and ((s, 7), T) respectively for
arbitrary typesT. A transforms a given meaning into a Carnapian intension, i.e. Aa means
the function from possible worlds to a's extensions in these worlds. In contrast, if b means
an intension then vb refers to the extension in the current world. The intensor is used if
a usually extensionally interpreted expression is used in an intensional context. E.g. a
unicorn and a centaur mean generalized quantifiers, say Qx and Qv which extensionally
are false for any argument. This may be different for other possible worlds. Therefore
210 II. History of semantics
the intensions A(?, and AQ2 may differ.This accounts for the fact that e.g. the intensional
verb seek applied to A(?, and *Q2 may result in different values, as
(34) John seeks a unicorn,
may be true while
(35) John seeks a centaur.
may be false at the same time. With the intensional extension of type logic, natural
language semantics gets a powerful tool to account for the interaction of intensions and
extensions in compositional semantics.
The same machinery which is applicable in alethic modal logic - i.e. the logic of
possibility and necessity - is transferable to other branches of intensional semantics, as e.g.
the semantics of tense and temporal expressions, cf. article 57 (Ogihara) Tense.
11. Dynamic logic
Natural language expresssions not only refer to time-dependent situations, but their
interpretation also is dependent on time-dependent contexts. A preceding sentence may
introduce referents for later anaphoric expressions (cf. article 38 (Dekker) Dynamic
semantics and article 89 (Zimmermann) Context dependency).
(36) A man walks in the park. He whistles.
Among the anaphoric expressions are - of course - nominal anaphora, like pronouns
and definite descriptions. Antecedents are typically indefinite noun phrases. But possible
antecedents can also be introduced in a less obvious way, e.g. by propositions expressing
events. The event time can then be referenced by a temporal anaphora like at the same
time.
(37) The CEO lighted her cigarette. At the same time the health manager came in.
The anaphora at the same time refers to the time when the event described in the first
proposition happens.
Anaphorical relations are not necessarily realized by overt expressions, they can be
implicit, too, or they may be indicated by morphological means, e.g. by the choice of a
grammatical tense. Anaphora pose a problem for compositional approaches to semantics
based on predicate or type logic. (36) can be formalized by an expression headed by an
existential quantifier like
(38) 3jc[man(jc) a w-i-t-p(jc) a whistle(jc)]
But the man mentioned here can be referred to anywhere in the following discourse.
Therefore the quantifier scope cannot be closed at any particular position in the discourse.
10. The influence of logic on semantics 2H_
The semantics of anaphora, however, is not just a matter of the scope of
existential quantifiers. Expressions, like indefinite noun phrases, usually meaning an existential
quantifier can in certain contexts introduce discourse referents with a universal reading.
This fact was described by Geach (1962).
(39) If a farmer owns a donkey he feeds it.
means
(40) Vjc[V^[farmer(jc) a donkey(y) a own(jc,y) -» feed(jc,y)]]
This kind of anaphora is a further challenge for compositional semantics as it has to
deal with the fact that an expression which is usually interpreted cxistcntially gets a
universal reading here.The challenge is addressed by dynamic semantics. Pre-compositional
versions were developed independently by Hans Kamp and Irene Heim as Discourse
Representation Theory and File Change Semantics respectively (cf. article 37 (Kamp &
Reyle) Discourse Representation Theory). In both approaches structures representing
the truth-conditional content of the parts of a discourse as well as the entities which are
addressable anaphorically are manipulated by the meaning of discourse constituents.
The answer to the question how the meaning of a discourse constituent is to be construed
is simply: as a function from given discourse representing structures to new structures of
the same type, cf. Muskens (1996).
This view is made explicit in dynamic logics. This kind of logics was developed in the
1970er by David Harel and others, cf. Harel (2000), and has been primarily used for the
formal interpretation of procedural programming languages.
(41) (a)q
means that statement a possibly leads to a state where q is true, while
(42) [a\q
means that statement a necessarily leads to a state where q is true. Regarding states as
possible worlds we arrive at a modal logic with as many modalities as there are
(equivalence classes of) statements a, for a recursive language usually infinitely many. For many
purposes the consideration can be constrained to such modalities where from each state
exactly one successor state is accessible. For such modalities with functional accessibility
relation the weak ((...)) and the strong operator ([...]) collapse scmantically into one
operator. If we further agree that
(43) J,A[a]j2
can be rewritten as
(44) s\a\s2
212 II. History of semantics
then this notation has the intuitive reading that state 5, is mapped by the meaning of
a into sr A simplistic application is the following: Assume that 5, is a characterization
of the knowledge state of a recipient before receiving the information given by
assertion a.Then s2 is a characterization of the knowledge state of the recipient after being
informed. If you identify knowledge states with those sets of possible worlds which
are consistent with the current knowledge, and if you consider s, and s2 as descriptions
of sets Wx and W2 of possible worlds, then they differ in exactly that respect that W2
is the intersection of Wx and the set W of possible worlds in which a is true, i.e. W2 =
W, n Wa.
The notion of an informative utterance a can be defined by the condition that W2*
Wr And in order to be consistent with the previous context, it must be true that W2 *
0. The treatment of discourse states or contexts in dynamic logics is not limited to
truth-conditionally characterizable knowledge. In principle any kind of linguistic
context parameters can be part of the states, among these the anaphorically accessible
antecedents of a sentence.
In their Dynamic Predicate Logic, as developed in Groenendijk & Stokhof (1991),
Groenendijk and Stokhof model the anaphoric phenomena accounted for in Kamp's
Discourse Representation Theory and Heim's File Change Semantics in a fully
compositional fashion. This is mainly achieved by a dynamic interpretation of the existential
quantifier:
(45) 3x[P(x)]
is semantically characterized by the usual truth conditions but has the additional effect
that free occurrences of the variable x have to be kept assigned to the same object in
subsequent expressions which are connected appropriately.The dynamic effect is limited
by the scopes of certain operators like the universal quantifier, negation, disjunction, and
implication. So x is bound to the same object ouside the syntactic scope of the existential
quantifier as in
(46) 3jc[man(jc) a w-i-t-p(jc)] a whistle(jc)
although the syntactic scope of the existential quantifier ends after w-i-t-p(jc).This prop-
sosition is true, only if there is an object x which verifies all three predicates man, w-i-t-p,
and whistle.
If we characterize the dynamic dimension, the sentences (36) and (39) can be
formalized in Dynamic Predicate Logic as
(47) 3jc[man(jc) a w-i-t-p(jc)] a whistle(jc)
and
(48) 3jc[farmer(jc) a 3y[donkey()>) a own(jc, y)]] -> feed(jc, y)
respectively. It can easily be seen, how the usual meanings of the discourse sentences
(49) A man walks in the park.
10. The influence of logic on semantics 213
and
(50) A farmer has a donkey.
enter into the composed meaning of the discourse without any changes. In order to get
the intended truth conditions for implications, it is required as truth condition that the
second clause can be verified for any assigment to x verifying the first clause.
Putting together the filtering effect of propositions on possible worlds and the
modifying effect on assignment functions, we can consider propositions in Dynamic Predicate
Logic as functions on sets of world-assigment pairs. The empty context can be
characterized by the Cartesian product of the set of possible worlds and the set of assigments.
Each proposition of a discourse filters out certain world-assignment pairs. In some
respects Dynamic Predicate Logic deviates from standard dynamic logic approaches, as
Groenendijk & Stokhof (1991, sec. 4.3) point out, but it still can be seen as a special case
of a logic in this framework.
The dynamic view of semantics can be used to model other contextual dependencies
than just anaphora. Groenendijk (1999) shows another application in the Logic of
Interrogation. Questions add felicity conditions for a subsequent answer to the discourse
context. To a great extent, these conditions can be characterized semantically. In the Logic
of Interrogation the effect of a question is understood as a partitioning of the current set
of possible worlds. Each partition stands for some alternative exhaustive answer, e.g. a
yes-no question partitions the set of possible worlds into one subset consistent with the
positive answer and a complementary subset compatible with the negative answer.
Let us e.g. take the question (51).
(51) Does a man walk in the park?
According to the formalism of Groenendijk (1999) (51) can be formalized as (52).
(52) ?3jc [man(jc) a w-i-t-p(jc)]
(52) partitions the set of possible worlds in two subsets W+ and W~, such that (53) is
true for every world in W+ and false for every world in its complement subset W~. An
appropriate answer selects exactly one of these subsets.
(53) 3jc[man(jc) a w-i-t-p(jc)]
Wh-questions like (54) partition the sets of possible worlds in more partitions than yes-
no questions.
(54) Who walks in the park?
Each partition corresponds to an exhaustive answer, which provides the full information
who walks in the park and who does not. An assertion answers the question partially if
it eliminates at least one partition. If it furthermore eliminates all partitions but one, it
answers the question exhaustively.
This chapter has given a short overview how logic provided the tools for treating
linguistic phenomena. Each logical system has its characteristic strength and its limits.
214
II. History of semantics
The limits of a logical system sometimes inspired the development of new formal tools
(e.g. the step from Aristotelian syllogistic to modern predicate logic) which inspired
a new semantics. Sometimes a change in the focus of linguistic phenomena inspired
a systematic search for new logical tools (e.g. modal logic) or a reinterpretation of
already available logical tools (e.g. dynamic logic). We hope to have illustrated the main
developments of the bi-directional influences of logical systems and semantics.
12. References
Abaclardus. Pctrus 1956. Dialectica. Ed. L.M. de Rijk. Asscn: van Gorcum.
Almog. Joseph. John Perry & Howard Wettstein (eds.) 1989. Themes from Kaplan. Oxford: Oxford
University Press.
Aristoteles 1992. Analytica priora. Die Lehre vom Schlufi oder erste Analytik. Ed. E. Rolfes.
Hamburg: Meiner.
Bar-Hillel, Yehosliua 1953. A quasi-arithmetical notation for syntactic description. Language 29,
47-58.
Bai wise, Jon & Robin Cooper 1980. Generalized quantifiers and natural language. Linguistics &
Philosophy 4,159-218.
Carnap, Rudolf 1947. Meaning and Necessity. Chicago, IL:The University of Chicago Press
Church, Alonzo 1936. An uiisolvable problem of elementary number theory. American Journal of
Mathematics 58,354-363.
Cresswell, Maxwell & Arnini von Stechow 1982. De re belief generalized. Linguistics & Philosophy
5,503-535.
Davidson, Donald 1967. Truth and meaning. Synthesis 17,304-323.
Frcgc, Gottlob 1966. Grundgesetze der Arithmetik. Hildcshcim: Olms.
Frege, Gottlob 1977. Be griffsschrift und andere Aufsatze. Ed. I. Angelelli. Darmstadt: Wissen-
schaftliche Buchgesellschaft.
Frege, Gottlob 1988. Die Grundlagen der Arithmetik. Eine logisch mathematische Untersuchung
iiber den Begriffder Zahl. Ed. Chr.Thiel. Hamburg: Meiner.
Frege, Gottlob 1994. Funktion, Begriff Bedeutung. Fiinf logische Studien. Gottingen: Vandenhoeck
& Ruprecht.
Geach, Peter 1962. Reference and Generality: An Examination of Some Medieval and Modern
Theories. Ithaca, NY: Cornell University Press
Godel, Kurt 1930. Die Vollstandigkeit der Axiome des logischen Funktionenkalkiils. Monatshefte
fur Mathematik und Physik 37,349-360.
Godel, Kurt 1931. Uber formal unentscheidbare Satze der Principia Mathematica und verwandter
Systeme. Monatshefte fur Mathematik und Physik 38,173-198.
Groenendijk, Jeroen 1999. The Logic of Interrogation. Research report, ILLC Amsterdam.
Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic Predicate Logic. Linguistics & Philosophy
14,39-100.
Haas-Spohn, Ulrike 1989. Zur Interpretation der Einstellungszuschreibungen. In: E. Falkenberg
(ed.). Wissen, Wahrnehmen, Glauben. Epistemische Ausdriicke und propositionale Einstellungen.
Tubingen: Niemeyer, 50-94.
Hard, David 2000. Dynamic logic. In: D. Gabbay & F Guenthner (eds.). Handbook of
Philosophical Logic, vol. II: Extensions of classical logic, chap. 11.10. Dordrecht: Reidel, 497-604.
Hodges, Wilfried 1991. Logic. London: Penguin.
Kaplan, David 1969. Quantifying in. In: D. Davidson & J. Hintikka (eds.). Words & Objections.
Essays on the Work ofW.V.O. Quine. Dordrecht: Reidel,206-242.
Kaplan, David 1979. On the logic of demonstratives. Journal of Philosophical Logic 8,81-98.
Kaplan, David 1989. Demonstratives. In: J. Almog, J. Perry & H. Wettstein (eds.). Themes from
Kaplan. Oxford: Oxford University Press, 481-563.
10. The influence of logic on semantics 2T5
Kripke, Saul 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of Natural
Language. Dordrecht: Reidel, 253-233 and 763-769.
von Kutschera. Franz 1989. Gottlob Frege. Eine Einfiihrung in sein Werk. Berlin: de Gruyter.
Leibniz, Gottfried Wilhelm 1992. Schriften zur Logik und zur philosophischen Grundle-
gung von Mathmatik und Naturwissenschaft. Ed. H. Herring. Darmstadt: Wissenschaftliche
Buchgesellschaft.
Lenzen, Wolfgang 1990. Das System der Leibnizschen Logik. Berlin: de Gruyter.
Lenzen. Wolfgang 2000. Guilielmi Pacidii Non plus ultra oder: Eine Rekonstruktion des
Leibnizschen Plus-Minus-Kalkiils. Philosophicgeschichte und logische Analyse 3,71-118.
Lindstrom, Per 1966. First order predicate logic with generalized quantifiers. Theoria 32,186-195.
Loar, Brian 1972. Reference and propositional attitudes. The Philosophical Review 81.43-62.
Montague, Richard 1973. The proper treatment of quantification in ordinary English. In:
J. Hintikka, J. Moravcsik & P. Suppes (eds.). Approaches to Natural Language. Dordrecht: Reidel.
221-242. Reprinted in: R. Thomason (ed.). Formal Philosophy. Selected Papers of Richard
Montague. New Haven, CT: Yale University Press, 1974,247-270.
Mostowski, Andrzcj 1957. On a generalization of quantifiers. Fundamenta Mathematicae 44,12-36.
Muskens, Reinhard 1996. Combining Montague Semantics and Discourse Representation.
Linguistics & Philosophy 19,143-186.
Nortmann, Ulrich 1996. Modale Syllogismen, mbgliche Welten, Essentialismus. Eine Analyse der
aristotelischen Modallogik. Berlin: de Gruyter.
Quine, Willard van Orman 1953. From a Logical Point of View. Cambridge, MA: Harvard
University Press
Quine, Willard van Orman 1956. Quantifiers and propositional attitudes. The Journal of Philosophy
53,177-187.
Russell, Bertrand 1903. The Principles of Mathematics. Cambridge: Cambridge University Press
Russell, Bertrand 1905. On denoting. Mind 14,479^193.
Russell, Bertrand 1908. Mathematical logic as based on the theory of types. American Journal of
Mathematics 30,222-262.
Russell, Bertrand 1910. Knowledge by acquaintance and knowledge by description. Proceedings of
the Aristotelian Society 11,108-128.
Russell, Bertrand & Alfred N. Whitehead 1910-1913. Principia Mathematica. Cambridge:
Cambridge University Press
Tarski, Alfred 1935. Der Wahrheitsbegriff in den formalisierten Sprachen. Studia Philosophica 1,
261^105.
Zalta, Edward 2000. Leibnizian theory of concepts. Philosophiegeschichte und logische Analyse 3,
137-183.
Albert Newen, Bochum (Germany)
Bernhard Schroder, Essen (Germany)
216 II. History of semantics
11. Formal semantics and representationalism
1. Logic, formal languages and the grounding of linguistic methodologies
2. Natural languages as formal languages: Formal semantics
3. The challenge of context-dependence
4. Dynamic Semantics
5. The shift towards proof-thcorctic perspectives
6. Ellipsis: A window on context
7. Summary
8. References
Abstract
This paper shows how formal semantics emerged shortly after the explosion of interest
in formally characterising natural language in the fifties, swiftly replacing all advocacy of
semantic representations within explanations of natural-language meaning. It then charts
how advocacy of such representations has progressively re-emerged in formal semantic
characterisations through the need to model the systemic dependency on context of
natural language construal. First, the logic concepts on which subsequent debates depend are
introduced, as is the formal-semantics (model-theoretic) framework in which meaning in
natural language is defined as a reflection of a direct language-world correspondence. The
problem of context dependence is then set out which has been the primary motivation for
introducing semantic representation, with sketched accounts of pronoun construal
relative to context. It is also shown how, in addition to such arguments, proof-theoretic (hence
structural) concepts have been increasingly used in semantic modelling of natural
languages. Finally, ellipsis is introduced as a novel window on context providing additional
evidence for the need to advocate representations in semantic explanation. The paper
concludes with reflections on how the goal of modelling the incremental dynamics of natural
language interpretation forges a much closer link between competence and performance
models than has hitherto been envisaged.
1. Logic, formal languages, and the grounding
of linguistic methodologies
The meaning of natural-language (NL) expressions, of necessity, is wholly invisible; and
one of the few supposedly reliable ways to establish the interpretation of expressions is
through patterns of inference. Inference is the relation between two pieces of
information such that one can be wholly inferred from the other (if for example I assert that I
discussed my analysis with at least two other linguists, then I imply I have an analysis, that
I have discussed it with more than one linguist, and that I am myself a linguist). Inference
alone is not however sufficient to pinpoint the interpretation potential of NL
expressions. Essential to NL interpretation is the pervasive dependence of expressions on
context for how they arc to be understood, a problem which has been a driving force in the
development of formal semantics.This article surveys the emergence of formal semantics
following growth of interest in the formal characterisation of NL grammars (Chomsky
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 216-241
11. Formal semantics and representationalism 217
1955, Lambek 1958). It introduces logic concepts on which subsequent debates depend
and the formal-semantics framework (the formal articulation of what has been called
truth-conditional semantics), in which NL is claimed to be a logic. It then sets out the
problem of context dependence with its apparent need to add a level of semantic
representation over and above whatever is needed for syntax, and shows how semanticists
have increasingly turned to tools of proof theory (the syntactic mechanisms for defining
inference in logic) to project compositionality of content in natural language. Ellipsis
data are introduced as an additional basis for evaluating the status of representations
specific to characterising NL construal; and the paper concludes with reflections on how
modelling the incremental dynamics of the step-by-step way in which information is built
up in discourse forges a much closer link between competence and performance models
than has hitherto been envisaged.
During the sixties, with emergence of the new Chomskian framework (Chomsky
1965), inquiry into the status of NL semantics within the grammar of a language was
inevitable. There were two independent developments: articulation of semantics as part
of the broadly Chomskian philosophy (Katz & Fodor 1963, Katz 1972), and Montague's
extension of formal-language semantic tools to NL (Thomason (ed.) 1974). The point
of departure for both Chomskian and formal semantics paradigms was the inspiration
provided by the formal languages of logic, though, as we shall see, they make rather
different use of this background.
Logics arc defined for the formal study of inference irrespective of subject matter,
with individual formal languages defined to reflect specific forms of reasoning, modal
logic to reflect modal reasoning, temporal logic to reflect temporal reasoning, etc.
Predicate logic, as its name implies, is defined to reflect forms of reasoning that turn on
subsentential structure, involving quantification, names, and predicates: it is the logic
arguably closest to natural languages. In predicate logic, the grammar defines a system
for inducing an infinite set of propositional formulae with internal predicate-argument
structure over which semantic operations can be defined to yield a compositional account
of meaning for these formulae. Syntactic rules involve mappings from (sub)-formulae
to (sub)-formulae making essential reference to structural properties; semantic rules
assign interpretations to elementary parts of such formulae and then compute
interpretations by mapping interpretations onto interpretations from bottom to top ('bottom-
up') as dictated by the structures syntactically defined. With co-articulation of syntax
and semantics, inference as necessary, truth dependence is then defined syntactically
and semantically, the former making reference solely to properties of structure of the
formulae in question, the latter solely to truth-values assigned to such formulae
(relative to some so-called model). The syntactic characterisation of inference is defined by
rules which map one propositional structure into another, the interaction of this small
set of rules (the proof rules) predicting all and only the infinite set of valid inferences.
We can use this pattern to define what we mean by representationalism, as follows. A
representationalist account is one that involves essential attribution of structure in the
characterisation of the phenomenon under investigation. Any account of natural
language which invokes syntactic structure of natural language strings is providing a
representationalist account of language. More controversially, representationalist accounts
of meaning are those in which the articulation of structure is an integral part of the
account of NL interpretation in addition to whatever characterisation is provided of
syntactic properties of sentence-strings.
218 II. History of semantics
1.1. The Chomskian methodology
In the Chomskian development of a linguistic philosophy for NL grammar, it was the
methodology of grammar-writing for these familiar logics which was adapted to the NL
case. By analogy, NL grammars were defined as a small number of rules inducing an
infinite set of strings, success in characterising a language residing in whether all and
only the wcllformcd sentences of the language arc characterised by the given rule set.
Grammars were to be evaluated not by data of language use or corpus analysis, but by
whether the grammar induces the set of strings judged by a speaker to be grammatical.
In this, Chomsky was universally followed: linguists generally agreed that there should
be no grounding of grammars directly in evidence from what is involved in producing
or parsing a linguistic string. Models of language were logically prior to consideration of
performance factors; so data relevant to grammar construction had to be intuitions of
grammatically as made by individuals with capacity in the language. This commitment
to complete separation of competence-based grammars from all performance
considerations has been the underpinning to almost all linguistic theorising since then, though as
we shall see, this assumption is being called into question.
Following the Chomskian methodology, Katz and colleagues set out analogous
criteria of adequacy for semantic theories of NL: that they should predict relations between
word meaning and sentence meaning as judged by speakers of the language; synonymy
for all expressions having the same meaning; entailment (equivalently, inference) for all
clausal expressions displaying a (possibly asymmetric) dependence of meaning;
ambiguity for expressions with more than one interpretation. The goal was to devise rule
specifications that yield these results, with candidate theories evaluated solely by relative
success in yielding the requisite set of semantic relations/properties. Much of the focus
was on exploring appropriate semantic representations in some internalised language of
thought to assign to words to secure a basis for predicting such entailment relations as
John killed Bill, Bill died, or John is a bachelor, John is an unmarried man (Katz 1972,
Fodor 1981,1983,1998, Pustejovsky 1995, Jackendoff 2002).There was no detailed
mapping defined from such constructs onto the objects/events which the natural language
expression might be presumed to depict.
1.2. Language as logic: The formal-semantic methodology
It was Montague and the program of formal semantics that defined a truth-theoretic
grounding for natural language interpretation. Montague took as his point of
departure both the methodology and formal tools of logic (cf. article 10 (Newen & Schroder)
Logic and semantics). He argued that by extending the syntactic and semantic systems of
modal/temporal predicate logic with techniques defined in the lambda calculus, the extra
flexibility of natural language could be directly captured, with each individual natural
language defined to be a formal language no different in kind from a suitably enriched
variant of predicate logic (see Montague 1970 reprinted in Thomason (ed.) 1974). In
predicate logic, inference is definable from the strings of the language, with invocation of
syntax as an independent level of representation essentially eliminable in being no more
than a way of describing semantic combinatorics; and, in applying this concept to natural
languages, many formal scmanticists adopt similar assumptions (in particular catcgorial
grammar: Morrill 1994). Their theoretical assumptions are thus unlike Chomskian NL
11. Formal semantics and representationalism 2^9
grammars in which syntax, hence representations of structure, is central. Since these two
paradigms made such distinct use of these formal languages in grounding their NL
grammars, an essential background to appreciating the debate is a grasp of predicate logic as
a language.
1.3. Predicate logic: Syntax, semantics and proof theory
The remit of predicate logic is to express inference relations that make essential
reference to sub-propositional elements such as quantifiers and names, extending propo-
sitional logic (with its connectives a ('and'), -> ('if-then', the conditional connective),
v ('or'),—i ('not').The language has a lexicon,syntax and semantics (Gamut 1991).There
is a finite stock of primitive expressions, and a small set of operators licensed by the
grammar: the propositional connectives and the quantifiers V (universal), 3 (existential).
Syntactic rules define the properties of these operators, mapping primitive expressions
onto progressively more complex expressions; and for each such step, there is a
corresponding semantic rule so that meanings of individual expressions can be defined
as recursively combining to yield a formal specification of necessary and sufficient
truth-conditions for propositional formulae in which they are contained.
There are a number of equivalent ways of defining such semantics. In the Montague
system, the semantics is defined with respect to a model defined as (i) a set of
stipulated individuals as the domain of discourse, (ii) appropriate assignment of a denotation
(equivalently extension) for the primitive expressions from that set: individuals from the
domain of discourse for names, sets of individuals for one-place predicate expressions,
and so on. Semantic rules map these assignments onto denotations for composite expres-
sions.Such denotations are based exclusively on assignments given to their parts and their
mode of combination as defined by the syntax, yielding a truth value (True, False) with
respect to the model for each propositional formula. There is a restriction on the remit
of such semantics. Logics are defined to provide the formal vehicle over which inference
independent of subject matter can be defined. Accordingly, model-theoretic semantics
for the expressions of predicate logic takes the denotation of terminal expressions as a
primitive, hence without explanation: all it provides is a formal way of expressing com-
positionality for the language, given an assumption of a stipulated language-denotation
relation for the elementary expressions. With such semantics, relationships of inference
between propositional formulae are definable. There are two co-extensive
characterisations of inference. One, more familiar to linguists, is a semantic characterisation
(entailment). A proposition 0entails a distinct proposition ^if in all models in which 0 is true y/
is true (explicitly a characterisation in terms of truth-dependence). Synonymy, or
equivalence, =, is when this relation is two-way. The other characterisation of inference is
syntactic, i.e. proof-theoretic, defined as the deducibility of one propositional formula from
another using proof rules: all such derivations (proofs) involve individual steps that apply
strictly in virtue of structural properties of the formulae. Bringing syntax and semantics
together, a logic is sound if the proof rules derive only inferences that are true in the
intended models, and complete if it can derive all the true statements according to the
models.Thus, in logic, syntax and semantics are necessarily in a tight systematic relation.
These proof rules constitute some minimal set, and it is interaction between them
which determines all and only the correct inferences expressible in the language. In
natural deduction systems (Fitch 1951,Prawitz 1965), defined to reflect individual local steps
220 II. History of semantics
in any such derivation, each operator has an associated introduction and elimination
rule. Elimination rules map complex formulae onto simpler formulae: introduction rules
map simpler formulae onto a more complex formula. For example, there is Conditional
Elimination (Modus Ponendo Ponens), which given premises of the form 0 and 0 -> y/
licenses the deduction of y/; there is Conditional Introduction, which, conversely, from
the demonstration of a proof of ^on the basis of some assumption 0 enables the
assumption of 0 to be removed, and a weaker conclusion 0 -> ^to be derived. Of the predicate-
logic rules, Universal Elimination licenses the inference of VjcF(jc) to F(a), simplifying
the formula by removing the quantifying operator and replacing its variable with a
constructed arbitrary name. Universal Introduction enables the universal quantifier to be
rc-introduccd into a formula replacing a corresponding formula containing such a name,
subject to certain restrictions. In rather different spirit, Existential Elimination involves
a move from 3xF(x) by assumption to F(a) (a an arbitrary name) to derive some
conclusion 0 that crucially does not depend on any properties associated with the particular
name a.
A sample proof, with annotations as metalevel commentary detailing the rules used,
illustrates a characteristic proof pattern. Early steps of the proof involve eliminating
the quantificational operators and the structure they impose, revealing the propositional
structure simpliciter with names in place of variables; central steps of inference (here
just one) involve rules of propositional calculus; late steps of the proof re-introduce the
quantificational structure with suitable quantifier-variable binding, (h is the proof sign
for valid inference.)
Vx(F(x) -» G(x)),VxF(x) i-VxG(x)
1. Vx(F(x) -> G(x))
2. Vx.F(x)
3. F(a)^G(a)
4. F(a)
5. G(a)
6. VxG(x)
Assumption
Assumption
Universal-Elim, 1
Universal-Elim, 2
Modus Ponens,3,4
Universal-Intro, 5
Fig. 11.1: Sample proof by universal elimination/introduction
These rules provide a full characterisation of inference, in that their interaction yields
all and only the requisite inference relations as valid proofs of the system, with semantic
definitions grounding the syntactic rules appropriately. In sum, inference in logical
systems is characterisable both semantically, in terms of denotations, and syntactically,
in proof-theoretic terms; and, in these defined languages, the semantic and syntactic
characterisations are co-extensive, defined to yield the same results.
1.4. Predicate logic and natural language
Bringing back into the picture the relation between such formal languages and natural
languages, what first strikes a student of NL is that predicate logic is really rather UNlike
natural languages. In NL, quantifying expressions occur in just the same places as other
noun phrases, and not, as in predicate logic, in some position adjoined to fully defined
sentential units (see Vjc(F(jc) -> G(x)) in Fig. 11.1. Nonetheless, there is parallelism
between predicate logic and NL structure in the mechanisms defined for such quantified
11. Formal semantics and representationalism 221_
formulae. The arbitrary names involved in proofs for quantified formulae display a
pattern similar to NL quantifying expressions. In some sense then, NL quantifying
expressions can be seen as closer to the constructs used to model the dynamics of inferential
action than to the predicate-logic language itself.
Confirming this, there is tantalising parallelism between NL quantifiers and terms of
the epsilon calculus (Hilbert & Bernays 1939), the logic defining the properties of these
arbitrary names. In this logic, the epsilon term that corresponds to an arbitrary name
carries a record of the mode of combination of the propositional formula within which
it occurs:
(1) 3xF(x) = F(exF(x))
The formula on the right hand side of the equivalence sign is a predicate-argument
sequence and within the argument of this sequence, there is a required second token
of the predicate F as the restrictor for that argument term (e is the variable-binding term
operator that is the analogue of the existential quantifier). The effect is that the term
itself replicates the content of the overall formula. As we shall see, this internal
complexity to epsilon terms corresponds directly to so-called E-type pronouns (Gcach 1972,
Evans 1980), in which the pronoun appears to reconstruct the whole of some previous
propositional formula, despite only coreferring to an indefinite term. In (2), that is, it is
the woman that was sitting on the steps that the pronoun she is used to refer to:
(2) A woman was sitting on the steps. She was sobbing.
So there is interest in exploring links between construal of NL quantifiers and epsilon
terms (sec von Hcusingcr 1997, Kcmpson, Meyer-Viol & Gabbay 2001, von Hcusingcr&
Kempson (eds.) 2004).There is also, more generally, as we shall see, growth of interest in
exploring links between NL interpretation processes and proof-theoretic
characterisations of inference.
In the meantime, what should be remembered is that predicate logic, with its
associated syntax-semantics correspondence, is taken as the starting point for all formal-
semantic modelling of a truth-conditional semantics for NL; and this leads to very
different assumptions held in the conflicting Chomskian and formal-semantic paradigms
about the status of representations within the overall NL grammar. For Chomskians,
with the parallelism with logic largely residing in the methodology of defining a grammar
yielding predictions that are consistent and complete for the phenomenon being
modelled, the ontology for NL grammars is essentially representationalist. The grammar
is said to comprise rules, an encapsulated body of knowledge acquired by a child (in
part innate, and encapsulated from other devices controlled by the cognitive system).
Whether or not semantics might depend on some additional system of representations
is seen as an empirical matter, and not of great import.To the contrary in the Montague
paradigm, no claims about the relation between language and the mind are made, and
in particular there is no invocation of any mind-internal language of thought. Language
is seen as an observable system of patterns similar in kind to formal languages of logic
(cf. article 10 (Newen & Schroder) Logic and semantics). From this perspective, syntax
is only a vehicle over which semantic (model-theoretic) rules project interpretations for
NL strings. If natural languages are indeed to be seen as formal languages, as claimed,
222 II. History of semantics
the syntax will do no more in the grammar than yield the appropriate pairing of
phonological sequences and denotational contents, so is not the core defining property of a
grammar: it is rather the pairing of NL strings and truth-conditionally defined content
that is its core. This is the stance of categorial grammar: Lambek (1958), Morrill (1994).
Not all grammars incorporating formal semantic insights are this stringent:
Montague's defined grammar for English as a formal language had low-level rules
ensuring appropriate morphological forms of strings licensed by the grammar rules.
But even though formal semanticists might grant that structural properties of language
require an independent system of structure-inducing rules said to constitute NL syntax,
the positing of an additional mentalistic level of representation internal to the
projection of denotational content for the structured strings of the language is debarred in
principle. The core formal-semantics claim is that interpretation for NL strings is
definable over the interpretation of the terminal elements of the language and their mode of
combination as dictated by the syntax and nothing else: this is the compositionality of
meaning principle. Positing any level of representation intermediate between the system
projecting syntactic structure of strings and the projection of content for those strings is
tantamount to abandoning this claim. The move to postulate a level of semantic
representation as part of some supposed semantic component of NL grammar is thus hotly
contested. In short, though separation of competence performance considerations as a
methodological assumption is shared by all, there is not a great deal else for advocates of
the Chomskian and Montague paradigms to agree about.
2. Natural languages as formal languages: Formal semantics
The early advocacy of semantic representations as put forward by Katz and colleagues
(cf. Katz & Fodor 1963, Katz 1972) was unsuccessful (see the devastating critique of
Lewis 1970); and from then on, research in NL semantics has been largely driven by the
Montague program. Inevitably not all linguists followed this direction.Those concerned
with lexical specifications tended to resist formal-semantic assumptions, retaining
articulations of representationalist forms of analysis despite lack of formal-semantic
underpinnings, relying on linguistic, computational, or psycho-linguistic forms of justification
(Pustejovsky 1995, Fodor 1998, Jackendoff 2002).
However, there were in any case what were taken at the time to be good additional
reasons for not seeking to develop a representational alternative to the model-theoretic
program following the pattern of the syntactic characterisation of inference for predicate
logic provided by the rules of proof (though see Hintikka 1974). For any one semantic
characterisation, there are a large number of alternative proof systems for predicate and
propositional calculus, with no possible means of choosing between them, the only
unifying factor being their shared semantic characterisation. Hence it would seem that, if
a single choice for explanation has to be made, it has to be a semantic one. Moreover,
because of the notorious problems in characterising generalized quantifiers such as most
(Barwise & Cooper 1981), it is only model-theoretic characterisations of NL content that
have any realistic chance of adequate coverage.
In such debates, the putative relevance of processing considerations was not even
envisaged. Yet amongst many variant proof-theoretic methods for predicate-logic proof
systems, Fitch-style natural deduction is invariably cited as the closest to the
observable procedural nature of natural language reasoning (Fitch 1951, Prawitz 1965). If
consideration of external factors such as psycholinguistic plausibility had been taken
11. Formal semantics and representationalism 223
as a legitimate criterion for determining selection between alternative candidate proof
systems, this scepticism about the feasibility of selecting from amongst various proof-
theoretic methodologies to construct a representationalist (proof-theoretic) model of
NL inference might not have been so widespread. However, inclusion of performance-
related considerations was, and largely still is, deemed to be illegitimate; and a broad
range of subsequently established empirical results appear to confirm the decision to
retain the NL-as-formal-language methodology. A corollary has been that vocabularies
for syntactic and semantic generalisations had to be disjoint, with only the syntactic
component of the grammar involving representations, semantics notably involving no more
than a bottom-up characterisation of denotational content defined model-theoretically
over the structures determined by the syntax, hence with a non-rcprcscntationalist
account of NL semantics. Nevertheless, as we shall see, representationalist assumptions
within semantics have been progressively re-emerging as these formal-semantic
assumptions have led to ever increasing postulations of unwarranted ambiguity, suggesting that
something is amiss in the assumption that the semantics of NL expressions are directly
encapsulated in their assigned denotational content.
2.1. The Lambda calculus and NL semantics
There was one critical tool which Montague utilised to substantiate the claim that
natural languages can be treated as having denotational semantics read off their syntactic
structure: the lambda calculus (cf. also article 33 (Zimmermann) Model-theoretic
semantics).
The lambda calculus is a formal language with a function operator X which binds
variables in some open formula to yield an expression that denotes a function from the
type of the variable onto the type of the formula. For example F(x), an open predicate-
logic formula, can be used as the basis for constructing the expression Xjc[F(jc)], where
the lambda-term is identical in content to the one-place predicate expression F (square
brackets for lambda-binding visually distinguish lambda binding and quantifier binding).
Thus the formula Xjc[F(jc)] makes explicit the functional nature of the predicate term
F, as does its logical type (e, t) (equivalently e —> t): any such expression is a predicate
that explicitly encodes its denotational type mapping individual-denoting expressions
onto propositional formulae. All that is needed to be able to define truth conditions
over a syntactically motivated structure is to take the predicate logic analogue for any
NL quantification-containing sentence, and define whatever processes of abstraction are
needed over the predicate-expressions in the agreed predicate-logic representation of
content to yield a match with requirements independently needed by the NL
expressions making up that sentence. For example, on the assumption that V'x{Student{x) -»
Smoke(x)) is an appropriate point of departure for formulating the semantic content
of Every student smokes, two steps of abstraction can be applied to that predicate-
logic formula, replacing the two predicate constants with appropriately typed variables
and lambda operators to yield the term XPXQ[\/x(P(x) -» (?(•*))] as specifying the
lexical content of every. This term can then combine first with the term Student (to
form a noun-phrase meaning) and then with the term Smokes (as a verb-phrase
meaning) to yield back the predicate logic formula V'x(Student(x) -» Smokes{x)). This
derivation can be represented as a tree structure with parallel syntactic and semantic
labelling:
II. History of semantics
(3) Vx (Student(x) -> Smokes(x)):S
\PkQ[Vx(P(x) -» Q(x))]:DET
The consequence is that noun phrases are not analysed as individual-denoting
expressions of type e (for individual), but as higher-type expressions ((e ->/)-> 0- Since the
higher-typed formula is deducible from the lower-typed formula, such a lifting was taken
to be fully justified and applied to all noun-phrase contents, thereby in addition
reinstating syntactic and semantic parallelism for these expressions. This gave rise to the
generalised quantifier theory of natural language quantification (Barwise & Cooper
1981, article 43 (Keenan) Quantifiers). The surprising result is the required
assumption that the content attributable to the VP is semantically the argument (despite
whatever linguistic arguments there might be that the verb is the syntactic head in its
containing phrase), and the subject expresses the functor that applies to it, mapping it
into a propositional content, so semantic and syntactic considerations appear no longer
to coincide.
This methodology of defining suitable lambda terms that express the same content
as some appropriate predicate-logic formula was extended across the broad array of
NL structures (with the addition of possible world and temporal indices in what was
called intensional semantics). Hence the demonstrable claim that the formal semantic
method captures a concept of compositionality for natural language sentences while
retaining predicate-logic insights into the content to be ascribed, a formal analysis which
also provides a basis for characterisations of entailment, synonymy, etc. With syntactic
and semantic characterisations of formal languages defined in strictly separate
vocabulary, albeit in tandem, the Montague methodology for natural languages imposes
separation of syntactic and semantic characterisations of natural language strings, the latter
being defined exclusively in terms of combinatorial operations on denotational
contents, with any intermediate form of representation being for convenience of exegesis
only. Montague indeed explicitly demonstrated that the mapping onto intermediate
(intensional) logical forms in articulating model-theoretic meanings was eliminable.
Following predicate-logic semantics methodology, there was little concern with the
concept of meaning for elementary expressions. However, relationships between word
meanings were defined by imposing constraints on possible denotations as meaning
postulates (following Carnap 1947). For example, the be of identity was defined so that
extensions of its arguments were required to be co-extensive across all possible worlds:
conversely, the verbs look for and find were defined to ensure that they did not have
equivalent denotational contents.
3. The challenge of context-dependence
Despite the dismissal by formal semanticists in the seventies of any form of
representation within a semantic characterisation of interpretation, it was known rightaway that the
model-theoretic stance as a program for natural language semantics within the grammar
is not problem-free. One major problem is the inability to distinguish synonymous
11. Formal semantics and representationalism 225
expressions when within prepositional attitude reports (John believes groundhogs are
woodchucks vs John believes groundhogs are groundhogs), indeed any necessary truths, a
problem which led to invoking structured meanings (Cresswell 1985, Lappin & Fox 2005).
Another major problem is the context-dependency of NL construal, a truth-theoretic
semantics for NL expressions having to be defined relative to some concept of context
(cf. article 89 (Zimmermann) Context dependency). This is a foundational issue which
has been a recurrent concern amongst philosophers over many centuries (cf. article 9
(Nerlich) Emergence of semantics for a larger perspective on the same problem).
The pervasiveness of the context-dependence of NL interpretation was not taken to
be of great significance by some, in Lewis (1970) the only provision being the
stipulation of an addition to the model of an open-ended set of indices indicating objects in the
utterance context (speaker, hearer, and some finite list of individuals). Nevertheless, the
problem posed by context was recognised as a challenge; and an early attempt to meet
it, sustaining a core Montagovian conception of denotational semantics, while
nevertheless disagreeing profoundly over details, was proposed by Barwise & Perry (1983).
Their proposed enriched semantic ontology, Situation Semantics, included situations and
an array of partial semantic constructs (resource situations, infons, etc.) with sentence
meanings requiring anchoring in such situations in order to constitute contents with
context-determined values. Inference relations were then defined in terms of relations
between situations, with speakers being attuned to such relations between situations (see
the subsequent exchange between Fodor and Barwise on such direct interpretation of
NL strings: Fodor 1988, Barwise 1989).
3.1. Anaphoric dependencies
Recognition of the extent of this problem emerged in attempts to provide principled
explanations for how pronouns are understood. Early on, Partee (1973) had pointed out
that pronouns can be interpreted anaphorically, indexically or as a bound-variable. In
(4), the pronoun is subject to apparent indexical construal (functioning like a name as
referring to some intended object from the context); but in (5) it is interpreted like a
predicate-logic variable with its value determined by some antecedent quantifying
expression, hence not from the larger context:
(4) She is tired.
(5) Every woman student is panicking that she is inadequate.
Yet, as Kamp and others showed (Evans 1980, Kamp 1981, Kamp & Rcylc 1993), this is
just the tip of the iceberg, in that the phenomenon of anaphoric dependence is not restrict-
able to the domain provided by any one NL sentence in any obvious analogue of
predicate-logic semantics. There are for example E-type uses of that same pronoun in which it
picks up its interpretation from a quantified expression across a sentential boundary:
(6) A woman student left. She had been panicking about whether she was going to
pass.
If natural languages matched predicate-logic patterns, not only would we apparently be
forced to posit ambiguity as between indexical and bound-variable uses of pronouns, but
226 II. History of semantics
one would be confronted with puzzles that do not fall into either classification: this pronoun
appears to necessitate positing a term denoting some arbitrary witness of the preceding
propositional formula, in (6) a name arbitrarily denoting some randomly picked individual
having the properties of being a student, female, and having left. Such NL construal is
directly redolent of the epsilon terms underpinning arbitrary names of natural deduction
proofs, carrying a history of the compilation of content from the sentence providing the
antecedent (von Heusinger 1997). But this just adds to the problem for it seems there are
three different interpretations for one pronoun, hence ambiguity. Moreover, as Partee had
pointed out, tense specifications display all the hallmarks of anaphora construal, able to
be interpreted either anaphorically, indexically or as a bound variable, indeed with E-type
effects as well, so this threat of proliferating ambiguities is not specific to pronouns.
4. Dynamic Semantics
The responses to this challenge can all be labelled dynamic semantics (Muskens, van
Benthem & Visser 1997, Dekker 2000).
4.1. Discourse Representation Theory
Discourse Representation Theory (DRT: Kamp 1981, Kamp & Reyle 1993) was the first
formal articulation of a response to the challenge of modelling anaphoric dependence in
a way that enables its various uses to be integrated. Sentences of natural language were
said to be interpreted by a construction algorithm for interpretation which takes the
syntactic structure of a string as input and maps this by successive constructional steps onto
a structured representation called a Discourse Representation Structure (DRS), which
was defined to correspond to a partial model for the interpretation of the NL string. A
DRS contains named entities (discourse referents) introduced from NL expressions, with
predicates taking these as arguments, the sentence relative to which such a partial model
is constructed being defined to be true as long as there is at least one embedding of the
DRS into the overall model. For example, for a simple sentence-sequence such as (7), the
construction algorithm for building discourse representation structure induces a DRS for
the interpretation of the first sentence in which one discourse referent is entered into the
DRS corresponding to the name and one for the quantifying expression, together with a
set of predicates corresponding to the verb and nouns.
(7) John loves a woman. She is French.
(8) HT7
The DRS in (8) might then be extended, continuing the construal process for the overall
discourse by applying the construction algorithm to the second sentence to yield the
expanded DRS:
John=x
loves(y)(x)
woman(y)
11. Formal semantics and representationalism
(9)
x, y,z
John=x
loves(y)(x)
woman(y)
z = y
French(y)
To participate in such a process, indefinite NPs are defined as introducing a new discourse
referent into the DRS, definite NPs and pronouns require that the referent entered
into the DRS be identical to some discourse referent already introduced, and names
require a direct embedding into the model providing the interpretation. Once
constructed, the DRS is evaluated by its embeddability into the model. Any such resulting
DRS is true in a model if and only if there is at least one embedding of it within the
overall model.
Even without investigating further complexities that license the embeddability of one
DRS within another and the famous characterisation of If a man owns a donkey, he beats
it (cf. article 37 (Kamp & Reyle) Discourse Representation Theory), an immediate bonus
for this approach is apparent. The core cases of the so-called E-type pronouns fall into
the same characterisation as more obvious cases of co-reference: all that is revised is
the domain across which some associated quantifying expression can be seen to bind.
It is notable in this account that there is no structural reflex of the syntactic properties
of the individual quantifying determiner: indeed this formalism was among the first to
come to grips with the name-like properties of such quantified formulae (cf. Fine 1984).
It might of course seem that such a construction process is obliterating the difference
between names, quantifying expressions, and anaphoric expressions, since all lead to the
construction of discourse referents in a DRS. But, as we have seen, these expressions are
distinguished by differences in the construction process. The burden of explanation for
NL expressions is thus split: some aspect of their content is characterised by the mode of
construction of the intervening DRS, some of it by the embeddability conditions of that
structure into the overall model.
The particular significance of DRT lies in the Janus-faced properties of the DRS's
defined. On the one hand, a DRS corresponds to a partial model (or more weakly, is a set
of constraints on a model), defined as true if and only if it is embeddable in the overall
model (hence is the same type of construct). On the other hand, specific structural
properties of the DRS may be invoked in defining antecedent-pronoun relations, hence such
a level is an essential intermediary between NL string and the denotations assigned to its
expressions. Nonetheless, this level has a fully defined semantics constituting its
embeddability into an overall model, so its properties are explicitly defined. There is a second
sense in which DRT departs from previous theories. In providing a formal articulation
of the incremental process of how interpretation is built up relative to some previously
established context, there is implicit rejection of the methodology disallowing
reference to performance in articulations of NL competence. Indeed the DRS construction
algorithm is a formal reflection of sentence-by-sentence accumulation of content in a
discourse (hence the term Discourse Representation Theory). So DRT not only offers
a representationalist account of NL meaning, but one reflecting the incrementality of
utterance processing.
228 II. History of semantics
4.2. Dynamic Predicate Logic
This account of anaphoric resolution sparked immediate response from proponents of
the model-theoretic tradition. For example, Groenendijk & Stokhof (1991) argued that
the intervening construct of DRT was both unnecessary and illicit in making composi-
tionality of NL expressions definable not directly over the NL string but only via this
intermediate structure. Part of their riposte to Kamp involved positing Dynamic
Predicate Logic (DPL) with two variables for each quantifier and a new attendant semantics,
so that once one of these variables gets closed off in ways familiar from predicate-
logic binding, the second remains open, bindable by a quantifying mechanism introduced
as part of the semantic combinatorics associated with some preceding string, hence
obtaining cross-sentential anaphoric binding without any ancillary level of
representation as invoked in DRT (cf. article 38 (Dekker) Dynamic semantics). Both the logic and
its attendant semantics were new. Nevertheless, such a view is directly commensurate
with the stringently model-theoretic view of context-dependent interpretation for
natural language sentences provided by e.g. Stalnaker (1970, 1999): in these systems,
progressive accumulation of interpretation across sequences of sentences in a discourse is
seen exclusively in terms of intersections of sets of possible worlds progressively
established, or rather, to reflect the additional complexity of formulae containing unbound
variables, intersection of sets of pairs of worlds and assignments of values to variables
(see Heim 1982 where this is set out in detail).
In the setting out of DPL as a putative competitor over DRT in characterising the
same data without any level of representation, there was no attempt to address the
challenge which Kamp had brought to the fore in articulating DRT, that of characterizing
how anaphoric expressions contribute to the progressive accumulation of interpretation:
on the DPL account, the pronoun in question was simply presumed to be coindexed with
its antecedent. Notwithstanding this lack of take up of the challenge which DRT was
addressing, there has been continuing debate since then as to whether any intervening
level of representation is justified over and above whatever syntactic levels are posited to
explain syntactic properties of natural language expressions. Examples such as(10)-(ll)
have been central to the debate (Kamp 1996, Dekker 2000):
(10) Nine of the ten marbles are in the bag. It is under the sofa.
(11) One of the ten marbles isn't in the bag. It is under the sofa.
According to the DRT account, the reason why the pronoun it cannot successfully be
used with the interpretation that it picks up on the one marble not in the bag in (10) is
because such an entity is only inferrable from information given by expressions in the
previous sentence: no representation of any term denoting such an entity in (10) has
been made available by the construction process projecting a discourse representation
structure on the basis of which the truth conditions of the previous sentence are
compiled. So though in all models validating the truth of (10) there must be a marble not
in the bag described, there cannot be a successful act of reference to such an individual
in using the pronoun. By way of contrast, in (11), despite its being true in all the same
models that (10) is true, it is because the term denoting the marble not in the bag is
11. Formal semantics and representationalism 229
specifically introduced that anaphoric resolution is successful. Hence, it is argued, the
presence of an intermediate level of representation is essential.
4.3. The pervasiveness of context-dependence
Despite Kamp's early insight (Kamp 1981) that anaphora resolution was part of the
construction for building up interpretation, this formulation of anaphoric dependence
was set aside in the face of the charge that DRT did not provide a compositional account
of natural language semantics, and alternative generalised-quantifier accounts of
natural language quantification were provided reinstating compositionality of content over
the NL string within the DRT framework even though positing an intermediate level of
representation (van Eijck & Kamp 1997).
Despite great advances made by DRT, the issue of how to model context dependence
continues to raise serious challenges to model-theoretic accounts of NL content.The
different modes of interpretation available for pronouns extend far beyond a single syntactic
category. Definite NPs, demonstrative NPs, and tense all present a three-way ambiguity
between indexical, bound-variable and E-type forms of construal; and this ambiguity, if
not reducible to some general principle, requires that the language be presumed to
contain more than one such expression, with many discrete expression-denotation pairings.
Such ambiguity is a direct consequence of the assumption that an articulation of meaning
of an expression has to be in terms of its systematic contribution to truth conditions of
the sentences in which it occurs. Such distinct uses of pronouns do indeed need to be
expressed as contributing different truth conditions, whether as variable, as name, or
as some analogue of an epsilon term; but the very fact that this ambiguity occurs in all
context-dependent expressions in all languages indicates that something systematic
is going on: this pattern is wholly unlike the accidental homonymy typical of lexical
ambiguity.
Furthermore, it is the non-existence of context-dependence in interpretation of
classical logic which lies at the heart of the difference between the two types of system. In
predicate logic, by definition, there is no articulation of context or how interpretation is
built up relative to that.The phenomenon under study is that of inference, and the formal
language is defined to match such patterns (and not the way in which the formulae in
question might themselves have been established). Natural languages are however not
purpose-built systems; and context-dependence is essential to their success as an
economical vehicle for expressing arbitrarily rich pieces of information relative to arbitrarily
varying contexts. This perspective is buttressed by work in the neighbouring disciplines
of philosophy of language and pragmatics. The gap between intrinsic content of words
and their interpretation in use had been emphasised by the later Wittgenstein (1953),
Austin (papers collected in 1961), Grice (papers collected in 1989), Sperber & Wilson
(1986/1995), Carston (2002). The fact that context is essential to NL construal imposes
an additional condition of adequacy on accounts of NL content: a formal
characterisation of the meaning of NL expressions needs to define both the input which an
individual NL expression provides to the interpretation process and the nature of contexts
with which such input interacts. Answers to the problem of context formulation cannot,
however, be expected to come from the semantics of logics. Rather, we need some basis
for formulating specifications that UNDER-determine any assigned content. Given the
230 II. History of semantics
needed emphasis on underspecification, on what it means to be part-way through a
process whereby some content is specifiable only as output, it is natural to think in terms of
representations, or, at least, in terms of constraints on assignment of content.This is now
becoming common-place amongst linguists (Underspecified Discourse Representation
Semantics (UDRT), Reyle 1993, van Leusen & Muskens 2003). Indeed Hamm, Kamp
& van Lambalgen (2006) have argued explicitly that such linguistically motivated
semantic representations have to be construed within a broadly computational, hence
representationalist theory of mind.
5. The shift towards proof-theoretic perspectives
It might seem as though, with representationalism in semantics such a threat to the
fruitfulness of the Montague paradigm, any such claim would be deluged with
counterarguments from the formal-semantics community (Dekker 2000, Muskens 1996). But,
to the contrary, use of representationalist tools is continuing apace in both orthodox
and less orthodox frameworks; and the shift towards more representational modes of
explanation is not restricted to anaphora resolution.
5.1. The Curry-Howard isomorphism
An important proof-theoretic result pertaining to the compositionality of NL content
came from the proof that the lambda calculus and type deduction in intuitionistic logic
are isomorphic (intuitionistic logic is weaker than classical logic in that several classical
tautologies do not hold). This is the so-called Curry-Howard isomorphism. Its relevance
to linguists, which I define ostensively by illustration, is that the fine structure of how
compositionality of content for NL expressions is built up can be represented proof-
theoretically, making use of this isomorphism (Morrill 1994, Carpenter 1997).
The isomorphism is displayed in proofs of type deduction in which propositions are
types, with a label demonstrating how that proof type as conclusion was derived. The
language of the labels is none other than the lambda calculus, with functional application
in the labels corresponding to type deduction on the formula side. So in the label, we
might have a lambda term, e.g. Xx[Sneeze(x)], and in the formula its corresponding type
e -> /.The compositionality of content expressible through functional application defined
over lambda terms can thus be represented as a step of natural deduction over labelled
propositional formulae, with functional application on the labels and modus ponens on
the typed formula. For example, a two-place predicate representable as XxXy[See(x)(y)]
can be stated as a label to a typed formula:
(12) XxXy[See(x)(y)]:e^(e^t)
This, when combined with a formula
(13) Mary.e
yields as output:
(14) Xy[See{Mary){y)\ :e^t
11. Formal semantics and representationalism 231_
And this in its turn when paired with
(15) John :e
yields as output by one further step of simultaneous functional application and
Conditional Elimination:
(16) See(Mary)(John) : t
So compilation of content for the string John sees Mary can be expressed as a labelled
deduction proof (12)—(16), reflecting the bottom-up compilation of content for the NL
expression.
5.2. Proof theory as syntax: Type Logical Grammar
This method for using the fine structure of natural deduction as a means of
representing NL compositionality has had very wide applicability, in categorial grammar and
elsewhere (Dalrymple, Shieber & Pcreira 1991, Dalrymple (ed.) 1999). In categorial
grammar in particular (following Lambek 1958 who defined a simple extension of the
lambda calculus with the incorporation of two order-sensitive operators indicating
functional application with respect to some left/right placed argument), the Curry-Howard
isomorphism is central, and with later postulation of modal operators to define syntactic
domains (Morrill 1990,1994), such systems have been shown to have expressive power
sufficient to match a broad array of variation expressible in natural languages (leaving on
one side the issue of context dependence). Hence the categorial-grammar claim that NL
grammars are logics with attendant proof systems.These are characteristically presented
either in natural-deduction format or in its meta-level analogue, the sequent calculus.
The advantage of such systems is the fine level of granularity they provide for reflecting
bottom-up compositionality of content, given suitable (higher) type assignments to NL
expressions. In being logics, natural languages are presumed to have syntax and
semantics defined in tandem, with the proof display being no more than an elegant display of
how prosodic sequences (words) are paired projected by steps of labelled type deduction
onto denotational contents. In particular, no representationalist assumptions are made
vis a vis either syntax or logical representations (see Morrill 1994 for a particularly clear
statement of this strictly dcnotationalist commitment).
Nonetheless in more recent work, Morrill departs from at least one aspect of this
stringent categorial grammar proof-theoretic ideology. In all categorial grammar
formalisms, as indeed in many other frameworks, there is strict separation of competence
and performance considerations with grammar-formalisms only evaluated in terms of
their empirical predictive success; yet Morrill & Gavarro (2004) and Morrill (2010) argue
that an advantage of the particular categorial-grammar characterisation adopted (with
linear-logic proof derivations), is the step-wise correspondence of individual steps in
the linear-logic derivation to measures of complexity in processing the NL string, this
being a bonus for the account. Thus representational properties of grammar-defined
derivations would seem at least evidenced by performance data, even if such proof-
theoretically defined derivations are taken to be eliminable as a core property of the
grammar formalism itself.
232
II. History of semantics
5.3. Type Theory with Records
A distinct proof-theoretic equivalent of Montague's PTQ grammar was defined by
Ranta (1994), who adopted as a basis for his framework the methodology of Martin-
Lofs (1984) proof theory for intuitionistic type theory. By way of introducing the
Martin-Lof ontology, remember how in natural deduction proofs there are metalevel
annotations (section 1.4), but these arc only partially explicit: many do not record
dependencies between arbitrary names as they are set up. The Martin-Lof methodology to
the contrary requires that all such dependencies are recorded, and duly labelled; and
Ranta used this rich attendant labelling system to formulate analyses of anaphora
resolution and quantification, and from these he established a structural concept of context,
notably going beyond what is achievable in categorial grammar formalisms.
Furthermore, such explicit representations of dependency have been used to establish a fully
explicit proof-theoretic grounding for generalized quantifiers, from which an account of
anaphora resolution follows incorporating E-type pronouns as a subtype (Ranta 1994,
Piwek 1998, Fernando 2002).
In Cooper (2006), the Ranta framework is taken a step further, yielding the Type
Theory with Records framework (TTR). Cooper uses a concept of record and record-
type to set out a general framework for modelling both context-dependent
interpretation and the intrinsic undcrspccification that NL expressions themselves contribute
to the interpretation process. Echoing the DRT formulation, the interpretation to be
assigned the sentence A man owns a donkey is set out as taking the form of the record-
type (Cooper 2005) (the variables in these formulations, as labels to proof terms, are like
arbitrary names, expressing dependencies between one term and another:
(17) Vx:Ind
c, :man{x)
y And
c2 :donkey{y)
c, :own(y)(x)
x,y are variables of individual type,
c, is of the type of proof that x is a man, (hence a proof that is dependent on some
proof of jc),
c2 is of the type of proof that y is a donkey,...
A record of that record type would be some instantiation of variables e.g.:
11. Formal semantics and representationalism
(18)
x =a
c. =/>.
y=b
C2=/>2
.C>=P>.
px a proof of'man(a)',
p2 a proof of 'donkey(y)\
and so on.
This is a proof-theoretic reformulation of the situation-theory concepts of infon
(= situation-type) and situation. A record represents some situation that provides
values that make some record-type true. The concept of record-type corresponds to
sentence meanings in abstraction from any given context/record: a sentence meaning
is a mapping from records to record-types. It is no coincidence that such dependency
labelling has properties like that of epsilon terms, since, like them, the terms that
constitute the labels to the derived types express the history of the mode of combination.
The difference between this and DRT lies primarily in the grounding of records and
record-types in proof-theoretic rather than model-theoretic underpinnings (Cooper
2006). Like DRT, this articulation of record theory with types as a basis for NL
semantics leaves open the link with syntax. However, in recent work, Ginzburg and Cooper
have extended the concept of records and record-types yet further to incorporate full
details of linguistic signs (with phonological, syntactic and semantic information - a
multi-level representational system Ginzburg & Cooper 2004, cf. article 36 (Ginzburg)
Situation Semantics).
6. Ellipsis: A window on context
So far, the main focus has been anaphora construal, but ellipsis presents another window
on context. Elliptical fragments are those where there is only a fragment of a clause:
words and whole phrases can simply be omitted when the context fully determines what
is needed to complete the interpretation.The intrinsic interest of such elliptical fragments
is that they provide evidence that very considerable richness of labelling is required in
modelling NL construal. Like pronominal anaphora, ellipsis displays a huge diversity.
Superficially, the phenomenon might seem to be amenable to some more sophisticated
variant of anaphora construal.There are, that is, what look like indexical construals, and
also analogues of coreference and bound-variable anaphora:
(19) (Mother to a toddler stretching up to the stove above their head)
Johnny, don't.
(20) John stopped in time but I didn't.
(21) Everyone who submitted their thesis without checking it wished that the others
had too.
To capture the array of effects illustrated in (20)-(21), there is a model-theoretic
account of ellipsis (Dalrymple, Shieber & Pereira 1991) whose starting point is to take
234 II. History of semantics
the model-theoretic methodology, and from the propositional content provided by the
antecedent clause, to define some lambda term to isolate an appropriate predicate for
combining with the NP term provided by the fragment. For (20), one might define the
term Xjc[stop-in-time(jc)]; for (21) the term Xjt[3)>(thesis(y) a submit(y)(jt) a -icheck(y)
(jc))]. However, even allowing for complexities of possible sequences of quantificational
dependencies replaced at the ellipsis site as in (21), the rebinding mechanism has to
be yet more complex, as what is rebound may be not merely some subject value and
whatever quantified expressions are dependent on that, but cases where the subject
expression is itself dependent on some expression within its restrictor, so that an even
higher-order form of abstraction is required:
(22) The man who arrested Joe failed to read him his rights, as did the man who arrested
Sue.
This might seem to be an echo of the debate between DRT and DPL, with the higher-
order account having the formal tools to express the parallelism of construal carried
over from the antecedent clause to the ellipsis site as in (20)-(21), without requiring any
invocation of a representation of content. However, there are many cases of ellipsis, as
syntacticians have demonstrated, where only an account which invokes details of some
assigned structure to the string can explain the available interpretations (Fiengo & May
1994, Merchant 2004). For example, there are morphological idiosyncracies displayed
by individual languages that surface as restrictions on licensed elliptical forms. In a
case-rich language such as German, for example, the fragment has to occur with the case
form it would be expected to have in a fully explicit follow-on to the antecedent clause
(Greek also displays exactly the same type of requirement). English, where case is very
atrophied, imposes no such requirement (Morgan 1973, Ginzburg & Cooper 2004):
(23) Hans will nach London gehen. Ich/*mich auch.
(24) Hans wants to go to London. Me too.
This is not the kind of information which a higher-order unification account can express,
as its operations arc defined on the model-theoretic construal of the antecedent
conjunct, not over morphological sequences. There are many more such structure-particular
variations indicating that ellipsis would seem to have to be defined over representations
of structure, hence to be analysed as a syntactic phenomenon, in each case defined over
some suitable approximate correspondent to the surface strings (there has to be this
caveat as there are well-known cases of vehicle-change where what is reconstructed is not
the linguistic form, Fiengo & May 1994):
(25) John has checked his thesis notes carefully, but I haven't.
However, there are puzzles for both types of account. Fragments may occur in dialogue
for which arguably only a pragmatic enrichment process can capture the effects (not
based either on structured strings or on their model-theoretic contents, Stainton 2006):
(26) A (leaving hotel):The station?
B (receptionist): Left out of the door.Then second on the right.
11. Formal semantics and representationalism 235
For reasons such as this, ellipsis continues to receive a great deal of attention, with no
single account apparently able single-handedly to match the fine-grained nature of the
cross-conjunct/speaker patterning that ellipsis construal can achieve. The conclusion
generally drawn is that the folk intuition about ellipsis is simply not expressible. Indeed
the added richness in TTR of combining Ranta's type-logical formalism with the feature-
matrix vocabulary of Head Driven Phrase Structure Grammar (HPSG: Sag, Wasow &
Bender 2002) to obtain a system combining independent semantic, syntactic and
morphological paradigms was at least in part driven by the observed 'fractal heterogeneity'
of ellipsis (Ginzburg & Cooper 2004).
However, there is one last chance to capture such effects in an integrated way, and this
is to posit a system of labelling recording individual steps of information build-up with
whatever level of granularity is needed to preserve the idiosyncracies in the individual
steps. Ellipsis construal can then be defined by making reference to such records. This
type of account is provided in Dynamic Syntax (Kempson, Meyer-Viol & Gabbay 2001,
Cann, Kempson & Marten 2005, Purver, Cann & Kempson 2006, Cann, Kempson &
Purver 2007).
6.1. Dynamic Syntax: A reflection of parsing dynamics as a basis
for syntax
Dynamic Syntax (DS) models the dynamics of how interpretation is incrementally
built up following a parsing dynamics, with progressively richer representations of
content constructed as words are processed relative to context. This articulation of the
stepwise way in which interpretation is built up is claimed to be all that is needed for
explaining NL syntax: representations constructed are simultaneously the vehicle for
explaining syntactic distributions and a vehicle for representing how interpretation
is built up.
The methodology adopts a rcprcscntationalist stance vis a vis content (Fodor 1983).
Predicate-argument structures are represented in a tree format with the assumption
of progressive update of partial tree-representations of content. Context, too, is
represented in the same terms, evolving in tandem with each update. This concept of
structural growth totally replaces the semantically blind syntactic specifications characteristic
of such formalisms as Minimalism (Chomsky 1995) and HPSG. It is seen as starting
from an initial one-node tree stating the goal of the interpretation process to establish
some propositional formula (the tree representation to the left of the »-» in (27)).Then,
using both parse input and information from context, some propositional formula is
progressively built up (the tree representation to the right of the •-» in (27)).
(27) Parsing John upset Mary
?t, 0 ~ Upsef(Mary') (John'): t, 0
Upset'(Mary'): e -»t
Mary': e Upset": e -»(e -»t)
236 II. History of semantics
The output is a fully decorated tree whose topnode is a representation of some
proposition expressed with its associated type specification, and each dominated node has
a concept formula, e.g. John' representing some individual John, and an indication of
what semantic type that concept is. The primitive types are types e and t as in formal
semantics but construed syntactically as in the Curry-Howard isomorphism, labelled type
deduction determining the decorations on non-terminal nodes once all terminal nodes
are fixed and suitably decorated with formula values.There is invariably one node under
development in any partial tree, as indicated by the pointer 0. So a parse process for (27)
would constitute a transition across partial trees, the substance of this transition turning
on how the growth relation •-» is to be determined by the word sequence. The concept
of requirement IX for any decoration X is central. Decorations on nodes such as ?/, ?e,
?(e -> t), etc. express requirements to construct formulae of the appropriate type on
the nodes so decorated, and these requirements drive the subsequent tree-construction
process. A string can then be said to be wellformed if and only if there is at least one
derivation involving monotonic growth of partial trees licensed by computational (general),
lexical and pragmatic actions following the sequence of words that yields a complete tree
with no requirements outstanding.
Just as the concept of tree growth is central, so too is the concept of procedure for
mapping one partial tree to another. Individual transitions from partial tree to partial
tree are all defined as procedures for tree growth. The formal system underpinning
the partial trees that arc constructed is a logic of finite trees (LOFT: Blackburn &
Meyer-Viol 1994). There are two basic modalities, (i) and (T), and Kleene * operators
defined over these relations, e.g. (\^)Tn(a) indicating that somewhere dominating this
node is the tree-node Tn(a) (a standard tree-theoretic characterisation of'dominate').
The procedures in terms of which the tree growth processes are defined then involve
such actions as make((i)),go((i)),make((i*)),put(Ar) (for any decoration X),etc.This
applies both to general constraints on tree growth (hence the syntactic rules of the
system) and to specific tree update actions constituting the lexical content of words. So
the contribution which a word makes to utterance interpretation is expressed in the
same vocabulary as general structural growth processes, and is invariably more than
just a concept specification: it is a sequence of actions developing a sub-part of a tree,
possibly building new nodes, and assigning them decorations such as formula and type
specifications.
Of the various concepts of undcrspccification, two arc central. On the one hand, there
is underspecification of conceptual content, with anaphoric expressions being defined as
adding to a node in a tree a place-holding metavariable of a given type as a provisional
formula value to be replaced by some fixed value which the immediate context makes
available.This is a relatively uncontroversial approach to anaphora construal,equivalent to
formulations in many other frameworks (notably DRT and TTR). On the other hand, there
is underspecification and update of structural relations, in particular replacing all
movement or feature-passing accounts of discontinuity effects, with introduction of an unfixed
node, one whose structural relation is not specified at the point of construction, and whose
value must be provided from the construction process. Formally the construction of a new
node within a partial tree is licensed from some node requiring a propositional type, with
that relation being characterised only as that of domination (weakly specified tree relations
are indicated by a dashed line with 7>i(0)asthe rootnode: this is step (i) of (28).The update
to this relatively weak tree-relation for English, lacking as it does any case specifications,
11. Formal semantics and representationalism 237
becomes possible only once having parsed the verb.This is the unification step (ii) of (28),
an action which satisfies both type and structure update requirements:
(28) Parsing Mary, John upset
ft, Tn(0)
J Mary1: e
Mary'.e, <T*>7n(0)
<T*)7n(0),
0
step (i)
This process, like the substitution operation associated with pronoun construal feeds into
the ongoing process of creating a completed tree, in this case by steps of labelled type
deduction.
It might seem that all such talk of (partial) trees as representations of content could
not in principle simultaneously serve as both a syntactic explanation and a basis for
semantic interpretation, because of the problems posed by quantification, known to
necessitate a globally defined process expressing scope dependencies between
quantifying expressions, and generally agreed to involve mismatch between logical and
syntactic category (see section 2.1). However, quantified expressions are taken to map onto
epsilon terms, hence of type e (Kempson, Meyer-Viol & Gabbay 2001, ch.7); and in
having a restrictor which reflects the content of the proposition in which they are
contained, the defining property of epsilon terms is that they grow incrementally, as
additional predicates are added to the term under construction (see section 1.4). And these
epsilon terms, metavariables and tree relations, all under development, all interact to
yield the complex interactions more familiarly seen as scope and anaphoric sensitivities
to long-distance and other discontinuity effects.
The bonus of this framework for modelling ellipsis is that by taking not merely
structured representations of content as first-class citizens but the procedures used to
incrementally build up such stucture, ellipsis construal is expressible in an integrated
way: it is indeed definable as determined directly by context. What context comprises
is the richer notion that includes some sequence of words, their assigned content, and
its attendant propositional tree structure, plus the sequence of actions whereby that
structure and its content were established. Any one of these may be used to build up
interpretation of the fragment, and together they determine the range of
interpretations available: recovery of some predicate content from context (19)-(20), recovery of
some actions in order to create a parallel but distinct construal (21)-(22).This approach
to ellipsis as building up structure by reiterating content or actions can thus replace
the higher order unification account with a more general, intuitive, and yet essentially
rcprcscntationalist account.
A unique property of the DS grammar mechanisms is that production is expressed
in the same terms as parsing, as progressive build-up of semantic structure, differing
from parsing only in having also a richer representation of what the speaker is trying
to express as a filter on whether appropriate semantic structure is being induced by
the selected words.This provides a basis from which to expect fluent exchange of speaker/
Johri.e ?<e^')
Upset'. (e->(e->0)
step (ii)
238 II. History of semantics
hearer roles in dialogue, in which a speaker sets out a partial structure which their
interlocutor, parsing what they say, takes over:
(29) Q:Who did John upset?
A: Himself.
(30) A: I saw John.
B:With his mother?
A: Yes, with Sue. In the park.
Since both lexical and syntactic actions arc defined as tree-growth processes, construal
of elliptical fragments both within and across speakers is expected to allow replication
of any such actions following their use to build interpretation for some antecedent,
predicting the mixture of semantic and syntactic factors in ellipsis. Scope dependencies are
unproblematic.These are not expressed on the trees, but formulated as an incrementally
collected set of constraints on the evaluation of the emergent epsilon terms. These are
applied once the propositional formula is constructed, determining the resulting epsilon
term. So despite the apparently global nature of scope, parallel scope dependencies can
be expressed as the re-use of a sequence of actions that had earlier been used in the
construal of the antecedent string, replicating scope actions but relative to the terms
constructed in interpreting the fragment. Even case, defined as an output filter on the
resultant tree, is unproblematic, unlike for semantic characterisations of ellipsis, as in (23).
So, it is claimed, interaction of morphology, syntax, and even pragmatics in ellipsis
construal is predictable given DS assumptions.
To flesh this out would involve a full account of how DS mechanisms interact to yield
appropriate results. All this sketch provides is a point of departure for addressing ellipsis
in an integrated manner (sec also a variant of DRT: Ashcr & Lascaridcs 2002). More
significantly, the DS account of context and the emergent concept of content intrinsic to
NL words themselves are essentially representationalist.The ellipsis account succeeds
in virtue of the fact that lexical specifications are procedures that interact with context
to induce the building of representations of denotational content. Natural languages are
accordingly denotationally interpretable only via a mapping onto an intermediate logical
system. And, since compositionality of denotational content is defined over the resulting
trees, it is the incrcmcntality of projection of word meaning and the attendant monoto-
nicity of the tree growth process which constitutes compositionality definable over the
words making up sentences.
7. Summary
Inevitably,issues raised by anaphora and ellipsis remain open. But,even without resolving
these, research goals have strikingly shifted since early work in formal semantics and
the representationalism debate it engendered. The dispute remains; but answers to the
questions are very different. On the view espoused within TTR and DS formulations, a
natural language is not definable as a logic in the mould of predicate logic. Rather, it is
a set of mechanisms out of which truth-denoting objects can be built relative to what is
available in context. To use Cooper's apt turn of phrase 'natural languages are tools for
formal language construction' (Cooper & Ranta 2008): so semantic representations are
central to the form of explanation.
11. Formal semantics and representationalism 239
Whatever answers individual researchers might reach to individual questions, one
thing is certain: new puzzles are taking centre-stage. The data of semantic investigation
are not now restricted to judgements of entailment relations between sentences. The
remit includes modelling the human capacity to interpret fragments in context in
conjunction with other participants in dialogue; and defining appropriate concepts of
information update. Each of these challenges involves an assumption that the human capacity
for natural language is a capacity for language processing in context. With this narrowing
of the gap between competence and performance considerations, assumptions about the
nature of such semantic representations of content can be re-evaluated. We can again
pose the question of the status of representations in semantic modelling; but we can
now pose this as a question of the relation of such representations to those required for
modelling cognitive inference more broadly. Furthermore, such questions can be posed
from a number of frameworks as starting point: categorial grammar, Type Theory with
Records, DRT and its variants, Dynamic Syntax, to name but a few. And it is these new
avenues of research which are creating new ways of understanding the nature of natural
language and linguistic competence.
Eleni Gregoromichelaki, Ronnie Cann and two readers provided most helpful comments
leading to significant improvement of this paper. However, normal disclaimers apply.
8. References
Asher, Nicholas & Alex Lascarides 2002. Logics of Conversation. Cambridge, MA: The MIT
Press
Austin, John 1961. How to Do Things with Words. Oxford: Clarendon Press
Barwise, Jon 1989. The Situation in Logic. Stanford, CA: CSLI Publications.
Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics &
Philosophy 4,159-219.
Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA:The MIT Press
Blackburn, Patrick & Wilfried Meyer-Viol 1994. Linguistic logic and finite trees. Bulletin of the
Interest Group of Pure and Applied Logics 2,2-39.
Cann, Ronnie, Ruth Kempson & Lutz Marten 2005. Dynamics of Language. Amsterdam: Elsevier.
Cann, Ronnie, Ruth Kempson & Matthew Purver 2007. Context, wellformedness and the dynamics
of dialogue. Research on Language and Computation 5,333-358.
Carnap, Rudolf 1947. Meaning and Necessity. A Study in Semantics and Modal Logic. Chicago, IL:
The University of Chicago Press
Carpenter, Robert 1997. Type-Logical Semantics. Cambridge, M A:Tlie MIT Press
Carston, Robyn 2002. Thoughts and Utterances: The Pragmatics of Explicit Utterances. Oxford:
Blackwcll.
Chomsky, Noam 1955. Syntactic Structures.The Hague: Mouton.
Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press
Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA:The MIT Press
Cooper, Robin 2005. Austinian truth, attitudes and type theory. Research on Language and
Computation 3,333-362.
Cooper, Robin 2006. Records and record types in semantic theory. Journal of Logic and
Computation 15,99-112.
Cooper, Robin & Aarne Ranta 2008. Natural language as collections of resources In: R. Cooper &
R. Kempson (eds). Language in Flux: Dialogue Coordination, Language Variation, Change, and
Evolution. London: College Publications 109-120.
Crcsswcll, Max 1985. Structured Meaning. Cambridge, MA:The MIT Press
240 II. History of semantics
Dalrymple Mary, Stuart Shieber & Fernando Pereira 1991. Ellipsis and higher-order unification.
Linguistics & Philosophy 14,399^452.
Dalrymple, Mary (ed.) 1999. Semantics and Syntax in Lexical Functional Grammar: The Resource
Logic Approach. Cambridge, MA:Tlie MIT Press
Dekker, Paul 2000. Coreference and representationalism. In: K. von Heusinger & U. Egli (eds).
Reference and Anaphoric Relations. Dordrecht: Kluwer. 287-310.
Evans, Gareth 1980. Pronouns. Linguistic Inquiry 11.337-362.
Fernando, Timothy 2002. Three processes in natural language interpretation. In: W. Sieg,
R. Sonimer & C.Talcott (eds.). Reflections on the Foundations of Mathematics: Essays in Honor
of Solomon Feferman. Natick, MA: Association for Symbolic Logic, 208-227.
Fiengo, Robert & Robert May 1994. Indices and Identity. Cambridge, MA:The MIT Press
Fine, Kit 1984. Reasoning with Arbitrary Objects. Oxford: Blackwell.
Fitch, Frederic 1951. Symbolic Logic. New York:The Ronald Press Company.
Fodor, Jerry 1981. Re-Presentations. Cambridge. M A:Tlic MIT Press.
Fodor, Jerry 1983. Modularity of Mind. Cambridge, MA:Thc MIT Press
Fodor, Jerry 1988. A situated grandmother. Mind & Language 2,64-81.
Fodor, Jerry 1998. Concepts. Oxford: Oxford University Press
Gamut, L.T.F. 1991. Logic, Language and Meaning. Chicago, IL:Tlic University of Chicago Press
Geach, Peter 1972. Logic Matters. Oxford: Oxford University Press
Ginzburg, Jonathan & Robin Cooper 2004. Clarification, ellipsis and the nature of contextual
updates Linguistics & Philosophy 27,297-365.
Grice, Paul 1989. Studies in the Way of Words. Cambridge, MA: Harvard University Press
Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic predicate logic. Linguistics & Philosophy
14.39-100.
Hamm. Fritz, Hans Kamp & Michiel van Lambalgen 2006. There is no opposition between formal
and cognitive semantics Theoretical Linguistics 32,1—40.
Heim, Irene 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation.
University of Massachusetts Amherst, MA. Reprinted: Ann Arbor, MI: University Microfilms
Hilbert, David & Bernays Paul 1939. Grundlagen der Mathematik. Vol. II. 2nd edn. Berlin: Springer.
Hintikka, Jaako 1974. Quantifiers vs quantification theory. Linguistic Inquiry 5,153-177.
Jackcndoff, Ray 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford:
Oxford University Press
Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk, T. Janssen &
M. Stokhof (eds). Formal Methods in the Study of Language. Amsterdam: Mathematical Centre,
277-322.
Kamp, Hans 1996. Discourse Representation Theory and Dynamic Semantics: Representational
and nonrepresentational accounts of anaphora. French translation in: F. Corblin & C. Gardent
(eds). Interpreter en contexte. Paris: Hermes 2005.
Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer.
Katz, Jcrrold 1972. Semantic Theory. New York: Harper & Row.
Katz, Jerrold & Jerry Fodor 1963.The structure of a semantic theory. Language 39,170-210.
Kempson, Ruth, Wilfried Meyer-Viol & Dov Gabbay 2001. Dynamic Syntax: The Flow of Language
Understanding. Oxford: Blackwell.
Lambek, Joachim 1958. The mathematics of sentence structure. American Mathematical Monthly
65,154-170.
Lappin. Shalom & Christopher Fox 2005. Foundations of Intensional Semantics. Oxford: Blackwell.
van Lcuscn, Noor & Rcinhard Muskcns 2003. Construction by description in discourse
representation. In: J. Peregrin (ed). Meaning: The Dynamic Turn, chap. 12. Amsterdam: Elsevier,
33-65.
Lewis David 1970. General semantics Synthese 22,18-67. Reprinted in: G. Harman & D. Davidson
(eds). Formal Semantics of Natural Language. Dordrecht: Reidcl, 1972,169-218.
Martin-L6f, Per 1984. Intuitionistic Type Theory, Naples: Bibliopelis
11. Formal semantics and representationalism 241
Merchant, Jason 2004. Fragments and ellipsis. Linguistics & Philosophy 27,661-738.
Montague, Richard 1970. English as a formal language. In: Bruno Visentini et al. (eds). Linguaggi
nella Societ a e nella Tecnica. Milan: Edizioni di Comunita, 189-224. Reprinted in: R.Thomason
(ed.). Formal Philosophy. Selected Papers of Richard Montague. New Haven, CT: Yale
University Press, 1974.188-221.
Morgan, Jerry 1973. Sentence fragments and the notion 'sentence'. In: H. R. Kahane, R. Kahane &
B. Kachru (eds.). Issues in Linguistics. Urbana, IL: University of Illinois Press, 719-751.
Morrill, Glyn 1990. Intensionality and boundedness Linguistics & Philosophy 13,699-726.
Morrill, Glyn 1994. Type Logical Grammar. Dordrecht: Foris
Morrill, Glyn 2010. Categorial Grammar: Logical Syntax, Semantics, and Processing. Oxford:
Oxford University Press
Morrill. Glyn & Anna Gavano 2004. On aphasic comprehension and working memory load. In:
Proceedings of Categorial Grammars: An Efficient Tool for Natural Language Processing.
Montpcllicr, 259-287.
Muskcns, Rcinhard 1996. Combining Montague semantics and Discourse Representation.
Linguistics & Philosophy 19.143-186.
Muskens, Reinhard, Johan van Benthem & Albert Visser 1997. Dynamics. In: J. van Benthem &
A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 587-648.
Partee, Barbara 1973. Some structural analogies between tenses and pronouns in English. The
Journal of Philosophy 70,601-609.
Piwek, Paul 1998. Logic, Information and Conversation. Ph.D dissertation. University of
Technology, Eindhoven.
Prawitz, Dag 1965. Natural Deduction: A Proof-Theoretical Study. Uppsala: Ahnqvist & Wiksell.
Purver, Matthew, Ronnie Cann & Ruth Kempson 2006. Grammars as parsers: The dialogue
challenge. Research on Language and Computation 4.289-326.
Pustcjovsky, James 1995. The Generative Lexicon. Cambridge, MA:Tlic MIT Press
Ranta, Aarne 1994. Type-Theoretic Grammar. Oxford: Clarendon Press
Reyle.Uwe 1993. Dealing with ambiguities by underspecification: Construction, representation and
deduction. Journal of Semantics 10,123-179.
Sag, Ivan, Tom Wasow & Emily Bender 2002. Head Phrase Structure Grammar: An Introduction.
2nd cdn. Stanford, CA: CSLI Publications.
Sperber, Dan & Deirdre Wilson 1986/1995. Relevance: Communication and Cognition. Oxford:
Blackwell.
Stainton. Robert 2006. Words and Thoughts. Oxford: Oxford University Press
Stalnaker, Robert 1970. Pragmatics Synthese 22,272-289.
Stalnaker, Robert 1999. Context and Content. Oxford: Oxford University Press
Thomason, Richmond (ed.) 1974. Formal Philosophy. Selected Papers of Richard Montague. New
Haven, CT: Yale University Press
van Eijk, Jan & Hans Kamp 1997. Representing discourse in context. In: J. van Benthem & A. ter
Meulen (eds). Handbook of Logic and Language. Amsterdam: Elsevier, 179-239.
von Heusinger, Klaus 1997. Definite descriptions and choice functions In: S. Akama (ed.). Logic,
Language and Computation. Dordrecht: Kluwer, 61-92.
von Heusinger, Klaus & Ruth Kempson (eds) 2004. Choice Functions in Linguistic Theory. Special
issue of Research on Language and Computation 2.
Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated by Gillian Anscombe. 3rd edn.
Oxford: Blackwell, 1967.
Ruth Kempson, London (United Kingdom)
III. Methods in semantic research
12. Varieties of semantic evidence
1. Introduction: Aspects of meaning and possible sources of evidence
2. Ficldwork techniques in semantics
3. Communicative behavior
4. Behavioral effects of semantic processing
5. Physiological effects of semantic processing
6. Corpus-linguistic methods
7. Conclusion
8. References
Abstract
Meanings are the most elusive objects of linguistic research. The article summarizes the
type of evidence we have for them: various types of metalinguistic activities like
paraphrasing and translating, the ability to name entities and judge sentences true or false, as
well as various behavioral and physiological measures such as reaction time studies, eye
tracking, and electromagnetic brain potentials. It furthermore discusses the specific type
of evidence we have for different kinds of meanings, such as truth-conditional aspects,
presuppositions, implicatures, and connotations.
1. Introduction: Aspects of meaning and possible sources
of evidence
1.1. Why meaning is a special research topic
If we ask an astronomer for evidence for phosphorus on Sirius, she will point out that
spectral analysis of the light from this star reveals bands that are characteristic of this
element, as they also show up when phosphorus is burned in the lab. If we ask a linguist
the more pedestrian question for evidence that a certain linguistic expression - say, the
sentence The quick brown fox jumps over the lazy dog - has meaning, answers arc
probably less straightforward and predictable. He might point out that speakers of English
generally agree that it has meaning - but how do they know? So it is perhaps not an
accident that the study of meaning is the subfield of linguistics that developed only very
late in the 2500 years of history of linguistics, in the 19th century (cf. article 9 (Nerlich)
Emergence of semantics).
The reason why it is difficult to imagine what evidence for meaning could be is that
it is difficult to say what meaning is. According to a common assumption,
communication consists in putting meaning into a form, a form that is then sent from the speaker
to the addressee (the conduit metaphor of communication, see Lakoff & Johnson 1980).
Aspects that are concerned with the form of linguistic expressions and their material
realization as studied in syntax, morphology, phonology and phonetics; they are generally
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 242-268
12. Varieties of semantic evidence
243
more tangible than aspects concerned with their content. But semanticists in general
hold that semantics, the study of linguistic meaning, indeed has an object to study that
is related but distinct from the forms in which it is encoded, from the communicative
intentions of the speaker and from the resulting understanding of the addressee.
1.2. Aspects of meaning
The English noun meaning is multiply ambiguous, and there are several readings that are
relevant for semantics. One branch of investigation starts out with meaning as a notion
rooted in communication. Grice (1957) has pointed out that we can ask what a speaker
meant by uttering something, and what the utterance means that the speaker uttered.
Take John F. Kennedy's utterance of the sentence Ich bin ein Berliner on June 26,1963.
What JFK meant was that in spite of the cold war, the USA would not surrender West
Berlin - which was probably true. What the utterance meant was that JFK is a citizen
of Berlin, which was clearly false. Obviously, the speaker's meaning is derived from the
utterance meaning and the communicative situation in which it was uttered. The way
how this is derived, however, is less obvious - cf. article 2 (Jacob) Meaning, intentionality
and communication, especially on particularized conversational implicatures.
A complementary approach is concerned with the meaning of linguistic forms,
sometimes called literal meanings, like the meaning of the German sentence Ich bin ein
Berliner which was uttered by JFK to convey the intended utterance meaning. With forms,
one can distinguish the following aspects of meaning (cf. also article 3 (Textor) Sense and
reference, and article 4 (Abbott) Reference). The character is the meaning independent
from the situation of utterance (like speaker, addressee, time and location - see Kaplan
1978).The character of the sentence used by JFK is that the speaker of the utterance is a
citizen of Berlin at the time of utterance. If we find a sticky note in a garbage can, reading
/ am back in five minutes - where wc don't know the speaker, the time, or location of the
utterance - we just know the character. A character, supplied with the situation of
utterance, gets us the content or intension (Frege's Sinn) of a linguistic form. In the proper
historical context, JFK's utterance has the content that JFK is a citizen of Berlin on June
26, 1963. (We gloss over the fact here that this first has to be decoded as a particular
speech act, like an assertion.) This is a proposition, which can be true or false in
particular circumstances.The extension or reference of an expression (Frege's Bedeutung) is
its content when applied to the situation of utterance. In the case of a proposition, this
is a truth value; in the case of a name or a referring expression, this is an entity.
Sometimes meaning is used in a more narrow sense, as opposed to reference; here I have used
meaning in an encompassing way.
Arguably, the communicative notion of meaning is the primary one. Meaning is
rooted in the intention to communicate. But human communication crucially relies on
linguistic forms, which are endowed with meaning as outlined, and for which speakers
can construct meanings in a compositional way (see article 6 (Pagin & Westerstahl)
Compositionality). Semantics is concerned with the meaning of linguistic forms, a
secondary and derived notion. But the use of these forms in communication is crucial data
to rc-cnginccr the underlying meaning of the forms. The ways how literal meanings arc
used in acts of communication and their effects on the participants, in general, is part of
pragmatics (cf. article 88 (Jaszczolt) Semantics and pragmatics).
244
III. Methods in semantic research
1.3. Types of access to meaning
Grounding meaning of linguistic expressions in communication suggests that there are
various kinds of empirical evidence for meaning. First, we can observe the external
behavior of the participants in, before and after the act of communication. Some kinds
of behavior can be more directly related to linguistic meaning than others, and hence
will play a more central role in discovering underlying meaning. For example,
commands often lead to a visible non-linguistic reaction, and simple yes/no-quesiions will
lead to linguistic reactions that are easily decodable. Secondly, we can measure aspects
of the external behavior in detail, like the reaction times to questions, or the speed
in which passages of text are read (cf. article 102 (Frazier) Meaning in psycholinguis-
tics). Third, we can observe physiological reactions of participants in communication,
like the changing size of their pupil, the saccades of the eyes reading a text, the eye
gaze when presented to a visual input together with a spoken comment, or the
electromagnetic field generated by their cortex (cf. article 15 (Bott, Featherston, Rad6 &
Stolterfoht) Experimental methods, and article 105 (Pancheva) Meaning in neuro-
linguistics). Fourth, we can test hypotheses concerning meaning in the output of
linguistic forms itself, using statistical techniques applied to corpora (cf. article 109 (Katz)
Semantics in corpus linguistics).
1.4. Is semantics possible?
The reader should be warned that correlations between meanings and observable
phenomena like non-linguistic patterns of behavior or brain scans do not guarantee that the
study of meaning can be carried out successfully. Leonard Bloomfield, a behavioralist,
considered the observable effects so complex and interwoven with other causal chains
that the science of semantics is impossible:
Wc have defined the meaning of a linguistic form as the situation in which the speaker
utters it and the response which it calls forth in the hearer. [...] In order to give a
scientifically accurate definition of meaning for every form of a language, we should have to have a
scientifically accurate knowledge of everything in the speaker's world. The actual extent of
human knowledge is very small compared to this. [...]The statement of meanings is
therefore the weak point in language-study, and will remain so until human knowledge advances
very far beyond its present state.
(Bloomfield 1933: 139f)
We could imagine similar skepticism concerning the science of semantics from a
neuroscientist believing that meanings are activation patterns of our head. The huge
number of such patterns, and their variation across individuals that we certainly have
to expect, seems to preclude that they will provide the foundation for the study of
meaning.
Despite Bloomfield's qualms, the field of semantics has flourished. Where he went
wrong was in believing that we have to consider the whole world of the speaker, or the
speaker's whole brain.There arc ways to cut out phenomena that stand in relation to, and
bear evidence for, meanings in much more specific ways. For example, we can investigate
whether a specific sentence in a particular context and describing a particular situation
is considered true or false; and derive from that hypotheses about the meaning of the
12. Varieties of semantic evidence
245
sentence and the meaning of the words involved in that sentence.The usual methods of
science - forming hypotheses and models, deriving predictions, making observations and
constructing experiments that support or falsify the hypotheses - have turned out to be
applicable to linguistic semantics as well.
1.5. Native semantic activities
There are many native activities that directly address aspects of meaning. When Adam
named the animals of paradise he assigned expressions to meanings, as we do today
when naming things or persons or defining technical terms. We explain the meaning of
words or idioms by paraphrasing them - that is, by offering different expressions with
the same or at least similar meanings. We can refer to aspects of meaning: We say that
one expression means the same as another one, or its opposite; we say that one
expression refers to a subcase of another. As for speaker's meanings, we can elaborate on
what someone meant by such-and-such words, and can point out differences between
that and what the words actually meant or how they were understood by the addressee.
Furthermore, for human communication to work it is crucial that untruthful use of
language can be detected, and liars can be identified and punished. For this, a notion
of what it means for a sentence or text to be true or false is crucial. Giving a statement
at court means to know what it means to speak the truth, and the whole truth. Hence,
it seems that meanings are firmly established in the pre-scientific ways we talk about
language.
We can translate, that is, rephrase an expression in one language by an expression
in another while keeping the meaning largely constant. We can teach the meaning of
words or expressions to second language learners or to children acquiring their native
language - even though both groups, in particular children in first language acquisition,
will acquire meanings to a large part implicitly, by contextual clues. The sheer
possibility of translation has been enormously important for the development of humankind.
We find records of associated practices, like the making of dictionaries, dating back to
Sumerian-Akkadian glossaries of 2300 BC.
These linguistic activities show that meaning is a natural notion, not a theoretical
concept.They also provide important source of evidence for meaning. For example, it would
be nearly impossible to construct a dictionary in linguistic field work without being able
to ask for what a particular word means, or how a particular object is called. As another
example, it would be foolish to dismiss the monumental achievements of the art of
dictionary writing as evidence for the meaning of words.
But there are problems with this kind of evidence that one must be aware of. Take
dictionary writing.Traditional dictionaries are often unsystematic and imprecise in their
description of meaning. They do not distinguish systematically between contextual (or
"occasional") meaning and systematic meaning, nor do they keep ambiguity and
polysemy apart in a rigorous way. They often do not distinguish between linguistic aspects
and more general cultural aspects of the meaning and use of words. We in re ich (1964)
famously criticized the 115 meanings of the verb to turn that can be found in Webster's
Third Dictionary. Lexicography has greatly improved since then, with efforts to define
lexical entries by a set of basic words and by recognizing regularities like systematic
variations between word meanings (e.g. the intransitive use of transitive verbs, or the
polysemy triggered in particular contexts of use).
246
III. Methods in semantic research
1.6. Talking about meanings
Pre-scientific ways to address meanings rely on an important feature of human
language, its self reference - we can use language to talk about language. This feature is
so entrenched in language that it went unnoticed until logicians like Frege, Russell and
Tarski, working with much more restricted languages, pointed out the importance of
the metalanguage / object language distinction. It is only quite recently that we
distinguish between regular language, reference to expressions, and reference to meanings by
typographical conventions and write things like "XXX means'YYY'".
The possibility to describe meanings may be considered circular - as when Tarski
states that 'Snow is white' is true if and only if snow is white. However, it does work under
certain conditions. First, the meaning of an unknown word can be described, or at least
delimited, by an expression that uses known words; this is the classical case of a
definition. If we had only this procedure available as evidence for meaning, things would be
hopeless because we have to start somewhere with a few expressions whose meanings
arc known; but once we have those, they can act as bootstraps for the whole lexicon of
a language. The theory of Natural Semantic Metalanguage even claims that a small set
of concepts (around 200) and a few modes of combining them are sufficient to achieve
access to the meanings of all words of a language (Goddard 1998).
Second, the meanings of an ambiguous word or expression can be paraphrased by
expressions that have only one or the other meaning. This is common practice in
linguistic semantics, e.g. when describing the meaning of He saw that gasoline can explode
as (a) 'He saw an explosion of a can of gasoline' and (b) 'He recognized the fact that
gasoline is explosive'. Speakers will generally agree that the original sentence has the
two meanings teased apart by the paraphrases. There are variations on this access to
meaning. For example, we might consider a sentence in different linguistic contexts
and observe differences in the meaning of the sentence by recognizing that it has to be
paraphrased differently. For the paraphrases, we can use a language that has specific
devices that help to clarify meanings, like variables. For example, we can state that
a sentence like Every man likes a woman that likes him has a reading 'Every man x
likes a woman y that likes x', but not 'There is a woman y that every man x likes and
that likes x'.The disadvantage of this is that the paraphrases cannot be easily grasped
by naive native speakers. In the extreme case, we can use a fully specified formal
language to specify such meanings, such as first-order predicate logic; the existing reading
of our example then could be specified as Vx[man(x) -> 3y[woman(y) a likes(x, y) a
likes (y,x)]].
Talking about meanings is a very important source of evidence for meanings.
However, it is limited not only by the problem mentioned above, that it describes
meanings with the help of other meanings. There are many cases where speakers cannot
describe the meanings of expressions because this task is too complex - think of children
acquiring their first language, or aphasics loosing the capacity of language. And there
are cases in which the description of meanings would be too complex for the linguist.
We may think of first fie Id work sessions in a research project on an unknown language.
Somewhat closer to home, we may also think of the astonishingly complex meanings of
natural language determiners such as a, some, a certain, a particular, a given or indefinite
this in there was this man standing at the door whose meanings had to be teased apart by
careful considerations of their acceptability in particular contexts.
12. Varieties of semantic evidence
247
2. Fieldwork techniques in semantics
In this section we will discuss various techniques that have been used in linguistic field-
work, understood in a wide sense as to include work on one's own language and on
language acquisition, for example. There are a variety of sources that reflect on
possible procedures; for example, the authors in McDaniel, McKee & Cairns (eds.) (1996)
discuss techniques for the investigation of syntax in child language, many of which also
apply to semantic investigations, and Matthewson (2004) is concerned with techniques
for semantic research in American languages which, of course, are applicable for work
on other languages as well (cf. also article 13 (Matthewson) Methods in cross-linguistic
semantics, and article 103 (Crain) Meaning in first language acquisition).
2.1. Observation, transcription and translation
The classical linguistic fieldwork method is to record conversations and texts in natural
settings, transcribe them, and assign translations, ideally with the help of speakers that
are competent in a language that they share with the investigator. In classical American
structuralism, this has been the method de rigueur, and it is certainly of great importance
when we want to investigate natural use of language.
However, this technique is also severely limited. First, even large text collections
may not provide the evidence that distinguishes between different hypothesis. Consider
superlatives in English; is John is the tallest student true if John and Mary both are
students that are of the same height and taller than any other student? Competent English
speakers say no, superlatives must be unique - but it might be impossible to find out on
the basis of a corpus of non-elicited text.
Secondly, there is the problem of translation. Even when we grant that the translation
is competent according to usual standards, it is not clear how we should deal with
distinctions in the object language that are not easily made in the meta language. For example,
Matthewson (2004) shows that in Menominee (Algonquian, Northern Central United
States of America), inalienable nouns can have a prefix me-indicating an arbitrary owner, as
contrasted with a prefix o- indicating a specific 3rd person owner.This difference could not
be derived from simple translations of Menominee texts into English, as English does not
make this distinction.There is also the opposite problem of distinctions that are forced on
us by the meta language; for example, pronouns in English referring to humans distinguish
two genders, which may not be a feature of the object language. Hence, as Matthewson puts
it, translations should be seen as clues for semantic analysis, rather as its result.
Translations, or more generally paraphrases, are problematic for more fundamental
reasons as evidence for meaning, as they explain the meaning of an expression a by way
of the meaning of an expression (3, hence it presupposes the existence and knowledge
of meanings, and a judgment of similarity of meaning. However, it appears that without
accepting this type of hcrmcncutic circle the study of semantics could not get off the
ground. But there are methods to test hypotheses that have been generated first with the
help of translations and paraphrases by independent means.
2.2. Pointing
Pointing is a universal non-linguistic human behavior that aligns with aspects of
meanings of certain types of linguistic expressions (cf. also article 90 (Diessel) Deixis and
248
III. Methods in semantic research
demonstratives). Actually, pointing may be as characteristic for humans as language, as
humans appear to be the only apes that point (cf.Tomascllo 2008).
Pointing is most relevant for referring expressions, with names as the prototypical
example (cf. article 4 (Abbott) Reference). These expressions denote a particular entity
that is also identified by the pointing gesture, and hence pointing is independent
evidence for the meaning of such expressions. For example, if in a linguistic fieldwork
situation an informant points to a person and utters Max, this might be taken to be the name
of that person. We can conclude that Max denotes that person, in other words, that the
meaning of Max is the person pointed at.
Simple as this scenario is, there are certain prerequisites for it to work. For example,
the pointing gesture must be recognized as such; in different cultures, the index finger,
the stretched-out hand, or an upward movement of the chin may be used, and in some
cultures there may be a taboo against pointing gestured when directed at humans.
Furthermore, there must be one most salient object in the pointing cone (cf. Kranstedt et al.
2006) that will then be identified.This presupposes a pre-linguistic notion of objects, and
of saliency.This might work well when persons or animals are pointed at, who are cog-
nitively highly salient. But mistakes can occur when there is more than one object in the
pointing cone that are equally salient. When Captain Cook on his second voyage visited
an island in the New Hebrides with friendly natives and tried to communicate with them,
he pointed to the ground. What he heard was tanna, which he took as the name of the
island, which is still known under this name. Yet the meaning of tana in all Melanesian
languages is simply "earth".The native name for Tanna is reported to be parei (Gregory
2003); it is not in use anymore.
Pointing gestures may also help to identify the meaning of common nouns, adjectives,
or verbs - expressions that denote sets of entities or events. The pointing is directed
towards a specimen, but reference is at entities of the same type as the one pointed at.
There is an added source of ambiguity or vagueness here: What is "the same type as"?
On his first voyage, Captain Cook made landfall in Australia, and observed creatures
with rabbit-like ears hopping on their hind legs. When naturalist Joseph Banks asked the
local Guugu Yimidhirr people how they are called, presumably with the help of some
pointing gesture, he was the first to record the word kangaroo. But the word gangurru
actually just refers to a large species of black kangaroo, not to the marsupial family in
general (cf.Haviland 1974).
Quine (1960: ch. II), in an argument to discount the possibility of true translation,
famously described the problems that even a simple act like pointing and naming might
involve. Assume a linguist points to a white rabbit, and gets the response gavagai. Quine
asks whether this may mean 'rabbit', or perhaps 'animal', or perhaps 'white', or
perhaps even 'non-detached rabbit parts'. It also might mean 'rabbit stage', in which case
repeated pointing will identify different reference objects. All these options are
theoretical possibilities under the assumption that words can refer to arbitrary aspects of
reality. However, it is now commonly assumed that language is build on broad cognitive
commonalities about entities and classes. There is evidence that pre-linguistic babies
and higher animals have concepts of objects (as contrasted to substances) and animals
(as contrasted to lifeless beings) that preclude a conceptualization of a rabbit as a set
of rabbit legs, a rabbit body, a rabbit head and a pair of rabbit ears moving in unison.
Furthermore, there is evidence that objects are called with terms of a middle layer of a
12. Varieties of semantic evidence
249
taxonomic hierarchy, the so-called "generic level", avoiding terms that are too general
or too specific (cf. Berlin, Breedlove & Raven 1973). Hence a rabbit will not be called
thing in English, or animal, and it will not be called English angora either except
perhaps by rabbit breeders that work with a different taxonomy. This was the reason for
Captain Cooks misunderstanding of gangurru; the native Guugu Yimidhirr people had
a different, and more refined, taxonomic hierarchy for Australian animals, where species
of kangaroo formed the generic level; for the British visitors the family itself belonged
to that level.
Pointing, or related gestures, have been used to identify the meaning of words. For
example, in the original study of Berlin & Kay (1969) on color terms subjects were
presented with a two-dimensional chart of 320 colors varying according to spectral color
and saturation. The task was to identify the best specimen for a particular color word
(the focal color) and the extent to which colors fall under a particular color word. Similar
techniques have been used for other lexical fields, for example for the classification of
vessels using terms like cup, mug or pitcher (cf. Kempton 1981;see Fig. 12.1).
1
oaooaooo
ODOOOOoo
PffOPOOOc?
o»
Cf CPODDOOoo
V'uvuovnz?
VVVVWvk?
Coffee cup
Fig. 12.1: Vessel categories after Kempton (1981:103). Bold lines: Identification of
>80% agreement between subjects for mug and coffee cup. Dotted lines:
Hypothetical concept that would violate connectedness and convexity (see below)
Tests of this type have been carried out in two ways: Either subjects were presented with
a field of reference objects ordered after certain dimensions; e.g. Berlin & Kay (1969)
presented colors ordered after their wave length (the order they present themselves in
a rainbow) and after their saturation (with white and black as the extremes). Kempton's
250
III. Methods in semantic research
vessels were presented as varying in two dimensions: The relation between the upper
and lower diameters, and the relation between height and width. When judging whether
certain items fall under a term or not, the neighboring items that already have been
classified might influence the decision. Another technique, which was carried out in the
World Color Survey (see Kay et al. 2008), presented color chips in random order to avoid
this kind of influence.
The pointing test can be used in two ways: Either we point at an entity in order to
get the term that is applicable to that entity, or we have a term and point to various
objects to find out whether the term is applicable.The first approach asks an onomasio-
logical question; it is concerned with the question: How is this thing called? The second
approach asks the complementary semasiological question: What docs this expression
mean?
Within a Fregean theory of meaning, a distinction is made between reference and
sense (cf. article 3 (Textor) Sense and reference). With pointing to concrete entities we
gain access to the reference of expressions, and not to the sense, the concept that allows
us to identify the reference. But by varying potential reference objects we can form
hypotheses about the underlying concept, even though we can never be certain that by
a variation of reference objects we will uncover all aspects of the underlying concept.
Goodman (1955) illustrated this with the hypothetical adjective grue that, say, refers
to green objects when used before the year 2100 and to blue objects when used after
that time; no pointing experiment executed before 2100 could differentiate grue from
green. Meaning shifts like that do happen historically: The term Scotia referred to
Ireland before the 11th century, and after to Scotland; the German term gelb was reduced
in extension when the term orange entered the language (cf. the traditional local term
Gelbe Ruben 'yellow turnips' for carrots). But these are language changes, and not
meanings of items within a language. A meaning like the hypothetical grue appears as
strange as a reference towards non-detached rabbit parts. We work under the hypothesis
that meanings of lexical items are restricted by general principles of uniformity over
time. There are other such principles that restrict possible meanings, for example
connectedness and, more specifically, convexity (Gardenfors 2000). In the vessel example
above, where potential reference objects were presented following certain dimensions,
we expect that concepts do not apply to discontinuous areas and have the general
property that when x is an a and y is an a, then everything in between x and y is an a as
well.The dotted lines in Fig. 12.1 represent an extension of a concept that would violate
connectedness and convexity.
In spite of all its problems, pointing is the most elementary kind of evidence for
meaning without which linguistic field work, everyday communication and language
acquisition would be impossible. Yet it seems that little research has been done on
pointing and language acquisition, be it first or second. Its importance, however, was
recognized as early as in St. Augustin's Confessions (5th century AD), where he writes
about his own learning of language:
When they [the elders] called some thing by name and pointed it out while they spoke.
I saw it and realized that the thing they wished to indicate was called by the name they then
uttered. And what they meant was made plain by the gestures of their bodies, by a kind of
natural language, common to all nations [...]
(Confessions, Book 1:8)
12. Varieties of semantic evidence
251
2.3. Truth value judgments (TVJ)
Truth value judgments do the same job for the meaning of sentences as pointing does for
referring expressions. In the classical setup, a situation is presented with non-linguistic
means together with a declarative sentence, and the speaker has to indicate whether
this sentence is true or false with respect to the situation.This judgement is an linguistic
act by itself, so it can be doubted that this provides a way to base the study of meaning
wholly outside of language. But arguably, agreeing or disagreeing arc more primitive
linguistic acts that may even rely on simple gestures, just as in the case of pointing.
The similarity between referential expressions - which identify objects - and
declarative sentences - which identify states of affairs in which they are true - is related to Frege's
identification of the reference of sentences with their truth value with respect to a
particular situation (even though this was not the original motivation for this identification,
cf. Frege 1892). This is reflected in the two basic extensional types assumed in sentence
semantics: Type e for entities referred to by names, and type t for truth values referred
to by sentences. But there is an important difference here:There are many distinct objects
- De, the universe of discourse, is typically large; but there are just two (basic) truth
values - D(, the set of truth values, standardly is {0, 1), falsity and truth. Hence we can
distinguish referring expressions more easily by their reference than wc can distinguish
declarative sentences. One consequence of this is that onomasiological tests do not work.
We cannot present a "truth value" and expect a declarative sentence that is true. Also,on
presenting a situation in a picture or a little movie we cannot expect that the linguistic
reactions are as uniform as when we, say, present the picture of an apple. But the sema-
siological direction works fine: We can present speakers with a declarative sentence and
a situation or a set of situations and ask whether the sentence is true in those situations.
Truth values are not just an ingenious idea of language philosophers to reduce the
meaning of declarative sentences to judgments whether a sentence is true or false in
given situations.They are used pre-linguistically, e.g. in court procedures. Within
linguistics, they are used to investigate the meaning of sentences in experiments and in linguistic
field work. They have been particularly popular in the study of language acquisition
because they require a rather simple reaction by the child that can be expected even
from two-year olds.
The TVJ task comes in two flavors. In both, the subjects are presented with a sentence
and a situation, specified by a picture, an acted-out scene with hand puppets or a movie,
or by the actual world provided that the subjects have the necessary information about
it. In the first version, the subjects should simply state whether the sentence is true or
false. This can be done by a linguistic reaction, by a gesture, by pressing one of two
buttons, or by ticking off one of two boxes. We may also record the speed of these reactions
in order to get data about the processing of expressions. In the second version, there is a
character, e.g. a hand puppet, that utters the sentence in question, and the subjects should
reward or punish the character if the sentence is true or false with respect to the situation
presented (see e.g.Crain 1991). A reward could be, for example, feeding the hand puppet
a cookie. Interestingly, the second procedure taps into cognitive resources of children
that are otherwise not as easily accessible.
Gordon (1998), in a description of TVJ in language acquisition, points out that this
task is quite natural and easy. This is presumably so because truth value judgment is an
elementary linguistic activity, in contrast to,say,grammaticality judgments.TVJ also puts
252
III. Methods in semantic research
less demands on answers than wh-questions (e.g. Who chased the zebra? vs. Did the lion
chase the zebra?) This makes it the test of choice for children and for language-impaired
persons.
But there are potential problems in carrying out TVJ tasks. For example, Crain et
al. (1998) have investigated the phenomenon that children seem to consider a
sentence like Every farmer is feeding a donkey false if there is a donkey that is not fed
by the farmer. They argue that children are confused by the extra donkey and try to
reinterpret the sentence in a way that seems to make sense. A setup in which attention
is not drawn to a single object might be better; even adding a second unfed donkey
makes the judgments more adult-like. Also, children respond better to scenes that are
acted out than to static pictures. In designing TVJ experiments, one should consider
the fact that positive answers are given quicker and more easily than negative ones.
Furthermore, one should be aware that unconscious reactions of the experimenter may
provide subtle clues for the "right" answer (the "Clever Hans" effect, named after the
horse that supposedly could solve arithmetic problems). For example, when acting out
and describing a scene, the experimenter may be more hesitant when uttering a false
statement.
2.4. TVJ and presuppositions/implicatures
There are different aspects of meaning beyond the literal meaning, such as
presuppositions, conventional implicatures, conversational implicatures and the like, and it would
be interesting to know how such meaning components fare in TVJ tasks.Take
presuppositions (cf. also article 91 (Beaver & Geurts) Presupposition).Theories such as Stalnaker
(1974) that treat them as preconditions of interpretation predict that sentences cannot
be interpreted with respect to situations that violate their presuppositions.The TVJ test
does not seem to support this view. The sentence The dog is eating the bone will most
likely be judged true with respect to a picture showing two dogs, where one of the dogs
is eating a bone. This may be considered evidence for the case of accommodation, which
consists of restricting the context to the one dog that is eating a bone. Including a third
option or truth value like "don't know" might reveal the specific meaning contribution
of presuppositions
As for conversational implicature (cf. article 92 (Simons) Implicature) we appear to
get the opposite picture.TVJ tests have been used to check the relevance of scalar
implicatures. For example, Noveck (2001), building on work of Smith (1980), argued that
children are "more logical" than adults because they can dissociate literal meanings from
scalar implicatures. Children up to 11 years react to statements like some giraffes have
long necks (where the picture shows that all giraffes have long necks) with an affirmative
answer, while most adults find them inappropriate.
2.5. TVJ variants: Picture selection and acting out
The picture selection task has been applied for a variety of purposes beyond truth values
(cf. Gerken & Shady 1998). But for the purpose of investigating sentence meanings, it
can be seen as a variant to the TVJ task:The subject is exposed to a declarative sentence
and two or more pictures and has to identify the picture for which the sentence is true.
It is good to include irrelevant pictures as filler items, which can test the attention of
12. Varieties of semantic evidence
253
the subjects. The task can be used to identify situations that fit best to a sentence. For
example, for sentences with presuppositions it is expected that a picture will be chosen
that does not only satisfy the assertion, but also the presupposition. So, if the sentence
is The dog is eating a bone, and if a picture with one or two dogs is shown, then
presumably the picture with one dog will be preferred. Also, sentences whose scalar implicature
is satisfied will be preferred over those for which this is not the case. For example, if the
sentence is some giraffes have long necks, a picture in which some but not all giraffes
have long necks will be preferred over a picture in which all giraffes are long-necked.
Another relative of the TVJ task is the Act Out task in which the subject has to "act
out" a sentence with a scene such that the sentence is true. Again, we should expect
that sentences arc acted out in a way as to satisfy all meaning components - assertion,
presupposition, and implicature - of a sentence.
2.6. Restrictions of the TVJ methodology
One restriction of the various TVJ methodologies appears to be that they just target
expressions that have a truth value, that is, sentences. However, they allow to investigate
the meaning of subsentential expressions, under the assumption that the meaning of
sentences is computed in a compositional way from the meanings of their syntactic parts
(cf. article 6 (Pagin & Westerstahl) Compositionality). For example, the meaning of
spatial presuppositions like on, on top of, above or over can be investigated with scenes in
which objects are arranged in particular ways.
Another potential restriction of TVJ as discussed so far is that we assumed that the
situations are presented by pictures. Language is not restricted to encoding
information that can be represented by visual stimuli. But we can also present sounds, movie
scenes or comic strips that represent temporal developments, or even olfactory and
tactile stimuli to judge the range of meanings of words (cf. e.g. Majid et al. 2006 for verbs of
cutting and breaking).
TVJ is also difficult to apply when deictic expressions arc involved, as they often
require reference to the speaker, who is typically not part of the picture. For example,
in English the sentence The ball is in front of the tree means that the ball is in between
the speaker that faces the tree and the tree; the superficially corresponding sentence in
Hausa means that the ball is behind the tree (cf. Hill 1982). In English, the tree is seen
as facing the speaker, whereas in Hausa the speaker aligns with the tree (cf. article 98
(Pederson) The expression of space). Such differences are not normally represented in
pictures, but it can be done. One could either represent the picture from a particular
angle, or represent a speaker with a particular position and orientation in the picture
itself and ask the subject to identify with that figure.
The TVJ technique is systematically limited for sentences that do not have truth
values, such as questions, commands, or exclamatives. But we can generalize it to a
judgment of appropriateness of sentences given a situation, which sometimes is done to
investigate politeness phenomena and the like. There are also subtypes of declarative
sentences that are difficult to investigate with TVJ, namely modal statements, e.g. Mary
must be at home, or habituals and generics that allow for exceptions, like Delmer walks
to school, or Birds fly (cf. article 47 (Carlson) Genericity). This is arguably so because
those sentences require to consider different possible worlds, which cannot be easily
represented graphically.
254
III. Methods in semantic research
2.7. TVJ with linguistic presentation of situation
The TVJ technique can be applied for modal or generic statements if we present the
situation linguistically, by describing it. For example, we could ask whether Delmer walks
to school is true if Delmer walks every day except Fridays, when his father gives him a
ride. Of course, this kind of linguistic elicitation technique can be used in nearly all the
cases described so far. It has clear advantages: Linguistic descriptions arc easy and cheap
to produce and can focus the attention of the subject to aspects that are of particular
relevance for the task. For this reason it is very popular for quick elicitations whether a
sentence can mean such-and-such.
Matthewson (2004) argues that elicitation is virtually the only way to get to more
subtle semantic phenomena. She also argues that it can be combined with other
techniques, like TVJ and grammaticality judgments. For example, in investigating aspect
marking in St'aYimcets Salish (Salishan, Southwestern Canada) the sentence Have you
been to Seattle? is translated using an adverb Ian that otherwise occurs with the meaning
'already'; a follow-up question could be whether it is possible to drop Ian in this context,
retaining roughly the same meaning.
The linguistic presentation of scenes comes with its own limitations.There is the
foundational problem that we get at the meaning of an expression a by way of the meaning
of an expression (3. It cannot be applied in case of insufficient linguistic competence, as
with young children or language-impaired persons.
2.8. Acceptability tests
In this type of test, speakers are given an expression and a linguistic context and/or
an description of an extralinguistic situation, and are asked whether the expression
is acceptable with respect to this context or the situation. With it, we can explore the
felicity conditions of an expression, which often are closely related to certain aspects of
its meaning.
Acceptability tests are the natural way to investigate presuppositions and
conventional implicaturcs of expressions. For example, additive focus particles like also
presuppose that the predication holds for an alternative to the focus item. Hence in a context
like John went to Paris, the sentence John also went to PRAGUE is felicitous, but the
sentence Mary also went to PRAGUE is not. Acceptability tests can also be used to
investigate information-structural distinctions. For example, in English, different accent
patterns indicate different focus structures; this can be seen when judging sentences
like JOHN went to Paris \s.John went to PARIS in the context of questions like Who
went to Paris? and John went where? (cf. article 66 (Krifka) Questions). As another
example, Portner & Yabushita (1998) discussed the acceptability of sentences with a
topic-comment structure in Japanese where the topic was identified by a noun phrase
with a restrictive relative clause and found that such structures are better if the relative
clause corresponds to a comment on the topic in the preceding discourse. Acceptability
tests can also be used to test the appropriateness of terms with honorific meaning,
or various shades of expressive meaning, which have been analyzed as conventional
implicatures by Potts (2005).
When applying acceptability judgments, it is natural to present the context first,
to preclude that the subject first comes up with other contexts which may influence
12. Varieties of semantic evidence
255
the interpretation. Another issue is whether the contexts should be specified in the
object language, or can also be given in the meta-language that is used to carry out the
investigation. Matthewson (2004) discusses the various advantages and disadvantages -
especially if the investigator has a less-than-perfect command over the object language -
and argues that using a meta-language is acceptable, as language informants generally can
resist the possible influence of the metalanguage on their responses.
2.9. Elicited production
We can turn the TVJ test on its head and ask subjects to describe given situations with
their own words. In language acquisition research, this technique is known as "elicited
production", and encompasses all linguistic reactions to planned stimuli (cf. Thornton
1998). In this technique the presumed meaning is fixed, and controls the linguistic
production; wc can hypothesize about how this meaning can be represented in language.
The best known example probably is the retelling of a little movie called the Pear Story,
which has unearthed interesting differences in the use of tense and aspect distinctions
in different languages (cf. Chafe 1980 for the original publication). Another example,
which allows to study the use of meanings in interaction, is the "map task", where one
person explains the configuration of objects or a route on a map to another without
visual contact.
The main problem of elicited production is that the number of possible reactions
by speakers is, in principle, unlimited. It might well be that the type of utterances one
expects do not occur at all. For example, we could set up a situation in which person A
thinks that person B thinks that person C thinks that it is raining, to test the recursivity
of propositional attitude expressions, but we will have to wait long till such utterances are
actually produced. So it is crucial to select cues that constrain the linguistic production in
a way that ensures that the expected utterances will indeed occur.
2.10. From sentence meanings to word meanings
The TVJ technique and its variants test the meaning of sentences, not of words or sub-
sentential expressions. Also, with elicitation techniques, often we will get sentence-like
reactions. With elicited translations, it is also advisable to use whole sentences instead of
single words or simpler expressions, as Matthewson (2004) argues. It is possible to elicit
the basic meaning of nouns or certain verbs directly, but this is impossible for many other
words. The first ten most frequent words in English are often cited as being the, of, and,
a, to, in, is, you, that; it would be impossible to ask a na'ive speaker of English what they
mean or discover there meanings in other more direct ways, with the possible exception
of you.
We can derive hypotheses about the meaning of such words by using them in
sentences and judging the truth value of the sentences with respect to certain situations, and
their acceptability in certain contexts. For example, we can unearth the basic uses of the
definite article by presenting pictures containing one or two barking dogs, and ask to pick
out the best picture for the dog is barking. The underlying idea is that the assignment
of meanings to expressions is compositional, that is, that the meaning of the complex
expression is a result of the meaning of its parts and the way they arc combined.
256
III. Methods in semantic research
3. Communicative behavior
Perhaps the most important function of language is to communicate, that is, to transfer
meanings from one mind to another. So we should be able to find evidence for meaning
by investigating communicative acts.This is obvious in a trivial sense: If A tells B
something, B will often act in certain ways that betray that B understood what A meant. More
specifically, we can investigate particular aspects of communication and relate them
to particular aspects of meaning. We will look at three examples here: Presuppositions,
conversational implicatures and focus-induced alternatives.
Presuppositions (cf. article 91 (Beaver & Geurts) Presupposition) are meaning
components that are taken for granted, and hence appear to be downtoned.This shows up in
possible communicative reactions. For example, consider the following dialogues:
A: Unfortunately, it is raining.
B: No, it isn't.
Here, B denies that it is raining; the meaning component of unfortunate expressing
regret by the speaker is presupposed or conventionally implicated.
A: It is unfortunate that it is raining.
B: No, it isn't.
Here, B presupposes that it is raining, and states that this is unfortunate. In order to
deny the presupposed part, other conversational reactions are necessary, like But that's
not unfortunate, or But it doesn't rain. Simple and more elaborate denials are a fairly
consistent test to distinguish between presupposed and proffered content (cf. van der
Sandt 1988).
For conversational implicatures (cf. article 92 (Simons) Implicature) the most
distinctive property is that they are cancelable without leading to contradiction. For example,
John has three children triggers the scalar implicature that John has exactly three
children. But this meaning component can be explicitly suspended: John has three children,
if not more. It can be explicitly cancelled: John has three children, in fact he has four. And
it does not arise in particular contexts, e.g. in the context of People get a tax reduction if
they have three children. This distinguishes conversational implicatures from
presuppositions and semantic entailments:/o/m has three children, [if not two /in fact, two) is judged
contradictory.
Our last example concerns the introduction of alternatives that are indicated by focus,
which in turn can be marked in various ways,e.g. by sentence accent. A typical procedure
to investigate the role of focus is the question-answer test (cf. article 66 (Krifka)
Questions). In the following four potential question-answer pairs (Al-Bl) and (A2-B2) are
well-formed, but (A1-B2) and (A2-B1) are odd.
Al: Who ate the cake?
A2: What did Mary eat?
Bl: MARYate the cake.
B2: Mary ate the CAKE.
12. Varieties of semantic evidence
257
This has been interpreted as saying that the alternatives of the answer have to
correspond to the alternatives of the question.
To sum up, using communicative behavior as evidence for meaning consists in
evaluating the appropriateness of certain conversational interactions. Competent speakers
generally agree on such judgments. The technique has been used in particular to
identify, and differentiate, between different meaning components having to do with the
presentation of meanings, in particular with information structure.
4. Behavioral effects of semantic processing
When discussing evidence for the meaning of expressions we have focused so far on the
meanings themselves. We can also investigate how semantic information is processed,
and get a handle on how the human mind computes meanings. To get information on
semantic processing, judgment tasks are often not helpful, and might even be deceiving.
We need other types of evidence that arguably stand in a more direct relation to semantic
processing. It is customary to distinguish between behavioral data on the one hand, and
neurophysiology data that directly investigates brain phenomena on the other. In this
section we will focus on behavioral approaches (cf. also article 15 (Bott, Featherston,
Rad6 & Stolterfoht) Experimental methods).
4.1. Reaction times
The judgment tasks for meanings described so far can also tap into the processing of
semantic information if the timing of judgments is considered. The basic assumption is
that longer reaction times, everything else being equal, are a sign for semantic processing
load.
For example, Clark & Lucy (1975) have shown that indirect speech acts take longer
for processing than direct ones, and attribute this to the additional inferences that they
require. Noveck (2004) has shown that the computation of scalar implicature takes time;
people that reacted to sentences like Some elephants are mammals with a denial (because
all elephants and not just some arc) took considerably longer. Kim (2008) has
investigated the processing of ow/y-sentences, showing that the affirmative content is evaluated
first, and the presupposition is taken into account only after.
Reaction times are relevant for many other psycholinguistic paradigms, beyond tasks
like TVJ, and can provide hints for semantic processing. One notable example is the
semantic phenomenon of coercion, changes of meanings that are triggered by the
particular context in which meaning-bearing expressions occur (cf. article 25 (dc Swart)
Mismatches and coercion). One well-known example is aspectual coercion: Temporal
adverbials of the type until dawn select for atelic verbal predicates, hence The horse
slept until dawn is fine. But The horse jumped until dawn is acceptable as well, under an
iterative interpretation of jump that is not reflected overtly. This adaptation of the basic
meaning to fit the requirements of the context should be cognitively costly, and there is
indeed evidence for the additional semantic processing involved. Pifiango et al. (2006)
report on various studies and their own experiments that made use of the dual task
interference paradigm: Subjects listen to sentences and, at particular points, deal with
an unrelated written lexical decision task.They were significantly slower in deciding this
258
III. Methods in semantic research
task just after an expression that triggered coercion (e.g. until in the second example, as
compared to the first). This can be taken as evidence for the cognitive effort involved
in coercion; notice that there is not syntactic difference between the sentences to which
such reaction time difference could be attributed.
4.2. Reading process: Self-paced reading and eye tracking
Another window into semantic processing is the observation of the reading process.
There are two techniques that have been used: (i) Self-paced reading, where subjects
are presented with a text in a word-by-word or phrase-by-phrasc fashion; the subject has
control over the speed of presentation, which is recorded, (ii) Eye tracking, where the
reading movements of the subject are recorded by cameras and matched with the text
being read. While self-paced reading is easier to handle as a research paradigm, it has
the disadvantage that it might not give fine-grained data, as subjects tend to get into a
rhythmical tapping habit.
Investigations of reading have provided many insights into semantic processing;
however, it should be kept in mind that by their nature they only help to investigate one
particular aspect of language use that lacks many features of spoken language.
For example, reading speed has been used to determine how speakers deal with
semantic ambiguity: Do they try to resolve it early on, which would mean that they slow
down when reading triggers of ambiguity,or do they entertain an undcrspccificd
interpretation? Frazier & Rayncr (1990) have shown that reading slows down after ambiguous
words, as e.g. in The records were carefully guarded {after they were scratched / after the
political takeover), showing evidence for an early commitment for a particular reading.
However, with polysemous words, no such slowing could be detected; an example is
Unfortunately the newspaper was destroyed, [lying in the rain / managing advertising so
poorly).
The newspaper example is a case of coercion, which shows effects for semantic
processing under the dual task paradigm (see discussion of Pifiango et al. 2006 above).
Indeed, Pickering, McElree & Frisson (2006) have shown that the aspectual coercion
cases do not result in increased reading times; thus different kinds of tests seem to differ
in their sensitivity.
Another area for which reading behavior has been investigated is the time course of
pronoun resolution: Arc pronouns resolved as early as possible, at the place where they
occur, or is the semantic processor procrastinating this decision? According to Ehrlich &
Rayner (1983), the latter is the case. They manipulated the distance between an
antecedent and its pronoun and showed that distance had an effect on reading times, but only
well after the pronoun itself was encountered.
4.3. Preferential looking and the visual world paradigm
Visual gaze and eye movement can be used in other ways as windows to meaning and
semantic processing.
One technique to investigate language understanding is the preferential looking
paradigm, a version of the picture selection task that can be administered to young infants.
Preferential looking has been used for the investigation of stimulus discrimination, as
infants look at new stimuli longer than at stimuli that they are already accustomed to. For
12. Varieties of semantic evidence
259
the investigation of semantic abilities, so-called "Intermodal Preferential Looking" is used:
Infants hear an expression and are presented at the same time with two pictures or movie
scenes side by side; they preferentially look at the one that fits the description best. Hirsh-
Pasek & Golinkoff (1996) have used this technique to investigate the understanding of
sentences by young children that produce only single-word utterances.
A second procedure that uses eye gaze is known as "Visual World Paradigm". The
general setup is as follows: Subjects are presented with a scene and a sentence or text,
and have to judge whether the sentence is true with respect to the scene. In order
to perform this verification, subjects have to glance at particular aspects of the scene,
which yields clues about the way how the sentence is verified or falsified, that is, how it
is scmantically processed.
In an early study, Eberhard et al. (1995) have shown that eye gaze tracks semantic
interpretation quite closely. Listeners use information on a word-by-word basis to reduce
the set of possible visual referents to the intended one. For example, when instructed to
Touch the starred yellow square, subjects were quick to look at the target in the left-hand
situation, slower in the middle situation, and slowest in the right-hand situation. Sedivy
et al. (1999) have shown that there are similar effects of incremental interpretation even
with non-intersective adjectives, like tall.
pink blue
n + n
yellow red
ffl □
pink red
ffl ffl
yellow blue
ffl ffl
blue red
ffl ffl
yellow yellow
ffl ffl
Fig. 12.2: Stimulus of eye gaze test (from Eberhard et al. 1995)
Altman & Kamide (1999) have shown that eye gaze is not just cotemporaneous with
interpretation, but may jump ahead; subjects listening to The boy will eat the... looked
preferentially at the picture of a cake than at the picture of something non-edible. In a
number of studies, including Weber, Braun & Crocker (2006), the effect of contrastive
accent has been studied. When listeners had already fixated one object - say, the purple
scissors - and now are asked to touch the RED scissors (where there is a competing
red vase), they gaze at the red scissors more quickly, presumably because the square
property is given. This effect is also present, though weaker, without contrastive accent,
presumably because the use of modifying adjectives is inherently contrastive.
For another example of this technique, see article 15 (Bott, Featherston, Rado &
Stolterfoht) Experimental methods.
5. Physiological effects of semantic processing
There is no clear-cut way to distinguishing physiological effects from behavioral effects.
With the physiological phenomena discussed in this section it is evident that they are
truly beyond conscious control, and thus may provide more immediate access to semantic
processing.
260
III. Methods in semantic research
Physiological evidence can be gained in a number of ways: From lesions of the
brain and how they affect linguistic performance, from excitations of brain areas
during surgery, from the observable metabolic processes related to brain activities, and
from the electro-magnetic brain potentials that accompany the firing of bundles of
neurons. There are other techniques that have been used occasionally, such as pupillary
dilation, which correlates with cognitive load. For example, Kriiger, Nuthmann & van der
Meer (2001) show with this measure that representations of event sequences following
their natural order are cognitively less demanding than when not following the time line.
5.1. Brain lesions and stimulations
Since the early discoveries of Broca and Wernicke, it has been assumed that specific
brain lesions affect the relation between expressions to meanings. The classical picture
of Broca's area responsible for production and Wernicke's area responsible for
comprehension is now known to be incomplete (cf. Damasio ct al. 2004), but it is still assumed
that Broca's aphasia impedes the ability to use complex syntactic forms to encode and
also to decode meanings. From lesion studies it became clear that areas outside the
classical Broca/Wernicke area and the connecting Geschwind area are relevant for language
production and understanding. Brain regions have been identified where lesions lead to
semantic dementia (also known as anomic aphasia) that selectively affects the
recognition of names of persons, nouns for manipulable objects such as tools, or nouns of natural
objects such as animals.These regions are typically situated in the left temporal lobe, but
the studies reported by Damasio et al. also indicate that regions of the right hemisphere
play an important role.
It remains unclear, however, whether these lesions affect particular linguistic abilities
or more general problems with the pre-linguistic categorization of objects. A serious
problem with the use of brain lesions as source of evidence is that they are often not
sufficiently locally constrained as to allow for specific inferences.
Stimulation techniques allow for more directed manipulations, and hence for more
specific testing of hypothesis.There are deep stimulation techniques that can be applied
during brain surgery. There is also a new technique, Transcranial Magnetic Stimulation
(TMS), which affects the functioning of particular brain regions by electromagnetic fields
applied from outside of the skull.
5.2. Brain imaging of metabolic effects
The last decades have seen a lively development of methods that help to locate brain
activity by identifying correlated metabolic effects. Neuronal activity in certain brain
regions stimulate the flow of oxygen-rich blood, which in turn can be localized by various
means. While early methods like PET (Positron-Electron Tomography) required the use
of radioactive markers, the method of fMRI (functional Magnetic-Resonance Imaging)
is less invasive; it is based on measuring the electromagnetic fields of water molecules
excited by strong magnetic fields. A more recent method, NIRS (Near Infrared
Spectroscopy), applies low-frequency laser light from outside the skull; it is currently the least
invasive technique. All the procedures mentioned have a low temporal resolution, as
metabolic changes are slow, within the range of a second or so. However, their spatial
resolution is quite acute, especially for fMRI using strong magnetic fields.
12. Varieties of semantic evidence
261
Results of metabolic brain-image techniques often support and refine findings derived
from brain lesions (cf. Damasio et al. 2004). As an example of a recent study, Tyler,
Randall & Stamatakis (2008) challenge the view that nouns and verbs are represented
in different brain regions; they rather argue that inflected nouns and verbs and minimal
noun phrases and minimal verb phrases, that is,specific syntactic uses of nouns and verbs,
are spatially differentiated. An ongoing discussion is how general the findings about
localizations of brain activities are, given the enormous plasticity of the brain.
5.3. Event-related potentials
This family of procedures investigates the electromagnetic fields generated by the
cortical activity. They are observed by sensors placed on the scalp that either track minute
variations of the electric field (EEG) or the magnetic field (MEG).The limitations of this
technique are that only fields generated by the neocortex directly under the cranium can
be detected. As the neocortex is deeply folded, this applies only to a small part of it.
Furthermore, the number of electrodes that can be applied on the scalp is limited (typically
16 to 64, sometimes up to 256), hence the spatial resolution is weak even for the
accessible parts of the cortex. Spatial resolution is better for MEG,but the required techniques
are considerably more complex and expensive. On the positive side, the temporal
resolution of the technique is very high, as it docs not measure slow metabolic effects of brain
activity, but the electric fields generated by the neurons themselves (more specifically, the
action potentials that cause neurotransmitter release at the synapses). EEG electrodes
can record these fields if generated by a large number of neurons in the pyramidal
bundles of neurons in which the cortex is organized, in the magnitude of at least 1000 neurons.
ERP (Event-related potentials), the correlation of EEG signals with stimuli events,
has been used for thirty years in psycholinguistic research, and specifically for semantic
processing since the discovery by Kutas & Hillyard (1980) of a specific brain potential,
the N400.This is a frequently observed change in the potential leading to higher
negativity roughly 400ms after the onset of a relevant stimulus. See Kutas, van Petten &
Klucndcr (2006) for a review of the vast literature, and Lau, Phillips & Pocppcl (2008)
for a partially critical view of standard interpretations.
The N400 effect is seen when subjects are presented in an incremental way with
sentences like / like my coffee with cream and [sugar / socks), and the EEG signals of the
first and the second variant is compared. In the second variant, with a semantically
incongruous word, a negativity around 400ms after the onset of the anomalous word (here:
socks) appears when the brain potential development is averaged over a number of trials.
-5pVH
Fig. 12.3: Averaged EEG over sentences with no semantic violation (solid line) and with semantic
violation (dotted line): vertical axis at the onset of the anomalous word (from Lau, Phillips & Poeppel 2008)
262
III. Methods in semantic research
There are at least two interpretations of the N400 effect: Most researchers see it as a
reflex of the attempt to integrate the meaning of a subexpression into the meaning of the
larger expression, as constructed so far. With incongruous words, this task is hard or even
fails, which is reflected by a stronger N400.The alternative view is that the N400 reflects
the effort of lexical access.This is facilitated when the word is predictable by the context,
but also when the word is frequent in general. There is evidence that highly frequent
words lead to a smaller N400 effect. Also, N400 can be triggered by simple word priming
tasks; e.g. in coffee - [tea/chair], the non-primed word chair leads to an N400. See Lau,
Phillips & Poeppel (2008) for consequences of the integration view and the lexical access
view of the N400.
The spatial location of the N400 is also a matter of dispute. While Kutas, van Petten &
Kluender (2006) claim that its origins are in the left temporal lobe and hence can be
related to established language areas, the main electromagnetic field can be observed
rather in the centroparietal region, and often on the right hemisphere. Lau, Phillips &
Poeppel (2008) discuss various possible interpretations of these findings.
There are a number of other reproducible electrophysiological effects that point at
additional aspects of language processing. In particular, Early Left Anterior Negativity
(ELAN) has been implicated in phrase structure violations (150ms), Left Anterior
Negativity (LAN) appears with morphosyntactic agreement violations (300-500ms), and P600,
a positivity after 600ms, has been seen as evidence for difficulties of syntactic
integration, perhaps as evidence for attempts at syntactic restructuring. It is being discussed
how specific N400 is for semantics; while it is triggered by phenomena that are clearly
related to the meaning aspects of language, it can be also found when subjects perform
certain non-linguistic tasks, as in melody recognition. Interestingly, N400 can be masked
by syntactic inappropriateness, as Hahne & Friederici (2002) have shown. This can be
explained by the plausible assumption that structures first have to make syntactic sense
before semantic integration can even start to take place.
There are a number of interesting specific findings around N400 or related brain
potentials (cf. Kutas, van Petten & Kluender 2006 for an overview). Closed-class words
generally trigger smaller N400 effects than open-class words, and the shape of their
negativity is different as well - it is more drawn out up to about 700ms. As already mentioned,
low-frequency words trigger greater N400 effects, which may be seen as a point in favor
for the lexical access theory; however, we can also assume that low frequency is a general
factor that impedes semantic integration. It has been observed that N400 is greater for
inappropriate concrete nouns than for inappropriate abstract nouns. With auditory
presentations of linguistic structures, it was surprising to learn that N400 effects can appear
already before the end of the triggering word; this is evidence that word recognition and
semantic integration sets in very early, after the first phonemes of a word.
The larger context of an expression can modulate the N400 effect, that is, the
preceding text of a sentence can determine whether a particular word fits and is easy to
integrate, or does not fit and leads to integration problems. For example, in a context in
which piercing was mentioned, earring triggers a smaller N400 than necklace. This has
been seen as evidence that semantic integration does not differentiate between lexical
access, the local syntactic fit and the more global semantic plausibility; rather, all factors
play a role at roughly the same time.
N400 has been used as evidence for semantic features. For example, in the triple The
pizza was too hot to [eat /drink / kill), the item drink elicits a smaller N400 than kill, which
12. Varieties of semantic evidence
263
can be interpreted as showing that the expected item eat and the test item drink have
semantic features in common (ingestion), in contrast to eat and kill.
Brain potentials have also been used to investigate the semantic processing of
negative polarity items (cf. article 64 (Giannakidou) Polarity items). Saddy, Drenhaus &
Frisch (2004) and Drenhaus et al. (2006) observe that negative polarity items in
inappropriate contexts trigger an N400 effect (as in [A /no] man was ever happy]. With NPIs
and with positive polarity items, a P600 could be observed as well, which is indicative for
an attempt to achieve a syntactic structure in which there is a suitable licensing
operator in the right syntactic configuration. Incidentally, these findings favor the semantic
integration view of the N400 over the lexical access view.
There are text types that require special efforts for semantic integration - riddles and
jokes. With jokes based on the reinterpretation of words, it has been found that better
comprehenders of jokes show a slightly higher N400 effect on critical words, and a larger
P600 effect for overall integration. Additional effort for semantic integration has also
been shown for metaphorical interpretations.
A negativity around 320ms has been identified by Fischler et al. (1985) for statements
known to the subjects to be false, even if they were not asked to judge the truth value.
But semantic anomaly clearly overrides false statements; as Kounios & Holcomb (1992)
have showed, in the examples like No dogs are [animals / fruits], the latter triggers an
N400 effect.
More recent experiments using MEG have discovered a brain potential called AMF
(Anterior Midline Field) situated in the frontal lobe, an area that is not normally implied
in language understanding. The effect shows up with coercion phenomena (cf. article 25
(de Swart) Mismatches and coercion). Coercion does not lead to an N400 effect; there is no
anomaly with John began the book (which has to be coerced to read or write the book). But
Pylkkancn & McElrcc (2007) found an AMF effect about 350ms after onset. This effect is
absent with semantically incongruous words, as well with words that do not require
coercion. Interestingly, the same brain area has been implied for the understanding of sarcastic
and ironic language in lesion studies (Shamay-Tsoory et al.2005).
6. Corpus-linguistic methods
Linguistic corpora, the record of past linguistic production, is a valuable source of
evidence for linguistic phenomena in general, and in case of extinct languages, the only
kind of source (cf. article 109 (Katz) Semantics in corpus linguistics). This includes the
study of semantic phenomena. For the case of extinct languages we would like to
mention, in particular, the task of deciphering, which consists in finding a mapping between
expressions and meanings.
Linguistic corpora can provide for evidence of meaning in many different ways. An
important philosophical research tradition is hermeneutics, originally the art of
understanding of sacred texts. Perhaps the most important concept in the modern hermeneutic
tradition is the explication of the so-called hermeneutic circle (cf. Gadamer 1960): The
interpreter necessarily approaches the text with a certain kind of knowledge that is
necessary for an initial understanding, but the understanding of the text in a first reading will
influence and deepen the understanding in subsequent readings.
With large corpora that are available electronically, new statistical techniques have
been developed that can tap into aspects of meaning that might otherwise be difficult
264
III. Methods in semantic research
to recognize. In linguistic corpora, the analysis of word co-occurrences and in particular
collocations can yield evidence for meaning relations between words.
For example, large corpora have been investigated for verb-NP collocations using the
so-called Expectation Maximation (EM) algorithm (Rooth et al. 1999).This algorithms
leads to the classification of verbs and nouns into clusters such that verbs of class X
frequently occur with nouns of class Y.The initial part of one such cluster, developed from
the British National Corpus, looks as in the following table. The verbs can be
characterized as verbs that involve scalar changes, and the nouns as denoting entities that can
move along such scales.
increase.as:s
increase:aso:o
fall.as:s
pay.aso:o
reduce. aso:o
rise.as.s
exceed.aso.o
exceed.aso:s
affect.aso.s
grow.as.s
number
rate
price
cost
level
amount
sale
value
interest
demand
• • • • •
Fig. 12.4: Clustering analysis of nouns and verbs; dots represent pairs that occur in the
corpus. "as:s" stands for subjects of intransitive verbs, "aso.s" and "aso:o" for subjects and
objects of transitive verbs, respectively (from Rooth et al. 1999)
We can also look at the frequency of particular collocations within this cluster, as
illustrated in the following table for the verb increase.
Tab. 12.1: Frequency of nouns occurring with INCREASE (from Rooth et al. 1999)
increase
number
demand
pressure
temperature
cost
134.147
30.7322
30.5844
25.9691
23.9431
propoi
size
rate
level
price
tion
23.8699
22.8108
20.9593
20.7651
17.9996
While pure statistical approaches as Rooth et al. (1999) are of considerable interest,
most applications of large-scale corpus-based research are based on a mix between
hand-coding and automated procedures. The best-known project that has turned into
an important application is WordNet (Fellbaum (ed.) 1998). A good example for the
mixed procedure is Gildea & Juravsky (2002), a project that attempted semi-automatic
assignment of thematic roles. In a first step, thematic roles were hand-coded for a large
number of verbs, where a large corpus provided for a wide variety of examples.These
initial examples, together with the coding, were used to train an automatic syntactic parser,
which then was able to assign thematic roles to new instances of known predicates and
even to new, unseen predicates with reasonable accuracy.
12. Varieties of semantic evidence
265
Yet another application of corpus-linguistic methods involves parallel corpora,
collections of texts and their translations into one or more other languages. It is
presupposed that the meanings of the texts are reasonably similar (but recall the problems
with translations mentioned above). Refined statistical methods can be used to train
automatic translation devices on a certain corpus, which then can be extended to new
texts that then are translated automatically, a method known as example based machine
translation.
For linguistic research, parallel corpora have been used in other ways as well. If a
language a marks a certain distinction overtly and regularly, whereas language p marks that
distinction only rarely and in irregular ways, good translations pairs of texts from a into (3
can be used to investigate the ways and frequency in which the distinction in (3 is marked.
This method is used, for example, in von Heusinger (2002) for specificity, using Umberto
Eco's Ilnome della rosa, and Behrens (2005) for genericity, using Sait-Exup£ry's Le petit
prince. The articles in Cysouw & Walchli (2007) discuss the potential of the technique,
and its problems, for typological research.
7. Conclusion
This article, hopefully, has shown that the elusive concept of meaning has many reflexes
that we can observe, and that semantics actually stands on as firm grounds as other
disciplines of linguistics. The kinds of evidence for semantic phenomena are very diverse,
and not always as convergent as semanticists might wish them to be. But they provide for
a very rich and interconnected area of study that has shown considerable development
since the first edition of the Handbook Semantics in 1991. In particular, a wide variety
of experimental evidence has been adduced to argue for processing of meaning. It is to
be hoped that the next edition will show an even richer and, hopefully, more convergent
picture.
8. References
Altniann, Gerry & Yuki Kamide 1999. Incremental interpretation at verbs: Restricting the domain
of subsequent reference. Cognition 73,247-264.
Behrens, Leila. 2005. Genericity from a cross-linguistic perspective. Linguistics 43,257-344.
Berlin, Brent, Dennis E. Breedlove & Peter H. Raven 1973. General principles of classification and
nomenclature in folk biology. American Anthropologist 75,214-242.
Berlin, Brent & Paul Kay 1969. Basic Color Terms. Their Universality and Evolution. Berkeley, CA:
University of California Press
Bloomfield. Leonard 1933. Language. New York: Henry Holt and Co.
Chafe. Wallace (ed.) 1980. The Pear Stories: Cognitive, Cultural, and Linguistic Aspects of Narrative
Production. Norwood, NJ: Ablex.
Clark, Herbert H. & Peter Lucy 1975. Understanding what is meant from what is said: A study in
conversationally conveyed requests. Journal of Verbal Learning and Verbal Behavior 14,56-72.
Crain, Stephen 1991. Language acquisition in the absence of experience. Behavioral and Brain
Sciences 14,597-650.
Crain, Stephen, Rosalind Thornton, Laura Conway & Diane Lillo-Martin 1998. Quantification
without quantification. Language Acquisition 5,83-153.
Cysouw, Michael & Bernhard Walchli 2007. Parallel texts: Using translational equivalents in
linguistic typology. Special issue of Sprachtypologie und Universalienforschung 60,95-99.
266
III. Methods in semantic research
Damasio, Hanna. Daniel Tranel, Thomas Grabowski, Ralph Adolphs & Antonio Damasio 2004.
Neural systems behind word and concept retrieval. Cognition 92,179-229.
Drenhaus, Heiner, Peter beim Graben, Doug Saddy & Stefan Frisch 2006. Diagnosis and repair of
negative polarity constructions in the light of symbolic resonance analysis. Brain & Language
96,255-268.
Eberhard, Kathleen, Michael J. Spivey-Knowlton, Judy Sedivy & Michael Tanenhaus 1995. Eye
movement as a window into real-time spoken language comprehension in a natural context.
Journal of Psycholinguistic Research 24,409-436.
Ehrlich, Kate & Keith Rayner 1983. Pronoun assignment and semantic integration during reading:
Eye movement and immediacy of processing. Journal of Verbal Learning and Verbal Behavior
22,75-87.
Fellbaum. Christiane (ed.) 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: The
MIT Press
Fischlcr, Ira, Donald G. Childcrs,Tecra Achariyapaopan & Nathan W. Perry 1985. Brain potentials
during sentence verification: Automatic aspects of comprehension. Biological Psychology 21,
83-105.
Frazier, Lyn & Keith Rayner 1990.Taking on semantic commitments: Processing multiple meanings
vs. multiple senses. Journal of Memory and Language 29,181-200.
Frege, Gottlob 1892. Uber Sinn und Bedeutung. Zeitschrift fiir Philosophic und philosophische
Kritik 100,25-50.
Gadamer. Hans Georg 1960. Wahrheit und Methode. Grundzuge einer philosophischen Herme-
neutik. Tubingen: Mohr.
Gaidenfors, Peter 2000. Conceptual Spaces - The Geometry of Thought. Cambridge, MA:The MIT
Press
Gei ken, Lou Ann & Michele E. Shady 1998. The picture selection task. ImD.McDaniel.C.McKee &
H. S. Cairns (eds). Methods for Assessing Children's Syntax. Cambridge, MA:The MIT Press,
125-146.
Gildea, Daniel & Daniel Jurafsky 2002. Automatic labeling of semantic roles. Computational
Linguistics 28,245-288.
Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University
Press
Goodman, Nelson 1955. Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press
Gordon. Peter 1998. The truth-value judgment task. In: D. McDaniel, C. McKee & H. S. Carins
(eds). Methods for Assessing Children's Syntax. Cambridge, MA:The MIT Press, 211-235.
Gregory, Robert J. 2003. An early history of land on Tanna, Vanuatu. The Anthropologist 5,
67-74.
Grice. Paul 1957. Meaning. The Philosophical Review 66,377-388.
Hahne, Anja & Angela D. Friederici 2002. Differential task effects on semantic and syntactic
processes are revealed by ERPs Cognitive Brain Research 13,339-356.
Haviland, John B. 1974. A last look at Cook's Guugu-Yimidhirr wordlist. Oceania 44,216-232.
von Heusinger, Klaus 2002. Specificity and definiteness in sentence and discourse structure. Journal
of Semantics 19,245-274.
Hill, Clifford. 1982. Up/down, front/back, left/right: A contrastive study of Hausa and English. In:
J. Weissenborn & W. Klein (eds). Here and There: Cross-Linguistic Studies on Deixis and
Demonstration. Amsterdam: Benjamins 11-42.
Hirsh-Pasck,Kathryn & Roberta M. Golinkoff 1996. The Origin of Grammar: Evidence from Early
Language Comprehension. Cambridge, MA:Thc MIT Press
Kaplan, David 1978. On the logic of demonstratives Journal of Philosophical Logic VIII, 81-98.
Reprinted in: P. French, T Uehling & H.K. Wettstein (eds). Contemporary Perspectives in the
Philosophy of Language. Minneapolis MN: University of Minnesota Press 1979,401-412.
Kay, Paul, Brent Berlin, Luisa Maffi & Wiliam R. Merrifield 2008. The World Color Survey.
Stanford, CA: CSLI Publications
Kempton, William. 1981. The Folk Classification of Ceramics. New York: Academic Press
12. Varieties of semantic evidence
267
Kim, Christina 2008. Processing presupposition: Verifying sentences with 'only'. In: J. Tauberer,
A. Eilam & L. MacKenzie (eds.). Proceedings of the 31st Penn Linguistics Colloquium.
Philadelphia, PA: University of Pennsylvania.
Kounios, John & Phillip J. Holcomb 1992. Structure and process in semantic memory: Evidence
from event-related brain potentials and reaction times. Journal of Experimental Psychology.
General 121,459-479.
Kranstedt, Alfred, Andy Lucking,Thies Pfeiffer & Hannes Rieser 2006. Deixis: How to determine
demonstrated objects using a pointing cone. In: Sylvie Gibet, Nicolas Courty & Jean-Franqois
Kamp (eds.). Gesture in Human-Computer Interaction and Simulation. Berlin: Springer. 300-311.
Kriiger. Frank. Antje Nuthmann & Elke van der Meer 2001. Pupillometric indices of the temporal
order representation in semantic memory. Zeitschrift fur Psychologie 209,402-415.
Kutas, Marta & Steven A. Hillyard 1980. Reading senseless sentences: Brain potentials reflect
semantic incongruity. Science 207.203-205.
Kutas, Marta, Cyma K. van Pcttcn & Robert Kluender 2006. Psycholinguistics electrified II 1994-
2005. In: M.A. Gcrnsbachcr & M. Traxlcr (eds.). Handbook of Psycholinguistics. New York:
Elsevier, 83-143.
Lakoff, George & Mark Johnson 1980. Metaphors we Live by. Chicago, IL: The University of
Chicago Press
Lau. Ellen F, Colin Phillips & David Poeppel 2008. A cortical network for semantics: (de)
constructing the N400. Nature Reviews Neuroscience 9,920-933.
McDaniel, Dena, Cecile McKee & Helen Smith Cairns (eds.) 1996. Methods for Assessing
Children's Syntax. Cambridge, MA:The MIT Press
Majid.Asifa, Melissa Bowerman, Miriam van Staden & James Boster 2006.The semantic categories
of cutting and breaking events: A crosslinguistic perspective. Cognitive Linguistics 18,133-152.
Mattliewson, Lisa 2004. On the methodology of semantic fieldwork. International Journal of
American Linguistics 70,369-415.
Noveck, Ira 2001. When children arc more logical than adults: Experimental investigations of scalar
implicatures. Cognition 78,165-188.
Noveck, Ira 2004. Pragmatic inferences linked to logical terms. In: I. Noveck & D. Sperber (eds.).
Experimental Pragmatics. Basingstoke: Palgravc Macmillan, 301-321.
Pickering, Martin J., Brian McElrcc & Steven Frisson 2006. Undcrspccification and aspectual
coercion. Discourse Processes 42,131-155.
Pinango, Maria M.. Aaron Winnick, Rash ad Ullah & Edgar Zurif 2006. Time course of semantic
composition:The case of aspectual coercion. Journal of Psycholinguistic Research 35,233-244.
Portlier, Paul & Katsuhiko Yabushita 1998.The semantics and pragmatics of topic phrases.
Linguistics & Philosophy 21.117-157.
Potts, Christopher 2005. The Logic of Conventional Implicatures. Oxford: Oxford University Press
Pylkkanen, Liina & Brian McElree 2007. An MEG study of silent meaning. Journal of Cognitive
Neuroscience 19,1905-1921.
Quine, Willard van Ornian 1960. Word and Object. Cambridge, MA:Thc MIT Press, 26-79.
Rayner, Keith 1998. Eye movements in reading and information processing: 20 years of research.
Psychological Bulletin 124,372-422.
Rooth, Mats, Stefan Riezler, Detlef Prescher & Glenn Carroll 1999. Inducing a semantically
annotated lexicon via EM-based clustering. In: R. Dale & K. Church (eds.). Proceedings of the 37th
Annual Meeting of the Association for computational Linguistics. College Park, MD: ACL.
104-111.
Saddy, Douglas, Hcincr Drcnhaus & Stefan Frisch 2004. Processing polarity items: Contrastivc
licensing costs. Brain & Language 90,495-502.
van der Sandt, Rob A. 1988. Context and Presupposition. London: doom Helm.
Sedivy, Julie C, Michael K. Tanenhaus, Craig C. Chambers & Gregory N. Carlson 1999. Achieving
incremental semantic interpretation through contextual representation. Cognition 71,109-147.
Shamay-Tsoory. Simone, Rachel Tomer & Judith Aharon-Peretz 2005. The neuroanatomical basis
of understanding sarcasm and its relationship to social cognition. Neuropsychology 19,288-300.
268
III. Methods in semantic research
Smith, Carlotta L. 1980. Quantifiers and question answering in young children. Journal of
Experimental Child Psychology 30,191-205.
Stalnaker, Robert 1974. Pragmatic presuppositions. In: M. K. Munitz & P. K. Unger (eds.). Semantics
and Philosophy. New York: New York University Press, 197-214.
Thornton, Rosalind 1998. Elicited production. In: D. McDaniel, C. McKee & H. S. Carins (eds.).
Methods for Assessing Children's Syntax. Cambridge, MA: The MIT Press, 77-95.
Tomasello, Michael 2008. Why don't apes point? In: R. Eckardt, G. Jager & Tonjes Veenstra (eds.).
Variation, Selection, Development. Probing the Evolutionary Model of Language Change. Berlin:
Mouton de Gruyter, 375-394.
Tyler, Lorraine K., Billi Randall & Emmanuel A. Stamatakis 2008. Cortical differentiation for nouns
and verbs depends on grammatical markers. Journal of Cognitive Neuroscience 20,1381-1389.
Weber, Andrea, Bettina Braun & Matthew W. Crocker 2006. Finding referents in time: Eye-tracking
evidence for the role of contrastive accents. Language and Speech 49.367-392.
Weinreich, Uriel 1964. Webster's Third. A critique of its semantics. International Journal of
American Linguistics 30,405-409.
Manfred Krifka, Berlin (Germany)
13. Methods in cross-linguistic semantics
1. Introduction
2. The need for cross-linguistic semantics
3. On semantic fieldwork methodology
4. What counts as a universal?
5. How do we find universals?
6. Semantic universals in the literature
7. Variation
8. References
Abstract
This article outlines methodologies for conducting research in cross-linguistic semantics,
with an eye to uncovering semantic universals. Topics covered include fieldwork
methodology, types of evidence for semantic universals, different types of semantic universals
(with examples from the literature), and semantic parameters.
1. Introduction
This article outlines methodologies for conducting research in cross-linguistic
semantics, with an eye to uncovering semantic universals. Section 2 briefly motivates the need
for cross-linguistic semantic research, and section 3 outlines fieldwork methodologies
for work on semantics. Section 4 addresses the issue of what counts as a universal, and
section 5 discusses how one finds universals. Section 6 provides examples of semantic
universals, and section 7 discusses variation. The reader is also referred to articles 12
(Krifka) Varieties of semantic evidence and 95 (Bach & Chao) Semantic types across
languages.
Maienborn, von Heusmger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 268-285
13. Methods in cross-linguistic semantics 269
2. The need for cross-linguistic semantics
It is by now probably uncontroversial that our field's empirical base needs to be as broad
as possible. Typologists have always known this;gcncrativists have also mostly advanced
beyond the view that we can uncover Universal Grammar by doing in-depth study of a
single language. Although semantics was the last sub-field of formal linguistics to
undertake widespread cross-linguistic investigation, such work has been on the rise ever since
the synthesis of generative syntax and Montague semantics in the 1980s. In the 20 years
since Comrie (1989: 4) wrote that for a 'Chomskyist' for whom universals are abstract
principles, "there is no way in which the analysis of concrete data from a wide range
of languages would provide any relevant information", we have seen many instances
where empirical cross-linguistic work has falsified abstract universal principles. Among
many other others, Maria Bittner's pioneering work on Kalaallisut (Eskimo-Aleut,
Greenland) is worth mentioning here (e.g., Bittncr 1987,1994,2005,2008), as well as the
papers in Bachet al.'s (1995) quantification volume, and the body of research responding
to Chierchia's (1998) Nominal Mapping Parameter. It is clear that we will only come close
to discovering semantic universals by looking at data from a wide range of languages.
3. On semantic fieldwork methodology
The first issue for a cross-linguistic semantic researcher is how to obtain usable data in
the field. The material presented here draws on Matthcwson (2004); the reader is also
referred to article 12 (Krifka) Varieties of semantic evidence. The techniques described
here are theory-neutral; I assume that regardless of one's theoretical assumptions,
similar data-collection processes are appropriate.
3.1. In support of direct elicitation
It was once widely believed that direct elicitation is an illegitimate methodology; see for
example Harris & Voegelin (1953: 59). Even in this century we find statements like the
following:
The referential meaning of nouns (in terras of definiteness and specificity) is an intricate
topic that is extremely hard to investigate on the basis of elicitation. In the end it is texts or
connected discourse in general in the language under investigation which provide the most
important clues for analysis of these grammatical domains.
(Dimmcndaal2001:69)
It is true that connected discourse is an indispensable part of any investigation into noun
phrase semantics. However, this methodology alone is insufficient.
Within semantics, there are two main reasons why textual materials are insufficient
as an empirical base. The first is shared with syntactic research, namely that one cannot
obtain negative evidence from texts. One cannot claim with any certainty that any
structure is impossible, if one only investigates spontaneously produced data. The second
reason is specific to semantics, namely that texts provide little direct evidence about
meaning. A syntactician working with a textual corpus has at least a number of sentences
which may be assumed to be grammatical. But a text is paired with at most a translation -
an incomplete representation of meaning. Direct elicitation results in more detailed,
270
III. Methods in semantic research
targeted information than a translation. (This is not to deny that examination of texts is
useful; texts are good sources of information about topic-tracking devices or reference
time maintenance, for example.)
3.2. Eliciting semantic judgments
The semantics of utterances or parts of utterances are not consciously accessible to
native speakers. Comments, paraphrases and translations offered by consultants are
all useful clues for the fieldworker (see article 12 (Krifka) Varieties of semantic
evidence). However, just as a syntactician does not ask a native speaker to provide a tree-
structure (but rather asks for grammaticality judgments of sentences which are designed
to test for constituency, etc.), so a semanticist does not ask a native speaker to conduct
semantic analysis.This includes generally avoiding questions such as 'what does this word
mean?','when is it okay to use this word?' and 'docs this sentence have two meanings?'
Instead, we mainly proceed by requesting judgments on the acceptability of sentences in
particular discourse contexts.
A judgment is an acceptance or rejection of some linguistic object, and ideally reflects
the speaker's native competence. (However, see Carden 1970, Schiitze 1996 on the
problem of speaker variability in judgments.) I assume there is only one kind of
judgment within semantics, that of whether a grammatical sentence (or string of sentences) is
acceptable in a given discourse context. Native speakers are not qualified to give
'judgments' about technical concepts such as entailment, tautology, ambiguity or vagueness
(although of course consultants' comments can give valuable clues about these issues).
All of these concepts can and should be tested for via acceptability judgment tasks.
The acceptability judgment task is similar to the truth value judgment task used in
language acquisition research (cf. Crain & Thornton 1998 and much other work; see
article 103 (Crain) Meaning in first language acquisition). The way to elicit an
acceptability judgment is to first provide a discourse context to the consultant, then present an
utterance, and ask whether the utterance is acceptable in that context. The consultant's
answer potentially gives us information about the truth-value of the sentence in that
context, and about the pragmatic felicity of the sentence in that context (e.g., whether its
presuppositions are satisfied).
The relation of the acceptability judgment task to truth values derives from Grice's
(1975) Maxim of Quality: we assume that a speaker will only accept a sentence S in a
discourse context C if S is true in C. Conversely, if S is false in C, a speaker will reject S
in C. From this it follows that acceptance of a sentence gives positive information about
truth value (the sentence is true in this discourse context), but rejection of a sentence
gives only partial information: the sentence may be false, but it may also be rejected on
other grounds. Article 103 (Crain) Meaning in first language acquisition offers useful
discussion of the influence of the Cooperative Principle on responses to acceptability
judgment tasks. And see von Fintel (2004) for the claim that speakers will assign the truth-
value 'false' to sentences which involve presupposition failure, and which are therefore
actually infelicitous (and analyzed as having no truth value).
When conducting an acceptability judgment task, it is important to present the
consultant with the discourse context before presenting the object language sentence(s). If the
consultant hears the sentence in the absence of a context, s/he will spontaneously think
of a suitable context or range of contexts for the sentence. If the speaker's imagined
13. Methods in cross-linguistic semantics 271_
context differs from the one the researcher is interested in, false negatives can arise,
particularly if the researcher is interested in a dispreferred or non-obvious reading.
The other important question is how to present the discourse context. In section 3.4
I will discuss the use of non-verbal stimuli when presenting contexts. Concentrating for
now on verbal strategies, we have two choices: explain the context in the object language,
or in a meta-language (a language the researcher is fluent in). Either method can work,
and I believe that there is little danger in using a meta-language to describe a context.
The context is merely background information; its linguistic features are not relevant. In
fact, presenting the context in a meta-language has the advantage that the consultant is
unlikely to copy structural features from the context-description to the sentence being
judged. (In Matthcwson 2004, I argue that even in a translation task, the influence of
meta-language syntax and semantics is less than has sometimes been feared. Based on
examples drawn from Salish, I argue that consultants will only copy structures up to the
point allowed by their native grammar, and that any problems of cross-language influence
are easily solvable by means of follow-up elicitation.) See article 12 (Krifka) Varieties of
semantic evidence for further discussion of issues in presenting discourse contexts.
3.3. Eliciting translations
Translations have the same status as paraphrases or additional comments about meaning
offered by the consultant: they are a clue to meaning, and should be taken seriously, but
they are not primary data. A translation represents the consultant's best effort to express
the same truth conditions in another language - but often, there is no way to express
exactly the same truth conditions in two different languages. Even if we assume effability,
i.e. that every proposition expressible in one language is expressible in any language
(Katz 1976), a translation task will usually fall short of pairing a proposition in one
language with a truth-conditionally identical counterpart in the other language.The reasons
for this include the fact that what is easily expressible in one language may be expressible
only with difficulty in another, or that what is expressed using an unambiguous
proposition in one language is best rendered in another language by an utterance which is
ambiguous or vague. Furthermore, what serves as a good translation of a sentence in one
discourse context may be a poor or inappropriate translation of the same sentence in a
different discourse context. As above, the only direct evidence about truth conditions are
acceptability judgments in particular contexts.
The problem of ambiguity is acute when dealing with a translation or paraphrase task.
If an object-language sentence is ambiguous, its translation or paraphrase often only
produces the preferred reading. And a consultant will often want to explicitly disambiguate;
after all, Gricc's Manner Maxim exhorts a cooperative speaker to avoid ambiguity. If
one suspects that a sentence may be ambiguous, one should not simply give the sentence
and ask what it means. Instead, present a discourse context, and elicit an acceptability
judgment. Eliciting the dispreferred reading first is a good idea (or, if the status of the
readings is not known, using different elicitation orders on different days). It is also a
good idea to construct discourse contexts which pragmatically favour the dispreferred
reading. Manipulating pragmatic plausibility can also establish the absence of ambiguity:
if a situation pragmatically favours a certain possible reading, but the consultant rejects
the sentence in that context, one can be pretty sure that the reading is absent.
272
III. Methods in semantic research
When dealing with presuppositions, or other aspects of meaning which affect felicity
conditions, translations provide notoriously poor information. Felicity conditions are
usually ignored in the translation process; the reader is referred to Matthewson (2004)
for examples of this.
When eliciting translations, the basic rule is only to ask for translations of complete,
grammatical sentences. (See Nida 1947:140 for this point; Harris &Voegelin 1953:70-71,
on the contrary, advise that one begin by asking for translations of single morphemes.)
Generally speaking, any sub-sentential string will probably not be translatable with
any accuracy by a native speaker. A sub-sentential translation task rests on the faulty
assumption that the meaning which is expressed by a certain constituent in one language
is expressible by a constituent in another language. Even if the syntactic structures are
roughly parallel in the two languages, the meaning of a sub-sentential string may depend
on its environment (cf. for example languages where bare noun phrases are interpreted
as specific or definite only in certain positions).
The other factor to be aware of when eliciting translations is that a sentence which
would result in ungrammatically if it were translated in a structure-preserving way, will
be restructured. In Matthewson (2004) I provide an example of this involving testing for
Condition C violations in Salish. Due to pro-drop and free word order, the St'at'imcets
(Lillooet Salish, British Columbia) sentence which translates Mary saw her mother is
structurally ambiguous with She saw Mary's mother. The wrong way to test for
potential Condition C-violating structures (which has been tried!) is to ask for translations
of the potentially ambiguous sentence into English. Even if the object language does
allow Condition C violations (i.e., does allow the structure Shet saw Mary^s mother),
the consultant will never translate the sentence into English as She saw Mary's mother.
(See Davis 2009 for discussion of strategies for dealing with the particular elicitation
problems of Condition C.)
3.4. Elicitation using non-verbal stimuli
It was once widely believed that elicitation should proceed by means of non-verbal stimuli,
which are used to generate spontaneous speech in the object language, and that a
metalanguage should be entirely avoided (see for example Hayes 1954, Yegerlehner 1955, or
Aitken 1955). It will already be clear that I reject this view, as do most modern researchers.
Elicitation using only visual stimuli cannot provide negative evidence, for example.
Visual stimuli are routinely used in language acquisition experiments; see Crain &
Thornton (1998), among others.The methodology is equally applicable to fieldwork with
adults. The stimuli can be created using computer technologies, but they do not need
to be high-tech; they can involve puppets, small toys, or line-drawings. Usually, these
methodologies arc not intended to replace verbal elicitation, but serve as a support and
enhancement for it. For example, a picture or a video will be shown, and may be used
to elicit spontaneous discourse, but is also usually followed up with judgment questions.
The advantage of visual aids is obvious: the methodology avoids potential interference
from linguistic features of the context description, and it can potentially more closely
approximate a real-life discourse context. This is particularly relevant for pragmatically
sensitive questions, where it is almost impossible to hold cues in the metalanguage (e.g.,
intonation) constant.This methodology also allows standardization across different
elicitation sessions, different fieldworkers, and different languages. When researchers share
13. Methods in cross-linguistic semantics 273
their video clips, for example, we can be certain that the same contexts are being tested
and can compare cross-linguistic results with more confidence.
Another advantage of visual stimuli concerns felicity conditions. It is quite challenging
to elicit information about presupposition failure using only verbal descriptions of
discourse contexts, as the necessary elements of the context include information about the
interlocutors' belief or knowledge states. However, video, animation or play-acting give a
potential way to represent a character's knowledge state, and if carefully constructed, the
stimulus can rule out potential presupposition accommodation.The consultant can then
judge the acceptability of an utterance by one of the characters in the video/play which
potentially contains presupposition failure.
The main disadvantage of these methodologies is logistics: compared to verbally
describing a context, visually representing it takes more time and effort. Many hours can
go into planning (let alone filming or animating) a video for just one discourse context. In
many cases, verbal strategies are perfectly adequate and more efficient. For example, when
trying to establish whether a particular morpheme encodes past tense, it is relatively simple
to describe a battery of discourse contexts for a single sentence and obtain judgments for
each context. One can quickly and unambiguously explain that an event took place
yesterday, is taking place at the utterance time, or will take place tomorrow, for example. In
sum, verbal elicitation still remains an indispensable and versatile methodology.
The next section turns to semantic universale We first discuss what semantic
universal look like, and then how one gets from cross-linguistic data to the postulation of
universale
4. What counts as a universal?
Universals come in two main flavours: absolute/unconditional ('all languages have x'),
and implicational ('if a language has x, it has y'). Examples of each type are given in (1)
and (2); all of these except (lb) can be found in the Konstanz Universals Archive.
(1) Absolute/unconditional universals:
a. Every language has an existential element such as a verb or particle (Ferguson
1972:78-79).
b. Every language has expressions which are interpreted as denoting individuals
(Article 95 (Bach & Chao) Semantic types across languages).
c. Every language has N's and NPs of the type e —> t (Partee 2000).
d. The simple NP of any natural language express monotone quantifiers or
conjunctions of monotone quantifiers (Barwise & Cooper 1981:187).
(2) Implicational universals:
a. If a language has adjectives for shape, it has adjectives for colour and size (Dixon
1977).
b. A language distinguishes between count and mass nouns iff a language
possesses configurational NPs (Gil 1987).
c. There is a simple NP which expresses the [monotone decreasing] quantifier ~Q
if and only if there is a simple NP with a weak non-cardinal determiner which
expresses the [monotone increasing] quantifier Q (Barwise & Cooper 1981:
186).
274
III. Methods in semantic research
Within the typological literature, 'universals' are often actually tendencies ('statistical
universals'). Many of the entries in the Konstanz Universals Archive contain phrases like
'with more than chance frequency','more likely to','most often','typically' or 'tend to'.
For obvious reasons, one does not find much attention paid to statistical universals in the
generativist literature.
Another difference between typological and formal research relates to whether more
importance is attached to implicational or unconditional universals.Typological research
finds the implicational kind more interesting, and there is a strand of belief according
to which it is difficult to distinguish unconditional universals from the preconditions of
one's theory. For example, take (lc) above. Is this a semantic universal, or a theoretical
decision under which wc describe the semantic behaviour of languages? (Thanks to a
reviewer for discussion of this point.)
There are certainly unconditional universals which seem to be empirically unfalsi-
fiable. One example is semantic compositionality, which is assumed by many formal
semanticists to hold universally. As discussed in von Fintel & Matthewson (2008) and
references therein, compositionality is a methodological axiom which is almost impossible
to falsify empirically. On the other hand, many unconditional universals are falsifiable;
for example, Barwise and Cooper's NP-Quantifier Universal, introduced in section 5
below. Even (lc) does make a substantive claim, although worded in theoretical terms. It
asserts that all languages share certain meanings; it is in a sense irrelevant if one doesn't
agree that wc should analyze natural language using the types c and t.
If typologists are suspicious of unconditional universals, formal semanticists may tend
to regard implicational universals as merely a stepping stone to a deeper discovery. Many
of the implications as they are stated are probably not primitively part of Universal
Grammar. They may either have a functional explanation, or they may be descriptive
generalizations which should be derivable from the theory rather than forming part of it.
For example, take the semantic implicational universal in (3) (provided by a reviewer):
(3) If a language has a definite article, then a type shift from <e,t> to «e,t>,t> is not
freely available (without using the article).
(3) is ideally merely one specific instance of a general ban on type-shifting when there is
an overt element which does the job.This in turn may follow from some general economy
principle, and not need to be stated separately.
A final point of debate relates to level of abstractness. Formal semanticists often propose
constraints which are only statable at a high level of abstractness, but some researchers
believe that universals should always be surface testable. For example, Comrie (1989) argues
that abstractly-stated universals have the drawback that they rely on analyses which may
be faulty. And Levinson (2003:320) argues for surface-testability because "highly abstract
generalizations ... are impractical to test against a reasonable sample of languages."
Abstractness is required in the field of semantics because superficial statements almost
never reveal what is really going on.This becomes clear if one considers how much
reliable information about meaning can be gleamed from sources like descriptive grammars.
In fact, the inadequacy of superficial studies is illustrated by the very examples cited
by Levinson (2003) in support of his claim that "Whorf's (1956: 218) emphasis on the
'incredible degree of linguistic diversity of linguistic systems over the globe' looks
considerably better informed than the opinions of many contemporary thinkers." Lcvinson's
13. Methods in cross-linguistic semantics 275
examples include variation in lexical categories, and the fact that some languages have
no tense (Levinson 2003: 328). However, these are areas where examination of
superficial data can lead to hasty conclusions about variation, and where deeper investigation
often reveals underlying similarities.
With respect to lexical categories, Jelinek (1995) famously advanced the strong and
interesting hypothesis that Straits Salish (Washington and British Columbia) lacks cat-
egorial distinctions. The claim was based on facts such as that any open-class lexical
item can function as a predicate in Straits, without a copular verb. However, subsequent
research has found, among other things, that the head of a relative clause cannot be
occupied by just any open-class lexical item. Instead, this position is restricted to elements
which correspond to nouns in languages like English (Dcmirdachc & Matthcwson 1995,
Davis & Matthewson 1999, Davis 2002;see also Baker 2003).This is evidence for catego-
rial distinctions, and Jelinek herself has since rejected the category-neutral view of Salish
(orally,at the 2002 International Conference on Salish and Neighbouring Languages;see
also Montler 2003).
As for tense, while tenseless languages may exist, we cannot determine whether a
language is tensed or not based on superficial evidence. I will illustrate this based on
St'at'imcets; see Matthewson (2006b) for details. St'at'imcets looks like a tenseless
language on the surface, as there is no obligatory morphological encoding of the distinction
between present and past. (4) illustrates this for an activity predicate; the same holds for
all aspectual classes:
(4) mets-c31=lhkan
write-ACT=lsG.suBJ
'I wrote /1 am writing.'
St'at'imcets does have obligatory marking for future interpretations; (4) cannot be
interpreted as future.The main way of marking the future is with the future modal kelh.
(5) mets-cai=lhkan=A;e//j
Write-ACT=lSG.SUBJ=fCT
'I will write.'
The analytical problem is non-trivial here: the issue is whether there is a phonologically
unpronounccd tense in (4).The assumption of null tense would make St'at'imcets similar
to English, and therefore would be the null hypothesis (see section 5.1).The hypothesis is
empirically testable language-internally, but only by looking beyond obvious data such as
that in (4)-(5). For example, it turns out that St'at'imcets and English behave in a
strikingly parallel manner with respect to the temporal shifting properties of embedded
predicates. In English, when a stative past-tense predicate is embedded under another past
tense, a simultaneous reading is possible.The same is true in St'at'imcets, as shown in (6).
(6) tsut=tu7 s=Pauline [kw=s=guy't-aTmen=s=tu7]
say=thcn NOM=Paulinc [DET=NOM=slccp-want=3poss=thcn]
'Pauline said that she was tired.'
OK: Pauline said at a past time t that she was tired at t
276
III. Methods in semantic research
On the other hand, in English a future embedded under another future does not allow
a simultaneous reading, but only allows a forward-shifted reading (Enc, 1996, Abusch
1998, among others).The same is true of St'aYimcets, as shown in (7). Just as in English,
the time of Pauline's predicted tiredness must be later than the time at which Pauline
will speak.
(7) lsHl=kelh s=Pauline [kw=s=guy't-aTmen=s=A;e//j]
say=FUT NOM=Pauline [DET=NOM=sleep-want=3Poss=fcr]
'Pauline will say that she will be tired.'
♦Pauline will say at a future time t that she is tired at t.
OK: Pauline will say at a future time t that she will be tired at t' after t.
The parallel behaviour of the two languages can be explained if we assume that the
St'aYimcets matrix clauses in (6)-(7) contain a phonologically covert non-future tense
morpheme (Matthewson 2006b). The embedding data can then be afforded the same
explanation as the corresponding facts in English (see e.g., Abusch 1997,1998, Ogihara
1996 for suggestions).
We can see that there is good reason to believe that St'aYimcets is tensed, but the
evidence for the tense morpheme is not surface-obvious. Of course, there are other
languages which have been argued to be tense less (see e.g., Bohnemeyer 2002, Bittner 2005,
Lin 2006, Ritter & Wiltschko 2005, and article 97 (Smith) Tense and aspect). However,
I hope to have shown that it is useless to confine one's examination of temporal systems
to superficial data. Doing this would lead to the possibly premature rejection of core
universal properties of temporal systems in human language. This in turn suggests that
we cannot say that semantic universals must be surface-testable.
5. How do we find universals?
Given that we must postulate universals based on insufficient data (as data is never
available for all human languages), we employ a range of strategies to get from data to
universals. I will not discuss sources of evidence such as language change or cognitive-
psychological factors; the reader is referred to article 99 (Fritz) Theories of meaning
change, article 100 (Geeraerts) Cognitive approaches to diachronic semantics, and article
27 (Talmy) Cognitive Semantics for discussion. Article 12 (Krifka) Varieties of semantic
evidence is also relevant here.
5.1. Assume universality
One approach to establishing universals is simply to assume them, based on study of
however many languages one has data from. This is actually what all universals research
does, as no phenomenon has ever been tested in all of the world's languages. A point
of potential debate is the number of languages which need to be examined before
a universal claim should be made. This issue arises for semanticists because the nature
of the fieldwork means that testing semantic universals necessarily proceeds quite
slowly.
13. Methods in cross-linguistic semantics 277
My own belief is that one should always assume universality in the absence of
evidence to the contrary - and then go and look for evidence to the contrary. In earlier
work I dubbed this strategy the 'No variation null hypothesis' (Matthewson 2001); see
also article 95 (Bach & Chao) Semantic types across languages. The assumption of
universality is a methodological strategy which tells us to begin with the strongest
empirically falsifiable hypothesis. It does not entail that there is an absence of cross-linguistic
variation in the semantics, and it does not mean that our analyses of unfamiliar languages
must look parallel to those of English. On the contrary, work which adopts this strategy
is highly likely to uncover semantic variation, and to help establish the parameters within
which natural language semantics may vary.
The null hypothesis of universality is assumed by much work within semantics. In
fact, some of the most influential semantic universals to have been proposed, those of
Barwise & Cooper (1981),advance data only from English in support. (See also Keenan &
Stavi 1986 for related work, and article 95 (Bach & Chao) Semantic types across
languages for further discussion.) We saw two of Barwise and Cooper's universals above;
here are two more for illustration. Ul is a universal semantic definition of the noun
phrase (DP, in modern terminology):
Ul. NP-Quantifier universal (Barwise & Cooper 1981: 177)
Every natural language has syntactic constituents (called noun-phrases) whose semantic
function is to express generalized quantifiers over the domain of discourse.
U3 contains the well-known conservativity constraint, as well as asserting that there are
no languages which lack determiners in the semantic sense.
U3. Determiner universal (Barwise & Cooper 1981: 179)
Every natural language contains basic expressions, (called determiners) whose semantic
function is to assign to common count noun denotations (i.e., sets) A a quantifier that lives on A.
Barwise and Cooper do not offer data from any non-English language, yet postulate
universal constraints.This methodology was clearly justified, and the fact that not all of their
proposed universals have stood the test of time is largely irrelevant. By proposing explicit
and empirically testable constraints, Barwise and Cooper's work inspired a large amount
of subsequent cross-linguistic research, which has expanded our knowledge of quantifica-
tional systems in languages of the world. For example, several of the papers in the Bach
et al. volume (1995) specifically address the NP-Quantifier Universal; see also Bittner &
Trondhjem (2008) for a recent criticism of standard analyses of generalized quantifiers.
It is sometimes asserted that the strategy of postulating universals as soon as one can,
or even of claiming that an unfamiliar language shares the same analysis as English, is
euro-centric (see e.g., Gil 2001). But this is not correct. Since English is a well-studied
language, it is legitimate to compare lesser-studied languages to analyses of English.
It is our job to see whether we can find evidence for similarities between English and
unfamiliar languages, especially when the evidence may not be superficially obvious.The
important point here is that the null hypothesis of universality does not assign analyses
of any language any priority over language-faithful analyses of any other languages.
The null hypothesis is that all languages arc the same - not that all languages arc like
English. Thus, it is just as likely that study of a non-Indo-European language will cause us
278
III. Methods in semantic research
to review and revise previous analyses of English. One example of this is Bittner's (2008)
analysis of temporal anaphora. Bittner proposes a universal system of temporal anaphora
which "instead of attempting to extend an English-based theory to a typologically distant
language ... proceeds in the opposite direction -extending a Kalaallisut-based theory to
English" (Bittner 2008:384). Another example is the analysis of quantification presented
in Matthewson (2001). I observe there that the St'aVimcets data are incompatible with
the standard theory of generalized quantifiers, because St'aYimcets possesses no
determiners which create a generalized quantifier by operating on a common noun
denotation, as English every or most do. (The insight that Salish languages differ from English
in their quantificational structures is originally due to Jelinck 1995.) I propose that the
analysis suggested by the St'at'imccts data may be applicable also to English. In a similar
vein, see Bar-el (2005) for the claim that the aspectual system of Skwxwu7mesh invites
us to reanalyze the aspectual system of English.
The null hypothesis of universality guides empirical testing in a fruitful way; it gives
us a starting hypothesis to test, and it allows us to use knowledge gained from study of
one language in study of another. It inspires us to look beyond the superficial for
underlying similarities, yet it does not force us to assume similarity where none exists. It also
explicitly encodes the belief that there are limits on cross-linguistic variation. In contrast,
an extreme non-universalist position (cf. Gil 2001) would essentially deny the idea that
we have any reason to expect similarity between languages. This can lead to premature
acceptance of exotic analyses.
5.2. Language acquisition and learnability
Generative linguists are interested in universals which are likely to have been provided
by Universal Grammar, and clues about this can come from acquisition research and
learnability theory. As argued by Crain & Pietroski (2001: 150) among others, the
features which are most likely to be innately specified by UG arc those which are shared
across languages but which arc not accessible to a child in the Primary Linguistic Data
(the linguistic input available to the learner). Crain and Pietroski offer the examples
of coreference possibilities for pronouns and strong crossover; the restrictions found in
these areas are by hypothesis not learnable based on experience.The discussion of tense
above is another case in point, as the data required to establish the existence of the null
tense morpheme are not likely to be often presented to children in their first few years
of life.
5.3. A typological survey
It may seem that a typological survey is at the opposite end of the spectrum from the
research approaches discussed so far. However, even nativists should not discount the
benefits of typological research. Such studies can help us gain perspective on which
research questions arc of central interest from a cross-linguistic point of view.
One example of this involves determiner semantics. A preliminary study by
Matthewson (2008) (based on grammars of 33 languages from 25 language families)
suggests that English is in many ways typologically unusual with respect to the syntax
and semantics of determiners. For example, cross-linguistically, articles are the exception
rather than the rule. Demonstratives arc present in almost all languages, but rarely seem
13. Methods in cross-linguistic semantics 279
to occupy determiner position as they do in English; strong quantifiers often appear to
occupy a different position again. In many languages, the only strong quantifiers are
universals. While distributive universal quantifiers are frequently reduplications or affixes,
non-distributive ones are not. What we see is that a broad study can lead us to pose
questions that would perhaps not otherwise be addressed, such as what the reason is for
the apparent correlation between the syntax of a universal quantifier and its semantics.
There are many examples of typological studies which have inspired formal semantic
analyses (e.g.,Cusic 1981,Comrie 1985,Corbett 2000, to name a few, and see also Haspelmath
et al.2001 for a comprehensive overview of typology and universals).
6. Semantic universals in the literature
It is occasionally implied that since there is no agreed-upon set of semantic universals,
they must not exist. For example, Levinson (2003:315) claims: "There are in fact very few
hypotheses about semantic universals that have any serious, cross-linguistic backing."
However, I believe that the reason there is not a large number of agreed-upon semantic
universals is simply that semantic universals have so far only infrequently been explicitly
proposed and subjected to cross-linguistic testing. And the situation is changing: there
is a growing body of work which addresses semantic variation and its limits, and the
literature has reached a level of maturity where the relevant questions can be posed in an
interesting way. (Some recent dissertations which employ semantic fieldwork are Faller
2002, Wilhelm 2003, Bar-el 2005, Gillon 2006,Tonhauser 2006, Deal 2010, Murray 2010,
Peterson 2010, among others.) Many more semantic universals will be proposed as the
field continues to conduct in-depth, detailed study of a wide range of languages.
In this section I provide a very brief overview of some universals within semantics,
illustrating universals from a range of different domains.
In the area of lexical semantics we have for example Levinson's (2003:315) proposal
restricting elements encoding spatial relations. He argues that universally, there are at
most three frames of reference upon which languages draw, each of which has precise
characteristics that could have been otherwise. Levinson's proposals arc discussed
further in article 107 (Landau) Space in semantics and cognition; see also article 27 (Talmy)
Cognitive Semantics for discussion of location.
A proposal which restricts composition methods and semantic rules is that of
Bittner (1994). Bittner claims that there are no language-specific or construction-
specific semantic rules. She proposes a small set of universal operations which
essentially consist of functional application, type-lifting, lambda-abstraction, and a rule
interpreting empty nodes. Although her proposal may seem to be worded in a theory-
dependent way, claims about possible composition rules can be, and have been,
challenged on at least partly empirical grounds; sec for example Hcim & Kratzcr (1998),
Chung & Ladusaw (2003) for challenges to the idea that the only compositional rule is
functional application.
Another proposal of Bittner's involves a set of universals of temporal anaphora,
based on comparison of English with Kalaallisut. These include statements about the
location of states, events, processes and habits relative to topical instants or periods, and
restrictions on the default topic time for different event-types.The constraints are
explicitly formulated independently of syntax: "Instead of aligning LF structures, this strategy
aligns communicative functions" (Bittner 2008:383).
280
III. Methods in semantic research
A relatively common type of semantic universal involves constraints on semantic
types. For example, Landman (2006) proposes that universally, traces and pro-forms can
only be of type e.This is a purely semantic constraint: it is independent of whether the
syntax of a language allows elements occupying the relevant positions to be of different
categories.
A universal constraint on the syntax-semantics mapping is proposed by Gillon (2006).
Gillon argues that universally, all and only elements which occupy D(eterminer)
position introduce a contextual domain restriction variable (von Fintel 1994). (Gillon has to
claim, contra Chierchia 1998, that languages like Mandarin have null determiners.) An
interesting constructional universal is suggested by Faller (2007). She argues that the
semantics of reciprocal constructions (involving plurality, distinctness of co-arguments,
universal quantification and reflexivity) may be cross-linguistically uniform.
7. Variation
The search for universals is a search for limits on variation. If we assume a null
hypothesis of universality, any observed variation necessitates a rejection of our null hypothesis
and leads us to postulate restrictions on the limits of variation. Restrictions on variation
are known in the formal literature as'parameters', but the relation between parameters
and universals is so tight that restricted sets of options from which languages choose can
be classified as a type of universal (cf. article 95 (Bach & Chao) Semantic types across
languages).
One prime source for semantic variation is independent syntactic variation. Given
compositionality,the absence of certain structures in a language may result in the absence
of certain semantic phenomena. Within the semantics proper, one well-known parameter
is Chierchia's (1998) Nominal Mapping Parameter, which states that languages vary in
the denotation of their NPs. A language may allow NPs to be mapped into predicates,
into arguments (i.e., kinds), or both. These choices correlate with a cluster of co-varying
properties across languages. The Nominal Mapping Parameter represents a weakening
of the null hypothesis that the denotations of NPs will be constant across languages.
Whether or not it is correct is an empirical matter, and Chierchia's work has inspired a
large amount of discussion (e.g., Cheng & Sybesma 1999, Schmitt & Munn 1999, Chung
2000, Longobardi 2001, 2005, Doron 2004, Dayal 2004, Krifka 2004, among others). As
with Barwise and Cooper's universals, Chierchia's parameter has done a great deal to
advance our understanding of the denotations of NPs across languages.
The Nominal Mapping Parameter also illustrates the restricted nature of the
permitted variation; there are only three kinds of languages. Relevant work on restricted
variation within semantics is discussed in article 96 (Doetjes) Count/mass distinction,
article 98 (Smith) Tense and aspect, and article 98 (Pcdcrson) The expression of space.
There is often resistance to the postulation of semantic parameters. One possible
ground for this resistance is the belief that semantics should be more cross-linguistically
invariant than other areas of the grammar. For example, Chomsky (2000: 185) argues
that "there are empirical grounds for believing that variety is more limited for semantic
than for phonetic aspects of language." It is not clear to me, however, that we have
empirical grounds for believing this. We are still in the early stages of research into semantic
variation; we probably do not know enough yet to say whether semantics varies less than
other areas of the grammar.
13. Methods in cross-linguistic semantics 281_
Another possible ground for skepticism about semantic parameters is the belief that
semantic variation is not learnable. However, there is no reason why this should be so.
Chierchia, for example, outlines a plausible mechanism by which the Nominal Mapping
Parameter is learnable, with Chinese representing the default setting in line with
Manzini & Wexler's (1987) Subset Principle (Chierchia 1998:400-401). Chierchia argues
that his parameter "is learned in the same manner in which every other structural
difference is learned: through its overt morphosyntactic manifestations. It thus meets fully the
reasonable challenge that all parameters must concern live morphemes."
A relatively radical semantic parameter is the proposal that some languages lack
pragmatic presuppositions in the sense of Stalnaker (1974) (Matthcwson 2006a, based
on data from St'at'imcets). This parameter fails to be tied to specific lexical items, as it
is stated globally. However, the presupposition parameter places languages in a subset-
superset relation, and as such is potentially learnable, as long as the initial setting is
the St'aYimcets one. A learner who assumes that her language lacks pragmatic
presuppositions can learn that English possesses such presuppositions on the basis of overt
evidence (such as'Hey, wait a minute!' responses to presupposition failure, cf. von Fintel
2004).
One common way in which languages vary in their semantics is in degree of under-
specification. For example, I argued above that St'at'imcets shares basic tense semantics
with English.The languages differ in that the St'at'imcets tense morpheme is semantically
undcrspccificd, failing to distinguish past from present reference times. Another case of
variation in underspecification involves modality. According to Rullmann,Matthewson &
Davis (2008), St'at'imcets modals allow similar interpretations to English modal
auxiliaries. However, the interpretations are distributed differently across the lexicon: while in
English, a single modal allows a range of different conversational backgrounds (giving
rise to deontic, circumstantial orepistemic interpretations, Kratzer 1991), in St'at'imcets,
the conversational background is lexically encoded in the choice of modal. Conversely,
while in English, the quantificational force of a modal is lexically encoded, in St'at'imcets
it is not;each modal is compatible with both existential and universal interpretations.
One strong hypothesis would be that all languages possess the same functional
categories, but that languages differ in the levels of underspecification in the lexical entries
of the various morphemes, and in the way in which they 'bundle' the various aspects
of meaning into morphemes. As an example of the latter case, Lin's (2006) analysis of
Chinese involves partially standard semantics for viewpoint aspect and for tense, but a
different bundling of information from that found in English (for example, the perfective
aspect in Chinese includes a restriction that the reference time precedes the utterance
time).
In terms of lexical aspect, there appears to be cross-linguistic variation in the semantics
of basic aspectual classes. Bar-el (2005) argues that stative predicates in Skwxwu7mesh
Salish include an initial change-of-state transition. Bar-el implies that while the basic
building blocks of lexical entries are provided (e.g., that events can either begin or end
with BECOME transitions (cf. Dowty 1979, Rothstein 2004), languages combine these
in different ways to give different semantics for the various aspectual classes. Thus, the
semantics we traditionally attribute to 'accomplishments' or to 'states' arc not primitives
of the grammar. While the Salish/English aspectual differences appear not to be
reducible to underspecification or bundling, further decompositional analysis may reveal that
they are.
282
III. Methods in semantic research
/ would like to thank Henry Davis for helpful discussion during the writing of this article,
and Klaus von Heusinger and Noor van Leusen for helpful feedback on the first draft.
8. References
Abusch, Dorit 1997. Sequence of tense and temporal de re. Linguistics & Philosophy 20,1-50.
Abusch, Dorit 1998. Generalizing tense semantics for future contexts. In: S. Rothstein (ed.). Events
and Grammar. Dordrecht: Kluwer. 13-33.
Aitken, Barbara 1955. A note on eliciting. International Journal of American Linguistics 21,83.
Bach, Emmon et al. (eds) 1995. Quantification in Natural Languages. Dordrecht: Kluwer.
Baker, Mark 2003. Lexical Categories: Verbs, Nouns and Adjectives. Cambridge: Cambridge
University Press
Bar-cl. Lcora 2005. Aspectual Distinctions in Skwxwii7mesh. Ph.D. dissertation. University of
British Columbia, Vancouver, BC.
Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics &
Philosophy 4,159-219.
Bittner, Maria 1987. On the semantics of the Greenlandic antipassive and related constructions.
International Journal of American Linguistics 53,194-231.
Bittner, Maria 1994. Cross-linguistic semantics. Linguistics & Philosophy 17,53-108.
Bittner, Maria 2005. Future discourse in a tenseless language. Journal of Semantics 22,339-388.
Bittner, Maria 2008. Aspectual universals of temporal anaphora. In: S. Rothstein (ed.). Theoretical
and Crosslinguistic Approaches to the Semantics of Aspect. Amsterdam: Benjamins, 349-385.
Bittner, Maria & Naja Trondhjem 2008. Quantification as reference: Kalaallisut Q-verbs In:
L. Matthewson (ed.). Quantification: A Cross-Linguistic Perspective. Amsterdam: Emerald.
7-66.
Bohnemeyer, Jiirgen 2002. The Grammar of Time Reference in Yukatek Maya. Munich: Lincom
Europa.
Carden, Guy 1970. A note on conflicting idiolects. Linguistic Inquiry 1,281-290.
Cheng, Lisa & Rint Sybcsma 1999. Bare and not-so-barc nouns and the structure of NP. Linguistic
Inquiry 30,509-542.
Chierchia, Gennaro 1998. Reference to kinds across languages. Natural Language Semantics 6,
339-405.
Chomsky, Noam 2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge
University Press
Chung, Sandra 2000. On Reference to kinds in Indonesian. Natural Language Semantics 8,157-171.
Chung, Sandra & William Ladusaw 2003. Restriction and Saturation. Cambridge, MA: The MIT
Press
Comrie, Bernard 1985. Tense. Cambridge: Cambridge University Press
Comrie, Bernard 1989. Language Universals and Linguistic Typology: Syntax and Morphology.
Oxford: Blackwell.
Corbett, Greville 2000. Number. Cambridge: Cambridge University Press
Crain, Stephen & Paul Pietroski 2001. Nature, nurture, and universal grammar. Linguistics &
Philosophy 24,139-186.
Crain, Stephen & Rosalind Thornton 1998. Investigations in Universal Grammar:A Guide to
Experiments on the Acquisition of Syntax. Cambridge, MA: The MIT Press
Cusic, David 1981. Verbal Plurality and Aspect. Ph.D. dissertation. Stanford University, Stanford,
CA.
Davis Henry 2002. Categorial restrictions in St'aTimcets (Lillooet) relative clauses In: C. Gillon,
N. Sawai & R. Wojdak (eds). Papers for the 37th International Conference on Salish and
Neighbouring Languages. Vancouver, BC: University of British Columbia, 61-75.
13. Methods in cross-linguistic semantics 283
Davis, Henry 2009. Cross-linguistic variation in anaphoric dependencies: Evidence from the Pacific
Northwest. Natural Language and Linguistic Theory 27,1-43.
Davis, Henry & Lisa Matthewson 1999. On the functional determination of lexical categories.
Revue Quebecoise de Linguistique 27,27-67.
Dayal, Veneeta 2004. Number marking and (in)definiteness in kind terms. Linguistics & Philosophy
27,393-450.
Deal, Amy Rose 2010. Topics in the Nez Perce Verb. Ph.D. dissertation. University of Massachusetts,
Amherst, MA.
Demirdache, Hamida & Lisa Matthewson 1995. On the universality of syntactic categories. In:
J. Beckman (ed.). Proceedings of the North Eastern Linguistic Society (= NELS) 25. Amherst,
MA: GLSA, 79-93.
Dimmendaal, Gerrit 2001. Places and people: Field sites and informants. In: P. Newman & M. Ratliff
(eds.). Linguistic Fieldwork. Cambridge: Cambridge University Press. 55-75.
Dixon, Robert M.W. 1977. Where have all the adjectives gone? Studies in Language 1,19-80.
Doron, Edit 2004. Bare singular reference to kinds. In: R. Young & Y. Zhou (cds.).
Proceedings of Semantics and Linguistic Theory (= SALT) XIII. Ithaca, NY: Cornell University,
73-90.
Dowty, David 1979. Word Meaning in Montague Grammar. Dordrecht: Reidel.
E119, Murvet 1996. Tense and modality. In: S. Lappin (ed.). The Handbook of Contemporary
Semantic Theory. Oxford: Blackwell, 345-358.
Faller, Martina 2002. Semantics and Pragmatics of Evidentials in Cuzco Quechua. Ph.D.
dissertation. Stanford University, Los Angeles, CA.
Faller, Martina 2007. The ingredients of reciprocity in Cuzco Quechua. Journal of Semantics 243,
255-288.
Ferguson, Charles 1972. Verbs of 'being' in Bengali, with a note on Amharic. In: J. Verhaar (ed.).
The Verb 'Be' and its Synonyms: Philosophical and Grammatical Studies, 5. Dordrecht: Rcidcl,
74-114.
von Fintel, Kai 1994. Restrictions on Quantifier Domains. Ph.D. dissertation. University of
Massachusetts, Amherst, MA.
von Fintel, Kai 2004. Would you believe it? The king of France is back! Presuppositions and truth
value intuitions. In: A. Bczuidcnhout & M. Rcimcr (cds.). Descriptions and Beyond. Oxford:
Oxford University Press. 261-296.
von Fintel, Kai & Lisa Matthewson 2008. Universals in semantics. The Linguistic Review 25,
49-111.
Gil, David 1987. Definitencss, noun phrase configurationality, and the count-mass distinction. In:
E. Reuland & A. ter Meulen (eds.). The Representation of (In)Definiteness. Cambridge, MA:
The MIT Press, 254-269.
Gil, David 2001. Escaping Euro-centrism: Fieldwork as a process of unlearning. In: P. Newman &
M. Ratliff (eds.). Linguistic Fieldwork. Cambridge: Cambridge University Press, 102-132.
Gillon, Carrie 2006. The Semantics of Determiners: Domain Restriction in Skwxwii7mesh. Ph.D.
dissertation. University of British Columbia, Vancouver, BC.
Grice, H. Paul 1975. Logic and conversation. In: P. Cole & J. L. Morgan (eds.). Syntax and
Semantics 3: Speech Acts. New York: Academic Press, 41-58. Reprinted in: S. Davis (ed.).
Pragmatics. Oxford: Oxford University Press, 1991,305-315.
Harris, Zellig S. & Charles F. Voegelin 1953. Eliciting in linguistics. Southwestern Journal of
Anthropology 9,59-75.
Haspchnath, Martin ct al. (cds.) 2001. Language Typology and Language Universals: An
International Handbook (HSK 20.1-2). Berlin: de Gruyter.
Hayes, Alfred 1954. Field procedures while working with Diegueno. International Journal of
American Linguistics 20,185-194.
Heim, Irene & Angelika Kratzcr 1998. Semantics in Generative Grammar. Oxford: Blackwell.
284
III. Methods in semantic research
Jelinek, Eloise 1995. Quantification in Straits Salish. In: E. Bach et al. (eds.). Quantification in
Natural Languages. Dordrecht: Kluwer, 487-540.
Katz. Jerrold 1976. A hypothesis about the uniqueness of human language. In: S. Harnad, H. Steklis
& J. Lancaster (eds.). Origins and Evolution of Language and Speech. New York: New York
Academy of Sciences, 33-41.
Keenan, Edward & Jonathan Stavi 1986. A semantic characterization of natural language
determiners. Linguistics & Philosophy 9,253-326.
Konstanz Universals Archiv 2002-. The Universals Archive, maintained by the University of
Konstanz, Department of Linguistics, http://typo.uni-konstanz.de/archive/intro/, March 5,
2009.
Kratzer, Angelika 1991. Modality. In: D. Wunderlich & A. von Stechow (eds.). Semantics: An
International Handbook of Contemporary Research (HSK 6). Berlin: de Gruyter, 639-650.
Krifka, Manfred 2004. Bare NPs: Kind-referring, indefinites, both, or neither? Empirical issues in
syntax and semantics. In: R. Young & Y Zhou (eds.). Proceedings of Semantics and Linguistic
Theory (= SALT) XIII. Ithaca, NY: Cornell University, 180-203.
Landman, Meredith 2006. Variables in Natural Language. Ph.D. dissertation. University of
Massachusetts, Amherst, MA.
Levinson. Stephen 2003. Space in Language and Cognition: Explorations in Cognitive Diversity.
Cambridge: Cambridge University Press
Lin, Jo-Wang 2006. Time in a language without tense: The case of Chinese. Journal of Semantics 23,
1-53.
Longobardi, Giuseppe 2001. How comparative is semantics? A unified parametric theory of bare
nouns and proper names. Natural Language Semantics 9,335-361.
Longobardi, Giuseppe 2005. A minimalist program for parametric linguistics? In: H. Broekhuis
et al. (eds.). Organizing Grammar: Linguistic Studies for Henk van Riemskijk. Berlin: de Gruyter,
407-414.
Manzini, M. Rita & Kenneth Wcxler 1987. Parameters, binding theory, and learnability. Linguistic
Inquiry 18,413-444.
Matthewson, Lisa 1998. Determiner Systems and Quantificational Strategies: Evidence from Salish.
The Hague: Holland Academic Graphics.
Matthewson, Lisa 2001. Quantification and the nature of cross-linguistic variation. Natural
Language Semantics 9,145-189.
Matthewson, Lisa 2004. On the methodology of semantic fieldwork. International Journal of
American Linguistics 70,369-415.
Matthewson, Lisa 2006a. Presuppositions and cross-linguistic variation. In: C. Davis, A. Deal & Y
Zabbal (eds.). Proceedings of the North Eastern Linguistic Society (= NELS) 36. Amherst, MA:
GLSA, 63-76.
Matthewson, Lisa 2006b. Temporal semantics in a superficially tenseless language. Linguistics &
Philosophy 29.673-713.
Matthewson, Lisa 2008. Strategies of quantification in St'aYimcets and the rest of the world. To
appear in Strategies of Quantification. Oxford: Oxford University Press
Montler, Timothy 2003. Auxiliaries and other categories in Straits Salishan. International Journal of
American Linguistics 69,103-134.
Murray, Sarah 2010. Evidentiality and the Structure of Speech Acts. Ph.D. dissertation. Rutgers
University, New Brunswick, NJ.
Nida, Eugene 1947. Field techniques in descriptive linguistics. International Journal of American
Linguistics 13,138-146.
Ogihara,Toshi-Yuki 1996. Tense, Attitudes and Scope. Dordrecht: Kluwer.
Parlee. Barbara 2000. 'Null' determiners vs. no determiners. Lecture read at the 2nd Winter Typology
School, Moscow.
Peterson, Tyler 2010. Epistemic Modality and Evidentiality in Gitksan at the Semantics-Pragmatics
Interface. Ph.D. dissertation. University of British Columbia, Vancouver, BC.
14. Formal methods in semantics
285
Ritter, Elizabeth & Martina Wiltschko 2005. Anchoring events to utterances without tense. In:
J. Alderete, C-h. Han & A. Kochetov (eds.). Proceedings of the 24th West Coast Conference on
Formal Linguistics (= WCCFL). Somerville, MA: Cascadilla Proceedings Project, 343-351.
Rothstein, Susan 2004. Structuring Events. Oxford: Blackwell.
Rullmann, Hotze, Lisa Matthewson & Henry Davis 2008. Modals as distributive indefinites. Natural
Language Semantics 16,317-357.
Schmitt. Cristina & Alan Munn 1999. Against the nominal mapping parameter: Bare nouns in
Brazilian Portuguese. In: P. Tamanji, M. Hirotani & N. Hall (eds.). Proceedings of the North
Eastern Linguistic Society (=NELS) 29. Amherst, MA: GLSA, 339-353.
Schiitze, Carson 1996. The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic
Methodology. Chicago, IL:Tlie University of Chicago Press
Stalnaker, Robert 1974. Pragmatic presuppositions. In: M.K. Munitz & P.K. Unger (eds.). Semantics
and Philosophy. New York: New York University Press, 197-213.
Tonhauser, Judith 2006. The Temporal Semantics of Noun Phrases: Evidence from Guarani. Ph.D.
dissertation. Stanford University, Stanford, CA.
Whorf, Benjamin 1956. Language, Thought and Reality: Selected Writings of Benjamin Lee Whorf
Edited by J. Carroll. Cambridge, MA:The MIT Press
Wilhelm, Andrea 2003. Telicity and Durativity: A Study of Aspect in Dene Suline (Chipewyan) and
German. Ph.D. dissertation, University of Calgary, Calgary, AL.
Yegerlehner, John 1955. A note on eliciting techniques. International Journal of American
Linguistics 21,286-288.
Lisa Matthewson, Vancouver (Canada)
14. Formal methods in semantics
1. Introduction
2. First order logic and natural language
3. Formal systems, proofs and decidability
4. Semantic models, validity and completeness
5. Formalizing linguistic methods
6. Linguistic applications of syntactic methods
7. Linguistic applications of semantic methods
8. Conclusions
9. References
Abstract
Covering almost an entire century, this article reviews in general and non-technical terms
how formal, logical methods have been applied to the meaning and interpretation of
natural language. This paradigm of research in natural language semantics produced
important new linguistic results and insights, but logic also profited from such
innovative applications. Semantic explanation requires properly formalized concepts, but only
provides genuine insight when it accounts for linguistic intuitions on meaning and
interpretation or the results of empirical investigations in an insightful way. The creative
tension between the linguistic demand for cognitively realistic models of human linguistic
competence and the logicians demand for a proper and explicit account of all and only the
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 285-305
286
III. Methods in semantic research
valid reasoning patterns initially led to an interesting divergence of methods and associated
research agendas. With the maturing of natural language semantics as a branch of
cognitive science an increasing number of logicians trained in linguistics and linguists apt in
using formal methods are developing more convergent empirical issues in interdisciplinary
research programs.
1. Introduction
The scientific analysis of patterns of human reasoning properly belongs to the ancient
discipline of logic, bridging more than twenty centuries from its earliest roots in the
ancient Greek philosophical treatises of Plato and Aristotle on syllogisms to its
contemporary developments in connecting dynamic reasoning in context to underlying neuro-
biological and cognitive processes. In reasoning with information from various sources
available to us, we systematically exploit (i) the meaning of the words, (ii) the way they
are put together in clauses, as well as (iii) the relations between these clauses and (iv)
the circumstances in which we received the information in order to arrive at a
conclusion (Frege 1892;Tarski 1956).The formal analysis of reasoning patterns not only offers
an important window on the meaning and interpretation of logical languages, but also
of ordinary, acquired, i.e. natural languages. It constitutes a core component of cognitive
science, providing the proper scientific methods to model human information processing
as constitutive structure and form.
Any explanatory scientific theory of the meaning and interpretation of natural
language must at some level aim to characterize all and only those patterns of reasoning
that guarantee to preserve in one way or another the assumed truth of the information
on which the conclusions are based. In analyzing patterns of inferences, content and
specific aspects of the interpretation can only be taken into consideration, if they can be
expressed in syntactic form and constitutive structure or in the semantic meta-language.
The contemporary research program of natural language semantics has significantly
expanded the expressions of natural language to be subjected to such formal methods of
logical analysis to cover virtually all syntactic categories, as well as relations between
sentences and larger sections of discourse or text, mapping syntactic, configurational
structures to sophisticated notions of semantic content or information structure (Barwise &
Perry 1983; Chierchia 1995; Cresswell 1985; Davidson & Harman 1972; Dowty, Wall &
Peters 1981; Gallin 1975; Partee, ter Meulen & Wall 1990).
The classical division of labor between syntactic and semantic theories of reasoning
is inherited from the mathematical logical theories developed in the early twentieth
century, when logical, i.e. unnatural and purposefully designed languages were the
primary subject of investigation. Syntactic theories of reasoning exploit as explanatory
tools merely constitutive, configurational methods, structural, i.e. formal operations
such as substitution and pure symbol manipulation of the associated proof theories. The
semantic theories of reasoning require an interpretation of such constitutive, formal
structure in models to characterize truth-conditions and validity of reasoning as
systematic interaction between form and meaning (Boolos & Jeffrey 1980).The syntactic, proof
theoretic strategy with its customary disregard for meaning as intangible, has originally
been pursued most vigorously for natural language in the research paradigm of
generative grammar. Its earliest mathematical foundational research in automata theory,
14. Formal methods in semantics
287
formal grammars and their associated design languages regarded structural operations
as the only acceptable formal methods. Reasoning or inference was as such not the target
of their investigations, as grammars were principally limited to characterize sentence
internal properties and their computational complexity (Chomsky 1957; Chomsky 1959;
Chomsky & Miller 1958, 1963; Hopcroft & Ullman 1979; Gross & Lentin 1970). The
semantic, model-theoretic strategy of investigating human reasoning has been pursued
most vigorously in natural language semantics, Lambek and Montague grammars and
game theoretic semantics, and their 21st century successors, the various dynamic theories
of meaning and interpretation founded on developments in intensional and epistemic
logicsof (sharing) belief and knowledge (Barwise 1989;van Benthem 1986;van Benthem &
tcr Mculcn 1997;Crcsswcll 1985; Davidson & Harman 1972; Kamp 1981; Kamp & Reyle
1993; Lambek 1958; Lewis 1972,1983; Montague 1974;Stalnaker 1999).
Formal methods deriving from logic are also applied in what has come to be known as
formal pragmatics, where parameters other than worlds or situations, such as context or
speaker/hearer, time of utterance or other situational elements may serve in the models
to determine meaning and situated inference.This article will address formal methods in
semantics as main topic, as formal pragmatics may be considered a further
generalization of these methods to serve wider linguistic applications, but as such docs not in any
intrinsic way differ from the formal methods used in semantics.
In both logical and natural languages, valid forms of reasoning in their most general
characteristics exploit the information available in the premises, assumed to be true in
an arbitrary given model, to draw a conclusion, guaranteed to be also true in that model.
Preserving this assumed truth of the premises is a complex process that may be modeled
in various formal systems, but if their admitted inference rules are somehow violated in
the process the conclusion of the inference is not guaranteed to be true. This common
deductive approach accounts for validity in forms or patterns of reasoning based on the
stable meaning of the logical vocabulary, regardless of the class of models under
consideration. It has constituted the methodological corner stone of the research program
of natural language semantics, where ordinary language expressions are translated into
logical expressions, their 'logical form', to determine their truth conditions in models as a
function of their form and subsequently characterize their valid forms of reasoning (May
1985; Montague 1974; Dowty, Wall & Peters 1981). Some important formal methods of
major schools in this research program are reviewed, mostly to present their conceptual
foundations and discuss their impact on linguistic insights, while referring the reader for
more technical expositions and formal details to the relevant current literature and
articles 10 (Newen & Schroder) Logic and semantics, 11 (Kempson) Formal semantics and
representationalism, 33 (Zimmermann) Model-theoretic semantics, 43 (Keenan)
Quantifiers, 37 (Kamp & Reyle) Discourse Representation Theory and 38 (Dekker) Dynamic
semantics (see also van Benthem & ter Meulen 1997;Gabbay & Guenthner 1983;Partee,
terMeulen& Wall 1990).
Excluded from consideration as formal methods in the sense intended here arc
other mathematical methods, such as statistical, wductive inference systems, where the
assumptions and conclusions of inferences are considered more or less likely, or various
quantitative approaches to meaning based on empirical studies, or optimality systems,
which ordinarily do not account for inference patterns, but rank possible interpretations
according to a given set of constraints of diverse kinds. In such systems form does not
directly and functionally determine meaning and hence the role of inference patterns, if
288
III. Methods in semantic research
any, is quite different from the core role in logical, deductive systems of reasoning which
constitute our topic. Of course, as any well defined domain of scientific investigation,
such systems too may be further formalized and perhaps eventually even axiomatized
as a logical, formal system. But in the current state of linguistics their methods are often
informal, appealing to semantic notions as meaning only implicitly, resulting in
fragmented theories, which, however unripe for formalization, may still provide genuinely
interesting and novel linguistic results.
In section 2 of this article the best known logical system of first order logic is discussed.
It served as point of departure of natural language semantics, in spite of its apparent
limitations and idealizations. In section 3 the classical definition of a formal system is
specified with its associated notions of proof and theorem, and these arc related to its early
applications in linguistic grammars.The hierarchy of structural complexity generated by
the various kinds of formal grammars is still seen to direct the current quest in natural
language semantics for proper characterizations of cognitive complexity. Meta-logical
properties such as decidability of the set of theorems are introduced. In section 4 the
general notion of a semantic model with its definition of truth conditions is presented,
without formalization in set-theoretic notation, to serve primarily in conceptually
distinguishing contingent truth (at a world) in a model from logical truth in regardless which
model to represent valid forms of reasoning. Completeness is presented as a desirable,
but perhaps not always feasible property of logical systems that have attained a
perfect harmony between their syntactic and semantic sides. Section 5 discusses how the
syntactic and semantic properties of pronouns first provided a strong impetus for using
formal methods in separate linguistic research programs, that have later converged on a
more integrated account of their behavior, currently still very much under investigation.
In section 6 we review a variety of proof theoretic methods that have been developed in
logic to understand how each of them had an impact on formal linguistic theories later.
This is where constraints on derivational complexity and resource bounded generation
of expressions is seen to have their proper place, issues that have only gained in
importance in contemporary linguistic research. Section 7 presents an overview of the semantic
methods that have been developed in logic over the past century to see how they have
influenced the development of natural language semantics as a flourishing branch of
cognitive science, where formal methods provide an important contribution to their
scientific methods. The final section 8 contains the conclusion of this article stating that the
application of formal methods, deriving from logical theories, has greatly contributed
to the development of linguistics as an independent academic discipline with an
integrated, interdisciplinary research agenda. Seeking a continued convergence of syntactic
and semantic methods in linguistic applications will serve to develop new insights in the
cognitive capacity underlying much of human information processing.
2. First order logic and natural language
Many well known systems of first order logic (FOL), in which quantifiers may only range
over individuals, i.e. not over properties or sets of individuals, have been designed to
study inference patterns deriving from the meaning of the classical Boolean connectives
of conjunction (... and ...), disjunction (... or...), negation {not...) and conditionals {if
... then ...) and biconditionals (... if and only if...), besides the universal {every N) and
existential {some N) quantifiers. Syntactically these FOL systems differed in the number
14. Formal methods in semantics
289
of axioms, logical vocabulary or inference rules they admitted, some were optimal for
simplicity of the proofs, others more congenial to the novel user, relying on the intuitive
meaning of the logical vocabulary (Boolos & Jeffrey 1980; Keenan & Faltz 1985; Link
1991).
The strongly reformist attitudes of the early 20th century logicians Bertrand Russell,
Gottlob Frege, and Alfred Tarski, today considered the great-grandfathers of modern
logic, initially steered FOL developments away from natural language, as it was regarded
as too ambiguous, hopelessly vague,or content-and context-dependent. Natural language
was even considered prone to paradox, since you can explicitly state that something is
or is not true. This is what is meant when natural language is accused of "containing its
own truth-predicate", for the semantics of the predicate "is true" cannot be formulated
in a non-circular manner for a perfectly grammatical, but self-referential sentence like
This statement is false, which is clearly true just in case it is false. Similarly treacherous
forms of self-referential acts with circular truth-conditions are found in simple
statements such as/am lying,resembling the ancient Cretense Liar Paradox,and the
syntactically overtly self-referential, but completely comprehensible The claim that this sentence
contains eleven words is false, that lead later model theoretic logicians to develop
innovative models of self-reference and logically sound forms of circularity, abandoning the
need for rock bottom atomic elements of sets, as classical set theory had always required
(Barwise & Etchemendy 1987).
Initially FOL was advocated as a 'good housekeeping' act in understanding
elementary forms of reasoning in natural language, although little serious attention was paid
on just how natural language expressions should be systematically translated into FOL,
given their syntactic constituent structure. This raised objections of what linguists often
called 'miraculous translation', appealing to the implicit intuitions on truth-functional
meaning only trained logicians apparently had easy access to. Although this limited
logical language of FOL was never intended to come even close to modeling the wealth of
expressive power in natural languages, it was considered the basic Boolean algebraic
core of the logical inference engine. The descriptive power of FOL systems was
subsequently importantly enriched to facilitate the Fregean adagio of compositional
translation by admitting lambda abstraction, representing the denotation of a predicate as a
set A by a characteristic function /that tells you for each element d in the underlying
domain D whether or not d is an element of that set A, i.e.fA(d) is true if and only if d is
an clement of the set A (Partcc, tcr Mculcn & Wall 1990). Higher order quantification
with quantifiers ranging over sets of properties, or sets of those ad infinitum, required
a type structure, derived from the abstract lambda calculus, a universal theory of
functional structure.Typing formal languages resembled in some respects Bertrand Russell's
solution to avoid vicious circularity in logic (Barendregt 1984; Carpenter 1997; Montague
1974; Link 1991).
More complex connectives or other truth functional expressions and operators were
added to FOL, besides situations or worlds to the models to analyze modal, temporal
or epistemic concepts (van Benthem 1983; Carnap 1947; Kripke 1972; Lewis 1983;
McCawley 1981). The formal methods of FOL semantics varied from classical Boolean
full bi-valuation with total functions that ultimately take only true and false as values, to
three or more valued models in which formulas may not always have a determined truth
value, initially proposed to account for presupposition failure of definite descriptions
that did not have a referent in the domain of the intended model. Weaker logical systems
290
III. Methods in semantic research
were also proposed, admitting fewer inference rules.The intuitionistic logics are perhaps
the best known of these, rejecting for philosophical and perhaps conceptual reasons the
classical law of double negation and the correlated rule of inference that allowed you to
infer a conclusion, if its negation had been shown to lead to contradictions. In classical
FOL the truth functional definition of the meaning of negation as set-theoretic
complement made double negation logically equivalent to no negation at all, e.g. It is not the
case that every student did not hand in a paper should at least truth-functionally mean the
same as Some student handed in a paper (Partee, ter Meulen & Wall 1990). Admitting
partial functions that allowed quantifiers to range over possible extensions of their already
fixed, given range, ventured into for logic also innovative higher order methods, driven
by linguistic considerations of pronoun resolution in discourse and various sophisticated
forms of quantification found in natural language, to which we return below (Chierchia
1995;Gallin 1975;Groenendijk & Stokhof 1991; Kamp & Reyle 1993).
3. Formal systems, proofs and decidability
Whatever its exact language and forms of reasoning characterized as valid, any particular
logical system must adhere to some very specific general requirements, if it is to count as
a formal system. A formal system must consist of four components:
(i) a lexicon specifying the terminal expressions or words, and a set of non-terminal
symbols or categories,
(ii) a set of production rules which determine how the well formed expressions of any
category of the formal language may be generated,
(iii) a set of axioms or expressions of the lexicon that are considered primitive,
(iv) a set of inference rules, determining how expressions may be manipulated.
A formal system may be formulated purely abstractly without being intended as
representation of anything, or it may be designed to serve as a description or simulation
of some domain of real phenomena or, as intended in linguistics, modeling aspects of
empirical, linguistic data.
A formal proof is the product of a formal system, consisting of (i) axioms,
expressions of the language that serve as intuitively obvious or in any case unquestionable first
principles, assumed to be true or taken for granted no matter what, and (ii) applications
of the rules of inference that generate sequences of steps in the proof, resulting in its
conclusion, the final step, also called a theorem. The grammar of a language, whether
logical or natural, is a system of rules that generates all and only all the grammatical
or well-formed sentences of the language. But this does not mean that we can always
get a definite answer to the general question whether an arbitrary string belongs to a
particular language, something a child learning its first language may actually need.
There is no general decision procedure determining for any arbitrary given expression
whether it is or is not derivable in any particular formal system (Arbib 1969; Davis 1965;
Savitch,Bach& Marsh 1987). However, this question is pro\ab\y decidable for sizable
fragments of natural language, even if some form of higher order quantification is permitted
(Nishihara, Morita & Iwata 1990; Pratt-Hartmann 2003; Pratt-Hartmann & Third 2006).
Current research on generative complexity is focused on expanding fragments of
natural language that are known to be decidable to capture realistic limitations on search
14. Formal methods in semantics
291
complexity,in attempting to characterize formal counterparts to experimentally obtained
results on human limitations of cognitive resources, such as memory or processing time.
Automated theorem provers certainly assist people in detecting proofs for complex
theorems or huge domains. People actually use smart, but still little understood heuristics in
finding derivations, trimming down the search space of alternative variable assignments,
for example, by marked prosody and intonational meaning. Motivating much of the
contemporary research in applying formal methods to natural language is seeking to restrain
the complexity or computational power of the formal systems to less powerful, learnable
and decidable fragments. In such cognitively realistic systems the inference rules
constrain the search space of valuation functions or exploit limited resources in linguistically
interesting and insightful ways.
The general research program of determining the generative or computational
complexity of natural languages first produced in the late 1950s the well known Chomsky
Hierarchy of formal language theory, which classifies formal languages, their
corresponding automata and the phrase-structure grammar that generate the languages as
regular, context-free, context-sensitive or unrestricted rewrite systems (Arbib 1969;
Gross & Lentin 1970; Hopcroft & Ullman 1979). Initially, Chomsky (1957, 1959)
claimed that the rich structure of natural languages with its complex forms of
agreement required the strength of unrestricted rewrite systems, or Turing machines. Quickly
linguists realized that for grammars to be learnable and to claim to model in any
cognitively realistic way our human linguistic competence, whether or not innate, their
generative power should be substantially restricted (Peters & Ritchie 1973). Much of the
contemporary research on resource-bounded categorial grammars and the economy of
derivation in minimalist generative grammar, comparing formal complexity of
derivations, is still seeking to distill universal principles of natural languages, considered
cognitive constants of human information processing, linguistically expressed in structurally
very diverse phenomena (Adcs & Stccdman 1982; Moortgat 1996; Morrill 1994, 1995;
Pentus 2006; Stabler 1997).
4. Semantic models, validity and completeness
On the semantic side of formal methods, the notion of a model M plays a crucial role in
the definition of truth-conditions for formulas generated by the syntax.The meaning of
the Boolean connectives is considered not to vary from one model to a next one, as it is
separated in the vocabulary as the closed class of expressions of the logical vocabulary.
The conjunction (... and ...) is true just in case each of the conjuncts is, the disjunction
(... or ...) is false only in the case neither disjunct is true. The conditional is only false
in the case the antecedent (//... clause) is true, but the consequent (then ... clause) is
false. The bi-conditional (... if and only if...) is true just in case the two parts have the
same truth value. Negation (not ...) simply reverses the truth value of the expression it
applies to. For the interpretation of the quantifiers an additional tool is required, a
variable assignment function /, which assigns to each variable x in the vocabulary of variables
a referent d, an element of the domain D of the model M. Proper names and the
descriptive vocabulary containing all kinds of predicates, i.e. adjectives, nouns, and verbs, are
interpreted by a function P, given with model M, that specifies who was the bearer of the
name, or who had a certain property corresponding to a one place predicate, or stood in
a certain relation for relational predicates in the given model A/.
292
III. Methods in semantic research
Linguists were quick to point out that in natural language at least conjunctions and
disjunctions connect not only full clauses, but also noun phrases, which is not always
reducible to a sentential connective, e.g. John and Mary met * John met and Mary met.
This kind of linguistic criticism quickly led to generalizations of Boolean operations in a
logical system, where variable assignment functions are generalized and given the
flexibility to assign referents of appropriate complex types in a richer higher order logic, but
such complexities need not concern us here.
A formal semantic model M consists hence of: (i) a domain D of semantic objects,
sometimes classified into types or given a particular internal structure, and (ii) a function
P that assigns appropriate denotations to the descriptive vocabulary of the language
interpreted. If the language also contains quantifiers, the model M comes equipped with
a given variable assignment function g, often considered arbitrary, to provide a referent,
which is an element of D, for all free variables.The given variable assignment function is
sometimes considered to represent the current context in some contemporary systems
that investigate context dependencies and indexicals. In FOL the universal quantifier
every x [N(x)J is interpreted as true in the model M, if all possible alternative assignment
functions g' to the variable x that it binds provide a referent d in the denotation of N. So
not only the given variable assignment function g provides a d in the denotation of N,
but all alternative assignments g' that may provide another referent, but could equally
well have been considered the given one, also do. For the existential quantifier some
x [N(x)J only one such variable assignment, the given one or another alternative variable
assignment function, suffices to interpret the quantifier as true in the model M. An easy
way to understand the effect of the interpretation of quantifiers in clauses is to see that
for a universal NP the set denoted by N should be a subset of the set denoted by the VP,
e.g. every student sings requires that the singers include all the students. Similarly, the
existential quantifier requires that the intersection of the denotation of the N and the
denotation of the VP is not empty, e.g. some student sings means that among the singers
there is at least one student.
Intensional models generalize this elementary model theory for extensional FOL
to characterize truth in a model relative to a possible world, situation or some other
index, representing, for instance, temporal or epistemic variability. Intensional
operators typically require a clause in its scope to be true at all or at only some such indices,
mirroring strong, universal and existential quantifiers respectively.The domain of
intensional models must be enriched with the interpretation of the variables referring to such
indices, if such meta-variables are included in the language to be interpreted. Otherwise
they are considered to be external to the model, added as a set of parameters indexing
the function interpreting the language. Domains may also be structured as (semi)lat-
tices or other partial orders, which has proven useful for the semantics of plurals, mass
terms and temporal reference to events, but such further variations on FOL must remain
outside the scope of the elementary exposition in this article.
To characterize logical truths, reflecting the valid reasoning patterns, one simply
generalizes over all formulas true in all logically possible models, to obtain those formulas
that must be true as a matter of necessity or form only, due to the meaning assigned to
their logical vocabulary, irrespective of what is actually considered to be factually the
case in the models. By writing out syntactic proofs with all their premises conjoined as
antecedents of a conditional of which the conclusion is the consequent one may test in
14. Formal methods in semantics
293
a semantic way the validity of the inference. If such a conditional cannot be falsified, the
proof is valid and vice versa.
A semantic interpretation of a language, natural or otherwise, is considered formal
only if it provides such precise logical models in which the language can be
systematically interpreted. Accordingly, contingent truths, which depend for their truth on what
happens to be the case in the given model, are properly distinguished from logical truths,
which can never be false in any possible model, because the meaning of the logical
vocabulary is fixed outside the class of models and hence remains invariable.
A logical system in which every proof of a theorem can be proven to correspond to a
semantically valid inference pattern and vice versa is called a complete system, as FOL
is. If a formal system contains statements that arc true in every possible model but
cannot be proven as theorems within that system by the admitted rules of inference
and the given axiom base, the system is considered incomplete. Familiar systems of
arithmetic have been proven to be incomplete, but that does not disqualify them
from their sound use in practice and they definitely still serve as a valuable tool in
applications.
5. Formalizing linguistic methods
Given the fundamental distinction between syntactic, proof-theoretic and semantic,
model-theoretic characterizations of reasoning in theories that purport to model
meaning and interpretation, the general question arises what (dis)advantages these two
methodologically distinct approaches respectively may have for linguistic applications.
Although in its generality this question may not be answerable in a satisfactory way, it is
clear that at least in the outset of linguistics as its own, independent scientific discipline,
quite different and mostly disconnected, if not antagonizing research communities were
associated with the two strategies. This separation of minds was only too familiar to
logicians from the early days of modern logic, where proof theorists and model theoretic
scmanticists often drew blood in their disputes on the priority, conceptual or otherwise,
of their respective methods. Currently seeking convergence of research issues in syntax
and semantics is much more en vogue and an easy go-between in syntactic and semantic
methods has already proven to pay off in obtaining the best linguistic explanations and
new insights.
One of the best examples of how linguistic questions could fruitfully be addressed
both by syntactic and semantic formal methods, at first separately, but later in tandem, is
the thoroughly studied topic of binding pronouns and its associated concept of quantifier
scope. Syntacticians focused primarily on the clear configurational differences between
free (la) and bound pronouns (lb), and reflexive pronouns (lc), which all depend in
different ways on the subject noun phrase that precedes and commands them in the same
clause. Co-indexing was their primary method of indicating binding, though no
interpretive procedure was specified with it, as meaning was considered elusive (Reinhart 1983a,
1983b).
(1) a. [Every student]; who knows [John/a professor] loves [him]#j .
b. [Every student^ loves [[his]i teacher]^ .
c. [Every student^ who knows [John/a professor], loves [himself^ ...
294
III. Methods in semantic research
This syntactic perspective on the binding behavior of pronouns within clauses had deep
relations to constraints on movement as a transformation on a string, to which we return
below. It limited its consideration of data to pronominal dependencies among clauses
within sentences, disregarding the fact that singular universal quantifiers cannot bind
pronouns across sentential boundaries (2a,b), but proper names, plurals and indefinite
noun phrases typically do (2c).
(2) a. [Every student^ handed in a paper. [He]#j . passed the exam.
b. [Every student^ who handed in a paper [ t \ passed the exam.
c. [John/A student/All students], handed in a paper.
[Hc/thcyjj passed the exam.
Semanticists had to understand how pronominal binding in natural language was in
some respects similar, but in other respects quite different from the ordinary variable
binding of FOL. From a semantic perspective the first puzzle was how ordinary proper
names and other referring expressions, including freely referring pronouns, that were
considered to have no logical scope and hence could not enter into scope ambiguities,
could still force pronouns to corefer (3a), even in intensional contexts (3b).
(3) a. John/he loves his mother.
b. John believes that Peter loves his mother.
c. Every student believes that Peter loves his mother.
In (3a) the reference of the proper name John or the contextually referring free
pronoun he fixes the reference of the possessive pronoun his. In (3b) the possessive pronoun
his in the subordinate clause could be interpreted as dependent on Peter, but equally
easily as dependent upon John in the main clause. If FOL taught semanticists to identify
bound variables with those variables that were syntactically within the scope of a
existential or universal quantifier, proper names had to be reconsidered as quantifiers having
scope, yet referring rigidly to the same individual, even across intensional contexts. By
considering quantifying in as a primary semantic technique to bind variables
simultaneously, first introduced in Montague (1974),semanticists fell into the logical trap of
identifying binding with configurational notions of linear scope.This fundamental connection
ultimately had to be abandoned, when the infamous Gcach sentence (4)
(4) Every farmer who owns a donkey beats it.
made syntacticians as well as semanticists realize that existential noun phrases in
restrictive relative clauses of universal subject noun phrases could inherit, as it were, their
universal force, creating for the interpretation of (4) cases of farmers and donkeys over
which the quantifier was supposed to range.This is called unselective binding, introduced
first by David Lewis (Lewis 1972,1983), but brought to the front of research in semantic
circles in Kamp (1981). Generalizing quantifying in as a systematic procedure to account
also for intcrscntcntial binding of pronouns, as in (2), was soon also realized to produce
counterintuitive results.This motivated an entirely new development of dynamic
semantics, where the interpretation of a given sentence would partially determine the
interpretation of the next sentence in a text and pronominal binding was conceptually once
14. Formal methods in semantics
295
and for all separated from the logical or configurational notion of scope (Groenendijk &
Stokhof 1991; Kamp & Reyle 1993). The interested reader is referred to article 38
(Dekker) Dynamic semantics and article 40 (Biiring) Pronouns for further discussion
and details of the resulting account.
6. Linguistic applications of syntactic methods
In logic itself quite a few different flavors of formal systems had been developed based
on formal proof-theoretic (syntactic) or model-theoretic (semantic) methods. The best
known on the proof-theoretic side are axiomatic proof theory (Jeffrey 1967), Gentzen
sequent calculus (Gentzen 1934), combinatorial logic (Curry 1961), and natural
deduction (Jeffrey 1967; Partee, ter Meulen & Wall 1990), each of which have made distinct
and significant contributions in various developments in semantics. Of the more semanti-
cally flavored developments most familiar are Tarski's classical notion of satisfaction in
models (Tarski 1956), the Lambek and other categorial grammars (Ajdukiewicz 1935;
Bar Hillel 1964; van Benthem 1987,1988;Buszkowski 1988;Buszkowski & Marciszewski
1988; Lambek 1958;Oehrle, Bach & Wheeler 1988), tightly connecting syntax and
semantics via type theory (Barendregt 1984; van Benthem 1991; Carpenter 1997; Morrill 1994),
Beth's tableaux method (Beth 1970),game theoretic semantics (Hintikka & Kulas 1985),
besides various intensional logical systems, enriched with indices representing possible
worlds or other modal notions, which each also have led to distinctive semantic
applications (Asher 1993; van Benthem 1983;Cresswell 1985; Montague 1974).The higher order
enrichment of FOL with quantification over sets, properties of individuals or
properties of properties in Montague Grammar (Barwise & Cooper 1981; van Benthem & ter
Meulen 1985; Montague 1974) at least initially directed linguists' attention away from
the global linguistic research program of seeking to constrain the generative capacity
and computational complexity of the formal methods in order to model realistically the
cognitive capacities of human language users, as it focused everyone's attention on com-
positionality, type theory and type shifting principles, and adopted a fully generalized
functional structure.
The next two sections selectively review some of the formal methods of these logical
systems that have led to important semantic applications and innovative developments
in linguistic theory.
An axiomatic characterization of a formal system is the best way to investigate its
logic, as it matches its semantics to prove straightforwardly its soundness and
completeness, i.e. demonstrating that every provable theorem is true in all models (semantical!)/
valid) and vice versa. For FOL a finite, in fact small number of logically true expressions
suffice to derive all and only all valid expressions, i.e. FOL is provably complete. But in
actually constructing new proofs an axiomatic characterization is much less useful, for it
does not offer any reliable heuristics as guidance for an effective proof search.The main
culprit is the rule of inference guaranteeing the transitivity of composition, i.e. to derive
A -> C from A -> B and B -> C requires finding an expression B which has no trace
in the conclusion A —> C. Since there are infinitely many possible such Bs, you cannot
exhaustively search for it.The Gentzen sequent calculus (Gentzen 1934) is known to be
equivalent to the axiomatic characterization of FOL, and it proved that any proof of a
theorem using the transitivity of composition, or its equivalent, the so called Cut
inference in Gentzen sequent calculus, may be transformed into a proof that avoids using this
296
III. Methods in semantic research
rule. Therefore, transitivity of composition is considered 'logically harmless', since the
Gentzen sequent calculus effectively limits searching for a proof of any theorem to the
expressions constituting the theorem you want to derive, called the subformula property.
The inference rules in Gentzen sequent calculus are hence guaranteed to decompose the
complexity of the expressions in a derivation, making the question whether an expression
is a theorem decidable. But from the point of view of linguistic applications the original
Gentzen calculus harbored another drawback as generative system. It allowed structural
inference rules that permuted the order of premises in a proof or rebracketed any triple
(associativity), making it hard to capture linguistically core notions of dominance,
governance or precedence between constituents of a sentence to be proven grammatical.To
restore constitutive order to the premises, the premises had to be regarded to create an
linearly ordered sequence or w-tuple, or a multiset (in mathematics, a multiset (or bag)
is a generalization of a set. A member of a multiset can have more than one instances,
while each member of a set has only one, unique instance).This is now customary within
the current categorical grammars deriving from Gentzen's system, reviewed below in
the semantic methods that affected developments in natural language semantics
(Moortgat 1997).
Perhaps the most familiar proof theoretic characterization of FOL is Natural
Deduction, in which rules of inference systematically introduce or eliminate the connectives
and quantifiers (Jeffrey 1967;Partee,ter Meulen & Wall 1990).This style of constructing
proofs is often taught in introductory logic classes, as it docs provide a certain
intuitive heuristics in constructing proofs and closely follows the truth-conditional meaning
given to the logical vocabulary, while systematically decomposing the conclusion and
given assumptions until atomic conditions are obtained. Proving a theorem with natural
deduction rules still requires a certain amount of ingenuity and insight, which may be
trained by practice. But human beings will never be perfected to attain logical
omniscience, i.e. the power to find each and every possible proof of a theorem. No actual
success in finding a proof may mean either you have to work harder at finding it or that
the expression is not a theorem, but you never know for sure which situation you are in
(Boolos & Jeffrey 1980;Stalnaker 1999).
The fundamental demand that grammars of natural languages must realistically
model the human cognitive capacities to produce and understand language has led
to a wealth of developments in searching how to cut down on the generative power
of formal grammars and their corresponding automata. Early in the developments of
generative grammar, the unrestricted deletion transformation was quickly considered
the most dangerously powerful operation in an unrestricted rewrite system or Turing
machine, as it permitted the deletion of any arbitrary expressions that were
redundant in generating the required surface expression (Peters & Ritchie 1973). Although
deletion as such has still not been eliminated altogether as possible effect of
movement, it is now always constrained to leave a trace or some other formal expression,
making deleted material recoverable. Hence no expressions may simply disappear in
the context of a derivation. The Empty Category Principle (ECP) substantiated this
requirement further, stating that all traces of moved noun phrases and variables must be
properly governed (Chomsky 1981; Hacgcman 1991; van Ricmsdijk & Williams 1986).
This amounts to requiring them to be c-commanded by the noun phrase interpreted as
binding them (c-command is a binary relation between nodes in a tree structure defined
as follows: Node A c-commands node B iff A * B, A does not dominate B and B does
14. Formal methods in semantics
297
not dominate A, and every node that dominates A also dominates B.). Extractions of an
adjunct phrase out a w/i-island as in
(5) *How. did Mary ask whether someone had fixed the car f.?
or moving w/j-expressions out of a f/iaf-clause as in
(6) *Who( does Mary believe that f. will fix the car?
are clearly ungrammatical, because they violate this ECP condition, as the traces f. are
co-indexed with and hence intended to be interpreted as bound by expressions outside
their proper governance domain.
The classical logical notion of quantifier scope is much less restricted, as ordinarily
quantifiers may bind variables in intensional context without raising any semantic
problems of interpretation, as we saw above in (3c) (Dowty, Wall & Peters 1981; Partee, ter
Meulen & Wall 1990). For instance, in intensional semantics the sentence (7)
(7) Mary believes that someone will fix the car.
has at least one so called 'de re' interpretation in which someone is 'quantified in1 and
assigned a referent in the actual world, of whom Mary believes that he will fix the car
in a future world, where Mary's beliefs have come true. Such wide scope interpretations
of noun phrases occurring inside a complementizer that-C? is considered in generative
syntax a form of opaque or hidden movement at LF, regarded a matter of semantic
interpretation, and hence external to grammar, i.e. not a question of derivation of syntactic
surface word order (May 1985). Surface wide scope w/i-quantifiers in such 'de re'
constructions binding overt pronouns occurring within the intensional context i.e. within the
that-clause are perfectly acceptable, as in (8).
(8) Of whonv does Mary believe that hef will fix the car?
Anyone intending to convey that Mary's belief regarded a particular person, rather
than someone hypothetically assumed to exist, would be wise to use such an overt wide
scope clause as in (8), according to a background pragmatic view that speakers should
select the optimal syntactic form to express their thought and to avoid any possible
misunderstanding.
This theoretical division of labor between tangible, syntactic movement to generate
proper surface word order and intangible movement to allow disambiguation of the
semantic scope of quantifiers has perhaps been the core bone of contention over many
years between on the one hand the generative formal methods, in which semantic
ambiguity does not have to be syntactically derived, and on the other hand,categorial grammar
and its later developments in Montague grammar, which required full compositionality,
i.e. syntactic derivation must determine semantic interpretation, disambiguating
quantifier scope by syntactic derivation. In generative syntax every derivational difference had
to be meaningful, however implicit this core notion remained, but in categorial
grammars certain derivational differences could be provably scmantically equivalent and
hence meaningless, often denigratingly called the problem of spurious ambiguities. To
298
III. Methods in semantic research
characterize the logical equivalence of syntactically distinguished derivations required
an independent semantic characterization of their truth conditions, considered a
suitable task of logic, falling outside the scope of linguistic grammar proper, according to
most generativists.This problem of characterizing which expressions with different
derivational histories would be true in exactly the same models, hence would be logically
equivalent, simply does not arise in generative syntax, as it hides semantic ambiguities
as LF movement not reflected in surface structure, avoiding syntactic disambiguation of
semantic ambiguities.
A much stronger requirement on grammatical derivations is to demand
methodologically that all and only the constitutive expressions of the derived expression must be
used in a derivation, often called surface compositionality (Cresswell 1985; Partee 1979).
This research program is aiming to eliminate from linguistic theory anything that is not
absolutely necessary. Chomsky (1995) claimed that both deep structure, completely
determined by lexical information, and surface structure, derived from it by
transformations, may be dispensed with. Given that a language consists of expressions which
match sound structure to representations of their meaning, Universal Grammar should
consist merely of a set of phonological, semantic, and syntactic features, together with
an algorithm to assemble features into lexical expressions and a small set of operations,
including move and merge, that constitute syntactic objects, the computational system
of human languages. The central thesis of this minimalist framework is that the
computational system is the optimal, most simple, solution to legibility conditions at the
phonological and semantic interface. The goal is to explain all the observed properties of
languages in terms of these legibility conditions, and properties of the computational
system. Often advocated since the early 1990s is the Lexicalist Hypothesis requiring that
syntactic transformations may operate only on syntactic constituents, and can only insert
or delete designated elements, but cannot be used to insert, delete, permute, or
substitute parts of words.This Lexicalist Hypothesis, which is certainly not unchallenged even
among generativists, comes in two versions: (a) a weak one, prohibiting transformations
to be used in derivational morphology, and (b) a strong version prohibiting use of
transformations in inflection. It constitutes the most fundamentally challenging attempt from
a syntactic perspective to approach surface compositionality as seen in the well-known
theories of natural language semantics to which we now turn.
7. Linguistic applications of semantic methods
The original insight of the Polish logician Alfred Tarski was that the truth conditional
semantics of any language must be stated recursively in a distinct meta-language in
terms of satisfaction of formulas consisting of predicates and free variables to avoid
the paradoxical forms of self-reference alluded to above (Tarski 1956; Barwise &
Etchemendy 1987). By defining satisfaction directly, and deriving truth conditions
from it, a proper recursive definition could be formulated for the semantics of any
complex expression of the language. For instance, in FOL an assignment satisfies the
complex sentence S and S' if and only if it satisfies S and it also satisfies 5'. For
universal quantification it required an assignment/to satisfy the sentence 'Every x sings'
if and only if for every individual that some other assignment/' assigns to the variable
x, while assigning the same things as/to the other variables,/' satisfies sings(x), i.e.
the value of every such f'(x) is an element in the set of singers. Tarski's definition of
14. Formal methods in semantics
299
satisfaction is compositional, since for an assignment to satisfy a complex expression
depends only on the syntactic composition of its constituents and their semantics, as
Gottlob Frege had originally required (Frege 1892).Truth conditions can subsequently
be stated relative to a model and an arbitrary given assignment, assigning all free
variables their reference. Truth cannot be compositionally defined directly for 'Every
x sings' in terms of the truth of sings(x), because sings(x) has a free variable x, so its
truth depends on which assignment happens to be the given one. The Tarskian truth-
conditional semantics of FOL also provided the foundation for natural language
semantics, limited to fragments that do not contain any truth or falsity predicate, nor
verbs like to lie, nor other expressions directly concerned with veridicality. The
developments of File Change Semantics (Hcim 1982), Discourse Representation Theory
(Kamp 1981;Kamp & Reyle 1993),SituationTheory (Barwise & Perry 1983;Seligman &
Moss 1997) and dynamic Montague Grammar (Chierchia 1995;Groenendijk& Stokhof
1991), that all allowed free variables or reference markers representing certain use of
pronouns to be interpreted as if bound by a widest scope existential quantifier, even if
they occurred in different sentences, fully exploit this fundamental Tarskian approach
to compositional semantics by satisfaction.
Other formal semantics methods for FOL were subsequently developed in the second
half of the 20th century as alternatives to Tarskian truth-conditional semantics. Beth
(1970) designed a tableaux method in which a systematic search for counterexamples to
the assumed validity of a reasoning pattern seeking to verify the premises, but falsify its
conclusion leads in a finite number of decompositional steps either to such a
counterexample, if one exists, or to closure, tantamount to the proof that no such counterexample
exists (Beth 1970; Partee, ter Meulen & Wall 1990).This semantic tableaux method
provided a procedure to enumerate the valid theorems of FOL, because it only required
a finite number of substitutions in deriving a theorem: (i) the expression itself, (ii) all
of its constituent expressions, and (iii) certain simple combinations of the constituents
depending on the premises. Hence any tableau for a valid theorem eventually closes,
and the method produces a positive answer. It does not however constitute a decision
procedure for testing the validity of any derivation, since it does not enumerate the set of
expressions that are not theorems of FOL.
Game-theoretic semantics characterizes the semantics of FOL and richer, intensional
logics in terms of rules for playing a verification game between a truth-seeking player
and falsification seeking, omniscient Nature (Hintikka & Kulas 1985; Hintikka & Sandu
1997; Hodges 1985). Its interactive and epistemic flavor made it especially suitable for
the semantics of interrogatives in which requests for information are acts of inquiry
resolved by the answerer, providing the solicited information (Hintikka 1976). Such
information-theoretic methods are currently further explored in the generalized context
of dynamic epistemic logic, where communicating agents each have access to partial,
private and publicly shared information and seek to share or hide information they may
have depending on their communicative needs and intentions (van Ditmarsch, van der
Hoek & Kooi 2007). Linguistic applications to the semantics of dialogue or multi-agent
conversations in natural language already seem promising (Ginzburg & Sag 2000).
It was first shown in Skolcm (1920) how second order methods could provide novel
tools for logical analysis by rewriting any linear FOL formula with an existential
quantifier in the scope of a universal quantifier into a formula with a quantification prefix
consisting of existential quantifiers ranging over assignment functions, followed by
300
III. Methods in semantic research
only monadic (one place) universal quantifiers binding individual variables. The
dependent first order existential quantifier is eliminated by allowing such quantification over
second-order choice functions that assign the value of the existentially quantified
dependent variable as a function of the referent assigned to the monadic, universally
quantified individual variable preceding it. Linguistic applications using such Skolem functions
have been given in the semantics of questions (Engdahl 1986) and the resolution of
functional pronouns (Winter 1997). The general strategy to liberate FOL from the linear
dependencies of quantifiers by allowing higher order quantification or partially ordered,
i.e. branching quantifier prefixes, was linguistically exploited in the semantic research on
branching quantifiers (Hintikka & Sandu 1997; Barwise 1979). From a linguistic point of
view the identification of linear quantifier scope with bound occurrences of variables in
their bracketed ranges never really seemed justified, since informational dependencies
such as coreference of pronouns bound by an indefinite noun phrase readily cross
sentential boundaries, as we saw in (2c). Furthermore, retaining perfect information on the
referents already assigned to all preceding pronouns smells of unrealistic logical
omniscience, where human memory limitations and contextual constraints are disregarded. It
is obviously much too strong as epistemic requirement on ordinary people sharing their
necessarily always limited, partial information (Seligman & Moss 1997). Game-theoretic
semantics rightly insisted that a proper understanding of the logic of information
independence and hence of the lack of information was just as much needed for natural
language applications, as the logic of binding and other informational dependencies.
Such strategic reconsiderations of the limitations of foundational assumptions of logical
systems have prompted innovative research in logical research programs, considerably
expanding the formal methods available in natural language semantics (Muskens, van
Benthem&Visser 1997).
By exploiting the full scale higher order quantification of the type-theoretic categorial
grammars Montague Grammar first provided a fully compositional account of the
translation of syntactically disambiguated natural language expressions to logical expressions
by treating referential noun phrases semantically on a par with quantificational ones as
generalized quantifiers denoting properties of sets of individuals.This was accomplished
obviously at the cost of generating spurious ambiguities ad libitum, giving up on the
program of modeling linguistic competence realistically (van Benthem & ter Meulen 1985;
Keenan & Westerstahl 1997; Link 1991; Montague 1974; Partee 1979). Its type theory,
based only on two primitive types, e for individual denoting expressions and t for truth-
value denoting expressions, forged a perfect fit between the syntactic categories and the
function-argument structure of their semantics. For instance, all nouns are considered
syntactic objects that require a determiner on their left side to produce a noun phrase
and semantically denote a set of entities, of type <e, t>, which is an element in the
generalized quantifier of type «e, t>, t > denoted by the entire noun phrase. Proper names,
freely referring pronouns, universal and existential NPs are hence treated semantically
on a par as denoting a set of sets of individuals. This fruitful strategy led to a significant
expansion of the fragments of natural languages that were provided with a
compositional model-theoretic semantics, including many kinds of adverbial phrases, degree
and measurement expressions, unusual and complex quantifier phrases, presuppositions,
questions, imperatives, causal and temporal expressions, but also lexical relations that
affected reasoning patterns (Chierchia 1995; Krifka 1989). Logical properties of
generalized quantifiers prove to be very useful in explaining, for instance, not only which noun
14. Formal methods in semantics
301
phrases are acceptable in pleonastic or existential contexts, but also why the processing
time of noun phrases may vary in a given experimental situation and how their semantic
complexity may also constrain their learnability. In pressing on for a proper,
linguistically adequate account of pronouns in discourse, and for a cognitively realistic logic of
information sharing in changing contexts, new tools that allowed for non-linear
structures to represent information content play an important conceptually clarifying role
in separating quantifier scope from the occurrence of variables in the linear or partial
order of formulas of a logical language, while retaining the core model-theoretic insights
in modeling inference as concept based on Tarskian satisfaction conditions.
8. Conclusions
The development of formal methods in logic has contributed essentially to the
emancipation of linguistic research into an academic community where formal methods
were given their proper place as explanatory tool in scientific theories of meaning and
interpretation. Although logical languages arc often designed with a particular
purpose in mind, they reflect certain interesting computational or semantic properties also
exhibited, though sometimes implicitly, in natural languages. The properties of natural
languages that lend themselves for analysis and explanation by formal methods have
increased steadily over the past century, as the formal tools of logical systems were more
finely chiseled to fit the purpose of linguistic explanation better. Even more properties
will most likely become accessible for linguistic explanation by formal methods over the
next century. The issues of cognitive complexity, characterized at many different levels
from the neurobiological, molecular structure detected in neuro-imaging to interactive
behavioral studies, and experimental investigations of processing time provide a new set
of empirical considerations in the application of formal methods to natural language.
They drive experimental innovations and require an interdisciplinary research agenda
to integrate the various modes of explanation into a coherent model of human language
use and communication of information.
The current developments in dynamic natural language semantics constitute major
improvements in expanding linguistic application to a wider range of discourse
phenomena. The forms of reasoning in which context dependent expressions may change
their reference during the processing of the premises are now considered to be interesting
aspects of natural languages, that logical systems are challenged to simulate, rather than
avoid, as our great-grandfathers' advice originally directed us to do. There is renewed
attention to limit in a principled and empirically justified way the search space complexity
to decidable fragments of FOL and to restrict the higher order methods in order to reduce
the complexity to model cognitively realistic human processing power. Such
developments in natural language will converge eventually with the syntactic research programs
focusing on universals of language as constants of human linguistic competence.
9. References
Ades, Anthony E. & Mark J. Steedinan 1982. On the order of words. Linguistics & Philosophy 4,
517-558.
Ajdukiewicz, Kazimierz 1935. Die syntaktisclie Konnexitat. Stiidia Philosophica 1. 1-27. English
translation in: S. McCall (ed.). Polish Logic. Oxford: Clarendon Press, 1967.
302
III. Methods in semantic research
Arbib, Michael A. 1969. Theories of Abstract Automata. Englewood Cliffs, NJ: Prentice Hall.
Asher, Nicholas 1993. Reference to Abstract Objects in Discourse. Dordrecht: Kluwer.
Bar-Hillel, Yehoshua 1964. Language and Information: Selected Essays on their Theory and
Application. Reading, MA: Addison-Wesley.
Barendregt. Hendrik P. 1984. The Lambda Calculus. Amsterdam: North-Holland.
Barwise, Jon 1979. On branching quantifiers in English. Journal of Philosophical Logic 8,47-80.
Barwise, Jon 1989. The Situation in Logic. Stanford, CA: CSLI Publications.
Barwise, Jon & Robin Cooper 1981. Generalized quantifiers and natural language. Linguistics &
Philosophy 4,159-219.
Barwise. Jon & John Etchemendy 1987. The Liar. An Essay on Truth and Circularity. Oxford:
Oxford University Press.
Barwise. Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA:The MIT Press.
van Benthem, Johan 1983. The Logic of Time: A Model-Theoretic Investigation into the Varieties of
Temporal Ontology and Temporal Discourse. Dordrecht: Reidel.
van Bcnthcm, Johan 1986. Essays in Logical Semantics. Dordrecht: Rcidcl.
van Benthem, Johan 1987. Categorial grammar and lambda calculus. In: D. Skordev (ed.).
Mathematical Logic and its Applications. New York: Plenum, 39-60.
van Benthem, Johan 1988.The Lambek calculus. In: R.T. Oehrle, E. Bach & D. Wheeler (eds.).
Categorial Grammars and Natural Language Structures. Dordrecht: Reidel, 35-68.
van Benthem, Johan 1991. Language in Action: Categories, Lambdas and Dynamic Logic.
Amsterdam: North-Holland.
van Benthem, Johan & Alice ter Meulen 1985. Generalized Quantifiers in Natural Language.
Dordrecht: Foris.
van Benthem, Johan & Alice ter Meulen 1997. Handbook of Logic and Language. Amsterdam:
Elsevier.
Beth, Evert 1970. Aspects of Modern Logic. Dordrecht: Rcidcl.
Boolos, George & Richard Jeffrey 1980. Computability and Logic. Cambridge: Cambridge
University Press.
Buszkowski, Wojciech 1988. Generative power of categorial grammars. In: R.T. Oehrle. E. Bach &
D. Wheeler (eds.). Categorial Grammars and Natural Language Structures. Dordrecht: Rcidcl,
69-94.
Buszkowski, Wojciech & Witold Marciszewski 1988. Categorial Grammar. Amsterdam: Benjamins.
Carnap, Rudolf 1947. Meaning and Necessity. Chicago, IL:The University of Chicago Press.
Carpenter, Bob 1997. Type-Logical Semantics. Cambridge, MA:The MIT Press.
Chierchia, Gennaro 1995. Dynamics of Meaning: Anaphora, Presupposition, and the Theory of
Grammar. Chicago, IL:The University of Chicago Press.
Chomsky, Noam 1957. Syntactic Structures.1\\e Hague: Mouton.
Chomsky, Noam 1959. On certain formal properties of grammars. Information and Control 2,
137-167.
Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA:The MIT Press.
Chomsky, Noam & George A. Miller 1958. Finite-state languages. Information and Control 1,
91-112.
Chomsky, Noam & George A. Miller 1963. Introduction to the formal analysis of natural languages.
In: R.D. Luce, R. Bush & E. Galanter (eds.). Handbook of Mathematical Psychology, vol. 2. New
York: Wiley, 269-321.
Crcsswcll, Max J. 1985. Structured Meanings: The Semantics of Propostitional Attitudes. Cambridge,
MA: The MIT Pi ess.
Curry, Haskell B. 1961. Some logical aspects of grammatical structure. In: R. Jakobson (ed.).
Structure of Language and its Mathematical Aspects. Providence, RI: American Mathematical Society,
56-68.
Davidson, Donald & Gilbert Harinan 1972. Semantics of Natural Language. Dordrecht: Reidel.
14. Formal methods in semantics
303
Davis, Martin 1965. The Undecidable: Basic Papers on Undecidable Propositions, Unsolvable
Problems and Computable Functions. Hewlett, NY: Raven Press
van Ditmarsch, Hans, Wiebe van der Hoek & Barteld Kooi 2007. Dynamic Epistemic Logic.
Dordrecht: Springer.
Dowty, David, Robert Wall & Stanley Peters 1981. Introduction to Montague Semantics. Dordrecht:
Reidel.
Engdahl, Elisabeth 1986. Constituent Questions. Dordrecht: Reidel.
Frege, Gottlob 1892. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic und philosophische
Kritik 100,25-50. Reprinted in: G. Patzig (ed.). Funktion, Begriff Bedeutung. Fiinf logische Stu-
dien. 3rd edn. Vandenhoeck Hoeck & Ruprecht: Gottingen, 1969, 40-65. English translation
in: P. Geach & M. Black (eds.). Translations from the Philosophical Writings of Gottlob Frege.
Oxford: Blackwell, 1980,56-78.
Gabbay, Dov & Franz Guenthner 1983. Handbook of Philosophical Logic, vol. 1-4. Dordrecht:
Reidel.
Gallin, Daniel 1975. Intensional and Higher-Order Modal Logic. With Applications to Montague
Semantics. Amsterdam: North-Holland.
Gentzen, Gerhard 1934. Untersuchungen iiber das logische SchlieBen I & II. Mathematische
Zeitschrift 39,176-210,405-431.
Ginzburg, Joliiiathan & Ivan Sag 2000. Interrogative Investigations: The Form, Meaning and Use of
English Interrogatives. Stanford, CA: CSLI Publications.
Groenendijk, Jeroen & Martin Stokhof 1991. Dynamic Predicate Logic. Linguistics & Philosophy
14,39-100.
Gross, Maurice & Andre Lentin 1970. Introduction to Formal Grammars. Berlin: Springer.
Haegeman, Liliane 1991. Introduction to Government and Binding Theory. Oxford: Blackwell.
Heim, Irene R. 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D.
dissertation. University of Massachusetts, Amherst, MA. Reprinted: Ann Arbor, MI: University
Microfilms.
Hintikka, Jaakko 1976. The Semantics of Questions and the Questions of Semantics: Case Studies in
the Interrelations of Logic, Semantics, and Syntax. Amsterdam: North-Holland.
Hintikka, Jaakko & Jack Kulas 1985. Anaphora and Definite Descriptions: Two Applications of
Game-Theoretical Semantics. Dordrecht: Reidel.
Hintikka, Jaakko & Gabriel Sandu 1997. Game-theoretical semantics. In: J. van Benthein & A. ter
Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 361-410.
Hodges, Wilfrid 1985. Building Models by Games. Cambridge: Cambridge University Press
Hopcroft, John E. & Jeffrey D. Ullman 1979. Introduction to Automata Theory, Languages and
Computation. Reading, MA: Addison-Wesley.
Jeffrey, Richard 1967. Formal Logic: Its Scope and Limits. New York: McGraw-Hill.
Kamp, Hans 1981. A theory of truth and semantic representation. In: J. Groenendijk (ed.). Formal
Methods in the Study of Language. Amsterdam: Mathematical Centre, 277-322.
Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer.
Keenan, Edward L. & Leonard M. Faltz 1985. Boolean Semantics for Natural Language. Dordrecht:
Reidel.
Keenan, Edward & Dag Westerstahl 1997. Generalized quantifiers in linguistics and logic. In: J.
van Benthem & A. ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier,
837-893.
Krifka, Manfred 1989. Nominal reference, temporal constitution and quantification in event
semantics. In: R. Bartsch, J. van Bcnthcm & P. van Enidc Boas (eds.). Semantics and Contextual
Expressions. Dordrecht: Foris, 75-115.
Kripke, Saul A. 1972. Naming and necessity. In: D. Davidson & G. Haimaii (eds.). Semantics of
Natural Language. Dordrecht: Reidel, 253-355 and 763-769.
Lambek, Joachim 1958. The mathematics of sentence structure. American Mathematical Monthly
65,154-170.
304
III. Methods in semantic research
Lewis, David 1972. General semantics. In: D. Davidson & G. Harman (eds.). Semantics of Natural
Language. Dordrecht: Reidel, 169-218.
Lewis, David 1983, Philosophical Papers, vol. I. Oxford: Oxford University Press.
Link, Godehard 1991. Formale Methoden in der Semantik. In: A. von Stechow & D. Wunderlich
(eds.). Semantik. Ein internationales Handbuch der zeitgenossischen Forschung (HSK 6). Berlin:
de Gruyter, 835-860.
May, Robert 1985. Logical Form: Its Structure and Derivation. Cambridge, MA: The MIT
Press.
McCawley, James D. 1981. Everything that Linguists Have Always Wanted to Know About Logic But
Were Ashamed to Ask. Chicago, IL:Tlie University of Chicago Press.
Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and
with an introduction by Richard H.TIiomason. New Haven, CT: Yale University Press.
Moortgat, Michael 1996. Multimodal linguistic inference. Journal of Logic, Language and
Information 5,349-385.
Moortgat, Michael 1997. Categorial type logics. In: J. van Benthem & A. ter Meulen (eds.).
Handbook of Logic and Language. Amsterdam: Elsevier, 93-179.
Morrill, Glyn 1994. Type Logical Grammar. Categorial Logic of Signs. Dordrecht: Kluwer.
Morrill, Glyn 1995. Discontinuity in categorial grammar. Linguistics & Philosophy 18,175-219.
Muskens, Re in hard, Johan van Benthein & Albert Visser 1997. Dynamics. In: J. van Benthem & A.
ter Meulen (eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 587-648.
Nishihara, Noritaka, Kcnichi Morita & Shigenori Iwata 1990. An extended syllogistic system
with verbs and proper nouns, and its completeness proof. Systems and Computers in Japan 21,
96-111.
Oehrle, Richard T, Ennnon Bach & Deirdre Wheeler 1988. Categorial Grammars and Natural
Language Structures. Dordrecht: Reidel.
Partee, Barbara 1979. Semantics - mathematics or psychology? In: R. Bauerle, U. Egli & A. von
Stechow (eds.). Semantics from Different Points of View. Berlin: Springer.
Partee, Barbara, Alice ter Meulen & Robert Wall 1990. Mathematical Methods in Linguistics.
Dordrecht: Kluwer.
Pentus, Mati 2006. Lambek calculus is NP-complete. Theoretical Computer Science 357,186-201.
Peters, Stanley & Richard Ritchie 1973. On the generative power of transformational grammars.
Information Sciences 6,49-83.
Pratt-Hartmann, Ian 2003. A two-variable fragment of English. Journal of Logic, Language &
Information 12,13-45.
Pratt-Hartmann, Ian & Allan Third 2006. More fragments of language. Notre Dame Journal of
Formal Logic 47,151-177.
Reinhart,Tanya 1983a. Anaphora and Semantic Interpretation. London: doom Helm.
Reinhart,Tanya 1983b. Coreference and bound anaphora: A restatement of the anaphora question.
Linguistics & Philosophy 6,47-88.
van Riemsdijk, Henk & Edwin Williams 1986. Introduction to the Theory of Grammar. Cambridge.
MA:The MIT Press.
Savitch, Walter J.. Emmon Bach & Wiliam Marsh 1987. The Formal Complexity of Natural
Language. Dordrecht: Reidel.
Seligman, Jerry & Lawrence S. Moss 1997. Situation theory. In: J. van Benthem & A. ter Meulen
(eds.). Handbook of Logic and Language. Amsterdam: Elsevier, 239-280.
Skolcm, Thoralf 1920. Logisch-kombinatoi isclic Untcrsuchungen iibcr die Erfullbarkcit odcr
Bcwcisbarkcit mathematischer Satzc nebst cincm Theorem iibcr dichtc Mcngcn. Videnskaps-
selskapets Skrifter I, Matem-naturv. KI.I4,1-36.
Stabler. Edward 1997. Derivational minimalism. In: C. Retore" (ed.). Logical Aspects of
Computational Linguistics. Berlin: Springer. 68-95.
Stalnaker, Robert 1999. Context and Content Essays on Intentionality in Speech and Thought.
Oxford: Oxford University Press.
15. The application of experimental methods in semantics 305
Tarski, Alfred 1956. Logic, Semantics, Metamathematics: Papers from 1923 to 1938. Oxford:
Clarendon Press.
Winter, Yoad 1997. Choice functions and the scopal semantics of indefinites. Linguistics &
Philosophy 20,399-467.
Alice G.B. ter Meulen, Geneva (Switzerland)
15. The application of experimental
methods in semantics
1. Introduction
2. The stumbling blocks
3. Off-line evidence for scope interpretation
4. Underspecification vs. full interpretation
5. On-line evidence for representation of scope
6. Conclusions
7. References
Abstract
The purpose of this paper is twofold. On the methodological side, we shall attempt to
show that even relatively simple and accessible experimental methods can yield significant
insights into semantic issues. At the same time, we argue that experimental evidence, both
the type collected in simple questionnaires and measures of on-line processing, can inform
semantic theories. The specific case that we address here concerns the investigation of
quantifier scope. In this area, where judgements are often subtle and controversial, the gradient
data that psycholinguistic experiments provide can be a useful tool to distinguish between
competing approaches, as we demonstrate with a case study. Furthermore, we describe
how a modification of existing experimental methods can be used to test predictions of
underspecification theories. The program of research we outline here is not intended to be
a prescriptive set of instructions for researchers, telling them what they should do; rather it
is intended to illustrate some problems an experimental semanticist may encounter but also
the profit of this enterprise.
1. Introduction
A wide range of data types and sources are used in the field of semantics, as is
demonstrated by the related article 12 (Krifka) Varieties of semantic evidence in this volume.
The aim of this article is to show with an example research study series what sort of
questions can be addressed with experimental tools and suggest that these methods can
deliver valuable data which is relevant to basic assumptions in semantics. This text also
attempts to address the constraints on and limits to such an approach. These are both
methodological and theoretical: it has long been recognized that links between empirical
measures and theoretical constructs require careful argumentation to establish.
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 305-321
306
III. Methods in semantic research
The authors therefore have two aims: one related to experimental methodologies
and the other to do with the value of processing data. They first seek to show that even
relatively simple and accessible experimental methods can yield significant insights into
semantic issues. They second wish to illustrate that experimental evidence such as that
gathered in their eye-tracking study has the potential to inform semantic theory.
Semanticists have of course always sought confirmatory evidence to support their
analyses. There is, on the one hand, fairly extensive use of computational techniques
and corpus data in the field, and a growing body of experimental work on semantic
processing, language acquisition, and pragmatics, but in the area of theoretical and formal
semantics the experimental methods are less frequently employed.
Now there arc good reasons for this. There arc inherent factors related to the
accessibility of the relevant measures why controlled data gathering techniques are still
somewhat less frequent in this field than in some others. We shall discuss what these reasons
are and demonstrate with a case study what constraints they place on empirical studies,
particularly experimental studies.The example research program that we shall report is
thus not simply a recipe for others for what should be done, rather it is an illustration
of the difficulties involved, which aims to explore some of the boundaries of what is
accessible to experimental studies.
The specific case that we address here concerns the investigation of quantifier scope, a
perennial issue in semantics. Previous attempts to account for the complex data patterns
to be found in natural languages have met with the difficulty that the causal factors and
preferences need first to be identified before a realistic model can be developed. This
requires as an initial step the capture and measurement of the relevant effects and their
interactions, which is no trivial task.
The next section lays out a range of reasons why semanticists do not routinely seek to
test the empirical bases of their theories with simple experiments. Section 3 reports the
scries of empirical investigations on quantifier scope carried out by Bott and Rado in
ongoing research. Section 4 lays out some of the theoretical background and importance of
these studies for current theory (the underspecification debate).The final section takes
as a starting point Bott and Rad6 (2009) to suggest how some of the problems noted in
section 3 may be overcome with a more sophisticated experimental procedure.
2. The stumbling blocks
As Manfred Krifka notes in his neighbouring article 12 (Krifka) Varieties of semantic
evidence, a major problem with investigating meaning is that we cannot yet fully define
what it is. This is indeed a root cause of difficulty, but here we shall attempt to illustrate
in more practical detail what effects this has on attempts to conduct experiments in this
field.
2.1. Specifying meaning without using language
The essential feature distinguishing experiment procedure is control. In language
experiments we may distinguish three (sets of) variables: linguistic form, context, and meaning.
In the typical experiment we will keep two of them constant and systematically vary
the other. Much semantic research concerns the systematic interdependence of form,
context, and meaning. These issues can be investigated for example by:
15. The application of experimental methods in semantics
a) keeping form and context constant, manipulating meaning systematically, and
measuring the felicity of the outcome (in judgements, or reaction times, or processing
effort), or
b) manipulating (at least one of) form and context, and measuring perceived meaning.
The first requires the experimenter to manipulate meaning as a variable, which entails
expressing meaning in a form other than language, (pictures, situation descriptions, etc);
the second requires the experimenter to measure perceived meaning, which again
normally demands reference to meanings captured in non-linguistic form. But precisely this
expression of tightly constrained meaning in non-linguistic form is very difficult.
To show how this factor affects studies in semantics disproportionately, it is worth
noting how this makes controlled studies in semantics more challenging than in syntax.
Work in experimental syntax is often interested in addressing precisely those effects of
form change which are independent of meaning. The variable meaning can thus be held
constant, but this does not require it to be exactly specified.Thus only the syntactic
analysis need be controlled, which makes empirical studies in syntax a whole parameter less
difficult than those in semantics.
2.2. The boundaries of form, context, and meaning
A further problem of exact studies concerning meaning is that the three variables are not
always clearly distinguished, in part because they systematically covary, but also in part
because linguists do not always agree about the boundaries. This is particularly visible
when we seek to identify where an anomaly lies. Views have changed over time in
linguistics about the nature and location of ill-formedness (e.g. the discussion of the status
of I am lurking in a culvert in Ross 1970) but the fundamental ambiguity is still with us.
For example, Weskott & Fanselow (2009) give the following examples and judgements
of syntactic and semantic well-formedness: (la) is syntactically ill-formed (*), (lb) is
semantically ill-formed (#), and (lc) is ill-formed on both accounts (*#).
(1) a. *Die Suppe wurde gegen versalzen.
the soup was against oversalted
b. #Der Zug wurde gekaut.
the train was chewed
c. *#Das Eis wurde seit entziindet.
the ice was since inflamed
Our own judgements suggest that the structures in (1-a) and (1-c) have no acceptable
syntactic analysis, and therefore no semantic analysis can be constructed - they are
thus both syntactically and semantically ill-formed. Crucially, the semantic anomaly is
dependent upon the syntactic problem; the lack of a recognizable compositional
interpretation is a result of the lack of a possible structural analysis. We would therefore
regard these examples as primarily syntactically unacceptable. This contrasts with (1-b),
which we regard as well-formed on both parameters, being merely implausible, except
in a small child's playroom, where a train being chewed is an entirely normal situation
(cf. Hahne & Friederici 2002).
308
III. Methods in semantic research
2.3. Plausibility
Such examples highlight another problem in manipulating meaning as an experimental
variable: the human demand to make sense of linguistic forms. We associate possible
meanings with things that we can accept as being true or plausible. So 'the third-floor
appartment reappeared today', which is both syntactically and semantically flawless, will
cause irrelevant experimental effects since subjects will find it difficult to fit the meaning
into their mental model of the world. Zhou & Gao (2009) for example argue that
participants interpret Every robber robbed a bank in the surface scope reading because it is
more plausible that each robber robbed a different bank.
This links in to a wider discussion of the role of plausibility as a factor in semantic
processing and as a filter on possible readings. Zhou & Gao (2009) claim that such doubly
quantified sentences arc ambiguous in Mandarin, since their experimental evidence
suggests that both interpretations are built up in parallel, but one reading is subsequently
filtered out by plausibility, which accounts for the contrary judgements in work on semantic
theory (e.g. Huang 1982, Aoun & Li 1989).
2.4. Meaning as a complex measure
The meaning of a structure is not fixed or unique, even when linguistic, social, and
discourse context are fixed. First, a single expression may have multiple readings, which
compete for dominance. Often a specific relevant reading of a structure needs to be
forced in an experiment. Some readings of theoretical interest may be quite inaccessible,
though nevertheless real.This raises the issue of expert knowledge, which again contrasts
with the situation in syntax. Syntactic well-formedness judgements are generally
available and accessible to any native speaker and require no expertise. On the other hand, it
can require specialist knowledge to 'get1 some readings since the access to variant
readings is usually via different analyses. This is a crucial point in semantics, since it reduces
the likelihood that the intuitions of the naive native speaker can be the final arbiter
in this field, as they can reasonably be argued to be in syntax (Chomsky 1965). A fine
example of this is from Hobbs & Schieber (1987):
(2) Two representatives of three companies saw most samples.
They claim that this sentence is five-ways ambiguous. Park (1995) however denies the
existence of one of these readings {three > most > two). It is doubtful whether this
question is solvable by asking naive informants.
Even within a given analysis of a construction, the meaning may not be fully
determined. Aspects of meaning are left unspecified, which means that two different per-
ceivers can interpret a single structure in different ways.This too requires great care and
attention to detail when designing experiments which aim to be exact.
2.5. The observer's paradox
A frequent aim in semantic experiments is to discover how subjects interpret linguistic
input under normal conditions. A constant problem is how experimenters can access
this information, because whatever additional task we instruct the subjects to carry out
renders the conditions abnormal. For example, if we ask them to choose which one of a
15. The application of experimental methods in semantics 309
pair of pictures illustrates the interpretation that they have gathered, or even if we just
observe their eye movements, the very presence of two pictures is likely to make them
more aware that more than one interpretation is possible, thus biasing the results. Even
a single picture can alter or trigger the accessibility of a reading.
2.6. Inherent meaning and inferred meaning
One last linguistic distinction which we should note here is that between the inherent
meaning of an expression ("what is said") and the inferred meaning of a given utterance
of an expression.This distinction is fundamental in the division of research into meaning
into separate fields, but it is in practice very difficult to apply in experimental work, since
naive informants do not naturally differentiate the two.The recent 'literal Lucy' approach
of Larson et al. (2010) is a promising solution to this problem; in this paradigm
participants must report how 'literal Lucy', who only ever perceives the narrowly inherent
meaning of utterances and makes no inferences, would understand example sentences.
This distinction is particularly important when an experimental design requires a
disambiguation, and extreme care must be taken that its content is not only inferred. For
example, in (3), it is implicated that every rugby player broke one of their own fingers, but
this is not necessarily the case.This example cannot thus offer watertight disambiguation.
(3) Every rugby player broke a finger.
Implication: Every rugby player broke one of their own fingers.
2.7. Experimental measures and the object of theory
As a rule, semantic theory makes no predictions about semantic processing. Instead it
concerns itself with the final stable interpretation which is achieved after a whole
linguistic expression, usually at the sentence level, has been processed and all reanalyses,
for example as a result of garden paths, have been resolved. It fundamentally concerns
the stative, holistic result of the processing of an expression, indeed many
theoretical approaches regard meaning as only coming about in a full sentence (cf. article 8
(Meier-Oeser) Meaning in pre-19th century thought). But the processing of a sentence
is made up of many steps which are incremental and which interact strongly with each
other, partly predicting, partly parsing input as it arrives, partly confirming or revising
previous analyses. Much of the experimental evidence available to us provides direct
evidence only of these processing steps.
It thus follows that for many semantics practitioners much of the empirical evidence
which we can gather concerns at best out predictions about what the sentence is going to
mean, not really aspects of its actual meaning. The time course of our arriving at a
particular reading, whether it be remote or readily accessible, has no direct implications for
the theory, since this makes no predictions about processing speed (cf. Phillips & Wagers
2007). One aim of this article is to show that experimental techniques can deliver data
which can contribute to theory building.
2.8. Categorical predictions and gradient data
Predictions of semantic theories typically concern the availability of particular
interpretations. Experiments deliver more fine-grained data that reflect the relative preferences
310
III. Methods in semantic research
among the interpretations.Mapping these gradient data onto the categorical predictions,
that is, drawing the line between still available and impossible readings is a non-trivial
task. At the same time, the ability to distinguish preferences among the "intermediate"
interpretations may be highly relevant for testing predictions concerning readings that
fall between the clearly available and the clearly impossible.
2.9. Outlook
In the remainder of this paper we will discuss two ways in which systematically collected
experimental data can contribute to semantic theorizing. We will use quantifier scope as
an example of a phenomenon where results of psycholinguistic experiments can make
significant contributions to the theoretical discussions. We will not attempt to review
here the considerable psycholinguistic literature on the processing of quantifiers (for a
comprehensive survey cf. article 102 (Frazier) Meaning in psycholinguistics). Instead we
will concentrate on a small set of studies that show the usefulness of end-of-sentence
judgements in establishing the available interpretations of quantified sentences.Then we
will sketch an experiment to address aspects of the unfolding interpretation of quantifier
scope which are of interest to theoretical semanticists as well.
3. Off-line evidence for scope interpretation
Semantic theories are typically based on introspective judgements of a handful of
theoreticians. The judgements concern available readings of a sentence, possibly ranked
as to how easily available these readings are. Not surprisingly, judgements of this sort
are subtle and often controversial. For instance, the sentence Everyone loves someone
has been alternately considered to only allow the wide-scope universal reading (e.g.
Hornstein 1995; Beghelli & Stowell 1997) or to be fully ambiguous (May 1977, 1985;
Hornstein 1984;Higginbotham 1985). Example (2) above illustrates the same point. Park
(1995) and Hobbs & Shieber (1987) disagree about the number of available readings.
The data problem has been known for a long time. Studies as early as Ioup (1975) and
VanLehn (1978) have used the intuitions of naive speakers in developing an empirically
motivated theory. However, it has been clear from the beginning that "obvious" tasks
such as paraphrasing a presumably ambiguous doubly-quantified sentence or asking
informants to choose a (preferred) paraphrase are rather complex and that linguistically
untrained participants may not be able to carry them out reliably.
Another purely linguistic task has been problematic for a different reason.
Researchers have tried to combine the quantified sentence with a disambiguating
continuation, as in (4).
(4) Every kid climbed a tree.
(a) The tree was full of apples.
(b) The trees were full of apples.
Disambiguation of this type was used by Gillen (1991), Kurtzman & MacDonald (1993),
Tunstall (1998) and Filik, Paterson & Liversedge (2004), for instance. Here the plural
continuation is only acceptable if multiple trees are instantiated, that is, the wide-scope
universal interpretation (every kid > a tree) is chosen, whereas the singular continuation
15. The application of experimental methods in semantics 31J_
is intended to only fit the wide-scope existential interpretation (a tree > every kid).
Unfortunately the singular continuation fails to disambiguate the sentence, as Tunstall
(1998) points out: the tree (4a) can easily be taken to mean the tree the kid climbed,
thus making it compatible with the wide-scope universal interpretation as well (see also
Bott & Rad6 2007 and article 102 (Frazier) Meaning in psycholinguistics).
Problems of these kinds have prompted researchers to look for non-linguistic methods
of disambiguation. Gillen (1991) used, among other methods, simple pictures resembling
set diagrams. In her experiments subjects either drew diagrams to represent the meaning
of quantified sentences, chose the diagram that corresponded to the (preferred) reading
or judged how well the situation depicted in the diagram fitted the sentence. Bott & Rad6
(2007) tested a somewhat modified form of the last of these methods using diagrams like
those in Fig. 15.1, to see whether they constitute a reliable mode of disambiguation that
naive informants can use easily.They found that participants consistently delivered the
expected judgements both for scopally unambiguous quantified sentences (i.e. sentences
where one scope reading was excluded due to an intervening clause boundary) and for
ambiguous quantified sentences where expected preferences could be determined based
on theoretical considerations and corpus studies. These results show that there is no a
priori reason to exclude the judgements of non-linguist informants from consideration.
A) exactly one > each B) each > exactly one
students novels students novels
Fig. 15.1: Disambiguating diagrams for the sentence: "Exactly one novel was read by each student"
For informative experiments, however, we need to be able to derive testable hypotheses
based on existing semantic proposals. Although semantic theories are not formulated
to make predictions about processing, it is still possible to identify areas where
different approaches lead to different predictions concerning the judgement of particular
constructions.The interpretation of quantifiers provides an example here as well.
One possible way of classifying theories of quantifier scope has to do with the way
different factors are supposed to affect the scope properties of quantifiers. In configura-
tional models such as Reinhart (1976, 1978, 1983, 1995) and Beghelli & Stowell (1997),
quantifiers move to/are interpreted in different structural positions. A quantifier higher
in the (syntactic) tree will always outscope lower ones.The absolute position in the tree
is irrelevant; what matters is the position relative to the other quantifier(s). While
earlier proposals only considered syntactic properties of quantifiers, Beghelli and Stowell
also include semantic factors in the hierarchy of quantifier positions.Taking distributivity
as an example, assuming that a +dist quantifier is interpreted in Spec,QP which is the
312
III. Methods in semantic research
highest position available for quantifiers, Ql will outscope Q2 if only Ql is +dist,
regardless of what other properties Ql or Q2 may have. An effect of other factors will only
become apparent if neither of the quantifiers is +dist.
By contrast, the basic assumption in multi-factor theories of quantifier scope is that
each factor has a certain amount of influence on quantifier scope regardless of the
presence or absence of other factors (cf. Ioup 1975; Kurtzman & MacDonald 1993; Kuno 1991
and Pafel 2005).The effects of different factors can be combined, resulting in greater or
lesser preference for a particular interpretation. Theories differ in whether one of the
readings disappears when it is below some threshold, or whether sentences with multiple
quantifiers are always necessarily ambiguous.
Let us assume that the two scope-relevant factors we arc interested in arc
distributivity and discourse-binding, the latter indicated by the partitive NP one of these N, see
(6). Crossing these factors yields four possible combinations: +dist/+d-bound, +dist/-d-
bound, -dist/+d-bound, and -dist/-d-bound. In a configurational theory presumably there
will be a structural position reserved for discourse-bound phrases. Let us consider the
case where this position is lower than that for +dist, but higher than the lowest scope
position available for quantifiers. Thus Ql should outscope Q2 in the first two
configurations, Q2 should outscope Ql in the third, and the last one may in fact be fully scope
ambiguous unless some additional factors are at play as well. Moreover, as
configurational theories of scope have no mechanism to predict relative strength of scope
preference, the first two configurations should show the same size preference for a wide-scope
interpretation of Ql.In statistical terms, we expect an interaction: d-binding should have
an effect when Ql is -dist, but not when it is +dist.
In multi-factor theories, on the other hand, the prediction would usually be that the
effects of the different factors should add up.That is, the difference in scope bias between
a d-bound and a non-d-bound +dist quantifier should be the same as between a d-bound
and a non-d-bound -dist quantifier. A given factor should be able to exert its influence
regardless of the other factors present.
Bott and Rad6 have been testing these predictions in on-going work. In two
questionnaire studies subjects read doubly-quantified German sentences and used magnitude
estimation to indicate how well disambiguating set diagrams fitted the interpretation of
the sentence. Experiment 1 manipulated distributivity and linear order and used
materials like (5). Experiment 2 tested the factors distributivity and d-binding using sentences
like (6).
(5) a. Genau einen dieser Professoren haben alle Studentinnen verehrt.
Exactly one these professors have all female students adored.
All female students adored exactly one of these professors.
b. Genau einen dieser Professoren hat jede Studcntin verehrt.
Exactly one these professors has each female students adored.
Each female student adored exactly one of these professors.
c. Alle Studentinnen haben genau einen dieser Professoren verehrt.
All female students have exactly one these professors adored.
All female students adored exactly one of these professors.
15. The application of experimental methods in semantics 313
d. Jede Studentin hat genau einen dieser Professoren verehrt.
Each female student has exactly one these professors adored.
Each female student adored exactly one of these professors.
(6) a. Genau einen Professor haben alle diese Studentinnen verehrt.
Exactly one professor have all these female students adored.
All of these female students adored exactly one professor.
b. Genau einen dieser Professoren haben alle Studentinnen verehrt.
Exactly one these professors have all female students adored.
All female students adored exactly one of these professors.
c. Genau einen Professor hat jede dieser Studentinnen verehrt.
Exactly one professor has each these female students adored.
Each of these female students adored exactly one professor.
d. Genau einen dieser Professoren hat jede Studentin verehrt.
Exactly one these professors has each female student adored.
Each female student adored exactly one of these professors.
Bott and Rad6 found clear evidence for the influence of all three factors. The
distributive quantifier jeder took scope more easily than alle, d-binding of a quantifier and
linear precedence both resulted in a greater tendency to take wide scope. Crucially, the
effects were additive, which is compatible with the predictions of multi-factor theories
but unexpected under configurational approaches.
These results show that even simple questionnaire studies can deliver theoretically
highly relevant data.This is particularly important in an area like quantifier scope, where
the judgements are typically subtle and not always accessible to introspection. Of course
the study reported here cannot address all possible questions concerning the
interpretation of quantified sentences like those in (5)-(6). It cannot for example clarify whether
the processor initially constructs a fully specified representation of quantifier scope or
whether it first builds only a underspecified structure which is compatible with both
possible readings, an outstanding question of much current interest in semantics. The data
that wc have presented so far is off-line, in that it measures preferences only at the end
of the sentence, when its content has been disambiguated. In section 5 we present an
experimental design which allows investigation of the on-going (on-line) processing of
scope ambiguities. In the next section we relate the semantic issue of underspecification
to experimental data and predictions for on-line processing.
4. Underspecification vs. full interpretation
It is generally agreed that syntactic processing is incremental in nature (e.g. van Gompel &
Pickering 2007) i.e. a full-fledged syntactic representation is assigned to every incoming
word. Whether semantic processing is incremental in the strict sense, is far from being
314
III. Methods in semantic research
beyond dispute and is still an empirical question. To formulate hypotheses about the
time-course of semantic processing, we will now look at the on-going debate in semantic
theory on underspecification in semantic representations.
Underspecified semantic representations are a tool intended to handle the problem
of ambiguity. The omission of parts of the semantic information allows one single
representation to be compatible with a whole set of different meanings (for an overview of
underspecification approaches, see e.g. Pinkal 1999; articles 24 (Egg) Semantic
underspecification and 108 (Pinkal & Koller) Semantics in computational linguistics). It is thus
an economical method of dealing with ambiguity in that it avoids costly reanalysis, used
above all in computational applications.
Taking the psycholinguistic perspective, one would predict that constructing under-
specified representations in semantically ambiguous regions of a sentence avoids
processing difficulties in ambiguous regions and at the point of disambiguation (Frazier &
Rayner 1990).
Underspecification can be contrasted with an approach that assumes strict incre-
mentality and thus immediate full interpretation even in ambiguous regions.This would
predict processing difficulties in cases of disambiguations to non-preferred readings.
A candidate for a semantic processing principle guiding the choice of one specified
semantic representation would be a complexity-sensitive one (for example .-"Avoid
quantifier raising" captured in Tunstall's Principle of Scope Interpretation 1998 and
Anderson's 2004 Processing Scope Economy).
In the psycholinguistic investigation of coercion phenomena, the experimental
evidence is interpreted along these lines. Processing difficulties at the point of
disambiguation are taken as evidence for full semantic interpretation (see e.g. Pinango, Zurif &
Jackendoff 1999;Todorova, Straub, Badecker & Frank 2000) whereas the lack of
measurable effects is seen as support for an underspecified semantic representation (see e.g.
Pylkkancn & McElrcc 2006; Pickering, McElrcc, Frisson, Chen & Traxlcr 2006).
Analogously, in the processing of quantifier scope ambiguities, experimental
evidence for processing difficulties at the point of disambiguation will be interpreted as
support for full interpretation. However, this need not be taken as final. If we look at
underspecification approaches in semantics, non-semantic factors are mentioned which
might explain (and predict) difficulties in processing local scope ambiguities (see article
24 (Egg) Semantic underspecification, section 6.4.1.). And these are exactly the factors
which arc assumed by multi-factor theories to have an impact on quantifier scope:
syntactic structure and function, context, and type of quantifier. The relative weighting and
interaction of these factors are not made fully explicit, however.
For the full picture, it would be necessary to examine not only the point of
disambiguation but also the ambiguous part of the input, for it is there that the effects of these
factors might be identified. Underspecification is normally only temporary, however, and a
full interpretation will presumably be constructed at some stage (but see Sanford & Sturt
2002). This might be recognizable for example in behavioral measures, but the precise
predictions of underspecification theory are not always clear. For example, it might be
assumed that even representations which are never fully specified by the input signal (or
context) do receive more specific interpretations at some later stagc.This of course raises
the question what domains of interpretation are relevant here (sentence boundary,
utterance, ...). In the next section we present experimental work which may offer a starting
point for the empirical investigation of such issues.
15. The application of experimental methods in semantics 315
5. On-line evidence for representation of scope
The underspecification view would predict that relative scope should remain underspeci-
ficd as long as neither interpretation is forced. Indeed there should not even be any
preference for one reading. The results of the questionnaire studies reported in Section
3 already indicate that this view cannot be right: a particular combination of factors was
found to systematically support a certain reading. Furthermore it is unlikely that the
task itself introduced a preference towards one interpretation - although the diagram
representing the wide-scope existential reading was somewhat more complex, this did
not seem to interfere with participants' performance. The observed preferences must
thus be due to the experimental manipulation.That is, even if all possible interpretations
are available up to the point where disambiguating information arrives, there must be
some inherent ranking of the various scope-determining factors that results in certain
interpretations being more activated than others.
Off-line results such as those discussed above are thus equally compatible with two
different explanations; one where quantifier scope is fully determined (at least) by the
end of the sentence, and another one where several (presumably all combinatorially
possible) interpretations are available but weighted differently. A different methodology is
needed to find out whether there is any psycholinguistic support for an underspecified
view of quantifier scope.
As it turns out, the currently existing results of on-line studies are no more able to
distinguish the two alternatives than are offline studies. In on-line experiments a scope-
ambiguous initial clause is followed by a second one that is only compatible with one
scope reading. An indication of difficulty during the processing of the second sentence
is typically taken as evidence that the disambiguation is incompatible with the (sole)
interpretation that had been entertained up to that point. However, there is another
way to look at such effects. When the disambiguation is encountered, the underspecified
representation needs to be enriched to allow only one reading and exclude all others. It is
conceivable that updating the representation may require more or less effort depending
on the ultimate interpretation that is required.
This situation poses a dilemma for researchers investigating the interpretation of
quantifier scope. If explicit disambiguation is provided we can only test how easily the
required reading is available - the results don't tell us what other reading(s) may have
been constructed. Without explicit disambiguation, however, reading time (or other) data
cannot be interpreted, since we do not know what reading(s) the participants had in mind.
Bott & Rad6 (2009) approached this problem by tracking the eye-movements of
participants while they read ambiguous sentences and then asking them to report the
interpretation they computed. Although their results are only partly relevant for the
underspecification debate, we will describe the experiment in some detail,since it provides
a good starting point for a more conclusive investigation. We will then sketch a
modification of the method that makes it possible to avoid some problems with the original study.
The scope-ambiguous sentences in Bott and Rado's study were instructions like those
in (7):
(7) a. Genau ein Tier auf jedem Bild sollst du nennen!
Exactly one animal on each picture should you name!
Name exactly one animal from each picture!
316
HI. Methods in semantic research
b. Genau ein Tier auf alien Bildern sollst du nennen!
Exactly one animal on all pictures should you name!
Name exactly one animal from all pictures!
Fig. 1S.2: Display following inverse linking constructions
The first quantifier (Ql) was always the indefinite genau ein "exactly one".The second
(Q2) was either distributive (jeder) or not (alle). In one set of control conditions Ql was
replaced by a definite NP (das Tier "the animal"). In another set of control conditions the
two possible interpretations of (7) (one animal that is present in all fields vs. a possibly
different animal from each field on a display) were expressed by scope-unambiguous
quantified sentences, as in (8).
(8) a. Name exactly one animal that is found on all pictures,
b. From each picture name exactly one animal.
In each experimental trial participants first read one of these instruction sentences and
their eye-movements were monitored. Then the instruction sentence disappeared and
a picture display as in Fig. 15.2, replaced it. Participants inspected this and had to
provide an answer within four seconds.The displays were constructed to be compatible with
both possible readings: the wide-scope universal reading where the participant should
select any one animal per field, but also the wide-scope existential reading where the
one element common to all fields must be named (e.g. the monkey in Fig. 15.2).To make
the quantifier exactly one felicitous, the critical displays always allowed two potential
answers for the wide-scope existential interpretation.
The scope-ambiguous instructions were so-called inverse linking constructions, in
which the two quantifiers are contained within one NP. It has been assumed (e.g. May &
15. The application of experimental methods in semantics 317
Bale 2006) that in inverse linking constructions the linearly second quantifier
preferentially takes scope over the first. The purpose of the study was to test this prediction and
to investigate to what extent the distributivity manipulation is able to modulate it. Based
on earlier results (Bott & Rado 2007) it was assumed that jeder would prefer wide scope,
which should further enhance the preference for the inverse reading. When alle occurred
as Q2, there should be a conflict between the preferences inherent to the construction
and those arising from the particular quantifiers.
The experimental setup made it possible to look at both the process of computing the
relative scope of the quantifiers (eye-movement behavior while reading the instructions)
and at the final interpretation (the answer participants gave) without providing any
disambiguation. Thus the answers could be taken to reflect the scope preferences at the end
of the sentence, whereas processing difficulty during reading would serve as an indication
that scope preferences are computed at a point where no decision is yet required.
The off-line answers showed the expected effects. There was an overall preference
for the inverse scope reading, which was significantly stronger with jeder than with alle.
Crucially, the reading time data showed clear evidence of a conflict between the scope
factors: there was a significant slow-down at the second quantifier in (7b). The effect
was present already in first-pass reading times, suggesting that scope preferences were
computed immediately. Bott and Rado interpret these results as strong indication that
readers regularly disambiguate sentences during normal reading.
However, this conclusion may be too strong. In Bott and Rad6's experiment
participants had to choose a particular interpretation in order to carry out the instructions (i.e.
name an animal). Although they did not have to settle on that interpretation while they
were reading the instruction, they had to make a decision as to the preferred reading
immediately after the end of the sentence. This may have caused them to disambiguate
constructions that are typically left ambiguous during normal interpretation.
Moreover, the instructions used in the experiment were highly predictable in
structure: they always contained a complex NP with two quantifiers (experimental items), a
definite NP1 followed by a quantified NP2 (fillers A), or else an unambiguous sentence
with two quantifiers. Although the content of NP1 (animal, vehicle, flag) and
distributivity of Q2 was varied, the rest of the instruction was the same: sollst du nennen 'you
should name'.This pattern was easy to recognize and may have resulted in a strategy of
starting to compute the scope preferences as soon as the second NP had been received.
To rule out this explanation Bott and Rado compared responses provided in the first and
the last third of each experimental session and failed to find any indication of strategic
behavior. Still the possibility remains that consistent early disambiguation in the
experiment resulted from the task of having to choose a reading quickly in order to provide
an answer. The ultimate test of underspecification would have to avoid such pressure to
disambiguate fast.
We are currently conducting a modification of Bott and Rad6's experiment that may
not only avoid this pressure but actually encourage participants to delay disambiguation.
In this experiment participants have to judge the accuracy of sentences like those in (9):
(9) a. Genau eine geometrische Form auf alien Bildern ist rechteckig.
Exactly one geometrical shape on all pictures is rectangular.
Exactly one geometrical shape on all pictures is rectangular.
318
HI. Methods in semantic research
b. Genau eine geometrische Form auf jedem Bild ist rechteckig.
Exactly one geometrical shape on each picture is rectangular.
Exactly one geometrical shape on each picture is rectangular.
A) wide scope existential disambiguation B) wide scope universal disambiguation
Fig. 1S.3: Disambiguating displays in the proposed experiment
The experiment procedure is as before. The sentences will be paired with unambiguous
displays supporting either the wide-scope universal or the wide-scope existential reading
(Fig. 15.3.). In (9) full processing of the semantic content is not possible until the critical
information (rechteckig) has been received. Since the display following the sentence is
only compatible with one reading which the participant cannot anticipate, they are better
off waiting to see which interpretation will be required for the answer. If
underspecification is indeed the preferred strategy, there should be no difference in reading times
across the different conditions, nor should there be any difficulty in judging any kind
of sentence-display pair. Assuming immediate full specification of scope, however, we
would expect the same pattern of results as in Bott and Radd's study: slower reading times
in (9a) than in (9b) at the second quantifier, as well as slower responses to displays
requiring the wide-scope existential interpretation, the latter presumably modulated by
distributivity of Q2.
This experiment should be able to distinguish intermediate positions between the
two extremes of complete underspecification and immediate full interpretation. It is
conceivable, for instance, that scope interpretation is only initiated when the perceiver
can be reasonably sure that they have received all (or at least sufficient) information.
This would correspond to the same reading time effects (and same answering behavior)
as predicted under immediate full interpretation, but the effects would be somewhat
delayed. Another possibility is an initial underspecification of scope, but the
construction of a fully specified interpretation at the boundary of some interpretation domain
such as the clause boundary. That would predict a complete lack of reading time effects
but answer times showing the same incompatibility effects as under versions of the full
interpretation approach.
It is worth emphasizing how this design differs from existing studies. First, it looks
at the ambiguous region and not just the disambiguation point. Second, it differs from
15. The application of experimental methods in semantics 3_19
Filik, Paterson & Liversedge (2004), who also measured reading times in the ambiguous
region, but who used the kind of disambiguation that we criticized in section 3.
6. Conclusions
In this article we have attempted to show that experimentally obtained data can, in
spite of certain complicating and confounding factors, be of relevance to semantic
theory and provide both support for and in some cases falsification of its assumptions
and constructs. In section 2 we noted that the field of theoretical semantics has made
less use of experimental verification of its analyses and assumptions. We have seen that
there are some quite good reasons for this and laid out what some of the problematic
factors are. While some of these are shared to a greater or lesser degree with other
branches of linguistics, some of them are peculiar to semantics or are especially severe
in this case.
The main part of our paper reports a research program addressing the issue of relative
scope in doubly quantified sentences. We present this work as an example of the ways
in which experimental approaches can contribute to the development of theory. They
also illustrate some of the practical constraints upon such studies. For example, we have
seen that clear disambiguation is not always easy to achieve, in particular, it is difficult to
achieve without biasing the interpretational choices of the experiment participant. The
use of eye-tracking and fully ambiguous picture displays is a real advance on previous
practice (Bott & Rad6 2009).
Section 3 shows how experimental procedures which are simple enough for non-
specialist experimenters can nevertheless yield evidence of value for the development
of semantic theories: a carefully constructed and counter-balanced design can produce
data of sufficient quality to answer outstanding questions with some degree of finality. In
this particular case the configurational account of scope can be seen as failing to account
for data that the multi-factor account succeeds in capturing.The unsupported account is
demonstrated to need adaptation or development. Experimentation can make the field
of theory more dynamic and adaptive; an account which repeatedly fails to capture
evidence gathered in controlled studies and which cannot economically be extended to do
so will eventually need to be reconsidered.
In section 5 we describe an experiment designed to provide evidence which
distinguishes between two accounts (section 4) of the way that perceivers deal with ambiguity
in the input signal: underspecification vs. full interpretation. This is an example of how
processing data can under certain circumstances provide decisive evidence which
distinguishes between theoretical accounts. It is of course often the case that theory does not
make any direct predictions about psycholinguistically testable measures of processing.
The collaboration of psycholinguists and scmanticists may yet reveal testable predictious
more often than has sometimes been assumed.
We therefore argue for experimental linguists and semanticists to cooperate more
and take more notice of each other's work for their mutual benefit. Semanticists will
gain additional ways to falsify theoretical analyses or aspects of them, which can deliver
a boost to theory development. This will be possible, because experimenters can tailor
experimental methods, tasks, and designs to their specific requirements.
Experimenters for their part will benefit by having the questioning eye of the seman-
ticist look over their experimental materials, which will surely avoid many experiments
320
III. Methods in semantic research
being carried out whose materials fail to uniquely fulfill the requirements of the design.
An example of this is the mode of disambiguation which we discussed in section 3.
Further to this, experimenters will doubtless be able to derive more testable predictions
from semantic theories, if they discuss the finer workings of these with specialist seman-
ticists.We might mention here the example of semantic underspecification: can we find
evidence for its psychological reality? Further questions might be: if some feature of
an expression remains underdetermined by the input, how long can the representation
remain underspecified? Is it possible for a final representation of a discourse to have
unspecified features and nevertheless be fully meaningful?
We conclude, therefore, that controlled experimentation can provide a further source
of evidence for semantics. This data can under certain circumstances give a more detailed
picture of the states of affairs which theories aim to account for.This additional evidence
could be the catalyst for some advances in semantic theory and explanation, in the same
way that it has in syntactic theory.
7. References
Anderson. Catherine 2004. The Structure and Real-time Comprehension of Quantifier Scope
Ambiguity. Ph.D. dissertation. Northwestern University, Evanstone, IL.
Aoun, Joseph & Yen-hui Audrey Li 1989. Scope and constituency. Linguistic Inquiry 16,623-637.
Bcghclli, Filippo &Tim Stowcll 1997. Distributivity and negation:The syntax of each and every. In:
A. Szabolcsi (ed.). Ways of Scope Taking. Dordrecht: Kluwer, 71-107.
Bott, Oliver & Janina Rad6 2007. Quantifying quantifier scope. In: S. Featherston & W. Sternefeld
(eds.). Roots. Linguistics in Search of its Evidential Base. Berlin: Mouton dc Gruyter, 53-74.
Bott, Oliver & Janina Rad6 2009. How to provide exactly one interpretation for every sentence, or
what eye movements reveal about quantifier scope. In: S. Winkler & S. Featherston (eds.). The
Fruits of Empirical Linguistics, Volume I: Process. Berlin: de Gruyter, 25-46.
Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge. MA:Thc MIT Press
Filik, Ruth, Kevin B. Paterson & Simon P. Liversedge 2004. Processing doubly quantified sentences:
Evidence from eye movements. Psychonomic Bulletin & Review 11,953-959.
Frazier, Lyn & Keith Rayner 1990.Taking on semantic commitments: Processing multiple meanings
vs. multiple senses. Journal of Memory and Language 29,181-200.
Gillen, Kathryn 1991. The Comprehension of Doubly Quantified Sentences. Ph.D. dissertation.
Durham University.
van Gompel, Roger P.G. & Martin J. Pickering 2007. Syntactic parsing. In: G. Gaskell (ed.). The
Oxford Handbook of Psycholinguistics. Oxford: Oxford University Press 455-504.
Hahnc, Anja & Angela D. Fricdcrici 2002. Differential task effects on semantic and syntactic
processes as revealed by ERPs. Cognitive Brain Research 13,339-356.
Higginbotham, James 1985. On semantics. Linguistic Inquiry 16,547-594.
Hobbs, Jerry & Stuart M. Shieber 1987. An algorithm for generating quantifier scopings.
Computational Linguistics 13,47-63.
Hornstein, Norbert 1984. Logic as Grammar. Cambridge, MA:The MIT Press.
Hornstein, Norbert 1995. Logical Form: From GB to Minimalism. Oxford: Blackwell.
Huang. Cheng-Teh James 1982. Logical Relations in Chinese and the Theory of Grammar. Ph.D.
dissertation. MIT, Cambridge, MA.
Ioup, Georgette 1975. The Treatment of Quantifier Scope in Transformational Grammar. Ph.D.
dissertation. The University of New York, New York.
Kuno,Susumu 1991. Remarks on quantifier scope. In: H. Nakajima (ed.). Current English Linguistics
in Japan. Berlin: Mouton de Gruyter, 261-287.
15. The application of experimental methods in semantics 321
Kurtzman. Howard S. & Maryellen C. MacDonald 1993. Resolution of quantifier scope ambiguities.
Cognition 48,243-279.
Larson, Meredith. Ryan Doran, Yaron McNabb, Rachel Baker, Matthew Berends, Alex Djalali &
Gregory Ward 2010. Distinguishing the said from the implicated using a novel experimental
paradigm. In: U. Sauerland & K. Yatsushiro (eds.). Semantics and Pragmatics. From Experiment
to Theory. Houndmills: Palgrave Macmillan.
May, Robert 1977. The Grammar of Quantification. Ph.D. dissertation. MIT, Cambridge, MA.
Reprinted: Bloomington, IN: Indiana University Linguistics Club, 1982.
May, Robert 1985. Logical Form: Its Structure and Derivation. Cambridge, MArTlie MIT Press
May, Robert & Alan Bale 2006. Inverse linking. In: M. Everaert & H. van Riemsdijk (eds.). Black-
well Companion to Syntax. Oxford: Blackwell, 639-667.
Pafel, Jurgen 2005. Quantifier Scope in German. (Linguistics Today 84). Amsterdam: Benjamins.
Park, Jong C. 1995. Quantifier scope and constituency. In: H. Uszkoreit (ed.). Proceedings of the
33rd Annual Meeting of the Association of Computational Linguistics. Boston, MA: Morgan
Kaufmann, 205-212.
Phillips, Colin & Matthew Wagers 2007. Relating structure and time in linguistics and
psycholinguist ics. In: M.G. Gaskell (ed.)- The Oxford Handbook of Psycholinguistics. Oxford: Oxford
University Press, 739-756.
Pickering, Martin J., Brian McElree, Steven Frisson, Lilian Chen & Matthew J.Traxler 2006. Under-
specification and aspectual coercion. Discourse Processes 42,131-155.
Pinango, Maria, Edgar Zurif & Ray Jackendoff 1999. Real-time processing implications of enriched
composition at the syntax-semantics interface. Journal of Psycholinguistic Research 28,395-414.
Pinkal, Manfred 1999. On semantic underspecification. In: H. Bunt & R. Muskens (eds.).
Computing Meaning, Dordrecht: Kluwer, 33-55.
Pylkkanen, Liina & Brian McElree 2006. The syntax-semantics interface: On-line composition of
meaning. In: M. A. Gcrnsbachcr & M.Traxlcr (eds.). Handbook of Psycholinguistics. 2nd cdn.
New York: Elsevier, 537-577.
Reinhart. Tanya 1976. The Syntactic Domain of Anaphora. Ph.D. dissertation. MIT, Cambridge, MA.
Reinhart,Tanya 1978. Syntactic domains for semantic rules. In: F. Guenthner & S.J. Schmidt (eds.).
Formal Semantics and Pragmatics for Natural Language. Dordrecht: Rcidcl, 107-130.
Reinhart,Tanya 1983. Anaphora and Semantic Interpretation. London: Croom Helm.
Reinhart, Tanya 1995. Interface Strategies (OTS Working Papers in Linguistics). Utrecht: Utrecht
University.
Ross, John R. 1970. On declarative sentences. In: R. Jacobs & P. Rosenbaum (eds). Readings in
English Transformational Grammar. Waltham, MA: Ginn, 222-272.
Sanford, Anthony J. & Patrick Sturt 2002. Depth of processing in language comprehension: Not
noticing the difference. Trends in Cognitive Neuroscience 6,382-386.
Todorova, Marina, Kathy Straub, William Badecker & Robert Frank 2000. Aspectual coercion and
the online computation of sentential aspect. In: L. R. Gleitman & A. K. Joshi (eds.).
Proceedings of the 22nd Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence
Erlbaum Associates, 3-8.
Tunstall, Susanne L. 1998. The Interpretation of Quantifiers: Semantics and Processing. Ph.D.
dissertation. University of Massachussetts, Amherst, MA.
VanLehn, Kurt A. 1978. Determining the Scope of English Quantifiers. Technical Report (AI-TR
483). Cambridge, MA, Artificial Intelligence Laboratory, MIT.
Weskott,Thomas & Gisbcrt Fansclow 2009. Scaling issues in the measurement of linguistic
acceptability. In: S. Winkler & S. Fcathcrston (eds.). The Fruits of Empirical Linguistics, Volume I:
Process. Berlin: de Gruyter, 229-246.
Zhou, Peng & Liqun Gao 2009. Scope processing in Chinese. Journal of Psycholinguistic Research
38,11-24.
Oliver Bott, Sam Featherston, Janina Rado, and Britta Stolterfoht, Tubingen (Germany)
IV. Lexical semantics
16. Semantic features and primes
1. Introduction
2. Background assumptions
3. The form of semantic features
4. Semantic aspects of morpho-syntactic features
5. Interpretation of semantic features
6. Kinds of elements
7. Primes and universals
8. References
Abstract
Semantic features or primes, like phonetic and morpho-syntactic features, are usually
considered as basic elements from which the structure of linguistic expressions is built
up. Only a small set of semantic features seems to be uncontroversial, however. In this
article, semantic primes are considered as basic elements of Semantic Form, the interface
level between linguistic expressions and the full range of mental structures representing the
content to be expressed. Semantic primes are not just features, but elements of a functor-
argument-structure, on which the internal organization of lexical items and their
combinatorial properties, including their Thematic Roles, is based. Three types of semantic primes
are distinguished: Systematic elements, that are related to morpho-syntactic conditions like
tense or causativity; idiosyncratic features, not corresponding to grammatical distinctions,
but likely to manifest primitive conceptual conditions like color or taste; and a large range
of elements called dossiers, which correspond to hybrid mental configurations, integrating
varying modalities, but providing unified conceptual entities. (Idiosyncratic features and
dossiers together account for what is sometimes called distinguishers or completers.) A
restricted subsystem of semantic primes can reasonably be assumed to be directly fixed by
Universal Grammar, while the majority of semantic primes is presumably due to general
principles of mental organization and triggered by experience.
1. Introduction
The concept of semantic features, although frequently used in pertinent discussion, is
actually in need of clarification with respect to both of its components. The term feature,
referring in ordinary discourse to a prominent or distinctive aspect, quality or
characteristic of something, became a technical term in the structural linguistics of the 1930s,
primarily in phonology, where it identified the linguistically relevant properties as opposed
to other aspects of sound shape.The systematic extension of the concept from phonology
to other components of linguistic structure led to a wide variety of terms with similar
but not generally identical interpretation, including component, category, atom, feature
value, attribute, primitive element and a number of others. The qualification semantic,
which competes with a smaller number of alternatives, conceptual being one of them, is
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 322-357
16. Semantic features and primes 323
not less in need of elucidation though, depending among others on the delimitation of
the relevant domains, including syntactic, semantic, conceptual, discourse and pragmatic
structure, and the intricate distinction between linguistic and encyclopedic knowledge.
In what follows, I will use the term semantic feature in the sense of basic component of
linguistic meaning, adding amendments and changes, where necessary.
Different approaches and theoretical frameworks of linguistic analysis recognized the
need for an overall concept of basic elements in terms of which linguistic phenomena
are to be analyzed. Hjelmslev (1938) for instance based his theory of Glossematics on
the assumption that linguistic expressions are made up of ultimate, irreducible
invariants called glossemes. Following the Saussurean view of content and expression as the
interdependent planes of linguistic structure, he furthermore distinguished kenemes and
pleremes as the glossemes of the expression and the content plane, respectively. A recent
and rather different case in point is the Minimalist Program discussed in Chomsky (1995,
2000), where Universal Grammar is assumed to make available "a set Fof features
(linguistic properties) and operations C,IL (the computational procedure for human
language) that access F to generate expressions", which are pairs (PF, LF) of Phonetic Form
and Logical Form, determining the sound shape and the meaning of linguistic expressions.
The general notion of features as primitive components constituting the structure
of language must not obscure the fundamental differences between basic elements of
the phonetic, morphological, syntactic and semantic aspect of linguistic expressions. On
the phonetic side, the nature of distinctive features as properties of segments and
perhaps syllables is fairly clear in principle and subject to dispute only with respect to
interesting detail. The nature and role of primitive elements on the semantic side however
is subject to problems that are unsolved in crucial respects. A number of questions
immediately arise:
(1) a. What is the formal character of basic semantic elements?
b. Can linguistic meaning be exhaustively reduced to semantic features?
c. Is there a fixed set of semantic features?
d. What is the origin and interpretation of semantic features?
Any attempt to deal with these questions must obviously rely on general assumptions
about the framework which at least makes it possible to formulate the problems.
2. Background assumptions
An indispensable assumption of all approaches concerns the fact that a natural language
L provides a sound-meaning correspondence, relating an unlimited range of signals to an
equally unlimited range of objects, situations, and conditions the expressions arc about.
What linguistics is concerned with, however, is neither the set of signals nor the range
of things they are about, but rather the invariant patterns on which the production and
recognition of signals is based and the distinctions, in terms of which things and
situations are experienced, organized or imagined and related to the linguistic expressions.
Schematically, these general assumptions can be represented as in (2), where PF and SF
(for Phonetic Form and Semantic Form, respectively) indicate the structure of the sound
shape and meaning, with A-P (for Articulation and Perception) and C-I (for
Conceptualization and Intention) abbreviating the complex mental systems by which linguistic
324
IV. Lexical semantics
expressions are realized and related to their intended interpretations. Although I will
comply with general terminology as far as possible, tension cannot always be avoided.
Thus, the Semantic Form SF corresponds in many respects to the Logical Form LF
of Chomsky (1986 and subsequent work), to the Conceptual Structure CS of Jackendoff
(1984 and later work), and to the Discourse Representation Structure DRS of Kamp
and Reyle (1993), to mention just a few comparable approaches, (cf. also article 30
(Jackendoff) Conceptual Semantics, article 31 (Lang & Maienborn) Two-level Semantics
and article 37 (Kamp & Reyle) Discourse Representation Theory).
(2) Signal <?=======*> PF«-
A-P .
This schema is a simplification in several respects, but it enables us to fix a number of
relevant points. First, PF and SF represent the conditions that L imposes on or extracts from
extra-linguistic systems. Hence PF and SF are what is often called interfaces, by which
language interacts with other mental systems, abridged here as A-P and C-I. Semantic
and phonetic features can now be identified as the primitive elements of SF and PF,
respectively. Hence their formal and substantive nature is determined by the character
and role of these interfaces.
Second, the function of language, to systematically assign meaning to signals, is
accomplished by the component indicated here as Morpho-Syntax, which establishes the
connection between PF and SF. For obvious reasons, the rules and principles of Morpho-
Syntax are the primary concern of all pertinent theories, which in spite of differences
that will not be dealt with here agree on the observation that the relation between PF
and SF depends on the lexical system LS of L and is organized by rules and principles
which make up the Grammar G of L. The content of Morpho-Syntax is relevant in the
present context to the extent to which it relies on morphological and syntactic features
that relate to features of SF.
Third, according to (2), PF and SF and their primes are abstract in the sense that they
do not reflect properties of the signal or the external world directly, but represent the
units and configurations in terms of which the mental mechanisms of P-A and C-I
perceive or organize external phenomena. Features of PF should according to Halle (1983)
most plausibly be construed as instructions for vocal gestures in terms of which speech
sounds are articulated and perceived. In a similar vein, features of SF must be construed
not as elements of the external reality, but as conditions according to which elements,
properties, and situations are experienced or construed. Hence semantic features as
components of SF require an essentially mentalistic approach to meaning. This is clearly at
variance with various theories of meaning, among them in particular versions of formal
or model theoretic semantics like, e.g., Lewis (1972) or Montague (1974), which consider
semantics as the relation of linguistic expressions to strictly non-mental,external entities
and conditions. It should be emphasized, though, that the mentalistic view underlying
(2) docs not deny external objects and their properties, corresponding to structures in
SF under appropriate conditions. The crucial point is that it is the mental organization
of C-I which provides the correspondence to the external reality - if the correspondence
External
Morpho-Syntax ► SF <?========> and Internal
^ , C-I Environment
L
Mind/Brain
16. Semantic features and primes 325
actually obtains. Analogous considerations apply, by the way, to PF and its relation to the
external signal. For further discussion of these matters and the present position see, e.g.,
Jackendoff (2002, chapter 10).
It must finally be noted that the symmetrical status of PF and SF suggested in (2) is
in need of modification in order to account for the fundamental differences between
the two interfaces. While the mechanisms of A-P, which PF has access to, constitute a
complex but highly specialized mental system, the range of capacities abbreviated as C-I
is not restricted in any comparable way. As a matter of fact, SF has access to the whole
range of perceptual modalities, spatial orientation, motor control, conceptual
organization, and social interaction, in short: to all aspects of experience. This is not merely a
quantitative difference in diversity, size, and complexity of the domains covered, it raises
problems for the very status of SF as interface, as we will see. In any case, essential
differences concerning the nature of basic elements and their combination in PF and SF have
to be recognized.
Two remarks must be added about the items of the lexical system LS. First, whether
elements like killed, left, rose, or went are registered in LS as possibly complex but
nevertheless fixed lexical items or are derived by morphological rules of G from the
underlying elements kill, leave, rise, go plus abstract morphemes for Person, Number, and Tense
is a matter of dispute with different answers in different theories. In any case, regular
components like kill and -ed as well as idiosyncratic conditions holding for left, rose,
and went need to be captured, and it must be acknowledged that the items in question
consist of component parts. This is related to, but different from, the question whether
basic lexical items like kill, leave, or go are themselves complex structures. As to PF, the
analysis of words into segments and features is obvious, but for SF their decomposition
is a matter of debate, to which we will return in more detail, claiming that lexical items
are semantically complex in ways that clearly bear on the nature of semantic features.
The second remark concerns the question whether and to what extent rules and
principles of G determine the internal structure of lexical items and of the complex
expressions made up of them. Elements of LS are stored in long-term memory and thus
plausibly assumed to be subject to conditions that differ from those of complex
expressions. Surprisingly, though, the character of features and their combination at least seems
to be the same within and across lexical items, as we will see.
As to notation, I will adopt the following standard conventions: Features of PF will be
enclosed in square brackets like [ + nasal ], [ - voiced ], etc.; morphological and syntactic
features are enclosed in square brackets and marked by initial capitals like [ + Past ],
[ - Plural ], [ + Nominal ], etc.; and semantic features are given in small capitals, like
[HUMAN ], [ MALE ], [ CAUSE ].
3. The form of semantic features
3.1. Features in phonology and morpho-syntax
It is useful to first consider the primitive elements of the phonetic side, where the
conditions determining their formal character arc fairly clear. PF is based on sequentially
organized segments, which represent abstract time slots corresponding to the temporal
structure of the phonetic signal. Now features are properties of segments, i.e. one-place
predicates specifying articulatory (and perceptual) conditions on temporal units. The
326
IV. Lexical semantics
predicates either do or do not apply to a given segment, whence basic elements of PF are
mostly conceived as binary features. Thus, a feature combination like [+ nasal, + labial,
+ voiced] would specify a segment by means of the simultaneous articulatory conditions
indicated as nasal, labial, and voiced. Systematic consequences and extensions of these
conditions need to be added, such as the difference between presence and absence of a
property, marked by the plus- and minus-value of the features leading to the markedness-
asymmetry in Chomsky & Halle (1968). Further extensions include relations between
features in the feature geometry of Clements (1985), the predictability of unspecified
features, the supra-segmental properties of stress and intonation in Halle & Vergnaud
(1987), and the arrangement of segmental properties along different tiers, leading to
three-dimensional representations, to mention the more obvious extensions of PF.
These extensions, however, do not affect the basic conditions on phonetic features,
which can be summarized as follows:
(3) Phonetic features are
a. binary one-place predicates, represented as [ ± F ];
b. simultaneous conditions on sequentially connected segments (or syllables);
c. interpreted as instructions on the mechanisms in A-R
Some of the considerations supporting distinctive features in phonology have also
been applied to syntax and morphology. Thus in Jakobson (1936), Hjelmslev (1938), or
Bierwisch (1967) categories like Case, Number, Gender, or Person are characterized in
terms of binary features. Similarly, Chomsky (1970) reduced categories like Noun, Verb,
Adjective, and their projections Noun Phrase etc. to binary syntactic features. While
features of this kind are one-place predicates and thus in line with condition (3a), they
clearly don't meet condition (3b): morphological and syntactic features are not
conditions on segments of PF connected by sequential ordering, but on lexical and syntactic
units, which are related by syntactic conditions like dominance or constituency according
to the morpho-syntactic rules of G. The most controversial aspect of morpho-syntactic
features is their interpretation analogously to (3c). As these features do not belong to
PF or SF directly, but are essentially elements of the mediation between PF and SF, they
can only indirectly participate in the interface structures and their interpretation. But
they do affect PF as well as SF, although in rather different ways. As to PF, morphological
features determine the choice of segments or features by means of inflectional rules like
(4), where / - d / abbreviates the features identifying the past tense suffix d in cases like
climbed, rolled, etc.
(4) [ + Past]-W-d/
Actual systems of rules are more complex, taking into account intricate dependencies
between different categories, features, and syntactic conditions. For systematic accounts
of these matters see e.g., Bierwisch (1967), Halle & Marantz (1993), or Wunderlich
(1997a). In any case, morphological or syntactic features are not part of PF, but can only
influence its content via particular grammatical rules.
As to SF, it is often assumed that features of morphological categories like tense or
number are subject to direct conceptual interpretation, alongside with other semantic
elements, and are therefore considered as a particular type of semantic features. This
16. Semantic features and primes 327
was the position of Jakobson (1936), Hjelmslev (1938), and related work of early
structuralism, where in fact no principled distinction between morphological and semantic
features was made, supporting the notion of semantic primes as binary features. This
view cannot be generally upheld for various reasons, however. To be sure, a feature like
[+ Past ] has a more stable and motivated relation to the temporal condition attached to
it than to its various phonetic realizations e.g., in rolled, left, went, or was, still there is no
reasonable way to treat morphological or syntactic features as elements of SF, as will be
discussed in section 4.
3.2. Features and types
Turning to the actual elements of SF, we notice that they differ from the features of PF in
crucial respects.To begin with,condition (3a), according to which features are binary one-
place predicates, seems to be satisfied by apparently well-established ordinary predicates
like [ male ], [ alive ] or [ open ]. There are, however, equally well-established elements
like [ parent-of ], [ part-of ], [ perceive ], [ before ], or [ cause ], representing relations
or functions of a different type, which clearly violate this condition. More generally, SF
must be based on an overall system of types integrating the different kinds of
properties, relations, and operators. This leads directly to condition (3b), concerning the overall
organization of PF in terms of sequentially ordered segments. It should be obvious that
SF cannot rely on this sort of organization: it does neither consist of segments, nor does
it exhibit linear ordering. Whatever components one might identify in the meaning of
e.g., nobody arrived in time, or he needs to know it, or any other expression, there is no
temporal or other sequential ordering among them, neither for the meaning of words nor
their components. Although semantic processing in comprehension and production does
of course have temporal characteristics, they do not belong to the resulting structure of
meaning,in contrast to the linearstructure related to phonetic processing.See Bierwisch &
Schreuder (1992) for further discussion of this point.
Hence SF is organized according to other conditions, which naturally follow from
the type system just mentioned: Elements of SF, including in particular its primitive
elements, are connected by the functor-argument relation based on the type-structure of
the elements being combined. Notice that this is the minimal assumption to be made,
just like the linear organization of PF, with no arbitrary stipulations. We now have the
following conditions on semantic features, corresponding to the conditions (3a) and (3b)
on the phonetic side:
(5) Semantic elements, including irreducible features, are
a. members of types determining their combinatorial conditions;
b. participants of type-based hierarchical functor-argument structures.
These are just two interdependent aspects of the general organization of linguistic
meaning, which is, by the way, the counterpart of linear concatenation of segments in
PF. There are various notational proposals to make these conditions explicit, usually
extending or adapting basic conventions of predicate logic. The following considerations
seem to be indispensable in order to meet the requirements in (5). First, each element of
SF must belong to some specific type; second, if the element is a functor, its type
determines two things: (a) the type of the elements it combines with, and (b) the type the
328
IV. Lexical semantics
resulting combination belongs to. Thus functor-types must be of the general form (ot,p),
where a is the type of the required argument, and p that of the resulting complex. This
kind of type-structure was introduced in Ajdukicwicz (1935) in a related but different
context, and has since been adapted in various notational ways by Lewis (1972), Cresswell
(1973), Montague (1974), to mention just a few; cf. also article 85 (de Hoop) Type shifting.
Following standard assumptions, at least two basic types are to be recognized: e (for
entity, i.e. objects in the widest sense) and t (for truth-value bearer, roughly propositions
or situations). From these basic types, functor types like (e,t) and (e,(e,t)) for one- and
two-place predicates, (t,t) and (t,(t,t)) for one- and two-place propositional functions, etc.
are built up.To summarize:
(6) Elements of SF belong to types, where
a. e and t are basic types
b. if a and p are types, then (a,p) is a type.
A crucial point to be added is the possibility of empty slots, i.e. positions to be filled in
by way of the syntactic combination of lexical items with the complements they admit or
require. Empty slots are plausibly represented as variables, which are assigned to types
like elements of SF in general. The kind of resulting structure is illustrated in (6) by an
example adapted from Jackendoff (1990: 46), representing the SF of the verb enter in
three notational variants, indicating the relevant hierarchy of types by a labeled tree (7a),
and the corresponding labeled bracketing in (7b) and (7c):
(7) a. x GO TO IN y
II III
e ^.(e.1)) (e>e) (e>e) e
b- UcXlUlU^GollJ^TollJ^iNl^y]]]]
c- [, [<c.,> [[<e,e,» go] [e [<e.e> to] [J<ee> ,n] [e y]]] [e x ]]
While (7b) simply projects the tree branching of (7a) into the equivalent labeled
bracketing, (7c) follows the so-called Polish notation, where functors systematically precede
their arguments. This becomes more obvious if the type-indices are dropped, as in:
[go [to [ in y]]x]. Notice, however, that in all three cases the linear ordering is
theoretically irrelevant. What counts is only the functor-argument-relation. For the sake of
illustration,GO,TO, and in are assumed to be primitive constants of SF with more or less
obvious interpretation: in is a function that picks up the interior region of its argument,
to turns its argument into the end of a path, and go specifies the motion of its "higher"
or external argument along the path indicated by its "lower" or internal argument.
16. Semantic features and primes 329
Jackendoff's actual treatment of this example is given in (8a), with the equivalent
tree representation (8b). In spite of a gross similarity between (7) and (8), a different
conception of the combinatorial type system seems to be involved.
W 3- [Even,G°([pa,h king L' [pa,hTO dpiace ,N (king ]j)])])l
GO "TO IN J
III II
Event-Function Thing Path-Function Place-Function Thing
Putting minor notational details aside (such as Jackendoff's notation [ x([ y])] to indicate
that a component [y] is the argument of a functor x, or his use of indices i and j instead
of variables x and y to identify the relevant slots), the main difference between (7) and
(8) consists in the general attachment of substantive content to type nodes in (8), in
contrast to the minimal conditions assumed in (7), where types merely determine the
combinatorial aspect of their elements. Thus [ in [y ]] in (7) is of the same type e as the
variable y, while its counterpart [ in [j]] in (8) is assigned to the type Place, differing from
the type Thing of the variable [j]. It is not obvious whether the differences between (7)
and (8) have any empirical consequences, after stripping away notational redundancies
by which e.g., the functor Path-Function generates a complex of type Path, or the Place-
Function a Place without specifying substantial information. With respect to the form of
semantic features, however, the two systems can be taken as largely equivalent versions,
together with a fair number of further alternatives, including Miller & Johnson-Laird
(1976), Kamp & Reyle (1993), and to some extent Dowty (1979), to mention some well-
known proposals. Katz (1972) pursues similar goals by means of a somewhat clumsy and
mixed notational system.
3.3. SF and argument structure
One of the central consequences emerging from the systems exemplified in (7) and
(8) is the account of combinatorial properties of lexical items, especially the selection
restrictions they give rise to. Although a systematic survey of these matters would go
far beyond the concern for the nature of semantic features, at least the following points
must be made. First, as already mentioned, variables like x and y in (7) determine the
conditions syntactic complements have to meet semantically, as their SF has to be
compatible with the position of the corresponding variables. Second, the position the
variables occupy within the hierarchy of a lexical item's SF determines to a large extent the
syntactic conditions that the complements in question must meet. What is at issue here
is the semantic underpinning of what is usually called the argument structure of a lexical
item. As Jackendoff (1990:48) puts it,"argument structure can be thought of as an
abbreviation for the part of conceptual structure that is "visible" to syntax." There are various
ways in which parts of SF can be made accessible to syntax. Jackendoff (1990) relies
330
IV. Lexical semantics
on coindexing of semantic slots with syntactic subcategorization, Bierwisch (1996) and
Wunderlich (1997b) extend the type system underlying (8) by standard lambda
operators, as illustrated in (9) for open, where the features [+V(erbal), -N(ominal)] identify
the entry's morpho-syntactic categorization, and act, cause, become, and open make up
its semantic form:
(9) /open/;[+V,-N];
ty ^X [, [, [<e.,> ACTJ le XH [<t.t)l<t.<t.t» CAUSE1 It [<,.,)BECOME] I, l<e.t> OPEN 1 Ie y 1 1 1 1 1
Xy and Xx mark the argument positions for object and subject, respectively, defining in
this way the semantic selection restrictions the complements arc subject to. It is worth
noting, incidentally, that these operators in some way correspond to the functional
categories AgrQ and Agrs assumed in Chomsky (1995 and related work) to relate the verb to
its complements. In spite of relevant similarities, though, there are important differences
that can not be pursued here. Representations like (9) also provide a plausible account
for the way in which the transitive and intransitive variants of open and other "(un)erga-
tive" verbs like close, change, break etc. are related:The intransitive version is derived by
dropping the causative elements [act x ] and [ cause ] and the operator Xx, turning Xy
into the subject position:
(10) / open /; [+ V,-N] ; Xy [,[ ^become ] [t[<el> open ] [e y ] ] ]
These observations elucidate how syntactic properties of lexical items are routed in
configurations of their SF, determining thereby the semantic effect of the syntactic
head-complement-construction.
3.4. Extensions
The principles exemplified in (7) to (10) provide the skeleton of SF, determining the form
of semantic features, the structure of meaning of lexical items and complex expressions.
They do not account for all aspects of linguistically determined meaning, though, but
must be augmented with respect to phenomena like shift or continuation of reference,
topic-focus-articulation, presupposition, illocutionary force, and discourse relations. To
capture these facts, Jackendoff (1990, 2002) enriched semantic representations by
distinguishing a referential tier and information structure from the descriptive tier. Similar
ideas are pursued in Kamp & Reyle (1993), where a referential "universe of discourse"
is distinguished from proper semantic representation, which in Kamp (2001) is
furthermore split up into a descriptive and a presuppositional component. Distinctions of this
type do not affect the nature of basic semantic components, which they necessarily rely
on. Hence we need not go into the details of these extensions. We only mention that
components like become involve presuppositions, as observed in Bierwisch (2010),or that the
extensively discussed definiteness operator def depends on the universe of discourse.
Two further points must be added to this exposition, both of which show up in ordinary
nouns like bottle, characterized by the following highly provisional lexical entry:
(11) /bottle /;[-V,+N ] ;Xx [[physical x] [artifact x] [container x] [bottle x]]
16. Semantic features and primes 331_
The first point concerns the role of standard classificatory features like [physical] [object],
[artifact], [furniture], [container], etc, which are actually the root of the notion
feature, descending from the venerable tradition of analyzing concepts in terms of genus
proximum and differentia specifica. Although natural concepts of objects or any other
sort of entities obviously do not comply with orderly hierarchies of classification, the
logical relations and the (at least partial) taxonomies that classificatory features give rise
to clearly play a fundamental role in conceptual structure and lexical meaning. Entries
like (11) indicate how these conditions are reflected in SF: Features like [person],
[artifact], [physical] etc. are elements of type (e,t), i.e. properties that classify the argument
they apply to. Different conditions that jointly characterize an object x, e.g. as a bottle,
must according to the general structure of SF make up an integrated unit of a resulting
common type. Intuitively, the four conditions in (11), which are all propositions of type
t, are connected by logical conjunction, yielding a complex proposition, again of type t.
This can be made explicit by means of an (independently needed) propositional functor
& of type (t,(t,t)), which turns two propositions into simultaneous conditions of one
complex proposition. It might be noted that & is asymmetrical,in line with the generaldefini-
tion of functor types in (6b), according to which a functor takes exactly one argument,
such that two-place functors combine with their arguments in two hierarchical steps.This
is due to the lack of linear order in SF, by which the relation between elements becomes
structurally asymmetrical. In case of conjunction, which is generally assumed to be
symmetrical with regard to the conjuncts, this asymmetry might to some extent correspond
to the specificity of conditions, such that the SFof (11) would come out as (12):
(12) A.X [ [ PHYSICAL X ] [ & [ ARTIFACT X ] [ & [ CONTAINER X ] [ & [ BOTTLE X ] ] ] ] ]
Thus [ & [ container x ] [ & [ bottle x ] ] ] is the more specific condition added to the
condition [ artifact x ]. Whether this is in fact the appropriate way to reflect hierarchical
classification must be left open, especially as there are innumerable cases of joint
conditions that do not make up hierarchies but reflect properties of cross-classification or just
incidental combinations of conditions, differences the asymmetry of & cannot reflect.
It might be added that various proposals have been made to formally represent joint
conditions within lexical items as well as in standard types of adverbial or adnominal
modification like little, green bottle or walk slowly through the garden.
The second point to be added concerns elements like [ bottle ], whose status is
problematic and characteristic in various respects. On the one hand, standard dictionary
definitions of bottle like with a narrow neck, usually made of glass suggest that [ bottle ]
is a heuristic abbreviation to be replaced by a more systematic analysis, turning it into
a complex condition consisting of proper elementary features. On the other hand, any
further analysis might run into problems that cannot reasonably be captured in SF.Thus
even if the conditions indicated by narrow neck could be accounted for by appropriate
basic elements, it is unclear whether the features of neck, designating the body-part
connecting head and shoulders, can be used in bottle, without creating a vicious circle, which
requires bottle to identify the right sort of neck. What is important here is not a matter
of particular detail, but a kind of problem which shows up in innumerable places and in
fact marks the limits of semantic structure. The issue has been treated in different ways.
Laurence & Margolis (1999) appropriately called it "the problem of Completers",
332
IV. Lexical semantics
dealing with the residue of systematic analysis. It will be taken up below,cf. also article 32
(Hobbs) Word meaning and world knowledge.
Features of PF and SF were compared in §3.2. with respect to form and combination,
but not with respect to their interpretation, which for PF according to condition (3c)
consists in instructions on mechanisms in A-P. A straightforward counterpart for SF would
consider its basic elements as conditions on mechanisms of C-I.This analogy, apparently
suggested in schema (1), would be misleading for various reasons, though.The size of the
repertoire, the diversity of the involved components of C-I, and the status of SF as
interface all raise problems wildly differing from those of PF. The subsequent sections deal
with the conditions on interpretation of semantic features in more detail. Distinctions
between at least three kinds of basic elements will have to be made, not only because of
the ambivalent status of the completers just mentioned.
4. Semantic aspects of morpho-syntactic features
4.1. Interpretability of features and feature instances
The semantic aspect of morphological and syntactic categories is a matter of
continuous debate. As already mentioned, morphological features specifying categories like
Case, Number, Gender, Tense etc. were considered by Jakobson, Hjclmslcv, and others
as grammatical elements with essentially semantic content, independently of the PF-
realization assigned to them by rules like (4). We cannot deal here with the interesting
results of these approaches in any detail. It must be noted, however, that the insights into
conceptual aspects of morphological categories were never incorporated into
systematic and coherent semantic representations, their integration was left to common sense
understanding - mainly because appropriate frameworks of semantic representation
were not available.
To account for conceptual aspects of morpho-syntactic features, two important
distinctions must be recognized. First, two different roles of feature instances are to be
distinguished, which might be called "dominant" and "subdominant", for lack of better
terms. Dominant instances are essentially elements that categorize the expression they
belong to, subdominant instances are conditions on participants of various syntactic
relations including selection of complements, agreement, and concord. Thus the pronoun
her is categorized by the dominant features [+Feminine, -Plural, +Accusative], among
others, while prepositions like with or verbs like know specify their selection restriction
by subdominant instances of the feature [+Accusative]. Similarly, the proper name John
is categorized by the dominant features [+N, + 3.Person,-Plural ], while the (inflected)
verb knows has the subdominant features [+ 3.Person,-Plural]. The intricate details of
the relevant morpho-syntactic relations and the technicalities of their formal treatment
are notoriously complex and cannot be dealt with here. We merely note that different
feature instances are decisive in mediating the relation between PF and SF.The crucial
point is that subdominant feature instances have no semantic interpretation, neither in
selection restrictions nor as conditions in agreement or concord.Thus, in a case like these
boys knew her, the feature [+Plural] of boys, the feature [+Past] in knew, and the feature
[+ Feminine ] in her can participate in semantic interpretation, but the feature [+Plural ]
of these or the Number- and Person-features of the verb cannot.
16. Semantic features and primes 333
The second distinction to be made concerns the interpretability of dominant feature
instances. Three possibilities have to be recognized: First, features that have a constant
interpretation, with all dominant instances determining a fixed semantic effect, second,
features with a conditional interpretation, subject to different interpretations under
different circumstances, and third, features with no interpretation, whether dominant or
not. Clear cases of the first type are Tense and Person, whose interpretation is invariant.
Gender and Number in many languages are examples of the second type.Thus [+Plural]
has a clear conceptual effect in nouns like boys, nails, clouds, or women, which it doesn't
have in scissors, glasses or trousers, and [+ Feminine] is semantically vacuous in the case
of nouns designating ships. The paradigm case of the third type is structural Case. Lack
of conceptual content holds even for instances where a clear semantic effect seems to be
related to Case-distinctions, such as the contrast between German Dative and
Accusative in locative aufdem Dach (on the roof) vs. directional aufdas Dach (onto the roof),
because the contrast is actually due to the SF of the preposition, not the dominant Case
of its object, as shown in Bierwisch (1997). We will return to abstract Case shortly; see
also article 78 (Kiparsky & Tonhauser) Semantics of inflection.
This three-way-distinction is realized quite differently in different languages. A
particularly intriguing case in point is Gender in German, which has dominant instances in
the categorization of all nouns. Now, the feature [+Feminine] corresponds to the concept
female in cases like Frau (woman) Schwester (sister), but has no interpretation in the
majority of inanimate nouns like Zeit (time), Briicke (bridge), Wut (rage), many of which
inherit the feature [+Feminine] from derivational suffixes like -t, -ung, -heit, as in Fahr-t
(drive), Wirk-ung (effect), Dumm-heit (stupidity), etc. Moreover, some nouns like Weib
(woman) or Madchen (girl) are categorized as [-Feminine], irrespective of their SF. Even
more intricate are [+Feminine]-cases like Katze (cat), Ratte (rat), which designate either
the species (without specified sex) or the female animal. Further complications come
with nouns like Person (person) with the feature [+ Feminine], which arc animate and
may or may not be female.
Three conclusions follow from these observations: First, morpho-syntactic features
must be distinguished from their possible semantic impact, because these features
sometimes do, sometimes do not have a conceptual interpretation. Second, the relation
between morpho-syntactic features and elements of SF, which mediate the conceptual
content, must be determined by conditions comparable to those sketched in (4) relating
morphological features to PF, even if the source and the content of the conditions
is remarkably different. Third, the relation of morphological features to SF must be
conditional, applying to dominant instances depending on sometimes rather special
morphological or semantic circumstances.
4.2. The interpretation of morphological features
Morpho-syntactic features are binary conditions of the computational system that
accounts for the combinatorial matching of PF and SF.The semantic value corresponding
to these features consists of elements of the functor-argument-structure of SF. Hence the
conditions that capture this correspondence must correlate elements and configurations
of two systems, both of which are subject to their respective principles. (13) illustrates
the point for the feature [+Past] occurring, e.g., in the door opened indicating that an
334
IV. Lexical semantics
utterance u of this sentence denotes an event s the time T of which precedes the time
T of u, where T is a functor of type (e,e) and before a relation of type (e,(e,t)):
(13) [ +Past ]h[Ts][ before [ T u ] ]
This is a simplification which does not reflect the complex details discussed in the huge
literature about temporal relations, but (13) should indicate in principle how the feature
[+Past],syntactically categorizing the verb (or whatever one's syntactic theory requires),
can contribute to meaning. More specifically, most semantic analyses agree that tense
information applies to events instantiating in one way or another the propositional
condition specified by the verb and its complements. With this proviso, we get (14) as the
SF of the door opened, if we expand (10) by the event-instantiation expressed by the
operator inst of type (t,(e,t)).
(14) [ [ T S [ BEFORE [ T U ] ] ] & [ S [ INST [ BECOME [ OPEN [ DEF X [DOOR X ]]]]]] ]
As [+Past] has a constant interpretation in English, (13) needs no restricting condition:
Instances of [+Past] are always subject to (13). It might be considered as a kind of lexical
entry for [+Past] which applies to all occurrences of the feature, whether its morpho-
phonological realization is regular or idiosyncratic. There are further non-trivial
questions related to conditions like (13). One concerns the appropriate choice of the event
argument s, which obviously depends on the "scope" of [+Past], i.e. the constituent it
categorizes. Another one is the question what (13) means for the interpretation of [-Past],
i.e. morphological present tense. One option is to take [-Past] to determine the negation
not[Ts [ beforeTu] ].A more general strategy would consider [-Past] as the unmarked
case, which is semantically unspecified and simply doesn't determine a temporal position
of the event s.These issues concerning the conceptual effect of morphological categories
are highly controversial and need further clarification.
Conditional interpretation of morphological features is exemplified by [+Plural] in
(15), where [ collection ] of type (e,t) must be construed as imposing the condition that
its argument consists of elements of equal kind, rather than being a set in the sense of set
theory, since the denotation of a plural NP like these students is of the same type as that
of the singular the student. See, e.g., Kamp & Reyle (1993) for discussion of these mattes
and some consequences.
(15) [ + Plural ] <-> [ collection x ]
Like (13), condition (15) might be construed as a kind of lexical information about the
feature [+Plural], which would have the effect that for a plurale tantum like people the
regular collective reading follows from the idiosyncratic categorization by [+Plural]. Now,
the conditional character of (15) would have to be captured by lexically blocking nouns
like glasses or measles from the application of (15). Thus measles would be marked by
two exceptions: idiosyncratic categorization as [+Plural], but at the same time exemption
from the feature's interpretation.This again raises the question of interpreting [-Plural],
and again, the strategy to leave the unmarked value of the feature [±Plural] without
semantic effect seems to be promising, assuming that the default type of reference is
just neutral with respect to the condition represented as [ collection ]. A revealing
16. Semantic features and primes 335
illustration of lexical idiosyncrasies is the contrast between German Ferien (vacation)
and its near synonym Urlaub, where Ferien is a plurale tantum with idiosyncratic
singular reading, much like measles, while Urlaub allows for [+Plur] and [-Plur] with (15)
applying in the former case.
Still more complex conditions obtain, as already noted, for the category Gender in
German. The two features [± Masculine ] and [± Feminine ] identify three Genders by
two features as follows:
(16) a. Masculine: [+Masculine ] ; Mann (man), Loffel (spoon)
b. Feminine: [+ Feminine ]; Frau (women), Gabel (fork)
c. Neuter: [-Masculine,-Feminine ] ; Kind (child), Messer (knife)
Only plus values of Gender features are related to semantic conditions, and only for
animate nouns.This can be expressed as follows:
(17) a. [ + Masculine ] <-> [ male x ] / [ animate x ]
b. [ + Feminine ] <-> [ female x ] / [ animate x ]
The correspondence expressed in (17) holds only for cases where the semantic
condition [ animate x ] is present, hence the Gender features of nouns like Loffel, Gabel
have no semantic impact. According to the strategy that leaves minus-valued features
semantically uninterpreted, nouns like Kind and Messer would both be equally
unaffected by (17), while derivational affixes like -ung, -heit, -t, -schaft project their inherent
Gender-specification even when it is not interpretable by (17). Conditions like (17) and
(15) might reasonably be considered as saving lexical redundancy, such that the
categorization as [+Masculine] is predictable for nouns like Vater(father) the SF of which
contains [ animate x & [ male x ] ]. Idiosyncratic specifications would then be involved
in various cases of blocking (17). Thus Weib (woman) is [ female x ] in SF, but
categorized as [-Masculine, -Feminine]. On the other hand, Zeuge (witness) is categorized as
[+Masculine], but blocked for (17), although marked as [animate x]. Similarly Person
(person) is [+ Feminine ], but still not [ female x ], as eine mannliche Person (a male
person) is not contradictory.
Finally a short remark is in order about structural or abstract Case, the paradigm case
of features devoid of SF-interpretation. Two points have to be made. First, structural
Case must be distinguished from semantic or lexical Case, e.g., in Finno-Ugric and
Caucasian languages, representing locative and directional relations of the sort expressed by
prepositions in Indo-European languages. Semantic Cases like Adessiv, Inessiv, etc.
correspond to in, on, at, etc. Although the borderline between abstract and semantic Case is
not always clear-cut, raising non-trivial questions of detail, the role of abstract Case we
are concerned with is clear enough in principle.
There is, of course, a natural temptation to extend successful analyses of semantic
Cases to the Case system in general, relying on increasingly abstract characterizations,
which are compatible with practically all concrete conditions. Hjelmslev (1935) is an
impressive example of an ingenious strategy of this sort;cf. also article 78 (Kiparsky &
Tonhauser) Semantics of inflection.
Second, the semantic aspect of morphological features must not be confused with
their participation in expressions whose semantic structure differs for independent
336
IV. Lexical semantics
reasons, as already noted with regard to the locative/directive-distinction of the German
prepositions in, an, auf, etc. The most plausible candidates for possible semantic effects
of abstract Case are thematic roles in constructions like he hit him, where Nominative
and Accusative relate to Agent and Patient. Identity or difference of meaning is not due
to semantic features assigned to abstract Case, however, as can be seen from
constructions like (18), where the verb anziehen (put on, dress) assigns the same role Recipient
alternatively by means of Dative, Accusative, Nominative, and Genitive in the four cases
in (18a-d), while in (18b) and (18e) the different roles Recipient and Theme are marked
by the same Case Accusative.
(18) a. cr zicht dcm PaticntcnDat den MantclAcc an (he puts the coat on the patient)
b. er zieht den PatientenAcc an (he dresses the patient)
c. der PatientNom wird angezogen (the patient is dressed)
d. das Anziehen des PatientenGen (the dressing of the patient)
e. Er zieht den MantelAcc an (he puts on the coat)
Hence the roles of complements cannot derive from Case features,but must be determined
by the SF of the verb in the way indicated in (9) and (10). The selection restrictions the
verb imposes on its complements are inherent lexical conditions of the verb, modulated
by passivization in (18c) and nominalization in (18d).See Wunderlich (1996b) for a
discussion of further interactions between grammatical and semantic conditions; cf. also
article 84 (Wunderlich) Operations on argument structure.
To sum up, morphological features must clearly be distinguished from elements of
SF, but may correspond to them by systematic conditions; and they may participate in
making semantic distinctions explicit, even if they don't have any semantic purport at all.
This point is worth emphasizing in view of considerations like those in Svenonius (2007),
who adumbrates conceptual content for morphological features in general, including
Gender and abstract Case, not distinguishing instances with and without semantic
purport (as in Gender), or semantic purport and the grammatical distinction it depends on
(as in structural Case).
4.3. Syntactic categories
While categories like Noun, Verb, Preposition, Determiner, etc. are generally assumed to
determine the combinatorial properties of lexical items and complex expressions, the
identification of their features is still a matter of debate. Incidentally, the terminology sways
between syntactic and lexical categories, depending on whether functional categories like
Complementizer, Determiner, Tense, etc. and phrasal categories like NP, VP, DP, etc. are
included. For the sake of clarity, I will talk about syntactic categories and their features.
Alternatives of the initial proposal [ ± Verbal, ± Nominal ] generally recognize the
determination of argument structure as the major effect of the features in question. One
point to be noted is that nouns and adjectives do not have strong argument positions,
i.e. they allow their complements to be dropped, in contrast to verbs and prepositions,
which normally require their argument positions to be syntactically realized. Hence the
feature [+Nominal] could be construed as indicating the argument structure to be weak
in this sense. This is more appropriately expressed by an inverse feature like [-Strong
Arguments]. Furthermore, verbs and nouns are recognized to require functional heads
16. Semantic features and primes 337
(roughly Determiner for nouns and Tense and Complementizer for verbs), ultimately
supporting the referential properties of their constituents, a condition that does not
apply to adjectives and prepositions. This can be expressed by a feature [+Referential],
which replaces the earlier feature [Verbal]. A similar proposal is advocated in Wun-
derlich (1996a). It might be noted that the feature [+Referential] - or whatever
counterpart one might prefer - has a direct bearing on the categorization of the functional
categories Determiner,Tense, and Complementizer, and the constituents they dominate.
These matters go beyond the present topic, though. We would thus get the following still
provisional proposal to capture the distinctions just noted:
(19) Verb Noun Adjective Preposition
Strong Arg + - - +
Referential + + -
Further systematic effects these features impose on the argument structure and
selection restrictions depend on general principles of Universal Grammar and on language-
particular morphological categories. Thus in English and German Nominative- and
Accusative-complements of verbs must be realized as Genitive-attributes of
corresponding nouns, as shown by well-known examples like John discovered the solution vs.
John's discovery of the solution.
On the whole, then, the features in (19) have syntactic consequences and regulate
morphological conditions, and don't seem to call for any semantic interpretation. There
is, however, a traditional, if not very clear view, according to which syntactic categories
have an essentially semantic or conceptual aspect, where nouns, verbs, adjectives, and
prepositions denote, roughly speaking, things, events, properties, and relations,
respectively. Even Hale & Keyser (1993) adopt within the "Minimalist Program" a notional
view of syntactic categories, assuming verbs to have the type "(dynamic) event" e,
prepositions the type "interrelation" r, adjectives the type "state" s, with only the type n of
nouns left without further specification. There are, however, various kinds of
counterexamples such as adjectives denoting relations like similar or dynamic aspects of events
like sudden. Primarily, though, all these conditions are neatly accounted for by the SF
and the argument structure of the items in question, as shown by a rough illustration like
(20). According to (20a), a transitive verb like open denotes an event s which instantiates
an activity relating x to y; (20b) shows that the relational noun father denotes a person x
related to the (syntactically optional) individual y;(20c) indicates that the adjective open
denotes a property of the entity y, and (20d) sketches the preposition in, which denotes a
relation locating the entity x in the interior of y.
(20) a. /open/ X y X x X s [ s inst [ [ act x ] [ cause [ become [ open y ] ] ] ] ]
b. /father/ X y X x [ person x [ & [ male x ] [ & [ x [parent-of y ] ] ] ] ]
c. /open/ X y [ open y ]
d. /in/ X y X x [ x loc [ interior y ] ]
Clearly enough, the relevant sortal or ontological aspects of the items arc taken care of
without further ado.This is not the whole story, though. Even if the semantic neutrality
of the features in (19) might be taken for granted, one is left with the well-known but
intriguing asymmetry between nouns and verbs, both marked [+Referential], but subject
338
IV. Lexical semantics
to different constraints: While nouns have access to practically all sortal or ontological
domains, verbs are restricted to events, processes, and states. The asymmetry is
highlighted by basic lexical items like (21), which can show up as verbs as well as nouns, but
with different semantic consequences:
(21) a. run, walk, jump, rise, sleep, use,....
b. bottle, box, shelf, fence, house,saddle,....
Items like (21a) have essentially the same SF as nouns and verbs, differing only by their
morpho-syntactic realization, as in Eve runs vs. Eve's run. By contrast, items like (21b),
occurring as verbs differ scmantically from the corresponding noun, as illustrated by Eve
bottles the wine, Max saddles the horse, etc. In other words, nouns allow for object- as
well as event-reference, while verbs cannot refer to objects. Hence items like run need
only allow for alternative values of the feature [ ± Strong Arg ] to yield nominal or verbal
realizations, while nouns like bottle with object denotation achieve the necessary event
reference only through additional semantic components like cause, become, and loc.
These conditions can be made explicit in various ways. What we are left with in any case
is the observation that the event-reference of verbs seems to be constrained by syntactic
features. Event reference is used here (as before) in the general sense of eventuality
discussed in Bach (1986), including events, processes, states, activities - in short, entities
that instantiate propositions and arc subject to temporal identification as assumed in
(13) above for Tense features.
Two points should be made in this respect: First, the constraint is unidirectional, as
verbs require event-reference, but event reference is not restricted to verbs in view of
nouns like decision, event, etc Second, as it concerns verbs but none of the other
categories, it cannot be due to one feature alone, whether (19) or any other choice is adopted.
Using the present notation, the constraint could be expressed as follows:
(22) [ + Referential, + Strong Arg ] -> X s [ s inst [ p ] ]
This condition singles out verbs, as desired, and it determines the relevant notional
condition on their referential capacity, if we consider inst as imposing the sortal requirement
of eventualities on its first argument. It also has the correct automatic result of fixing
the event reference for verbs, which is the basis for the semantic interpretation of Tense,
Aspect, and Mood. Something like (22) might, in fact, not only be part of the (perhaps
universal) conditions on syntactic features in general and those of verbs in particular, but
can also be considered as a redundancy condition on lexical entries, providing verbs with
the event instantiation as their general semantic property. If so, the lexical information
of entries like (20a) could be reduced to (9), with the event reference being supplied
automatically.
5. Interpretation of semantic features
5.1. Conditions on interpretation
Basic elements of SF, including those corresponding to morpho-syntactic features, are
eventually related to or interpreted by the structures of C-I. As already mentioned, this
16. Semantic features and primes 339
interpretation, its conditions, and consequences are analogous to the interpretation
of phonetic features in A-P, yet different in fundamental respects. Analogies and
differences can be characterized by four major points.
First, phonetic and semantic features alike are primitive elements of the
representational systems they belong to, they cannot be reduced to smaller elements. Structural
differences of PF cannot be due to distinctions within elements like [-voiced] or [+nasal],
just as distinct representations of SF cannot be due to distinctions within components
like [ animate ], [ cause ], or [ interior ], although acoustic properties of an utterance
(say, because of different speakers), or variants of animacy or causation might well be
at issue (e.g., due to differences in intentionality). In other words, representations of PF
and SF and their correspondence via morpho-syntax cannot affect or depend on internal
structures of the primitive elements of PF and SF.
Second, the interpretation of basic elements is nevertheless likely to be complex,
and in any case subject to additional, rather different principles and relations. Basically,
primitive elements of PF and SF recruit patterns based on different conditions of mental
organization for the structural building blocks of language: The correlates of phonetic
features in A-P integrate conditions and mechanisms of articulation and auditory
perception, whereas correlates of semantic features in C-I involve patterns of various domains
of perception, intention, and complex structural dependencies, as shown by cases like
[ animate ] or [ artifact ], combining sensory with abstract, purpose oriented conditions.
This is the very gist of positing interface elements, which deal with items of two
currencies, so to speak, valid under intra- and extra-linguistic conditions, albeit in different ways.
As the present article necessarily focuses on the interpretation of basic elements, it has
to leave aside encroaching conditions involving larger structures, where compositional
principles of PF and SF are partially suspended. With respect to PF, overlay-effects can
be seen, e.g., in whispering, where suspending [ ± voiced ] has consequences for various
other features. With respect to SF, the wide range of metaphor, for instance, is based on
more or less systematic shifts within conceptually coherent areas. Projecting, e.g.,
conditions of animacy into inanimate domains, as is frequently done in talk about computers
or other technology, obviously alters the interpretation of features beyond [ animate ].
Phenomena of this sort may partially suspend the compositionality of interpretation, still
presupposing the base line of compositional matching, though.
Third, PF interacts with one complex but unified domain, viz. production and
processing of acoustic signals, and it shares with A-P the linear, time dependent structure as
the basic principle of organization. SF on the other hand deals with practically all
dimensions of experience, motivating the abstract functor-argument-structure sketched above.
It necessarily cannot match the different organizational principles inherent in domains
as diverse as visual perception, social relations, emotional values or practical intentions,
hence it must be neutral with respect to all of them. In other words, the relation of SF to
the diverse subsystems of C-I cannot mean that it shares the specific structure of different
domains of interpretation. To mention just one case in point: Color perception, though
highly integrated with other aspects of vision, is structured by autonomous principles,
which are completely different from the distinctions of color categories and their
relations in SF, as discussed in Berlin & Kay (1969) and subsequent work. It follows from
these observations that the interpretation of SF must be compatible with conditions
obtaining in disparate domains - not only with regard to the compositionality of complex
configurations, but also with regard to the nature and content of its primitive elements.
340
IV. Lexical semantics
Fourth, while for good reasons PF is assumed to be the interface with both articulation
and perception, possibly even granting their integration, it is unclear whether in the same
way SF could be the interface integrating all the subsystems and modules of C-I. Notice
that this is not the same issue as the fact that two or more systems organized along
different principles might well have a common interface. Singing for instance integrates
language and music, and visual perception integrates shape and color, which originate from
quite different neuronal mechanisms. On the one hand, for reasons of mental economy
one would expect language, which forces SF to interface with the whole range of different
modules of mental organization anyway, to be the designated representational system
unifying the systems of C-I, setting the stage for coherent experience and mental
operations like planning, reasoning, orientation.The central system posited in Fodor (1983) as
the global domain the modular input/output systems interface with, might be taken to
stipulate this kind of overall interface. On the other hand, there are well-known systems
integrating different modules of C-I independently of SF and according to autonomous,
different principles of organization. The most extensively studied case in point is spatial
orientation, which integrates various modes of perception and locomotion and is
fundamental for cognition in general, as it also provides a representational format for various
other domains, including patterns of social relation or abstract values, as documented
in the vast literature about many aspects of spatial orientation, its neurophysiological
basis, its computational realization and psychological representation, including the
particular conditions of spatial problem solving. An instructive survey is found in Jackcndoff
(2002) and the contributions in Bloom et al. (1996), which deal in particular with the
representation of space in language;see also article 107 (Landau) Space in semantics and
cognition. Similar considerations hold for other domains, like music, emotion, or plans
of action.The question whether SF serves as the central instance, providing the common
interface that directly integrates the different modules and subsystems of C-I, or whether
there arc separate interfaces mediating among parts of C-I independently of language
must be left open here. In any case, SF and its basic elements must eventually interface
with all aspects of experience we can talk about. In other words, color, music, emotions,
other people's minds, and theories about everything must all in some way participate in
interpreting the basic elements of SF.
5.2. Remarks on Conceptual Structure
One of the major problems originating from these considerations is the difficulty to
specify at least tentatively the format of interpretations which elements of SF receive
in C-I and its subsystems. To be sure, different branches of psychology and cognitive
sciences provide important and elaborate theories for particular mental domains, such
as Marr's (1982) seminal theory of vision. But because of their genuine task and
orientation, they deal with specific domains within the mental world. Thus they are far from
covering the whole range of areas that elements of SF have to deal with, and have little
to say about the interpretation of linguistic expressions. Proposals seriously dealing with
the question of how extra-linguistic structures correspond to linguistic elements and
configurations are largely linguo-centric, i.e. they approach the problem via insights and
hypotheses developed and motivated by the analysis of linguistic expressions.
An impressive large scale approach of this sort is Miller & Johnson-Laird (1976),
providing an extensive survey of perceptual and conceptual structures that interpret
16. Semantic features and primes 341_
linguistic expressions.The approach systematically distinguishes the "Sensory" Field and
representations of the "Perceptual World" it gives rise to from the semantic structure of
linguistic expressions, the format of which is very similar to that of SF as discussed here.
(23) illustrates elements of the perceptual world, (24) the semantic elements based on
them.
(23) a. Cause (e, e') Event e causes event e'
b. Event (e) e is an event
(24) a. happen (S): An event x characterized by S happens at time t if:
(i) Event, (jc)
b. cause (S,S'): Something characterized by S causes something S' if:
(i) happen (S)
(ii) happen (S1)
(iii) Cause ((i),(ii))
These examples cannot do justice to the general approach, but they correctly show that
perceptual and semantic representations have essentially the same format, suggesting
that semantic structures and their interpretation are of roughly the same sort and are
related by conditions as in (24). (This is surprising in view of the fact that Johnson-Laird
(1983) explores the mental structure of spatial representations and the inferences based
on them, showing that they are systematically different from the semantic structure of
verbal expressions describing the same spatial situations, thus supporting different
inferences.) In other words, perceptual interpretation could be (mis)construed as turning the
organization of SF into principles of perception, something Miller and Johnson-Laird
clearly do not suggest.
In a different but comparable approach, Jackendoff (1984,2002) assumes
representations of the sort illustrated in (8), which he calls Conceptual Structures, to cover semantic
representations and much of their perceptual interpretation.Thus, like Miller & Johnson-
Laird, he takes the format of semantic representations to be the model of mental systems
at large - with one important exception. In Jackendoff (1996, 2002) a system of Spatial
Representation SR, which Conceptual Structure CS interfaces with, is explicitly assumed
as a system based on separate principles, some of which, like deictic frameworks,
identification of axes, or orientation of objects, are made explicit.The systematic interaction
of CS and SR, however, is left implicit, except that its roots are fixed by means of hybrid
combinations inside lexical items, as indicated by the following entry for dog, adapted
from Jackendoff (1996:11).
(25) PF: / dog /
Categorization: [ +N,-V,+Count,... ]
CS: [ ANIMAL X & [ [ CARNIVORE X ] & [ POSSIBLE [ PET X ] ] ] ]
SR: [ 3-D model with motion affordances ]
Auditory: [ sound of barking ]
The notion of a 3-D model invoked here is taken from the theory of vision in Marr
(1982), extended by a mental representation of motion types Marr's theory does not
342
IV. Lexical semantics
deal with. The actual model must be construed as the prototype of dogs in the sense of
Rosch & Mervis (1975) and related work. Jackendoff's inclusion of auditory
information points (without further comment) to the natural condition that the concept of dogs
includes the characteristic voice, the mental representation of which requires access to
the organization of auditory perception. Besides this purely sensory quality, however,
barking is a linguistically classified aspect of dogs (and foxes) and as such belongs to the
SF-information of the entry of dog.
In general, then, what kind of representations SF must serve as interface for, is anything
but obvious and clear. Principles of propositional structure, on which SF is based, are
certainly not excluded from mental systems outside of language, but it is more than unlikely
that they can all be couched in this format, as it appears to be suggested in Miller &
Johnson-Laird (1976), Jackendoff (2002) and (at least implicitly) many others. Spatial
reasoning, to mention just one domain, does not rely on descriptions using the principles
of predicate logic, as shown in Johnson-Laird (1983).
5.3. The inventory
The diversity of domains SF has to cope with leads to difficulties in identifying the
repertoire of its primes - a problem that does not arise in PF, where the repertoire of primes
is uncontroversial, at least in principle. For SF, two contrary tendencies can be
recognized, which might be called minimal and maximal decomposition. Fodor et al. (1980)
defend the view that concepts (which are roughly the meaning of simple lexical items)
are essentially basic, unstructured elements. On this account, the inventory of semantic
primes (for which the term features ceases to be appropriate) is by and large identical
to the repertoire of semantically distinct lexical entries. Although Fodor (1981) admits a
certain amount of lexical decomposition, recognizing some undisputable structure within
lexical items, for the majority of cases he considers word meanings as primes. As a matter
of fact, the problem boils down to the role of Completers, noted above - a point to which
we will return.
The opposite view requires the SF of all lexical items to be reducible to 1 fixed
repertoire of basic elements, corresponding to the features of PF. The re are various versions of
this orientation, but explicit reflections on its feasibility are rare. Jackendoff (2002:334ff)
at least sketches its consequences, even contemplating the possibility of combinatorial
principles inside lexical items that differ from lexicon-external combination, a view that
is clearly at variance with detailed proposals he pursues elsewhere. On closer
inspection, the claims of minimal and maximal decomposition are not really incompatible,
though, allowing for intermediate positions with more or less extensive decomposition.
As a matter of fact, the contrary perspectives end up, albeit in rather different ways, with
the need to account for the nature of irreducible elements. Two points arc to be made,
however, before these issues will be taken up.
First, with regard to the size of the repertoire, there is no reliable estimate on the
market. It is clear that the number of PF-features is on the order of ten or so, but for
SF-features the order of magnitude is not even a matter of serious debate. If
decomposition is denied, the repertoire of basic elements is on the order of the lexical items, but
the decompositional view per se does not lead to interesting estimates, either, as there is
no direct relation between the number of lexical items and the number of primes. The
combinatorial principles of SF would allow for any number of different lexical items on
16. Semantic features and primes 343
the basis of whatever number of primes is proposed. Hence for standard
methodological reasons, parsimonious assumptions should be expected. However, since reasonable
assumptions must be empirically motivated, they would have to rely on systematically
analyzed, representative sets of lexical items. But so far, available results have not led to
any converging guesses.To give at least an idea about possible estimates, the
comprehensive analysis of larger domains of linguistic expressions in Miller & Johnson-Laird (1976)
can be taken - with all necessary caveats - to deal with remarkably more than a hundred
basic elements of the sort illustrated in (24).This repertoire is due to well-motivated
considerations of the authors, leading to interesting results with respect to central cognitive
domains but with no general implications, except that huge parts of the English
vocabulary could not even be touched. One completely different attempt to set up and motivate
a general repertoire of basic elements should at least be mentioned: Wierzbicka (1996)
has a growing list of 55 primes on the basis of what is called Natural Semantic
Metalanguage, a rather idiosyncratic framework a more detailed presentation of which would
by far exceed the present limits, but cf. article 17 (Engelberg) Frameworks of
decomposition. Among the peculiar assumptions of the approach is the tenet that semantic primes
are the meaning of designated, irreducible lexical items, like I, you, this, not, can, good,
BAD, DO, HAPPEN, WHERE, etc.
Second, as to the content, the repertoire of primes has to respond not only to the
richness of the vocabulary of different languages, but first of all to the diversity and nature
of the mental domains to be covered. Distinctions must be captured and categorized
not only for different perceptual modalities, but also for "higher order", more central
domains like social dependencies, goals of action, generating explanations, etc. There is,
after all, no simple and unconditional pattern to order sets of primes.
Two perspectives emerge from the contrary views about decomposition. On the one
hand, even strong opponents of decomposition grant a certain amount of formal analysis,
which requires a limited, not strictly closed core of basic semantic elements. These
elements show up in different guises in practically all pertinent approaches. Besides
components related to morpho-syntactic features as noted above, there is a small set of primes
like cause, become, and a few others that have been treated as direct syntactic elements
in different ways. The "light verbs" in Hale & Keyser (1993) and much related work
around the Minimalist Program, for instance, are just syntactically realized semantic
components. These different perspectives, emphasizing either the semantic or the
syntactic role of Logical Form, arc the reason for the terminological tensions mentioned at
the outset.Three decades earlier, an even stronger syntactification of semantic elements
had been proposed in Generative Semantics by McCawley (1968) and subsequent work,
also relying on cause, become, not, and the like. In spite of deep differences between the
alternative frameworks, there are clearly recurring reasons to identify central semantic
elements and the structures they give rise to; cf. also article 7 (Engelberg) Lexical
decomposition, article 17 (Engelberg) Frameworks of decomposition, and article 81
(Harley) Semantics in Distributed Morphology.
On the other hand, even if one keeps to the program according to which lexical items
are completely decomposed into primitive elements, the problem of idiosyncratic
residues, identified as Distinguishes in Katz & Fodor (1963) and Katz (1972) or as
Completers in Laurence & Margolis (1999), must be taken seriously. One might consider
elements like [ bottle ] in (11), or taxonomic specifications like [ canine ] for dog or
[ feline ] for cat, or distinguishes like [ knight serving under the standard of another
344
IV. Lexical semantics
knight ] in Katz & Fodor (1963) as provisional candidates, to be replaced by more
systematic items. But there are different domains where one cannot get rid of elements of
this sort for quite principled reasons. Taxonomies of animals, plants, or artifacts are by
their very nature based on elements that must be identified by idiosyncratic conditions.
For different reasons, distinctions in perceptual dimensions like color, heat, pressure, etc.
and other mental domains must be identified by primitive elements. In short, there is a
complex variety of possibly systematic sources for Completers, such that an apparently
undetermined set of basic elements of this sort must be acknowledged. For two reasons,
these elements cannot be considered as a side issue, to be dismissed in view of the more
principled, linguistically motivated semantic elements. First, they make up the majority
of the inventory of basic semantic elements, which is, moreover, not closed on principled
grounds, but open to amendments within systematic limits. Second, they are of
remarkably different character with respect to their interpretation and their position within SF
and the role of SF as the interface with C-I.
To sum up, the overall repertoire that SF-representations are made up of comprises
elements of different kinds. They can reasonably be grouped according to two
conditions into (a) elements that are related to language-internal systematic conditions of
L as opposed to ones that are not, and (b) elements with homogeneous as opposed to
potentially heterogeneous interpretation - an important aspect that will be made explicit
below.The emerging kinds of elements can be characterized as follows:
(26) Systematic Features Idiosyncratic Features Dossiers
L-systematic + - -
Homogeneous + + -
The distinctions between these kinds of elements are based on systematic conditions with
principled theoretical foundations, although the resulting differences need not always be
clear-cut, a usual phenomenon with regard to empirical phenomena. As closer
inspection will show, however, the classification reasonably reflects the fact that linguistically
entrusted elements exhibit increasingly homogeneous and systematic properties.
6. Kinds of elements
6.1. Systematic semantic features
The distinction between systematic elements and the rest seems reminiscent of that
between semantic markers and distinguishes in Katz & Fodor (1963), but it differs
fundamentally, both in principle and in empirical detail. There are at least two ways in
which semantic primes can be systematic in the sense of bearing on morpho-syntactic
distinctions. First, there are elements that participate in the interpretation of
morphological and syntactic features as discussed in §4, including both categorization and
the various types of selection. These include properties like [ female ], [ animate ],
[ physical ],relations like [ before ], [ location ],or functions like [ surface ], [ proximate ],
[ vertical ], etc. Second, there arc components on which much of the argument
structure of expressions, and hence their syntactic behavior depends. Besides well-known and
widely discussed elements like [ cause ], [ become ], [ act ], also features of relational
nouns like [ parent ], [ color ], [ extension ] and others belong to this group. For general
16. Semantic features and primes 345
reasons, negation, conjunction, possibility, and a number of other connectives and logical
operators must be subsumed here, as they are clearly involved in the way in which SF
accounts for the logical organization of cognitive structures. It must be noted, however,
that logical constants, explored in pertinent theories of logic, are not necessarily identical
with their natural counterparts in SF.The functor & assumed in (12), for example,differs
from standard conjunction at least by its formal asymmetry. Further differences between
logical and linguistic perspectives have been recognized with regard to quantification
and modality.
In any case, the interpretation of these elements consists in conceptually basic,
integrated conditions, which may or may not be related to particular sensory domains.
Causality, for instance, is a fundamental dimension of experience, which lives on integrated
perceptual conditions, incorporating information from various domains like change
and position, as discussed, e.g., in Miller & Johnson-Laird (1976). More specifically, it
is not extracted from, but imposed on perceptual information. Dowty (1979)
furthermore shows that the conceptual conditions of causality and change have straightforward
external, model-theoretic correlates. Similarly, shape, location, relative size, part-whole-
relation, or possession are integrated and often fairly abstract conditions organizing
perception and action. For instance, the relation [ loc ] or functions like [ interior ],
[ surface ], etc. cover conditions of rather different kind, depending on the sortal aspect
of their arguments. Thus a page in the novel, a chapter in the novel, an error in the novel,
or a scene in the novel pick out different aspects by means of the same semantic
conditions represented by the features assumed for in in (20d). Other and perhaps still more
complex and integrated aspects of experience are involved in the recognition of animacy
or the identification of male and female sex. Even if some ingredients of these
complex patterns can be sorted out and verbalized separately (identifying, e.g., physiological
properties or behavioral characteristics), the features are holistic components with no
internal structure within SF.This corresponds to the fact that the content of PF-fcaturcs
is sometimes accessible to conscious control and verbalization.The articulatory gestures
of features like [voiced] or [labial] for instance may well be reflected on, without their
encapsulated character being undercut.
In general, systematic features are to be construed as the stabilized, linguistic reflex of
mechanisms involved in organizing and conceptualizing experience. Fundamental
conditions of this sort seem to provide a core of possibilities the language faculty can draw
on. Which of these options arc eventually implemented in a given language obviously
depends to some extent on its morphological and lexical regularities.
6.2. Idiosyncratic semantic features
Two factors converge in this kind of features. First, there is the fundamental
observation, granted even under a radical denial of lexical decomposition, that the meanings of
linguistic expressions respond to distinctions within basic sensory domains, from which
more complex concepts are constructed. An extensively debated and explored domain
in this respect is color perception. Items like yellow, green, blue differ semantically on the
basis of perceptual conditions, but with no other linguistic consequences, except common
properties of the group as a whole, which is captured by the common,systematic feature [
color ], while the different chromatic categories are represented by idiosyncratic features
like [ red ], [ green ], [ black ], etc. It might be noted in this respect that color terms -
346
IV. Lexical semantics
unlike those for size, value, speed, etc. - are adjectives that can also occur as nouns, as
shown by the red of the car versus *the huge of the car, due to the component [ color ] as
opposed to the idiosyncratic values [ red ], [ blue ], which still are by no means arbitrary,
as Berlin & Kay (1969) and subsequent work has shown.
Various other domains categorizing sensory distinctions have attracted different
degrees of attention, with limited systematic insight. Intriguing problems arise, e.g., with
respect to haptic qualities, concerning texture and plasticity of surfaces or even the
perceptual classification of temperature. It might be added that besides perceptual domains,
there are biologically determined motor patterns, as in grasp or the distinction between
walk and run, which also provide the interpretation of basic elements of SF.
The second factor supporting idiosyncratic features is the obvious need to account for
differences of meaning that obviously have no systematic status in morpho-syntax or more
general semantic patterns. Whether or not the distinctions between break, shatter, smash
can be traced back to homogeneous sensory modalities need not be a primary concern,
since in any case besides basic modes of perception and motor control, more complex
conditions like those involved in the processes denoted by cough, breath, or even sleep,
laugh, cry and smile are likely to require semantic features that are motivated by nothing
else than the distinctions assigned to them in C-I.These distinctions might well be based on
processes that are physiologically or physically complex - they still lead to self-contained
features, as long as their interpretation is experientially homogeneous. One easily realizes
that lexical knowledge abounds with elements like shatter, splinter, smash, and plenty of
other sets of items that cannot get along without features of this idiosyncratic kind. They
seem to be the paradigm cases demonstrating the need for Distinguishes or Completers.
This is only half of the truth, though, because distinguishing elements like [ canine ],
[ feline ] etc. are usually assumed to be equally typical for Completers as are [ green ]
or [ smooth ] and the like. Characteristic elements identifying the peculiarities of a
particular species or a specific artifact arc, of course, plausibly considered as Completers
or Distinguishes. But they do have an essentially different character, as their
interpretation can be heterogeneous for principled reasons, combining information from
separate mental modalities. Similar considerations apply to what Pustejovsky (1995) calls
the Qualia structure of lexical items, which he supposes to characterize their particular
distinctive properties, using the term Qualia, however, in a fairly different sense than the
original notion of subjective (usually monomodal) percepts, e.g., in Goodman (1951).
The integration of different types of information acknowledged in these
considerations might be a matter of degree, which yields a fuzzy boundary with respect to the
kind of elements to be considered next.
6.3. Dossiers
The notion of Dossiers, proposed in Bierwisch (2007), takes up a problem that is present
but not made explicit in much of the literature that deals with the phenomena in
question. The point is illustrated by the entry (25) proposed by Jackendoff for dog, the
semantically relevant parts of which are repeated here as (27):
(27) CS: [ ANIMAL X & [ [ CARNIVORE X ] & [ POSSIBLE [ PET X ] ]] ]
SR: [ 3-D model with motion affordances ]
Auditory: [ sound of barking ]
16. Semantic features and primes 347
The elements in CS (Jackendoff's counterpart of SF) are fairly abstract conceptual
conditions, and in fact plausible candidates for systematic features classifying animals. But
instead of the distinguisher [ canine ] that would have to identify dogs, we have SR and
Auditory information to account for their specificity. Now, SR would either require the
domain of 3-D models to be subdivided in prototypes of dogs, cats, horses, spiders, etc. -
comparable to the division of the color space by cardinal categories like red, green,
yellow, etc. -, clearly at variance with the theory of 3-D representations, or it would leave
the information about shape and movement of dogs outside the range of the features of
SF. Similar remarks apply to the auditory information. The essential point is, however,
that 3-D models (of any kind of object, not just of dogs or animals) arc Spatial
Representations, which arc subject to completely different principles of organization than SF (or
CS,for that matter). SR-representations by their very nature preserve three-dimensional
relations, allowing for spatial properties and inferences, as discussed in Johnson-Laird
(1983) and elsewhere, conditions that SF is incapable to support. Moreover, besides
shape, movement, and acoustic characteristics, the distinguishing information about dogs
includes a variety of other aspects of behavior, capacities, possible functions, etc. some of
which might require propositional specifications that can be represented in the format of
SF. In other words, the replacement or interpretation of [ canine] does not only concern
SR and Auditory in (27), but involves a presumably extendible, heterogeneous cluster of
conditions, i.e. a dossier of different conditions, which is nevertheless a unified element
within SF, as the abbreviation [ canine ] in the SF of dog would correctly indicate.
The comments made on the SF of dog are easily extended to a large range of lexical
items. Take, for the sake of illustration, nouns like ski or bike, which would be classified
by systematic features like [ artifact ] and [ for motion ], while the more-or-less
complex specificities are indicated by dossiers like [ ski ], or [ bicycle ], respectively. Again,
what is involved are different domains of C-I, indicating e.g., conditions on substance,
bits and pieces, and technology, in addition to spatial models, and interpreting integrated
elements, which, incidentally,enter into operations of verbalization noted with respect to
(21b),such that ski and bike become verbs denoting movement by means of the specified
objects. This in turn shows that dossiers are by no means restricted to the SF of nouns.
Three comments should be made here. First, as noted in (26), dossiers are elements
with inhomogeneous interpretation that allow for conditions from different modules of
C-I. To the extent to which these conditions may or may not be integrated, the
borderline between dossiers and idiosyncratic features gets fuzzy, as noted earlier. With respect
to SF, however, they are basic elements, subject to the general type structure and the
combinatorial conditions it imposes, as shown by constructions like we were skiing or
she biked home. As a consequence, the Lexical System LS of a given language L must
be able to contain a remarkable collection of primitive elements whose idiosyncratic,
heterogeneous interpretation is attached to, but not really part of, the lexical entries. In
other words, lexical entries are connected to C-I not only by means of the general,
systematic primes, but also by means of idiosyncratic files calling up different modules. One
might observe that entries like (25) reflect this situation by making the extra-linguistic
(e.g. spatial and auditory) information part of the lexical items directly. That way one
would avoid ambivalent primes of the sort called Dossier, but the Lexical System would
become a hybrid combination of linguistic and other knowledge - which perhaps it is.
On that view, dossiers would behave as unified basic elements, a point that will be taken
up below. Under the opposite view, dossiers might be construed to a large extent as
348
IV. Lexical semantics
the meaning of lexical entries without decomposition. And that is in fact what dossiers
are - precisely to the extent to which primes resist language-internal decomposition.
Second, for this very reason dossiers are crucial joints for the role of SF as the
interface with the different modules of C-I. To the extent to which they include, e.g., 3-D
models that allow for spatial characteristics and spatial similarity judgments, dossiers
situate the elements they specify in SF with respect to spatial possibilities, whose actual
conditions, however, are given by the system of Spatial Representation. Similar
considerations apply to auditory or interpersonal information. In general then, insofar as dossiers
address different mental subsystems, they directly mediate between language and the
different modes of experience.
Third, as already noted, dossiers may integrate propositional information, in
particular if they are enriched through individual experience, such that, e.g., the file [ frog ]
includes conditions of procreation and development or usual habitats besides the
characteristic visual and auditory information. Much of this information might be of
propositional character, sharing in principle the format of SF. With respect to these components,
dossiers are transparent in the sense that on demand their content is available for
verbalization. In a way, dossiers might thus be means of sluicing information into the
proper lexical representations.
It is worth noting in this respect that Fodor (1981) supports the claim that lexical
meanings are basic, indefinable elements by explaining them as a kind of Janus-headed
entities, which arc logically primitive but may nevertheless depend on and integrate
other (basic or complex) elements. The decisive point is that the dependence in
question must not be construed as a logical combination of the integrated elements, since
this would lead to analyzable complex items, but as something which Fodor calls mental
chemistry in the sense of John Stuart Mill (1967), who suggests that by way of mental
chemistry simple ideas generate, rather than compose, the complex ones. In other words,
the way in which the prerequisites of such a double-faced clement arc involved is not
the kind of combination on which SF is based, but a type of integration by which the
mental architecture supports hybrid elements that are primitive in one respect, still
recruiting resources that are complex in other respects. This is exactly what Fodor's
lexical meanings share with dossiers or completers.
Mill's and Fodor's mental chemistry, the obviously necessary amalgamation of
different mental resources and modalities, is a permanent challenge for the cognitive
sciences dealing with perception and conceptual organization. The notion of frames
introduced, e.g., in Fillmore (1985) and systematically developed in Barsalou (1992,
1999) is one such proposal to relate the meaning of linguistic expressions to the different
dimensions of their interpretation, where frames are assumed to integrate and organize
the different dimensions of experience; cf. also article 29 (Gawron) Frame Semantics.
In conclusion, in view of the controversial properties of (in)decomposable complex
semantic elements it might be unavoidable to recognize ambivalent elements, which are
at the same time primitive features of the internal architecture of L and heterogeneous
elements at the border of SF.
6.4. Combinatorial effects
The phenomena just noted are related to a number of widely discussed problems. One of
them concerns the variation on the basis of the so-called literal meaning, in particular the
16. Semantic features and primes 349
combinatorial effect that SF-features may exert on their interpretation. (28) is a familiar
exemplification of one type of interaction:
(28) a. Tom opened the door.
b. Sally opened her eyes
c. The carpenters opened the wall
d. Sam opened his book to page 37
e. The surgeon opened the wound
f. The chairman opened the meeting
g. Bill opened a restaurant
Searle (1983) remarks about these examples that open has the same literal meaning in
(28a-e), adumbrating a somewhat different meaning for (28f) and (28g), as incidentally
suggested by the German glosses offiien for (28a-e), but eroffnen for (28f/g), although
he takes it as obvious that the verb is understood differently in all cases. As far as this
is correct, the differences are connected to the objects the verb combines with, inducing
different acts and processes of opening. If the SF of open given in (20a), repeated as (29),
represents the invariant meaning of the occurrences in question, then the differences
noted by Searle must arise from the feature-interpretation in C-I:
(29) / open / X y X x X s [ s inst [ [ act x ] [cause [ become [ open y ] ] ] ] ]
To sort out the relevant points, we first notice that the different types of transition towards
the resulting state of y and their causation by the act of x are intimately related to the
nature of the resulting state and the sort of activity that brings it about. In other words,
the time course covered by [ become ] and the act causing the change are different if the
eventual state is that of the door, the eye, a book, a bottle, or a meeting. Similarly the
character of s instantiating the transition is determined by the nature of x and y and their
involvement. Hence [sinst...] and [cause [ become ... ] ] don't contribute to the variants
of interpretation independently of [ act x] and [ open y ]. As noted earlier, [ become p ]
moreover involves a presupposition. It requires the preceding state to be [ not p ], which
in turn depends on the interpretation of p, i.e. [ open y ] in the present case. Hence the
actual choice in making sense of open depends-not surprisingly-on the values of x and y,
which arc provided by the subject and object of the verb. Now, the subject in all cases at
hand has the feature [ human x ], allowing for intentional action and thus imposing no
relevant differences on the interpretation of x. Variation in the interpretation of [ act x ]
is therefore not due to the value of x, but to the interpretation of [ open y ], which in turn
depends on y and the property abbreviated by open.This property is less clear-cut than it
appears, however. What it requires is that y does not preclude access, where y is either the
container whose interior is at issue, as in he opened the bottle, or its boundary or possible
barrier, as in he opened the door, while in cases like she opened her eyes or he opened the
hand the alternative seems altogether inappropriate. Bowerman (2005) shows that
differences of this sort may lead to different lexical items in different languages according to
conditions on the specification of y. How these and further variations arc to be reflected
in SF must be left open here. As a first approximation, [ open y ] could be replaced by
something like [ y allow-for [ access-to [ interior ] ] ], leaving undecided how the
interior relates to y, i.e. to the container or its boundary. In any case, the interpretation of
350
IV. Lexical semantics
[ open y ] depends on how the object of open, delivering the value of y, is to be
interpreted in C-I in all relevant respects.This in turn modulates the interpretation of [ act x ],
fostering the specific activity by which the resulting state can be brought about.
Without going through further details involved in cases similar to (28), it should be
clear that the SF-features are not interpreted in isolation, but only with regard to
connected scenarios C-I must provide on the basis of experience from different modules.
Searle (1983) calls these conditions the "Background" of meaning, without which literal
meaning could not be understood.The Background cannot itself be part of the meaning,
i.e. of the semantic structure, without leading to an infinite regress, as the background
elements would in turn bring in their background. It can be verbalized, however, to the
extent to which it is accessible to propositional representation. Roughly the same
distinction is made in Bierwisch & Lang (1989), Bierwisch (1996) between Semantic Form and
Conceptual Structure; cf. also article 31 (Lang & Maienborn) Two-level Semantics. A
different conclusion with respect to the same phenomena is drawn by Jackendoff (1996,
2002, and related work, cf. article 30 (Jackendoff) Conceptual Semantics), who proposes
Conceptual Structure, i.e. the representation of literal meaning, to include spatial as well
as any other sensory and motor representation. Carefully comparing various versions
to relate semantic, spatial, and other conceptual information, Jackendoff (2002) ends
up with the assumption that no principled separation of linguistic meaning from other
aspects of meaning is warranted. But besides the fact that there are different types of
reasoning based e.g., on spatial and propositional representations, as Jackendoff is well
aware, the problem remains whether and how to capture the different interpretations
related to the same lexical meaning, as illustrated by cases like (28) and plenty of others,
without corresponding differences in Conceptual Structure. In any case, if one does not
assume the verb open to be indefinitely ambiguous, with equally many different
representations in SF (or CS), it is indispensable to have some way to account for the unified
literal meaning as opposed to the multitude of its interpretations.
This problem has many facets and consequences, one of which is exemplified by the
following contrast:
(30) a. Mary left the institute two hours ago.
b. Mary left the institute two years ago.
Like open in (28), leave has the same literal meaning in (30a) and (30b), which can
roughly be indicated as follows, assuming that [ x at y ] provisionally represents the
condition that x is in some way connected to y.
(31) / leave / X y X x X s [ s inst [ [ act x ] [ cause [ become [ not [x at y ] ] ] ] ] ]
Under the preferred interpretation, (30a) denotes a change of place, while (30b) denotes
a change of affiliation.The alternatives rely on the interpretation of institute as a building
or as a social institution. Pustejovsky (1995) calls items of this sort dot-objects. A dot-
object connects ontologically heterogeneous conditions by means of a particular type of
combination which makes them compatible with alternative qualifications.The entry for
institute has the feature [ building x ] in "dotted" combination (hence the name) with
[ institution x ],which are picked up alternatively under appropriate combinatorial
conditions, as shown in (32). Whether [ building ] and [ institution ] are actually primes
16. Semantic features and primes 351^
or rather configurations of SF built on more basic elements such as [ physical ] and
[ social ] etc. can be left open. The point is that they have complementary conditions
with incompatible sortal properties.
(32) a. The institute has six floors and an elevator.
b. The institute has three directors and twenty permanent employees.
The cascade of dependencies in (30) turns on the different values for y in (31) - the
"dotted" institute -, which induces a different interpretation of the relation at.This in turn
implies a different type of change, caused by different sorts of activity, involving crucially
different aspects of the actor Mary, who must cither cause a change of her location or of
her social relations, participating as a physical or a social, i.e. intentional agent. More
generally, the feature [ human x ] must be construed as a "dotted" combination tantamount
to the very basis of the mind-body-problem. Now, the choice between the physical and the
social interpretation is triggered by the time adverbials two hours ago vs. two years ago,
which sets the limits for the event e, which in turn affects the interpretation of the time
course covered by [change ...].The intriguing point is that the contrasting elements hour
\s.year have a fixed temporal interpretation with no dotted properties. Hence they must
trigger the relevant cascades via general, extra-linguistic background knowledge.
An intriguing consequence of these observations is the fact that there are clearly
violations of the principle of compositionality, according to which the interpretation of a
complex expression derives from the interpretation of its constituent parts. Simple and
straightforward cases like (31) illustrate the point: The interpretation of Mary, institute
and leave - or rather of the components [ person ], [ act ], [ building ] etc. they contain -
depends on background- or context-information not part of SF at all, and definitely
outside the SF of the respective constituents. Hence even if SF is not supposed to be
systematically separate from conceptual structure and background information, its
compositional aspect, following the logic of its type-structure, must be distinguished from
differently organized aspects of knowledge.
7. Primes and universals
As noted at the beginning, there are strong and plausible tendencies to consider the
primitive elements that make up linguistic expressions as substantive universals,
provided by the language faculty, the formal organization of which is characterized by
Universal Grammar. In other words, UG is assumed to contain a universal repertoire of
basic possibilities, which are activated or triggered by individual experience. Thus
individual, ontogenetic processes select the actual distinctions from the general repertoire,
which is part of the language capacity as such. This means that the distinctions indicated
by features like [ tense ] or [ round ] may or may not appear in the system of PF-features
in a given system L, but if the features appear, they just realize options UG provides as
one of the prerequisites for the acquisition and use of language.
If these considerations are extended to features of SF, two aspects must be
distinguished, which at PF are often considered as essentially one phenomenon, namely
features and their interpretation. For articulation this identification is a plausible
abbreviation, but it doesn't hold at the conceptual side. For SF, the mental systems that
provide the interpretation of linguistic expressions must be conceived as having a rich and
352
IV. Lexical semantics
to a large extent biologically determined structure of their own, independent of (or in
parallel to) the language capacity. Spatial structure is the most obvious, but by no means
the only case in point. General conditions of experience such as three-dimensionality of
spatial orientation, identification of verticality, dimensions and distinctions of color
perception, and many other domains and distinctions may directly correspond to possible
features in SF, very much like parameters of articulation correspond to possible features
of PF. Candidates in this respect are systematic features like [ vertical ] or [ before ] and
their interpretation in C-I, as discussed earlier. Similarly, idiosyncratic features, which
correspond to biologically determined conditions of perception, motor control, or
emotion might be candidates of this sort.To which extent observations about the role of focal
colors or the body-schema and natural patterns of motor activity like walking, grasping,
chewing, swallowing, etc. can be considered as options recruited as features of SF, similar
to articulatory conditions for features of PF, must be left open. Notice, however, that we
are talking about features of SF predetermined by UG, not merely about perceptual or
motor correlates in C-I.
In any case, because of the number and diversity of phenomena to be taken into
account, there is a problem of principle, which precludes generally extending SF from
the notion of universal options predetermined to be triggered by experience. The
problem comes from the wide variety of basic elements that must be taken to be
available in principle, but cannot be conceived without distinct experience. It primarily
concerns dossiers, but also a fair range of idiosyncratic features, and it arises independently
of the question whether one believes in decomposition or takes, like Fodor (1981), all
concepts or possible word meanings to be innate. It is most easily demonstrated with
regard to 3-D models (or visual prototypes) that must belong to many dossiers. We easily
identify cats, dogs, birds, chairs, trumpets, trees, flowers, etc. on the basis of limited
experience, but it does not make sense to stipulate that the characteristic prototypes are all
biologically fixed, ready to be triggered like for instance the prototype of the human
face, which is known to unfold along a fixed maturational path. The same holds for the
wide variety of auditory patterns (beyond those recruited for features of PF), by which
we distinguish frogs, dogs, nightingales, or flutes and trombones, cars, bikes, and trains.
Without adding further aspects and details, it should be obvious that whatever enters
the interpretation of these kinds of basic elements, it cannot belong to conditions that
are given prior to experience and need only be activated. At the same time, the range of
biologically predisposed capacities, by means of which features can be interpreted in C-I
and distinguished in SF, is by no means arbitrary, chaotic, and unstructured, but clearly
determined by principles organizing experience.
Taken together, these considerations suggest a modified perspective on the nature of
primitive elements. Instead of taking features to be generally part of UG, triggered by
experience, one might look at UG as affording principles, patterns, or guidelines along
which features or primes are constructed or extracted, if experience provides the relevant
information. The envisaged distinction between primes and principles is not merely a
matter of terminology. It corresponds, metaphorically speaking, to the difference between
locations fixed on a map and the principles from which a map indicating the locations
would emerge on the basis of exploration. More formally, the difference corresponds to
that between the actual set of natural numbers and its construction from the initial
element by means of the successor operation. Less metaphorically, the substantial content
of the distinction can be explained by analogy with face recognition. Human beings can
16. Semantic features and primes 353
normally distinguish and recognize a large number of faces on the basis of limited and
often short exposure. The resulting knowledge is due to a particular, presumably innate
disposition, but it cannot be assumed to result from triggering individual faces that were
innately known, just waiting for their eventual activation. In other words, the acquisition
of faces depends on incidental, personal information, processed by the biologically
determined capacity to identify and recognize faces, a capacity that ontogenetically emerges
along a biologically determined schema and a fixed developmental path. With these
considerations in mind, UG need not be construed as containing a fixed and finite system of
semantic features, but as providing conditions and principles according to which
distinctions in C-I can make up elements of SF.The origin and nature of these conditions are
twofold, corresponding to the interface character of the emerging elements.
On the one hand the conditions and principles must plausibly be assumed to rely on
characteristic structures of the various interpretive domains which are independently
given by the modules of C-I. Fundamentals of spatial and temporal orientation, causal
explanation, principles of good form, discovered in Gestalt psychology, focal colors,
principles of pertinence and possession, or the body schema and its functional determinants are
likely candidates. On the other hand, conditions on primes are given by the organization
of linguistic expressions, i.e. the format of representations and the principles by which they
are built up and mapped onto each other. In this respect, even effects of morpho-syntax
come into play through constraints on government, agreement, or selection restrictions,
which depend on features at least partially interpreted in C-I, as discussed in § 4. In
general, then, a more or less complex configuration of conditions originating from principles
and distinctions within C-I serves as a feature of SF just in case this element plays a proper
systematic role within the combinatorial structure of linguistic expressions.
Under this perspective, semantic primes would emerge from the interaction of
principles of strictly linguistic structure with those of sensory-motor and various other domains
of mental organization -just as one should expect from their role as components of an
interface level. The effect of this interaction, although resulting from universal
principles of linguistic structure and general cognition, is determined (to different degrees) by
particular languages and their lexical inventory, as convincingly demonstrated by Bow-
erman (2000), Slobin et al. (2008) and much related work.This applies already to fairly
elementary completers like those indicated by [ interior ] in (20d),or [ at ] in (31).Thus
Bowerman (2005) shows that conditions like containment, (vertical) support, tight fit,
or flexibility can make up configurations matched differently by (presumably basic)
elements in different languages, marking conditions that children are aware of at the age of 2.
(33) is a rather incomplete illustration of distinctions that figure in the resulting situation
of actions placing x in/on y.
(33) English Dutch German Korean
block
book
ring
Lego
cup
hat
towel
->
->
->
->
->
->
->
pan
fitted case
finger
Lego stack
table
head
hook
in
in
on
on
on
on
on
in
in
om
op
op
op
aan
in
in
an
auf
auf
auf
an
nehta
kkita
kkita
kkita
nohta
ssuta
kelta
354
IV. Lexical semantics
In a similar vein, joints of distinctions and generalizations are involved in elements like
[ open ] in (20c) and (29),or [ animal ] and [ pet] in (25),but also [ artifact],[ container ],
and [ bottle ] in (12) and many others. Further complexities along the same lines must
be involved in Completers or dossiers specifying concepts like car, desk, bike, dog, spider,
eagle, murky, clever, computer, equation, exaggerate, occupy and all the rest. In this sense,
the majority of lexical items is likely to consist of elements that comprise a basic
categorization in terms of systematic features and a dossier indicating its specificity, similarly to
proper names, which combine - as argued in Bierwisch (2007) - a categorization of their
referent with an individuating dossier. Remember that these specifying conditions are
based on cross-modal principles in terms of which experience is organized, integrated
by what is metaphorically called mental chemistry, such that dossiers arc not in general
logically combined necessary and sufficient criteria of classification.
The upshot of these considerations is that primes of SF are clearly determined by UG,
as they emerge from principles of the faculty of language and are triggered by conditions
of mental organization in general. They are not, however, necessarily elements fixed in
UG prior to experience. They are not learned in the sense of inductive learning, but
induced or triggered by structures of experience, which linguistic expressions interface
with.This may lead, among others, to cross-linguistic differences within the actual
repertoire of primes, as suggested in (33). It does not exclude, however, certain core-elements
like [ cause ], [ become ], [ not ], [ loc ] and quite a few others to be explicitly fixed and
prc-cstablishcd in UG, providing the initial foundation to interface linguistic expressions
with experience.
This view leads to a less paradoxical reading of Fodor's (1981) claim that lexical
concepts must all be innate, including not only nose and triangle, but also elephant,
electron or grandmother, because they cannot logically be decomposed into basic sensory
features. Since Fodor concedes a (very restricted) amount of lexical decomposition, his
claim concerns essentially the status of dossiers. Now, these elements can well be
considered as innate, if at least the principles by which they originate are innate - either
via UG or by conditions of mental organization at large. Under this construal, the
specific dossier of e.g., horse or electron can be genetically determined without being
actually represented prior to triggering experience, just as, e.g., the knowledge of all prime
numbers could be considered as innate if the successor-operation and multiplication are
innate, identifying any prime number on demand, without an infinite set of them being
actually represented. As to the general principles of possible concepts, which under this
view are the innate aspect of possible entities (analogous to prime numbers fixed by
their indivisibility), Fodor assumes a hierarchy of triggering configurations, with a
privileged range of Basic-level concepts in the sense of Rosch (1977) as the elements most
directly triggered. Their particular status is due to two general conditions, ostensibility
and accessibility, both presupposing conceptual configurations to depend on each other
along the triggering-hierarchy (or perhaps network). Ostensibility requires a
configuration to be triggered by way of ostension or a direct presentation of exemplars, given the
elements triggered so far. Accessibility relates to dependence on prior (ontognetic or
intellectual) acquisition of other configurations.Thus the basic-level concept tree is prior
to the supcrordinatc plant, but also to the subordinates oak or elm, etc. Fodor is well
aware that ostension and accessibility are plausible, descriptive constraints on the
underlying principles, the actual specification of which is still a research program rather than a
simple result. What needs to be clarified are at least three problems: First the functional
16. Semantic features and primes 355
integration of conditions from different domains (the mental chemistry) within
complex but unified configurations of CS, second the abstraction or filtering, by which these
configurations are matched with proper, basic elements of SF, and third the dependency
of clusters or configurations in CS along the triggering hierarchy, which cannot
generally be reduced to logical entailment. Some of these dependencies are
characteristically reflected within SF by means of ordinary semantic elements and relations, while
much of crosslinguistic variation, metaphor and other aspects of reasoning are a matter
of essentially extralinguistic principles.
It might finally be added that for quite different reasons a similar orientation seems
to be indicated even with respect to features of PF. As pointed out in Bicrwisch (2001),
if the discoveries about sign language arc taken seriously, the language capacity cannot
be restricted to articulation by means of the vocal tract. As has been shown by Klima &
Bellugi (1979) and subsequent work, sign languages are organized by the same general
principles and with the same expressive power as spoken languages. Now, if the faculty of
language includes the option of sign languages as a possibility activated under particular
conditions, the principles of PF could still be generally valid, using time slots and articu-
latory properties, but the choice and interpretation of features can no longer be based
on elements like [ nasal ], [ voiced ], [ palatal ] etc. Hence even for PF, what UG is likely
to provide are principles that create appropriate structural elements from the available
input information, rather than a fixed repertoire of articulatory features to be triggered.
Logically, of course, UG can be assumed to contain a full set of features for spoken and
signed languages, of which normally only those of PF are triggered, but it is not a very
plausible scenario.
To sum up, semantic primitives are based on principles of Universal Grammar, even
if they do not make up a finite list of fixed elements. UG is assumed to provide the
principles that determine both the mapping of an unlimited set of structured signals to
an equally infinite array of meanings and the format of representations this mapping
requires. These principles specify in particular the formal conditions for possible
elements with interpretation provided by different aspects of experience, such as spatial
orientation, motor control, social interaction, etc. Although this view does not exclude
UG from supporting fixed designated features for certain aspects of linguistic structure,
it allows for the repertoire of primitive semantic elements to be extended on demand.
8. References
Ajdukiewicz, Kazimierz 1935. Uber die syntaktisclie Konnexitat. Studia Philosophica 1,1-27.
Bach, Enunon 1986. The algebra of events. Linguistics & Philosophy 9,1-16.
Barsalou. Lawrence W. 1992. Frames, concepts, and conceptual fields. In: A. Lehrer & E. Kittay
(eds.). Frames, Fields, and Contrasts. Hillsdale, NJ: Eiibaum, 21-74.
Barsalou, Lawrence W. 1999. Perceptual symbol systems. Behavioral and Brain Sciences 22,577-660.
Berlin, Brent & Paul Kay 1969. Basic Color Terms. Berkeley, CA: University of California Press
Biei'wisch, Manfred 1967. Syntactic features in morphology: General problems of so-called
pronominal inflection in German. In: To Honor Roman Jakobson, vol. /.The Hague: Mouton, 239-270.
Biei'wisch, Manfred 1996. How much space gets into language? In: P. Bloom et al. (eds.). Language
and Space. Cambridge, MA:Tlie MIT Press, 31-76.
Biei'wisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M.
Gartner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin:
Akademie Verlag, 227-266.
356
IV. Lexical semantics
Bierwisch, Manfred 2001. Repertoires of primitive elements - prerequisite or result of
acquisition? In: J. Weissenborn & B. Hohle (eds). Approaches to Bootstrapping, vol. II. Amsterdam:
Benjamins, 281-307.
Bierwisch, Manfred 2007. Semantic form as interface. In: A. Spath (ed.). Interfaces and Interface
Conditions. Berlin: de Gruyter, 1-32.
Bierwisch, Manfred 2010. become and its presuppositions. In: R. Bauerle, U. Reyle & T. E.
Zimmermann (eds.). Presupposition and Discourse. Bingley: Emerald, Group Publishing,
189-234.
Bierwisch, Manfred & Ewald Lang 1989 (eds.). Dimensional Adjectives. Berlin: Springer.
Bierwisch, Manfred & Rob Schreuder 1992. From concepts to lexical items. Cognition 42,23-60.
Bloom, Paul, Mary A. Peterson, Lynn Nadel & Merrill F. Garrett 1996. Language and Space.
Cambridge, MA:The MIT Press
Bowerman, Melissa 2000. Where do children's meanings come from? Rethinking the role of
cognition in early semantic development. In: P. Nucci, G. Saxc & E.Turicl (eds.). Culture, Thought, and
Development. Mahwah, NJ: Erlbaum, 199-230.
Bowerman, Melissa 2005. Why can't you 'open' a nut or 'break' a cooked noodle? Learning covert
object categories in action word meanings. In: L. Gershkoff-Stowe & D. H. Rakison (eds.).
Building Object Categories in Developmental Time. Mahwah, NJ: Erlbaum, 209-243.
Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA:Tlie MIT Press
Chomsky, Noam 1970. Remarks on nominalization. In: R.A.Jacobs & P. S. Rosenbaum (eds.).
Readings in English Transformational Grammar. The Hague: Mouton, 11-61.
Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris
Chomsky, Noam 1985. Knowledge of Language: Its Nature, Origin, and Use. New York: Praeger.
Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA:Tlie MIT Press
Chomsky, Noam 2000. Minimalist inquiries. In: R. Martin, D. Michaels & J. Uriagereka (eds.). Step
by Step. Cambridge, MA:Tlic MIT Press, 89-155.
Chomsky, Noam & Morris Halle 1968. The Sound Pattern of English. New York: Harper & Row.
Clements, George N. 1985.The geometry of phonological features. Phonology Yearbook 2,225-252.
Cresswell, Max 1973. Logics and Languages. London: Methuen.
Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Rcidcl.
Fillmore, Charles J. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6,
222-255.
Fodor, Jerry A. 1981. Representations. Cambridge, MA:Tlie MIT Press
Fodor, Jerry A. 1983. The Modularity of Mind. Cambridge, MA:Tlie MIT Press
Fodor, Jerry A., Merrill F. Garrett, Edward Walker & Cornelia Parkes 1980. Against definitions
Cognition 8,263-367.
Goodman, Nelson 1951. The Structure of Appearance. Cambridge, MA: Harvard University Press
Hale. Kenneth & Samuel J. Keyser 1993. Argument structure and the lexical expression of syntactic
relations In. K. Hale & S. J. Keyser (eds). The View from Building 20. Cambridge, M A:The MIT
Press, 53-109.
Halle, Morris 1983. On distinctive features and their articulatory implementation. Natural
Language and Linguistic Theory 1,91-105.
Halle, Morris & Alec Marantz 1993. Distributed morphology and the pieces of inflection.
In: K. Hale & S. J. Keyser (eds). The View from Building 20. Cambridge, MA: The MIT Press,
111-176.
Halle, Morris & Jean-Roger Vcrgnaud 1987. An Essay on Stress. Cambridge. MA: The MIT Press
Hjelmslev, Louis 1935. La Categorie des Cas I. Aarhus: Univcisitctsfoi lagct.
Hjelmslev, Louis 1938. Essai d'une thdorie des morphemes In: Actes du IV* Congres international
de linguists 1936. Kopenhagen, 140-151. Reprinted in: L. Hjelmslev. Essais linguistiques. Copen-
hague: Nordisk Sprog- og Kulturforlag, 1959,152-164.
Hjelmslev. Louis 1953. Prolegomena to a Theory of Language. Madison. WI: The University of
Wisconsin Press
16. Semantic features and primes 357
Jackendoff, Ray 1984, Semantics and Cognition. Cambridge, MA:Tlie MIT Press
Jackendoff, Ray 1990, Semantic Structures. Cambridge, MA:The MIT Press
Jackendoff, Ray 1996. The architecture of the linguistic-spatial interface. In: P Bloom et al. (eds).
Language and Space. Cambridge, MA:The MIT Press, 1-30.
Jackendoff. Ray 2002. Foundations of Language. Oxford: Oxford University Press
Jakobson. Roman 1936. Beitrag zur allgemeinen Kasuslehre. Travaux du Cercle Linguistique de
Prague 6,240-288.
Johnson-Laird, Philip N. 1983. Mental Models: Toward a Cognitive Science of Language, Inference
and Consciousness. Cambridge, MA: Harvard University Press
Kamp, Hans 2001. The importance of presupposition. In: C. Rohrer, A. RoBdeutscher & H. Kamp
(eds). Linguistic Form and its Computation. Stanford, CA: CSLI Publications 207-254.
Kamp, Hans & Uwe Reyle 1993. From Discourse to Logic. Dordrecht: Kluwer.
Katz, Jerrold J. 1972. Semantic Theory. New York: Harper & Row.
Katz, Jcrrold J. & Jerry A. Fodor 1963. The structure of a semantic theory. Language 39,170-210.
Klima, Edward S. & Ursula Bcllugi 1979. The Signs of Language. Cambridge, MA: Harvard
University Press
Laurence, Stephen & Eric Margolis 1999. Concepts and cognitive science. In: E. Margolis &
S. Laurence (eds). Concepts: Core Readings Cambridge, MA:The MIT Press, 3-81.
Lewis David 1972. General semantics In: D. Davidson & G Harnian (eds). Semantics of Natural
Language. Dordrecht: Reidel, 169-218.
Marr, David 1982. Vision. San Francisco, CA: Freeman.
McCawley, James D. 1968. Lexical insertion in a transformational grammar without deep structure.
In: B. J. Darden, C.-J.N. Bailey & A. Davison (eds). Papers from the Regional Meeting of the
Chicago Linguistic Society (= CLS) 4. Chicago, IL: Chicago Linguistic Society, 71-80.
Mill, John S. 1967. System of Logic. Collected Works. Vol. III. Toronto. ON: Toronto University
Press
Miller, George & Philip Johnson-Laird 1976. Language and Perception. Cambridge, MA: Harvard
University Press
Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited
and with an introduction by Richmond H.Thomason. New Haven, CT: Yale University Press
Pustcjovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press
Rosch, Eleanor. 1977. Classification of real-world objects: Origins and representations in
cognition. In: P. N. Johnson-Laird & P. C. Wason (eds). Thinking - Readings in Cognitive Science.
Cambridge: Cambridge University Press, 212-222.
Rosch, Eleanor & Carolin Mervis 1975. Family resemblances: Studies in the internal structure of
categories Cognitive Psychology 7,573-605.
Seal le, John R. 1983. Intentionality. Cambridge: Cambridge University Press
Slobin, Dan I., Melissa Bowerman, Penelope Brown, Sonja EisenbeiB & Bhuvana Narasimlian
2008. Putting things in places: Developmental consequences of linguistic typology. In:
J. Bohnemeyer & E. Pcdcrson (eds). Event Representation. Cambridge, MA: Cambridge
University Press, 1-28.
Svenonius Peter 2007. Interpreting uninterpretable features Linguistic Analysis 33,375-413.
Wierzbicka, Anna 1996. Semantics: Primes and Universals. Oxford: Oxford University Press
Wunderlich, Dieter 1996a. Lexical categories Theoretical Linguistics 22,1-48.
Wunderlich, Dieter 1996b. Dem Freund die Hand auf die Schulter legen. In: G Han as &
M. Bierwisch (eds). Wenn die Semantik arbeitet.Tubingen: Nicmcycr, 331-360.
Wunderlich, Dieter 1997a. A minimalist model of inflectional morphology. In: C. Wilder, H.-M.
Gartner & M. Bierwisch (eds). The Role of Economy Principles in Linguistic Theory. Berlin:
Akademie Verlag, 267-298.
Wunderlich, Dieter 1997b. Cause and the structure of verbs Linguistic Inquiry 28,27-68.
Manfred Bierwisch, Berlin (Germany)
358
IV. Lexical semantics
17. Frameworks of lexical decomposition of verbs
1. Introduction
2. Generative Semantics
3. Lexical decomposition in Montague Semantics
4. Conceptual Semantics
5. LCS decompositions and the MIT Lexicon Project
6. Event Structure Theory
7. Two-level Semantics and Lexical Decompositional Grammar
8. Natural Semantic Metalanguage
9. Lexical Relational Structures
10. Distributed Morphology
11. Outlook
12. References
Abstract
Starting from early approaches within Generative Grammar in the late 1960s, the article
describes and discusses the development of different theoretical frameworks of lexical
decomposition of verbs. It presents the major subsequent conceptions of lexical
decompositions, namely, Dowty's approach to lexical decomposition within Montague
Semantics, Jackendoff s Conceptual Semantics, the LCS decompositions emerging from the MIT
Lexicon Project, Pustejovsky's Event Structure Theory, Wierzbicka's Natural Semantic
Metalanguage, Wunderlich's Lexical Decompositional Grammar, Hale and Kayser's
Lexical Relational Structures, and Distributed Morphology. For each of these approaches,
(i) it sketches their origins and motivation, (ii) it describes the general structure of
decompositions and their location within the theory, (Hi) it explores their explanative value for
major phenomena of verb semantics and syntax, (iv) and it briefly evaluates the impact
of the theory Referring to discussions in article 7 (Engelberg) Lexical decomposition, a
number of theoretical topics are taken up throughout the paper concerning the
interpretation of decompositions, the basic inventory of decompositional predicates, the location of
decompositions on the different levels of linguistic representation (syntactic, semantic,
conceptual), and the role they play for the interfaces between these levels.
1. Introduction
The idea that word meanings are complex has been present ever since people have tried
to explain and define the meaning of words. When asked the meaning of the verb
persuade, a competent speaker of the language would probably say something like (la).This
is not far from what semanticists put into a structured lexical decomposition as in (lb):
(1) a. You persuade somebody if you make somebody believe or do something.
b. persuade(\,y,z): x cause (y believe z) (after Fillmore 1968a: 377)
However, it was not until the mid-1960s that intuitions about the complexity of verb
meanings lead to formal theories of their lexical decomposition.This article will review the
history of lexical decomposition of verbs from that time on. For some general discussion of
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 358-399
17. Frameworks of lexical decomposition of verbs 359
the concept of decomposition and earlier decompositional approaches, cf. article 7 (Engel-
berg) Lexical decomposition. The first theoretical framework to systematically develop
decompositional representations of verb meanings was Generative Semantics (section
2), where decompositions were representations of syntactic deep structure. Later theories
did not locate lexical decompositions on a syntactic level but employed them as
representations on a lexical-semantic level as in Dowty's Montague-based approach (section
3) and the decompositional approaches emerging from the MIT Lexicon Project (section
5) or on a conceptual level as in Jackendoff's Conceptual Semantics (section 4). Other
lexical approaches were characterized by the integration of decompositions into an Event
Structure Theory (section 6), the conception of a comprehensive Natural Semantic
Metalanguage (section 7), and the development of a systematic structure-based linking
mechanism as in Lexical Decomposition Grammar (section 8). Parallel to these developments,
new syntactic approaches to decompositions emerged such as Hale and Kayser's Lexical
Relational Structures (section 9) and Distributed Morphology (section 10).
Throughout the paper a number of theoretical topics will be touched upon that are
discussed in more detail in article 7 (Engelberg) Lexical decomposition. Of particular
interest will be the questions on which level of linguistic representation (syntactic,
semantic, conceptual) decompositions are located, how the interfaces to other levels
of linguistic representation are designed, what evidence for the complexity of word
meaning is assumed, how decompositions are semantically interpreted, what role the
formal structure of decompositions plays in explanations of linguistic phenomena, and
what the basic inventory of decompositional predicates is.
In the following sections, the major theoretical approaches involving lexical
decompositions will be presented. Each approach will be described in four subsections that
(i) sketch the historical development that led to the theory under discussion, (ii) describe
the place lexical decompositions take in these theories and their structural
characteristics, (iii) present the phenomena that arc explained on the basis of decompositions, and
(iv) give a short evaluation of the impact of the theory.
2. Generative Semantics
2.1. Origins and motivation
Generative Semantics was a school of syntactic and semantic research that opposed
certain established views within the community of Generative Grammar. It was active
from the mid-1960s through the mid-1970s. Its major proponents were George Lakoff,
James D. McCawley, Paul M. Postal, and John Robert Ross. (For the history of Generative
Semantics, cf.Binnick 1972; McCawley 1994.)
At that time, the majority view held within Generative Grammar was that there is
a single, basic structural level on which generative rules operate and to which all other
structural levels arc related by interpretive rules. This particular structural level was
syntactic deep structure from which semantic interpretations were derived. It was this view of
'interpretive semantics' that was not shared by the proponents of Generative Semantics.
Although there never was a "standard theory" of Generative Semantics, a number
of assumptions can be identified that were wide-spread in the GS-community
(cf. Lakoff 1970; McCawley 1968; Binnick 1972): (i) Deep structures arc more abstract
than Chomsky (1965) assumed. In particular, they are semantic representations of
sentences, (ii) Syntactic and semantic representations have the same formal status:They are
360
IV. Lexical semantics
structured trees, (iii) There is one system of rules that relates semantic representations
and surface structures via intermediary representations.
Some more specific assumptions that are important when it comes to lexical
decomposition were the following: (iv) In semantic deep structure, lexical items occur as
decompositions where the semantic elements of the decomposition are distributed over
the structured tree (cf. e.g., Lakoff 1970; McCawley 1968; Postal 1971). (v) Some
transformations take place before lexical insertion (prelexical transformations, McCawley 1968).
(vi) Semantic deep structure allows only three categories: V (corresponding to
predicates), NP (corresponding to arguments), and S (corresponding to propositions). Thus,
for example, verbs, adjectives, quantifiers, negation, etc. are all assigned the category V
in semantic deep structure (cf. Lakoff 1970: 115ff; Bach 1968; Postal 1971). (vii)
Transformations can change syntactic relations: Since Floyd broke the glass contains a
structure expressing the glass broke as part of its semantic deep structure, the glass occurs as
subject in semantic deep structure and as object in surface structure.
2.2. Structure and location of decompositions
In Generative Semantics, syntactic and semantic structures do not constitute different
levels or modules of linguistic theory. They are related by syntactically motivated
transformations; that is, although semantic by nature, the lexical decompositions occurring
in semantic deep structure and intermediate levels of sentence derivation must be
considered as parts of syntactic structure.
In contrast to the "Aspects"-model of Generative Syntax (Chomsky 1965), the
terminal constituents of semantic deep structure are semantic and not morphological
entities. Particularly interesting for the development of theories of lexical decomposition
is the fact that semantic deep structure contained abstract verbs like cause or change
(Fig. 17.1) that were sublcxical in the sense that they were part of a lexical
decomposition. Moreover, it was assumed that all predicates that appear in semantic deep
structure are abstract predicates. A basic abstract predicate like believe resembles the actual
word believe in its meaning and its argument-taking properties, but unlike actual words,
it is considered to be unambiguous.These abstract entities attach to the terminal nodes
in semantic deep structure (cf. Fig. 17.1 and for structures and derivations of this sort
Lakoff 1965; McCawley 1968;Binnick 1972).
S
NP V S
I I /\
david CAUSE V S
I Ax
BECOME V S
NOT V NP
I I
ALIVE goliath
Fig. 17.1: Semantic deep structure for David killed Goliath
17. Frameworks of lexical decomposition of verbs
361
Abstract predicates can be moved by a transformation called 'predicate raising', a form
of Chomsky adjunction that has the effect of fusing abstract predicates into predicate
complexes (cf. Fig. 17.2).
I I
NOT ALIVE
Fig. 17.2: Adjunction of ALIVE to NOT by predicate raising
Performing a series of transformations of predicate raising, the tree in Fig. 17.2 is
transformed into the tree in Fig. 17.4 via the tree in Fig. 17.3.
david CAUSE v NP
I I
NOT ALIVE
Fig. 17.3: Adjunction of NOT ALIVE to BECOME by predicate raising
Fig. 17.4: Adjunction of BECOME NOT ALIVE to CAUSE by predicate raising
362
IV. Lexical semantics
Finally, the complex of abstract predicates gets replaced by a lexical item. The lexical
insertion transformation *[cause[become[not[alive]]]] —> kilV yields the tree in Fig. 17.5.
S
NP V NP
I I I
david kill goliath
Fig. 17.5: Lexical insertion of kill, replacing [ CAUSE [ BECOME [ NOT ALIVE ] ] ]
According to McCawley (1968), predicate raising is optional. If it does not take place,
the basic semantic components do not fuse.Thus,A:i//isonly one option of expressing the
semantic deep structure in Fig. 17.1 besides cause to die, cause to become dead, cause to
become not alive, etc.
2.3. Linguistic phenomena
Since Generative Semantics considered itself a general theory of syntax and semantics,
a large array of phenomena were examined within this school, in particular
quantification, auxiliaries, tense, speech acts, etc. (cf. Immlcr 1974; McCawley 1994; Binnick 1972).
In the following, a number of phenomena will be illustrated whose explanation is closely
related to lexical decomposition.
(i) Possible words: Predicate raising operates locally. A predicate can only be adjoined
to an adjacent higher predicate. For a sentence like (2a), Ross (1972:109ff) assumes
the semantic deep structure in (2b). The local nature of predicate raising predicts
that the decompositional structure can be realized as try to find or in case of
adjunction of find to try as look for. A verb conveying the meaning of'try-entertain', on
the other hand, is universally prohibited since entertain cannot adjoin to try.
(2) a. Fritz looked for entertainment.
b. [tkv fritz [find fritz [entertain someone fritz]]]
(ii) Lexical gaps: Languages have lexical gaps in the sense that not all abstract predicate
complexes can be replaced by a lexical item. For example, while the three admitted
structures cause become red (redden), become red (redden), and red (red) can be
replaced by lexical items, the corresponding structures for blue show accidental
gaps:cause become blue (no lexical item), become blue (no lexical item), and blue
(blue) (cf. McCawley 1968). Lexical items missing in English may exist in other
languages, for example, French bleuir. Since the transformation of predicate raising
is restricted in the way described above, Generative Semantics can distinguish
between those non-existing lexical items that are ruled out in principle, namely, by
the restrictions on predicate raising and those that arc just accidentally missing.
(iii) Related sentences: Lexical decompositions allow us to capture identical sentence-
internal relations across a number of related sentences.The relation between Goliath
and not alive in the three sentences David kills Goliath (cf. Fig. 17.1), Goliath dies,
17. Frameworks of lexical decomposition of verbs 363
and Goliath is dead is captured by assigning an identical subtree expressing this
relation to the semantic deep structures of all three sentences. Lakoff (1970: 33ff)
provides an analysis of the adjective thick, the intransitive verb thicken, and the
transitive verb thicken that systematically bridges the syntactic differences between
the three items by exploring their semantic relatedness through decompositions.
(iv) Cross-categorical transfer of polysemy: The fact that the liquid cooled has two
readings ('the liquid became cool' and 'the liquid became cooler') is explained by
inserting into the decomposition the adjective from which the verb is derived where
the adjective can assume the positive or the comparative form (Lakoff 1970).
(v) Selectional restrictions (cf. e.g., Postal 1971:204ff): Generative Semantics is capable
of stating generalizations over selectional restrictions. The fact that the object of
kill and the subjects of die and dead share their selectional restrictions is due to the
fact that all three contain the abstract predicate not alive in their decomposition
(cf. Postal 197 l:204ff).
(vi) Derivational morphology:Terminal nodes in lexical decompositions can be
associated with derivational morphemes. McCawley (1968) suggests a treatment of lexical
causatives like redden in which the causative morpheme en is inserted under the
node become in the decomposition cause become red.
(vii) Reference of pronouns: Particular properties of the reference of pronouns are
explained by relating pronouns to subtrees within lexical decompositions.
(3) a. Floyd melted the glass though it surprised me that he would do so.
b. Floyd melted the glass though it surprised me that it would do so.
In (3b) in contrast to (3a), the pronoun picks up the decompositional subtree the-glass
become melted within the semantic structure [Floyd cause [the-glass become melted]]
(cf. Lakoff 1970; Lakoff & Ross 1972).
(viii)Semantics of adverbials: Adverbials often show scopal ambiguities (cf. Morgan
1969). Sentences like Rebecca almost killed Jamaal can have several readings
depending on the scope of almost (cf. article 7 (Engclbcrg) Lexical decomposition,
section 1.2). Assuming the semantic deep structure [do [Rebecca cause [become
[Jamaal not alive]]]], the three readings can be represented by attaching the adverb
above to either do, cause, or not alive. Similar analyses have been proposed for
adverbs to like again (Morgan 1969), temporal adverbials (cf. McCawley 1971),
and durative adverbials like for four years in the sheriff of Nottingham jailed Robin
Hood for four years, where the adverbial only modifies the resultative substructure
indicating that Robin Hood was in jail (Binnick 1968).
2.4. Evaluation
Generative Semantics has considerably widened the domain of phenomena that syntactic
and semantic theories have to account for. It stimulated research not only in syntax but
particularly in lexical semantics, where structures similar to the lexical decompositions
proposed by Generative Semantics are still being used. Yet, Generative Semantics
experienced quite vigorous opposition, in particular from formal semantics, psycholinguis-
tics, and, of course, proponents of interpretive semantics within Generative Grammar
364
IV. Lexical semantics
(e.g., Chomsky 1970). Some of the critical points pertaining to decompositions were the
following:
(i) Generative Semantics was criticized for its semantic representations not conforming
to standards of formal semantic theories. According to Bartsch & Vennemann
(1972: lOff), the semantic representations of Generative Semantics are not logical
forms: The formation rules for semantic deep structures are uninterpreted;
operations like argument deletion lead to representations that are not well-formed; and
the treatment of quantifiers, negation, and some adverbials as predicates instead of
operators is inadequate. Dowty (1972) emphasizes that Generative Semantics lacks
a theory of reference.
(ii) While the rules defining deep structure and the number of categories were
considerably reduced by Generative Semantics, the analyses were very complex,
and semantic deep structure differed extremely from surface structure (Binnick
1972:14).
(iii) It was never even approximately established how many and what transformations
would be necessary to account for all sentential structures of a language (Immler
1974:121).
(iv) It often remained unclear how the reduction to more primitive predicates should
proceed, that is, what criteria allow one to decide whether dead is decomposed as
NOTALivEora/iveasNOTDEAD (Bartsch & Vennemann 1972:22)-an objection that
also applies to most other approaches to decompositions.
(v) In most cases, a lexical item is not completely equivalent to its decomposition. De
Rijk has shown that while forget and its presumed decomposition 'cease to know'
are alike with respect to presuppositions, there are cases where argument-taking
properties and pragmatic behaviour are not interchangeable. If somebody has
friends in Chicago who suddenly move to Australia, it is appropriate to say / have
ceased to know where to turn for help in Chicago but not / have forgotten where
to turn for help in Chicago (de Rijk, after Morgan 1969: 57ff). More arguments of
this sort can be found in Fodor's (1970) famous article Three Reasons for Not
Deriving "Kill" from "Cause to Die" which McCawley (1994) could partly repudiate
by reference to pragmatic principles.
(vi) A lot of the phenomena that Generative Semantics tried to explain by regular
transformations on a decomposed semantic structure exhibited lexical
idiosyncrasies and were less regular than would be expected under a syntactic approach.
Here are some examples. Pronouns are sometimes able to refer to substructures
in decompositions. Unlike example (3) above, in sentences with to kill, they
cannot pick up the corresponding substructure x become dead (Fodor 1970:429ff);
monomorphemic lexical items often seem to be anaphoric islands:
(4) a. John killed Mary and it surprised me that he did so.
b. John killed Mary and it surprised me *that she did so.
While to cool shows an ambiguity related to the positive and the comparative form of
the adjective (cf. section 2.3), to open only relates to the positive form of the adjective
(Immler 1974: 143f). Sometimes selectional restrictions carry over to related
sentences displaying the same decompositional substructure, in other cases they do not.
17. Frameworks of lexical decomposition of verbs 365
While the child grew is possible, a decompositionally related structure does not allow
child as the corresponding argument of grow: *the parents grew the child (Kandiah
1968). Advcrbials give rise to structurally ambiguous sentence meanings, but they
usually cannot attach to all predicates in a decomposition (Shibatani 1976: ll;Fodor
et al. 1980: 286ff). In particular, Dowty (1979) showed that Generative Semantics
overpredicted adverbial scope, quantifier scope, and syntactic interactions with cyclic
transformations. He concluded that rules of semantic interpretation of lexical items
are different from syntactic transformations (Dowty 1979:284).
(vii) Furthermore, Generative Semantics was confronted with arguments derived
from psycholinguistic evidence (cf. article 7 (Engelberg) Lexical decomposition,
section 3.6).
3. Lexical decomposition in Montague Semantics
3.1. Origins and motivation
The interesting phenomena that emerged from the work of Generative Semanticists,
on the one hand, and the criticism of the syntactic treatment of decompositional
structures, on the other, led to new approaches to word-internal semantic structure. Dowty's
(1972; 1976; 1979) goal was to combine the methods and results of Generative Semantics
with Montague's (1973) rigorously formalized framework of syntax and semantics where
truth and denotation with respect to a model were considered the central notions of
semantics.
Dowty refuted the view that all interesting semantic problems only concern the so-
called logical words and compositional semantics. Instead, he assumed that
compositional semantics crucially depends on an adequate approach to lexical meaning. While
he acknowledged the value of lexical decompositions in that, he considered
decompositions as incomplete unless they come with "an account of what meanings really
are." Dowty (1979: v, 21) believed that the essential features of Generative Semantics
can all be accommodated within Montague Semantics where the logical structures of
Generative Semantics will get a model-theoretic interpretation and the weaknesses of
the syntactic approaches, in particular overgeneration, can be overcome. In Montague
Semantics,sentences are not interpreted directly but are first translated into expressions
of intensional logic.These translations are considered the semantic representations that
correspond to the logical structure (i.e., the semantic deep structure) of Generative
Semantics (Dowty 1979: 22). Two differences between Dowty's approach and classical
Generative Semantics have to be noted: Firstly, directionality is inverse. While
Generative Semantics maps semantic structures onto syntactic surface structure, syntactic
structures are mapped onto semantic representations in Dowty's theory. Secondly, in
Generative Semantics but not in Dowty's theory, derivations can have multiple stages
(Dowty 1979:24ff).
3.2. Structure and location of decompositions
Lexical decompositions are used in Dowty (1979) mainly in order to reveal the different
logical structures of verbs belonging to the several so-called Vendler classes. Vendler
366
IV. Lexical semantics
(1957) classified verbs (and verb phrases) into states, activities, accomplishments, and
achievements according to their behaviour with respect to the progressive aspect
and temporal-aspectual adverbials (cf. also article 48 (Filip) Aspectual class and Aktion-
sart). Dowty (1979: 52ff) extends the list of phenomena associated with these classes,
relates verbs of different classes to decompositional representations as in (5a-c), and also
distinguishes further subtypes of these classes as in (5d-f) (cf. Dowty 1979:123ff).
(5) a. simple statives
*nK «„)
John knows the answer.
b. simple activities
Do(ap[7Cn(<x, <xn)])
John is walking.
c. simple achievements
BECOME[7Cn(a, <Xn)]
John discovered the solution.
d. non-intentional agentive accomplishments
[[D0(ai,[7Cn(a, <Xn)])] CAUSE [BECOME[pm(p, PJ]]]
John broke the window,
c. agentive accomplishments with secondary agent
[[DO(o|t K(a, <Xn)])] CAUSE [D0(P|t [Pm (P, PJ])]]
John forced Bill to speak.
f. intentional agentive accomplishment
do(<x,, [do(<x,,7cn(a, <xn)) cause <|>])
John murdered Bill.
The different classes are built up out of stative predicates (nn, pm) and a small set of
operators (do, become, cause). Within an aspect calculus, the operators involved in the
decompositions are given model-theoretic interpretations, and the stative predicates
are treated as predicate constants. The interpretation of become (6a) is based on von
Wright's (1963) logic of change; the semantics of cause (6b) as a bisentential operator
(cf. Dowty 1972) is mainly derived from Lewis' (1973) counterfactual analysis of
causality, and the less formalized analysis of do (6c) relates to considerations about will and
intentionality in Ross (1972).
(6) a. [become 0] is true at / if there is an interval J containing the initial bound of /
such that —i 0is true at J and there is an interval K containing the final bound of
/such that 0is true at K (Dowty 1979:140).
b. [0cause y/\ is true if and only if (i) 0is a causal factor for if/, and (ii) for all other
0' such that 0' is also a causal factor for f.some -i 0-world is as similar or more
similar to the actual world than any -i 0'-world is.
<f>is a causal factor for if/if and only if there is a series of sentences 0, 0, 0n, \ff
(for n > 0) such that each member of the series depends causally on the previous
member.
17. Frameworks of lexical decomposition of verbs 367
<f> depends causally on if/ if and only if <f>, if/ and -i <f> a -» -i if/ are all true
(Dowty 1979:108f).
C. □[Do(a,0)<-»0 A UNDER_THE_UNMEDlATED_CONTROL_OF_THE_AGENT_0f (<f>)]
(Dowty 1979:118)
By integrating lexical decompositions into Montague Semantics, Dowty wants to expand
the treatment of the class of entailments that hold between English sentences (Dowty
1979:31); he aims to show how logical words interact with non-logical words (e.g., words
from the domain of tense, aspect, and mood with Vendler classes), and he expects that
lexical decompositions help to narrow down the range of possible lexical meanings
(Dowty 1979:34f,125ff).
With respect to the semantic status of lexical decompositions, Dowty (1976: 209ff)
explores two options; namely, that the lexical expression itself is decomposed into a
complex predicate, or that it is related to a predicate constant via a meaning postulate
(cf. article 7 (Engelberg) Lexical decomposition, section 3.1). While he mentions cases
where a strong equivalence between predicate and decomposition provides evidence
for the first option, he also acknowledges that the second option might often be
empirically more adequate since it allows the weakening of the relation between predicate
and decomposition from a biconditional to a conditional. This would account for the
observation that the complex phrase cause to die has a wider extension than kill.
3.3. Linguistic phenomena
Dowty provides explanations for a wide range of phenomena related to Vendler classes.
A few examples are the influence of mass nouns and indefinites on the membership of
expressions in Vendler classes (Dowty 1979: 78ff), the interaction of derivational
morphology with sublexical structures of meaning (Dowty 1979:32,206f, 256ff), explanations
of adverbial and quantifier scope (Dowty 1976:213ff), the imperfective paradox (Dowty
1979:133), the progressive aspect (Dowty 1979:145ff), resultative constructions (Dowty
1979: 219ff), and temporal-aspectual adverbials (for an hour, in an hour) (Dowty 1979:
332ff).
Dowty also addresses aspectual composition. Since he conceives of Vendler classes in
terms of lexical decomposition, he shows that dccompositional structures involving cause,
become, and do are not only introduced via semantically complex verbs but also via
syntactic and morphological processes. For example, accomplishments that arise when verbs
are combined with prepositional phrases get their become operator from the preposition
(7a) where the preposition itself can undergo a process that adds an additional cause
component (7b) (Dowty 1979:21 If). In (7c), the morphological process forming deadjec-
tival inchoatives is associated with a become proposition (Dowty 1979: 206) that can be
expanded with a cause operator in the case of deadjectival causatives (Dowty 1979:307f).
(7) a. John walks to Chicago.
walk'(John) a become be-at'(John, Chicago)
b. John pushes a rock to the fence.
3x[rock'(x) a 3y[Vz[fence'(z) <=> y = z] a push'(John, x) cause become beat'
(x.y)]]
368
IV. Lexical semantics
c. The soup cooled.
3x[Vy[soup'(y) <=> x = y] a become cool'(x)]
3.4. Evaluation
Of the pre-80s work on lexical decomposition, besides the work of Jackendoff (cf. 2.4), it
is probably Dowty's "Word Meaning and Montague Grammar" that still exerts the most
influence on semantic studies. It has initiated a long period of research dominated by
approaches that located lexical decompositions in lexical semantics instead of syntax. It
must be considered a major advancement that Dowty was committed to providing formal
truth-conditions for operators involved in decompositions. Many of the approaches
preceding and following Dowty (1979) lack this degree of explicitness. His account of the
compositional nature of many accomplishments, the interaction of aspectual adverbials
with Vcndlcr classes, and many other phenomena mentioned in section 3.3 served as a
basis for discussion for the approaches to follow. Among the approaches particularly
influenced by Dowty (1979) are van Valin's (1993) decompositions within Role and
Reference Grammar, Levin and Rappaport Hovav's Lexical Conceptual Structures
(cf. section 5), and Pustejovsky's Event Structures (cf. section 6).
4. Conceptual Semantics
4.1. Origins and motivation
Along with lexical decompositions in Generative Semantics, another line of research
emerged where semantic arguments of predicates were associated with the roles they
played in the events denoted by verbs (Gruber 1965; Fillmore 1968b). Thematic roles
like agent, patient, goal, etc. were used to explain how semantic arguments arc mapped
onto syntactic structures (cf. also article 18 (Davis) Thematic roles). Early approaches to
thematic roles assumed that there is a small set of unanalyzable roles that are semanti-
cally stable across the verbal lexicon. Thematic role approaches were confronted with a
number of problems concerning the often vague semantic content of roles, their
coarsegrained nature as a descriptive tool, the lack of reliable diagnostics for them, and the
empirically inadequate syntactic generalizations (cf. the overview in Levin & Rappaport
Hovav2005:35ff; Dowty 1991:553ff).As a consequence, thematic role theories developed
in different ways by decomposing thematic roles into features (Rozwadowska 1988),
by reducing thematic roles to just two generalized macroroles (van Valin 1993) or to
proto-rolcs within a prototype approach based on lexical entailments (Dowty 1991), by
combining them with event structure representations (Grimshaw 1990; Reinhart 2002),
and, in particularly conceiving of them as notions derived from lexical decompositions
(van Valin 1993).This last approach was pursued by Jackendoff (1972; 1976) and was one
of the foundations of a semantic theory that came to be known as Conceptual
Semantics, and which, over the years, has approached a large variety of phenomena beyond
thematic roles (cf. also article 30 (Jackendoff) Conceptual Semantics).
According to Jackendoff's (1983; 1990; 2002) Conceptual Semantics, meanings are
essentially conceptual entities and semantics is "the organization of those thoughts that
language can express" (Jackendoff 2002:123). Meanings are represented on an
autonomous level of cognitive representation called "conceptual structure" that is related to
17. Frameworks of lexical decomposition of verbs 369
syntactic and phonological structure,on the one side,and to non-linguistic cognitive levels
like the visual, the auditory, and the motor system, on the other side. Conceptual
structure is conceived of as a universal model of the mind's construal of the world (Jackendoff
1983:18ff).Thus, Conceptual Semantics differs from formal, model-theoretic semantics
in locating meanings not in the world but in the mind of speakers and hearers.
Therefore, notions like truth and reference do not play the role they play in formal semantics
but are relativized to the speaker's conceptualizations of the world (cf. Jackendoff 2002:
294ff). Jackendoff's approach also differs from many others in not assuming a strict
division between semantic and encyclopaedic meaning and between grammatically relevant
and irrelevant aspects of meaning (Jackendoff 2002:267ff).
4.2. Structure and location of decompositions
Conceptual structure involves a decomposition of meaning into conceptual primitives
(Jackendoff 1983:57ff). Since Jackendoff (1990: lOf) assumes that there is an indefinitely
large number of possible lexical concepts, conceptual primitives must be combined by
generative principles to determine the set of lexical concepts.Thus, most lexical concepts
are considered to be conceptually complex. Decompositions in Conceptual Semantics
differ considerably in content and structure from lexical structure in Generative
Semantics and Montague-based approaches to the lexicon.The sentence in (8a) would yield the
meaning representation in (8b):
(8) a. John entered the room.
b- Uven, GO (Ling JOHN]> [p..|, TO dpi.ee IN (Ling ROOM])])])l
Each pair of square brackets encloses a conceptual constituent, where the capitalized
items denote the conceptual content, which is assigned to a major conceptual category
like Thing, Place, Event, State, Path, Amount, etc. (Jackendoff 1983: 52ff). There are a
number of possibilities for mapping these basic ontological categories onto functor-
argument structures (Jackendoff 1990: 43). For example, the category Event can be
elaborated into two-place functions like go ([Thing], [Path]) or cause ([Thing/Event],
[Event]). Furthermore, each syntactic constituent maps into a conceptual constituent,
and partly language-specific correspondence rules relate syntactic categories to the
particular conceptual categories they can express. In later versions of Conceptual
Semantics, these kinds of structures are enriched by referential features and modifiers,
and the propositional structure is accompanied by a second tier encoding elements of
information structure (cf. Jackendoff 1990:55f; Culicover & Jackendoff 2005:154f).
A lexical entry consists of a phonological, syntactic, and conceptual representation.
As can be seen in Fig. 17.6, most of the conceptual structure in (8b) is projected from
the conceptual structure of the verb that provides a number of open argument slots
(Jackendoff 1990:46).
C enter "^
V
<NP;>
|^ [Event GO ([Thing]/), [path TO ([P,ace IN ([Thing]/)])]] J
Fig. 17.6: Lexical entry for enter
370
IV. Lexical semantics
The conceptual structure is supplemented with a spatial structure that captures finer
distinctions between lexical items in a way closely related to non-linguistic cognitive
modules (Jackendoff 1990:32ff; 2002:345ff).
The same structure can also come about in a compositional way. While enter already
includes the concept of a particular path (in), this information is contributed by a
preposition in the following example:
^9) a. John ran into the room.
b- [Even. GO (Ling JOHND' [path TO dpiace IN (Ling RO<H)])]]
The corresponding lexical entries for the verb and the preposition (Fig. 17.7) account for
the structure in (9b) (Jackendoff 1990:45).
run
V
<PPy>
L [Event GO ([Thing ]/), [path TO ]y] J
[ into ""1
P
<NPy>
L [Path TO ([Race IN ([Thing]/)])] J
Fig. 17.7: Lexical entries for run and into
An important feature of Jackendoff's (1990: 25ff; 2002:356ff) decompositions is the use
of abstract location and motion predicates in order to represent the meaning of words
outside the local domain. For example, a changc-of-statc verb like melt is rendered as 'to
go from solid to liquid'. Building on Gruber's (1965) earlier work, Jackendoff thereby
relates different classes of verbs by analogy. In Jackendoff (1983: 188), he states that
in any semantic field in the domain of events and states "the principle event-, state-,
path-, and place-functions are a subset of those used for the analysis of spatial location
and motion".To make these analogies work, predicates are related to certain fields that
determine the character of the arguments and the sort of inferences. For example, if the
two-place function BE(x,y) is supplemented by the field feature spatial, it indicates that
x is an object and y is its location while the field feature possession indicates that x is an
object and y the person who owns it. Only the latter one involves inferences about the
rights of y to use x (Jackendoff 2002:359ff).
Jackendoff (2002: 335f) dissociates himself from approaches that compare lexical
decompositions with dictionary definitions; the major difference is that the basic
elements of decompositions need not be words themselves, just as phonological features as
the basic components on phonological elements are not sounds. He also claims that the
speaker does not have conscious access to the decompositional structure of lexical items;
this can only be revealed by linguistic analysis.
The question of what elements should be used in decompositions is answered to the
effect that if the meaning of lexeme A entails the meaning of lexeme B, the
decomposition of A includes that of B (Jackendoff 1976). Jackendoff (2002:336f) admits that it is
hard to tell how the lower bound of decomposition can be determined. For example,
17. Frameworks of lexical decomposition of verbs 37_1
some approaches consider cause to be a primitive; others conceive of it as a family of
concepts related through feature decomposition. In contrast to approaches that are only
interested in finding those components that are relevant for the syntax-semantics
interface, he argues that from the point of learnability the search for conceptual primitives
has to be taken seriously beyond what is needed for the syntax. He takes the stance that
it is just a matter of further and more detailed research before the basic components are
uncovered.
4.3. Linguistic phenomena
Lexical decompositions within Conceptual Semantics serve much wider purposes than
in many other approaches. First of all, they are considered a meaning representation in
their own right, that is, not primarily driven by the need to explain linking and other
linguistic interface phenomena. Moreover, as part of conceptual structure, they arc linked
not only to linguistic but also to non-linguistic cognitive domains.
As a semantic theory, Conceptual Semantics has to account for inferences. With
respect to decompositions, this is done by postulating inference rules that link decom-
positional predicates. For example, from any decompositional structure involving
Go(x,y,z), we can infer that BE(x,y) holds before the go event and be(x,z) after it. This
is captured by inference rules as in (10a) that resemble meaning postulates in truth-
conditional semantics (Jackcndoff 1976: 114). Thus, wc can infer from the train went
from Kankakee to Mattoon that the train was in Kankakee before and in Mattoon after
the event. Since other sentences involving a Go-type predicate like the road reached
from Altoona to Johnstown do not share this inference, the predicates are subtyped
to the particular fields transitional versus extensional.lhe extensional version of go is
associated with the inference that one part of x in Go(x,y,z) is located in y and the other
in z (10b) (Jackendoff 1976:139):
(10) a. GOTrans(x,y,z) at /, => for some times t2 and ty such that t2 < /, < ty
BETrans(x,y) at f2and BETrans(x^) at ty
b. GOExl(x,y,z) => for some v and w such that v <zx and wax,
BEEx.(v'y) and BEnx.(w'z)-
One of Jackendoffs main concerns is the mapping between conceptual and syntactic
structure. Part of this mapping is the linking of semantic arguments into syntactic
structures. In Conceptual Semantics this is done via thematic roles. Thematic roles are not
primitives in Conceptual Semantics but can be defined on the basis of decompositions
(Jackcndoff 1972; 1987). They "arc nothing but particular structural configurations in
conceptual structure" (Jackendoff 1990: 47). For example, Jackendoff (1972: 39)
identifies the first argument of the cause relation with the agent role. In later versions of his
theory, Jackendoff (e.g., 1990:125ff) expands the conceptual structure of verbs by adding
an action tier to the representation. While the original concept of decomposition (the
'thematic tier') is couched in terms of location and motion, thereby rendering thematic
roles like theme, goal, or source, the action tier expresses how objects are affected and
accounts for roles like actor and patient. Thus, hit as in the car hit the tree provides a
theme (the car, the "thing in motion" in the example sentence) and a goal (the tree) on
372
IV. Lexical semantics
the thematic tier, and an actor (the car) and a patient (the tree) on the action tier. These
roles can be derived from the representation in Fig. 17.8 (after Jackendoff 1990:125ff).
tINCH [BE ([CAR], [AT [TREE]])]
Ev*nl AFF ([CAR], [TREE]) J
Fig. 17.8: Thematic tier and action tier for the car hit the tree
The thematic roles derived from the thematic and the action tiers are ordered within a
thematic hierarchy.This hierarchy is mapped onto a hierarchy of syntactic functions such
that arguments are linked to syntactic functions according to the rank of their thematic
role in the thematic hierarchy (Jackendoff 1990:258,268f; 2002:143). Strict subcatcgori-
zation can largely be dispensed with. However, Jackendoff (1990: 255ff; 2002:140f) still
acknowledges subcategorizational idiosyncrasies.
Among the many other phenomena treated within Conceptual Semantics and related
to lexical decompositions are argument structure alternations (Jackendoff 1990: 71ff),
aspectual-temporal advcrbials and their relation to the boundedness of events (Jackendoff
1990: 27ff), the semantics of causation (Jackendoff 1990: 130ff), and phenomena at the
border between adjuncts and arguments (Jackendoff 1990:155ff).
4.4. Evaluation
Jackendoff's approach to lexical decomposition has been a cornerstone in the
development of lexical representations since it coveres a wide domain of different classes of
lexical items and phenomena associated with these classes. The wide coverage forced
Jackendoff to expand the structures admitted in his decompositions. This in turn evoked
criticism that his theory lacks sufficient restrictiveness. Furthermore, it has been criticized
that the locational approach to decomposition needs to be stretched too far in order
to make it convincing that it includes all classes of verbs (Levin 1995: 84). Wunderlich
(1996:171) considers Jackendoff's linking principle problematic since it cannot easily be
applied to languages with case systems. In general, Jackendoff s rather heterogeneous set
of correspondence rules has attracted criticism because it involves a considerable
weakening of the idea of semantic compositionality (cf. article 30 (Jackendoff) Conceptual
Semantics for Jackendoff's position).
5. LCS decompositions and the MIT Lexicon Project
5.1. Origins and motivation
An important contribution to the development of decompositional theories of lexical
meaning originated in the MIT Lexicon Project in the mid-eighties. Its main proponents
are Beth Levin and Malka Rappaport Hovav. Their approach is mainly concerned with
the relation between semantic properties of lexical items and their syntactic behaviour.
Thus, it aims at "developing a representation of those aspects of the meaning of a
lexical item which characterize a native speaker's knowledge of its argument structure and
determine the syntactic expression of its arguments" (Levin 1985:4).The meaning
representations were supposed to lead to definitions of semantic classes that show a uniform
syntactic behaviour:
17. Frameworks of lexical decomposition of verbs 373
(1) All arguments bearing a particular semantic relation are systematically expressed in
certain ways. (2) Predicates fall into classes according to the arguments they select and
the syntactic expression of these arguments. (3) Adjuncts are systematically expressed in
the same way(s) and their distribution often seems to be limited to semantically coherent
classes of predicates. (4) There are regular extended uses of predicates that are correlated
with semantic class. (S) Predicates belonging to certain semantic classes display regular
alternations in the expression of their arguments.
(Levin 1985:47)
Proponents of the approach criticized accounts that were solely based on thematic roles,
as they were incapable of explaining diathesis alternations (Levin 1985:49ff; 1995:76ff).
Instead, they proposed decompositions on the basis of Jackcndoff's (1972; 1976)
conceptual structures, Generative Semantics, earlier work by Carter (1976) and Joshi (1974),
and ideas from Hale & Keyser (1987).
5.2. Structure and location of decompositions
The main idea in the early MIT Lexicon Project Working Papers was that two levels
of lexical representation have to be distinguished, a lexical-semantic and a lexical-
syntactic one (Rappaport, Laughren & Levin 1987, later published as Rappaport, Levin &
Laughren 1993; Rappaport & Levin 1988). The lexical-syntactic representation, PAS
("predicate argument structure"), "distinguishes among the arguments of a predicator
only according to how they combine with the predicator in a sentence". PAS, which is
subject to the projection principle (Rappaport & Levin 1988: 16), expresses whether
the role of an NP-argument is assigned (i) by the verb ("direct argument"), (ii) by a
different theta role assigner like a preposition ("indirect argument"), or (iii) by the VP via
predication ("external argument") (Rappaport, Laughren & Levin 1987:3).These three
modes of assignment are illustrated in the PAS for the verb put:
(11) a. WPAS:x<y,P|ocz>
b. put, LCS: [ x cause [ y come to be at z ]]
The lexical-semantic basis of PAS is a lexical decomposition, LCS ("Lexical Conceptual
Structure") (Rappaport, Laughren & Levin 1987: 8). The main task for an LCS-based
approach to lexical semantics is to find the mapping principles between LCS and PAS
and between PAS and syntactic structure (Rappaport 1985:146f).
Not all researchers associated with the MIT Lexicon Project distinguished two levels
of representation. Carter (1988) refers directly to argument positions of predicates within
dccompositionsinordcrtocxplainlinkingphcnomcna.Inlatcrwork,Lcvin and Rappaport
do not make reference to PAS as a level of representation anymore. The distinction
between grammatically relevant and irrelevant lexical information is now reflected in a
distinction between primitive predicates that are embedded in semantic templates, which
are claimed to be part of Universal Grammar, and predicate constants, which reflect the
idiosyncratic part of lexical meaning. The templates pick up distinctions known from
Vendler classes (Vendler 1957) and are referred to as event structure representations.
For example, (12a) is a template for an activity, and (12b) is a template for a particular
kind of accomplishment (Rappaport Hovav & Levin 1998:108):
374
IV. Lexical semantics
(12) a. [x act<imvvaJ
b. [[x act<imvvar>] cause [become [y <STATE>]]]
The templates in (12) illustrate two main characteristics of this approach: templates
can embed other templates and constants can function as modifiers of predicates (e.g.,
<MANNER> with respect to ACT) or as arguments of predicates (e.g., <STATE> with
respect to BECOME). The representations are augmented by well-formedness
conditions that require that each subevent in a template is represented by a lexical head in
syntax and that all participants in lexical structure and all argument XPs in syntax are
mapped onto each other. The principle of Template Augmentation makes it possible
to build up complex lexical representations from simple ones such that the variants of
sweep reflected in (13a-13c) are represented by the decompositions in (14a-14c):
(13) a. Phil swept the floor.
b. Phil swept the floor clean.
c. Phil swept the crumbs onto the floor.
(14) a. [xACT<TOf>y]
b. [[x act <sm:Kf9, y] cause [become [y <STATE>]]]
c. [[x act <xwtKfl> y] cause [become [z <PLACE>]\]
It is assumed that the nature of the constant can determine the range of templates that
can be associated with it (cf. article 19 (Levin & Rappaport Hovav) Lexical Conceptual
Structure).
It should be noticed that the approach based on LCS-type decompositions aims
primarily at explaining the regularities of argument realization. Particularly in its later
versions, it is not intended to capture different kinds of entailments, aspectual behaviour,
or restrictions on adverbial modification. It is assumed that all and only those meaning
components that are relevant to grammar can be isolated and represented as LCS
templates.
5.3. Linguistic phenomena
Within this approach, most research was focused on argument structure alternations that
verbs may undergo.
(15) a. Bill loaded cartons onto the truck,
b. Bill loaded the truck with cartons.
With respect to alternations as in (15) Rappaport & Levin (1988:19ff) argued that a
lexical meaning representation has to account for (i) the near paraphrase relation between
the two variants, (ii) the different linking behaviour of the variants, (iii) and the
interpretation of the goal argument in (15b) as completely affected by the action. They argued
that a theta role approach could not fulfill these requirements: If the roles for load were
considered identical for both variants, for example, <agent, locatum, goal>, requirement
(i) is met but (ii) and (iii) are not since the different argument realizations cannot follow
from identical theta role assignments and the affectedness component in (15b) is not
17. Frameworks of lexical decomposition of verbs 375
expressed. If the roles for the two variants are considered different, for example, <agent,
theme, goal> for (15a) versus <agent, locatum, goal> for (15b), the near paraphrase
relation gets lost, and the completeness interpretation of (15b) still needs stipulativc
interpretation rules.
Within an LCS approach, linking rules make reference not to theta roles but to
substructures of decompositions, for example, "When the LCS of a verb includes one of
the substructures in [16], link the variable represented by x in either substructure to the
direct argument variable in the verb's PAS.
(16) a. ... [ x come to be at LOCATION ] ...
b. ... [ x come to be in STATE ] ..." (Rappaport & Levin 1988:25)
With respect to load, it is assumed that the variant in (17b) with its additional meaning
component of completion entails the variant in (17a) giving rise to the following
representations:
(17) a. load: [ x cause [ y to come to be at z ] /LOAD ]
b. load: [[ x cause [ z to come to be in STATE ]] BY MEANS OF [ x cause [ y to
come to be at z ]] /LOAD ] (Rappaport & Levin 1988:26)
Assuming that the linking rules apply to the main clause within the decomposition, the
two decompositions lead to different PAS representations in which the direct argument
is associated with the theme in (18a) and the goal in (18b):
(18) a. load:x<y,Piocz>
b. load: x<z,Pwilhy>
Thus the three observations, (i) near-paraphrase relation, (ii) different linking behaviour,
and (iii) complete affectedness of the theme in one variant, are accounted for.
Among the contemporary studies that proceeded in a similar vein are Hale &
Kcyscr's (1987) work on the middle construction and Gucrsscl ct al.'s (1985) crosslinguistic
studies on causative, middle, and conative alternations.
It is particularly noteworthy that Levin and Rappaport have greatly expanded the
range of phenomena in the domain of argument structure alternations that a lexical
semantic theory has to cover. Their empirical work on verb classes determined by the
range of argument structure alternations they allow is documented in Levin (1993):
About 80 argument structure alternations in English lead to the definition of almost 200
verb classes.The theoretical work represented by the template approach to LCS focuses
on finding the appropriate constraints that guide the extension of verb meanings and
explain the variance in argument structure alternations.
5.4. Evaluation
The work of Levin, Rappaport Hovav, and other researchers working with LCS-likc
structures had a large influence on later work on the syntax-semantics interface. By
uncovering the richness of the domain of argument structure alternations, they defined
what theories at the lexical syntax-semantic interface have to account for today. Among
376
IV. Lexical semantics
the work inspired by Levin and Rappaport Hovav's theory are approaches whose goal is
to establish linking regularities on more abstract, structural properties of decompositions
(e.g., Lexical Decomposition Grammar, cf. section 8) and attempts to integrate elements
of lexical decompositions into syntactic structure (cf. section 9).
Levin and Rappaport Hovav's work is also typical of a large amount of lexical
semantic research in the 1980s and 90s that has largely given up the semantic rigorous-
ness characteristic of approaches based on formal semantics like Dowty (1979). Less
rigorous semantic relations make theories more susceptible to circular argumentations
when semantic representations are mapped onto syntactic ones (cf. article 7 (Engelberg)
Lexical decomposition, section 3.5). It has also been questioned whether Levin and
Rappaport Hovav's approach allows for a principled account of cross-linguistic variation
and universals (Croft 1998:26;Zubizarreta & Oh 2007:8).
6. Event Structure Theory
6.1. Origins and motivation
In the late 1980s, two papers approaching verb semantics from a philosophical point of
view inspired much research in the domain of aspect and Aktionsart, namely, Vendler's
(1957) classification of expressions based on predicational aspect and Davidson's (1967)
suggestion to reify events in order to explain adverbial modification. In connection
with Dowty's (1979) work on decompositions within Montague semantics, the
intensification of research on grammatical aspect, predicational aspect, and Aktionsarten also
stimulated event-based research in lexical semantics. In particular, Pustejovsky's (1988;
1991a; 1991b) idea of conceiving of verbs as referring to structured events added a new
dimension to decompositional approaches to verb semantics.
6.2. Structure and location of decompositions
According to Pustejovsky (1988; 1991a; 1991b), each verb refers to an event that can
consist of subevents of different types, where 'processes' (P) and 'states' (S) are simple
types that can combine to yield the complex type 'transition' [P S]T via event
composition. A process is conceived of as "a sequence of events identifying the same semantic
expression", a state as "a single event, which is evaluated relative to no other event", and
a transition as "an event identifying a semantic expression, which is evaluated relative
to its opposition" (Pustejovsky 1991a: 56). In addition to this event structure (ES),
Pustejovsky assumes a level LCS', where each subevent is related to a decomposition.
Out of this, a third level of Lexical Conceptual Structure (LCS) can be derived, which
contains a single lexical decomposition. The following examples illustrate how the
meaning of sentences is based on these representational levels:
(19) a. Mary ran.
b. Mary ran to the store.
c. The door is closed.
d. The door closed.
e. John closed the door.
17. Frameworks of lexical decomposition of verbs
377
LCS": [ run(mary) ]
LCS: run(mary)
Fig. 17.9: Representation of Mary ran
T
LCS": [ run(mary) ] [ at(mary, the-store) ]
LCS: cause(act(mary), become(at (mary, the-store)) by run)
Fig. 17.10: Representation of Mary ran to the store
S
ES: I
LCS": [ closed(the-door) ]
LCS: closed(the-door)
Fig. 17.11: Representation of the door is closed
T
ES:
LCS": [ ->closed(the-door) ] [ closed(the-door) ]
LCS: become(closed(the-door))
Fig. 17.12: Representation of the door closed
T
ES:
LCS": [ act(john, the-door) & ->closed(the-door) ] [ closed(the-door) ]
LCS: cause(act(john, the-door), become(closed(the-door)))
Fig. 17.13: Representation of John closed the door
378
IV. Lexical semantics
In terms of Vendler classes, Fig. 17.9 describes an activity, Fig. 17.11 a state, Figs. 17.10
and 17.13 accomplishments, and Fig. 17.12 an achievement. According to Pustejovsky,
achievements and accomplishments have in common that they lead to a result state
and are distinguished in that achievements do not involve an act-predicate at LCS'. As
in many other decompositional theories (Jackendoff 1972; van Valin 1993), thematic
roles are considered epiphenomenal and can be derived from the structured lexical
representations (Pustejovsky 1988:27).
Pustejovsky's event structure theory is part of his attempt to construct a theory of the
Generative Lexicon (Pustejovsky 1995) that, besides Event Structure, also comprises
Qualia Structure, Argument Structure, and Inheritance Structure (Pustejovsky 1991b,
1995). He criticises contemporary theories for focussing too much on the search for a
finite set of semantic primitives:
Rather than assuming a fixed set of primitives, let us assume a fixed number of
generative devices that can be seen as constructing semantic expressions. Just as a formal
language is described in terms of the productions in the grammar rather than its accompanying
vocabulary, a semantic language should be defined by the rules generating the structures for
expressions rather than the vocabulary of primitives itself.
(Pustejovsky 1991a: 54)
6.3. Linguistic phenomena
The empirical coverage of Pustejovsky's theory is wider than many other
decompositional theories: (i) The ambiguity of adverbials as in Lisa rudely departed is explained by
attaching the adverb cither to the whole transition T ('It was rude of Lisa to depart') or
to the embedded process P ('Lisa departed in a rude manner') (Pustejovsky 1988:31f).
(ii) The mapping of Vendler classes onto structural event representations allows for a
formulation of the restrictions on temporal-aspectual adverbials (in five minutes, for five
minutes, etc.) (Pustejovsky 1991a: 73). (iii) The linking behaviour of verbs is related to
LCS' components; for example, the difference between unaccusatives and unergatives
is accounted for by postulating that a participant involved in a predicate opposition (as
in Fig. 12) is mapped onto the internal argument position in syntax while the agentive
participant in an initial subevent (as in Fig. 17.13) is realized as the external argument
(Pustejovsky 1991a: 75). Furthermore, on the basis of Event Structure and Qualia
Structure, a theory of aspectual coercion is developed (Pustejovsky & Bouillon 1995) as well
as an account of lexicalizations of causal relations (Pustejovsky 1995).
6.4. Evaluation
Pustejovsky's concept of event structures has been taken up by many other lexical
semanticists. Some theories included event structures as an additional level of
representation. Grimshaw (1990) proposed a linking theory that combined a thematic hierarchy
and an aspectual hierarchy of arguments based on the involvement of event participants
in Pustcjovsky-stylc subevents. In Lexical Decomposition Grammar, event structures
were introduced as a level expressing sortal restrictions on events in order to explain
the distribution and semantics of adverbials (Wunderlich 1996). It has sometimes been
criticised that Pustejovsky's event structures were not fine-grained enough to explain
17. Frameworks of lexical decomposition of verbs 379
adverbial modification. Consequently, suggestions have been made how to modify and
extend event structures (e.g.,Wunderlich 1996; cf. also Engelberg 2006).
Apart from those studies and theories that make explicit reference to Pustejovsky's
event structures, a number of other approaches emerged in which phasal or mereolog-
ical properties of events are embedded in lexical semantic representations, among them
work by Tenny (1987; 1988), van Voorst (1988), Croft (1998), and some of the syntactic
approaches to be discussed in section 9. Even standard lexical decompositions are often
conceived of as event descriptions and referred to as 'event structures', for example, the
LCS structures in Rappaport Hovav & Levin (1998).
Event structures by themselves can of course not be considered full decompositions
that exhaust the meaning of a lexical item. As we have seen above, they arc always
combined with other lexical information, for example, LCS-style decompositions or thematic
role representations. Depending on the kind of representation they are attached to it is
not quite clear if they constitute an independent level of representation. In Pustejovsky's
approach, event structures are probably by and large derivable from the LCS structures
they are linked to.
7. Two-level Semantics and Lexical Decompositional Grammar
7.1. Origins and motivation
Two-level-Semantics originated in the 1980s with Manfred Bierwisch, Ewald Lang and
Dieter Wunderlich being its main proponents (Bierwisch 1982; 1989; 1997; Bierwisch &
Lang 1989; Wunderlich 1991; 1997a; cf. article 31 (Lang & Maienborn) Two-level
Semantics). In particular, Bierwisch's contribution is remarkable for his attempt to define the
role of the lexicon within Generative Grammar. Lexical Decomposition Grammar
(LDG) emerged out of Two-level Semantics in the early 1990s. LDG has been particularly
concerned with the lexical decompositon of verbs and the relation between semantic,
conceptual, and syntactic structure. It has been developed by Dieter Wunderlich (1991;
1997a; 1997b; 2000; 2006) and other linguists stemming from the Diisseldorf Institute for
General Linguistics (Sticbcls 1996; 1998; 2006; Kaufmann 1995a; 1995b; 1995c; Joppcn &
Wunderlich 1995; Gamerschlag 2005) - with contributions by Paul Kiparsky (cf. article
78 (Kiparsky & Tonhauser) Semantics of inflection).
7.2. Structure and location of decompositions
Two-level semantics argues for separating semantic representations (semantic form, SF)
that are part of the linguistic system, and conceptual representations that are part of the
conceptual system (CS). Only SF is seen as a part of grammar that is integrated into its
computational mechanisms while conceptual structure is a level of reasoning that builds
on more general mental operations. How the interplay between SF and CS can be spelled
out is shown in Maienborn (2003). She argues that spatial PPs can either function as
event-external modifiers, locating the event as a whole as in (20a), or as event-internal
modifiers, specifying a spatial relation that holds within the event as in (20b). While
external event location is semantically straightforward, internal event location is subject
to conceptual knowledge. Not only does the local relation expressed in (20b) require
world knowledge about spatial relations in bike riding events, it is also re-interpreted as
380
IV. Lexical semantics
an instrumental relation that is not lexically provided by the verb or the preposition.
Furthermore, in sentences like (20c) the external argument of the preposition, the woman's
hand or some instrument the woman uses is not even mentioned in the sentence but has
to be supplied by conceptual knowledge.
(20) a. Der Bankraubcr ist aufderlnsel geflohen.
the bank robber has on the island escaped
b. Der Bankrauber ist auf dem Fahrrad geflohen.
the bank robber has on the bicycle escaped
c. Maria zog Paul an den Haaren aus dem Zimmer.
Maria pulled Paul at the hair out of the room
Several tests show that event-internal modifiers attach to the edge of V and event external
modifiers to the edge of VP. Maienborn (2003:487) suggests that the two syntactic
positions trigger slightly different modification processes at the level of SF. While in both
cases the lexical entries entering the semantic composition have the same decomposi-
tional representation (21a, b), the process of external modification identifies the external
argument of the preposition with the event argument of the verb (21c; XQ applying to
the PP, XP to the verb), whereas the process of internal modification turns the external
argument of the preposition into a free variable, a so-called SF parameter (variable v in
21d) that is specified as a constituent part (part-of) of what will later be instantiated with
the event variable.
(21) a. [pauf]: Xy Xx [LOC (x,ON (y))]
b. [vfliehen]: Xx Xe [ESCAPE (e) & THEME (e,x)]
c. MOD: XQXPXx[P(x)&Q(x)]
d. MOD': XQXPXx[P(x)& PART-OF (x,v)&Q(v)]
The compositional processes yield the representation in (22a) for the sentence (20b), the
variable v being uninstantiated.This representation will be enriched at the level of CS
which falls back on a large base of shared conceptual knowledge.The utterance meaning
is achieved via abduction processes which lead to the most economical explanation that
is consistent with what is in the knowledge base. Spelling out the relevant part of the
knowledge base, i.e. knowledge about spatial relations, about event types in terms of
participants serving particular functions, and about the part-whole organization of physical
objects, Maienborn shows how the CS representation for (20b), given in (22b), can be
fomally derived (for details cf. Maienborn 2003:492ff).
(22) a. SF: 3e [ESCAPE (e) & THEME (e, r) & BANK-ROBBER (r)
& PART-OF (e, v) & LOC (v, ON (b)) & BIKE (b)]
b. CS: 3e [EXTR-MOVE (e) & ESCAPE (e) & THEME (e,r)
& BANK-ROBBER (r) & INSTR (e, b) & VEHICLE (b)
& BIKE (b) & SUPPORT (b, r, x(c)) & LOC (r, ON (b))]
The emergence of Lexical Decomposition Grammar out of Two-level Semantics is
particularly interesting for the development of theories of decompositional approaches
17. Frameworks of lexical decomposition of verbs
381
to verb meaning. LDG locates decompositional representations in semantics and
rejects syntactic approaches to decomposition arguing that they have failed to provide
logically equivalent paraphrases and an adequate account of scopal properties of
adverbials (Wunderlich 1997a: 28f). It assumes four levels of representation:
conceptual structure (CS), semantic form (SF), theta structure (TS), and
morphological/syntactic structure (MS, in earlier versions of LDG also called phrase structure, PS). SF is
a decomposition based on type logic that is related to CS by restrictive lexicalization
principles;TS is derived from SF by lambda abstraction and encodes the argument
hierarchy. TS in turn is mapped onto MS by linking principles (Wunderlich 1997a: 32; 2000:
249ff).The four levels are illustrated in Fig. 17.14 with respect to the German verb geben
'give' in (23).
(23) a. (als) [derTorwart [dem Jungen [den Ball gab]]]
(when) the goalkeeper the boy the ball gave
b. [DPxnom [DP>dat [DPzacc geb-AGR* ]]]
TS
Xz
+hr
-Ir
1
Xy
+hr
+lr
' i
ACC
Xx Xs
-hr
+lr
' i
DAT
<^=
'
AGR
NOM
MS
SF
{act(x) & bec Poss(y,z)}(s)
1
CS
x = Agent or Controller
y = Recipient
z = Patient or Affected
Causal event: act(x,s,)
R
esult state: Poss(y,z)(s2)
Fig. 17.14: The representational levels of LDG (Wunderlich 2000:250)
Semantic form docs not provide a complete characterization of a word's meaning. It
serves to represent those properties of predicate-argument structures that make it
possible to account for their grammatical properties (Wunderlich 1996: 170). This level
of representation must be finite and not subject to contingent knowledge. In contrast
to semantic form, conceptual structures draw on an infinite set of properties and can
be subject to contingent knowledge (Wunderlich 1997a: 29). Since SF decompositions
consist of hierarchically ordered binary structures - assuming that a & b branches as
[a [& b]] - arguments can be ranked according to how deeply they are embedded within
this structure.TS in turn preserves the SF hierarchy of arguments in inverse order so that
arguments can be discharged by functional application (Wunderlich 1997a: 44). Each
argument role in TS is characterized as to whether there is a higher or lower role.
Besides their thematic arguments, nouns and verbs also have referential arguments,
which do not undergo linking. Referential arguments are subject to sortal restrictions
that are represented as a structured index on the referential argument (Wunderlich
1997a: 34). With verbs, this sortal index consists of an event structure, similar in form
to Pustejovsky's event structures but slightly differing with respect to the distinctions
expressed (Wunderlich 1996:175ff).
382
IV. Lexical semantics
The relation between the different levels is mediated by a number of principles. For
example, Argument Hierarchy regulates the inverse hierarchical mapping from SF to
TS, and Coherence requires that the subevents corresponding to SF predicates be
interpreted as contemporaneous or causally related (Kaufmann 1995c).Thus, the causal
interpretation of geben in (23) is not explicitly given in SFbut left to Coherence as a general
CS principle of interpretation.
7.3. Linguistic phenomena
The main concern of LDG is argument linking, and its basic assumption is that syntactic
properties of arguments follow from hierarchical structures within semantic form.
Structural linking is based on the assignment of two binary features to the arguments in TS,
[± hr] 'there is a / no higher role' and [± lr] 'there is a / no lower role'. The syntactic
features are associated with these two binary features, dative with [+ hr, + lr], accusative with
[+hr], ergative with [+lr], and nominative/absolutive with [ ] (cf. Fig. 17.14). All and only
the structural arguments have to be matched with a structural linker. Besides structural
linking, it is taken into account that arguments can be suppressed or realized by oblique
markers.This also motivates the distinction between SF andTS (Wunderlich 1997b: 47ff;
2000: 252). The following examples show how structural arguments are matched with
structural linkers in nominative-accusative (NA) and absolutive-ergative (AE) languages
(Wunderlich 1997a: 49):
intransitive verbs:
transitive verbs:
ditransitive verbs:
NA:
AE:
NA:
AE:
NA:
AE:
Xx
[-hr,
nom
abs
Xy
[+hr,
ace
abs
Xz
[+hr,
ace
abs
-lr]
-lr]
-lr]
Xx
[-hr,+lr]
nom
erg
Xy
[+hr,+lr]
dat
dat
Xx
[-hr,+lr]
nom
erg
LDG pursues a strictly lexical account of argument extensions such as possessors,
beneficiaries, or arguments introduced by word formation processes or resultative formation.
These argument extensions are all handled within SF formation by adding predicates to an
existing SF (Stiebels 1996;Wunderlich 2000).Thus, the complex verb in (25a) is represented
as the complex SF (25c) on the basis of (25b) and an argument extension principle.
(25) a. Sie erschrieb sich den Pulitzer-Preis.
she "er"-wrote herself the Pulitzer Prize
'She won the Pulitzer Prize by her writing.'
b. schreib- 'write': XyXxXs WRiTE(x,y)(s)
c. erschreib: XvXuXxXsBy {wRiTE(x,y)(s) & become poss(u,v)}(s)
17. Frameworks of lexical decomposition of verbs 383
These processes are restricted by two constraints on possible verbs, Coherence and
Connexion, the latter one requiring that each predicate in SF share at least one, possibly
implicit, argument with another predicate in SF (Kaufmann 1995c). In (25c), Coherence
guarantees the causal interpretation, and Connexion accounts for the identification of
the agent of writing with the possessor of the prize. The resulting SF is then subject to
Argument Hierarchy and the usual linking principles. As we have seen in (25c), the
morphological operation adds semantic content to SF as it does with other operations
like resultative formation. In other cases, morphology operates on TS in order to change
linking conditions (e.g., passive) (Wunderlich 1997a: 52f).
During the last 20 years, LDG has produced numerous studies on phenomena in
a number of typologically diverse languages, dealing with agreement (Wunderlich
1994), word formation of verbs (Wunderlich 1997b; Stiebels 1996; Gamerschlag 2005),
locative verbs (Kaufmann 1995a), causatives and resultatives (Wunderlich 1997a;
Kaufmann 1995a),dative possessors (Wunderlich 2000),ergative case systems (Joppen &
Wunderlich 1995), and nominal linking (Stiebels 2006).
7.4. Evaluation
In contrast to some other decompositional approaches, LDG adheres to a
compositional approach to meaning and tries to define its relation to current syntactic theories.
In more recent publications (Stiebels 2002; Gamerschlag 2005), LDG has been
reformulated within an optimality theoretic framework. Lexical Decomposition Grammar
is criticized by Taylor (2000), in particular for its division between semantic and
conceptual knowledge. LDG, based on Two-level Semantics, accounts for the different
readings of a lexical item within conceptual structure, leaving lexical entries largely
monosemous.Taylor argues that lexical usage is to a large degree conventionalized and
that the particular readings a word docs or docs not have cannot be construed entirely
from conceptual knowledge. Bierwisch (2002) presents a number of arguments against
the removal of cause from SF decompositions. Further problems, emerging from
structural stipulations, are discussed in article 7 (Engelberg) Lexical decomposition,
section 3.5.
8. Natural Semantic Metalanguage
8.1. Origins and motivation
The theory of Natural Semantic Metalanguage (NSM) originated in the early seventies.
Its main proponents have been Anna Wierzbicka (1972; 1980; 1985; 1992; 1996) and Cliff
Goddard (1998; 2006; 2008a; Goddard & Wierzbicka 2002). NSM theory has been
developed as an attempt to construct a semantic metalanguage (i) that is expressive enough to
cover all the word meanings in natural languages, (ii) that allows noncircular reductive
paraphrases, (iii) that avoids metalinguistic elements that are not part of the natural
language it describes, (iv) that is not ethnocentric, and (v) that makes it possible to uncover
the universal properties of word meanings (cf. for an overview Goddard 2002a; Durst
2003). In order to achieve this, Wierzbicka suggested that the lexicon of a language can
be divided into a small set of indefinable words (semantic primes) and a large set of
words that can be defined in terms of these indefinables.
384
IV. Lexical semantics
8.2. Structure and location of decompositions
The term Natural Semantic Metalanguage is intended to reflect that the semantic primes
used as a metalanguage are actual words of the object language. The indefinables
constitute a finite set and, although they are language-specific, each language-specific set
"realizes, in its own way, the same universal and innate alphabet of human thought"
(Wicrzbicka 1992: 209). More precisely, this implies that the set of semantic primes of a
particular language and their combinatorial potential have the expressive power of a full
natural language and that the sets of semantic primes of all languages are isomorphic
to each other. The set of semantic primes consists of 60 or so elements including such
words as you, this, two, good, know, see, word, happen, die, after, near, if, very, kind of,
and like, each disambiguated by a canonical context (cf. Goddard 2008b). These primes
are claimed to be indefinable and indispensable (cf. Goddard & Wierzbicka 1994b;
Wierzbicka 1996; Goddard 2002b; Wierzbicka 2009). Meaning descriptions within NSM
theory look like the following (Wierzbicka 1992:133):
(26) a. (X is embarrassed)
b. X thinks something like this:
something happened to me now
because of this, people here arc thinking about mc
I don't want this
because of this, I would want to do something
I don't know what I can do
I don't want to be here now
because of this, X feels something bad
It is required for the relationship between the defining decomposition and the defined
term that they be identical in meaning.This is connected to substitutability; the definiens
and the definiendum are supposed to be replaceable by each other without change of
meaning (Wicrzbicka 1988:12).
8.3. Linguistic phenomena
More than any other decompositional theory, NSM theory resembles basic lexicographic
approaches to meaning, in particular, those traditions of English learner lexicography
in which definitions of word meanings are restricted to the non-circular use of a limited
"controlled" defining vocabulary (e.g., Summers 1995).Thus, it is not surprising that NSM
theory tackles word meanings in many semantic fields that have not been at the centre
of attention within other decompositional approaches, for example, pragmatically
complex domains like speech act verbs (Wierzbicka 1987). Other investigations focus on the
cultural differences reflected in words and their alleged equivalents in other languages,
for example, Wierzbicka's (1999) study on emotion words. NSM theory also claims to
be able to render the meaning of syntactic constructions and grammatical categories by
decompositions. An example is given in (27).
(27) a. f'first person plural exclusive']
b. I'm thinking of some people
17. Frameworks of lexical decomposition of verbs 385
I am one of these people
you are not one of these people (Goddard 1998:299)
The claim of NSM theory to be particularly apt as a means to detect subtle cross-linguistic
differences is reflected in Goddard & Wierzbicka (1994a), where studies on a fairly large
number of typologically and genetically diverse languages are presented.
8.4. Evaluation
While many of its critics acknowledge that NSM theory has provided many insights into
particular lexical phenomena, its basic theoretical assumptions have often been subject
to criticism. It has been called into question whether the emphasis on giving dictionary-
style explanations of word meanings is identical to uncovering the native speaker's
knowledge about word meaning. NSM theory has also been criticized for not putting
much effort into providing a foundation for the theory on basic semantics concepts
(cf. Riemer 2006: 352). The lack of a theory of truth, reference, and compositionality
within NSM theory raised severe doubts about whether it can adequately deal with
phenomena like quantification, anaphora, proper names, and presuppositions (Geurts 2003;
Matthewson 2003; Barker 2003). This criticism also affects the claim of the theory to be
able to cover the semantics of the entire lexicon of a language.
9. Lexical Relational Structures
9.1. Origins and motivation
With the decline of Generative Semantics in the 1970s, lexical approaches to
decomposition began to dominate the field. These approaches enriched our understanding of the
complexity of lexical meaning as well as the possibility of generalizations across verb
classes.Then in the late 1980s syntactic developments within the Principles & Parameter
framework suggested more complex structures within the VP. The assumption of VP-
internal subjects and, in particular, Larson's (1988) theory of VP-shells as layered VP-
internal structures suggested the possibility to align certain bits of verb-internal semantic
structure with structural positions in layered VPs. With these developments underway,
the time was ripe for new syntactic approaches to decomposition (cf. also the summary
in Levin & Rappaport Hovav 2005:131 ff).
9.2. Structure and location of decompositions
On the basis of data from binding, quantification, and conjunction with respect to double
object constructions, Larson (1988: 381) argues for the Single Argument Hypothesis,
according to which a head can only have one argument. This forces a layered structure
with multiple heads within VP. Fig. 17.15 exhibits the structure of Mary gave a box to
Tom within this VP-shell. The verb moves to the higher V node by head-movement.
The mapping of a verb's arguments onto the nodes within the VP-shell is determined
by a thcta hierarchy 'agent > theme > goal > obliques' such that the lowest role of a
verb is assigned to the lowest argument position, the next lowest role to the next lowest
argument position, and so on (Larson 1988:382). Thus, there is a weak correspondence
between structural positions and verb semantics in the sense that high argument positions
386
IV. Lexical semantics
are associated with a comparatively high thematic value. However, structural positions
within VP shells are not linked to any stable semantic interpretation.
assigned by Thematic Hierarchy
AGENT THEME GOAL <- give
Fig. 17.15: VP-shell and theta role assignment
Larsonian shells inspired research on the syntactic representation of argument
structure. Particularly influential was the approach pursued by Hale & Keyser (1993; 1997;
2002). They assume that argument structure is handled within a lexicon component
called 1-syntax, which is an integral part of syntax as it obeys syntactic principles. The
basic assumption is that argument structure, also called "lexical relational structure", is
defined in reference to two possible relations between a head and its arguments, namely,
the head-complement and the head-specifier relation (Hale & Keyser 1999: 454). Each
verb projects an unambiguous structure in 1-syntax. In Fig. 17.16, the lexical relational
structure of put as in put the books on the shelf is illustrated.
(on the shelf)
Fig. 17.16: Lexical relational structure of put and head movement of the verb
17. Frameworks of lexical decomposition of verbs 387
Primary evidence for this approach is taken from verbs that are regarded as denominal.
Locational verbs of this sort such as to shelve, to box, or to saddle receive a similar
representation as to put. The Lexical Relational Structure of shelve consists of Larsonian
VP-shells with the noun shelf as complement of the embedded prepositional head. From
there, the noun incorporates into an abstract V head by head movement (cf. Fig. 17.17).
(Hale &Keyser 1993:55ff)
V
V VP
NP V
Fig. 17.17: Lexical relational structure of shelve and incorporation of the noun
In a similar way, unergative verbs like sneeze or dance (28b), which are assumed to have
a structure parallel to expressions like make trouble or have puppies (28a), are derived
by incorporation of a noun into a V head (Hale & Kcyscr 1993:54f).
(28) a. [have v [puppies N] Np] v.
b. [sneezej v [t N] Np] v.
Hale & Kayser (1993:68) assume that "elementary semantic relations" are "associated"
with these syntactic structures:The agent occurs in a Spec position above VP, the theme
in a Spec position of a V that takes a PP/AP complement, and so forth. The argument
structure of shelve is thus related to semantic relations as exhibited in Fig. 17.18.
It is important to keep in mind that Hale and Keyser do not claim that argument
structures are derived from semantics. On the contrary, they assume that "certain
meanings can be assigned to certain structures" in the sense that they are fully determined by
1-syntactic configurations (Hale & Keyser 1993:68; 1999:463).
Fig. 17.18 also reflects two important points of Hale and Kcyscr's theory. Firstly, they
assume a central distinction between verb classes: Contrary to the early exploration of
VP structures as in Fig. 17.17, unergatives and transitives in contrast to unaccusatives
are assumed not to have a subject as part of their argument structure; their subjects
are assigned in s-syntax (Hale & Keyser 1993:76ff). Secondly, Hale and Kayser
emphasize that their approach explains why the number of theta roles is (allegedly) so small,
namely, because there is only a very restricted set of syntactic configurations with which
they can be associated.
388
IV. Lexical semantics
Fig. 17.18: Semantic relations associated with lexical relational structures
(after Hale & Keyserl993:76ff)
9.3. Linguistic phenomena
Hale and Keyser's approach aims to explain why certain argument structures are possible
while others are not. For example, it is argued that sentences like (29a) are ungrammat-
ical because incorporation of a subject argument violates the Empty Category Principle
(Hale & Keyser 1993:60).The ungrammaticality of (29b) is accounted for by the
assumption that unergatives as in (28) do not project a specifier that would allow a transitivity
alternation (Hale & Kcyscr 1999:455). (29c) is argued to be ungrammatical because the
Lexical Relational Structure would have to be parallel to she gave a church her money,
in which church occupies Spec,VP, the "subject" position of the inner VP. Incorporation
from this position violates the Empty Category Principle. By the same reasoning, she
flattened the metal is well-formed, incorporating/?^ from an AP in complement position
while (29d) is not since metal would have to incorporate from an inner subject position.
(29) a. *It cowed a calf, (with the meaning 'a cow calved' and it as expletive)
b. *An injection calved the cow early.
c. *She churched the money.
d. *Shc metalled flat.
This incorporation approach allows Hale and Keyser to explore the parallels in
syntactic behaviour between expressions like give a laugh and laugh which, besides being
17. Frameworks of lexical decomposition of verbs 389
near-synonymous, both fail to transitivize, as well as the differences between
expressions like make trouble and thicken soups where only the latter allows middles and
inchoatives.
9.4. Evaluation
Hale and Keyser's work has stimulated a growing body of research aiming at a syn-
tactification of notions of thematic roles, decompositional and aspectual structures.
Some prominent examples are Mateu (2001), Alexiadou & Anagnostopoulou (2004),
Ertcschik-Shir & Rapoport (2005; 2007), Zubizarrcta & Oh (2007), and Ramchand's
(2008) first phase syntax. Some of this work places a strong emphasis on aspectual
structure, for example, Ritter & Rosen (1998) and Travis (2000). As Travis (2000:181f) argues,
the new syntactic approaches to decompositions avoid many of the pitfalls of Generative
Semantics, which is certainly due to a better understanding of restrictive principles in
recent syntactic theories.
However, Hale and Keyser's work has also attracted heavy criticism from
proponents of Lexical Conceptual Structure (e.g., Culicover & Jackendoff 2005, Rap-
paport Hovav & Levin 2005), Two-level Semantics (e.g., Bierwisch 1997; Kiparsky
1997) as well as from anti-decompositionalist positions (e.g., Fodor & Lepore
1999). The analyses themselves raise many questions. For example, it remains
unexplained which principles exclude a lexical structure of a putative verb to church
(29c) along the lines of she gave the money to the church, that is, a structure
parallel to the one suggested for to shelve (cf. also Kiparsky 1997: 481). It has also been
observed that the position allegedly vacated by the noun in structures as in Fig.
17.18 can actually show lexical material as in Joe buttered the toast with rancid butter
(Culicover & Jackendoff 2005: 102). Many other analyses and assumptions have
also been under attack, among them assumptions about which verbs are denominal
and how their meanings come about (Kiparsky 1997: 485ff; Culicover & Jackendoff
2005:55) as well as predictions about possible transitivity alternations (Kiparsky 1997:
491). Furthermore, one can of course doubt that the number of different theta roles
is as small as Hale and Keyser assume. More empirically oriented approaches to verb
semantics come to dramatically different conclusions (cf. Kiparsky 1997: 478 or work
on Frame Semantics like Ruppcnhofcr ct al. 2006). Even some problems from
Generative Semantics reemerge, such as that expressions like put on a shelf and shelve are not
synonymous, the latter being more specific (Bierwisch 1997: 260). Overgeneralization
is not accounted for, either. The fact that there is a verb to shelve but no semanti-
cally corresponding verb to basket points to a location of decomposition in the lexicon
(Bierwisch 1997: 232f). Reacting to some of the criticism, Hale & Kcyscr (2005) later
modified some assumptions of their theory; for example, they abandoned the idea of
incorporation in favour of a locally operating selection mechanism.
10. Distributed Morphology
10.1. Origins and motivation
Hale & Keyser (1993) and much research inspired by them have attempted to reduce
the role of the lexicon in favour of syntactic representations. An even more radical anti-
390
IV. Lexical semantics
lexicalist approach is pursued by Distributed Morphology (DM), which started out in
Halle & Marantz (1993) and since then has been elaborated by a number of DM
proponents (Marantz 1997; Harley 2002; Harley & Noyer 1999; 2000; Embick 2004; Embick &
Noyer 2007) (cf. also article 81 (Harley) Semantics in Distributed Morphology).
10.2. Structure and location of decompositions
According to Distributed Morphology, syntax does not combine words but generates
structures by combining morphosyntactic features. Terminal nodes, so-called
"morphemes", are bundles of these morphosyntactic features. DM distinguishes f-nodes from
1-nodes. F-nodes correspond to what is traditionally known as functional, closed-class
categories; their insertion at spell-out is deterministic. L-nodes correspond to lexical,
open-class categories; their insertion is not deterministic. Vocabulary items are only
inserted at spell-out. These vocabulary items arc minimally specified in that they only
consist of a phonological string and some information where this string can be inserted
(cf. Fig. 17.19).
abstract syntactic representation
I spell-out
syntax
linguistic expression
I interpretation
conceptual interface
vocabulary
encyclopaedia
Fig. 17.19: The architecture of distributional morphology
(cf. Harley & Noyer 2000:352; Embick & Noyer 2007)
Neither a syntactic category nor any kind of argument structure representation is
included in vocabulary entries as can be seen in example (30a) from Harley & Noyer
(1999: 3). The distribution information in the vocabulary item replaces what is usually
done by theta-roles and selection. In addition to the Vocabulary, there is a component
called Encyclopaedia where vocabulary items are linked to those aspects of meaning that
are not completely predictable from morphosyntactic structure (30b).
(30) a. Vocabulary item: /dog/: [Root] [+count] [+animatc] ...
b. Encyclopaedia item: dog: four legs, canine, pet, sometimes bites etc... chases
balls, in environment "let sleeping s lie", refers to discourse entity who is
better left alone...
While the formal information in vocabulary items in part determines grammatical well-
formedness, the encyclopaedic information guides the appropriate use of expressions. For
example, the oddness of (31a) is attributed to encyclopaedic knowledge.The sentence is
pragmatically anomalous but interpretable: It could refer to some unusual telepathic
transportation event. (31b), on the other hand, is considered ungrammatical because put is
17. Frameworks of lexical decomposition of verbs
391
not properly licensed and, therefore, uninterpretable under any circumstances (Harley &
Noyer 2000:354).
(31) a. Chris thought the book to Mary,
b. *James put yesterday.
Part of speech is reflected in DM by the constellation in which a root morpheme occurs.
For example, a root is a noun if its nearest c-commanding f-node is a determiner and a
verb if its nearest c-commanding f-nodes are v, aspect, and tense. Not only are lexical
entries more reduced than in approaches based on Hale & Keyser (1993), there is also no
particular part of syntax corresponding to 1-syntax (cf. for this overview Harley & Noyer
1999,2000; Embick & Noyer 2007).
10.3. Linguistic phenomena
Distributed Morphology has been applied to all kinds of phenomena in the domain of
inflectional and derivational morphology. Some work has also been done with respect to
the argument structure of verbs and nominalizations. One of the main topics in this area
is the explanation of the range of possible argument structure alternations. A typical set
of data is given in (32) and (33) (taken from Harley & Noyer 2000:362).
(32) a. John grows tomatoes.
b. Tomatoes grow.
c. The insects destroyed the crop.
d. *The crops destroyed.
(33) a. the growth of the tomatoes
b. the tomatoes' growth
c. *John's growth of the tomatoes
d. the crop's destruction
e. the insects' destruction of the crop
It has to be explained why grow, but not destroy, has an intransitive variant and why
destruction, but not growth, allows the realization of the causer argument in Spec, DP
(for the following, cf. Harley & Noyer 2000: 356ff). Syntactic structures are based on
VP-shells. For each node in these structures, there is a set of possible items that can fill
this position (with LP corresponding approximately to VP):
a.
b.
c.
d.
e.
node
Spec,vP
v head
Spec,LP
L head
Comp,LP
possible filling
DP,0
happen/become, cause, be
DP,0
1-node
DP,0
Picking from this menu, a number of different syntactic configurations can be created:
392
IV. Lexical semantics
(35)
a.
b.
c.
d.
e.
f.
Spec
DP
0
DP
DP
0
0
j£
y
CAUSE
BECOME
CAUSE
CAUSE
BECOME
BE
Spec.LP
0
0
0
DP
0
DP
L
Comp.LP
DP
DP
DP
DP
DP
DP
(example)
grow (tr.)
grow (itr.)
destroy
give
arrive
know
The items filling the v head are the only ones conceived of as having selectional
properties: cause, but not become or be, selects an external argument.The grammaticality of the
configurations in (35) is also determined by the licensing environment specified in the
Vocabulary (cf. Fig. 17.20).
VOCABULARY
Phonology Licensing environment
ENCYCLOPAEDIA
Fig. 17.20: Vocabulary and encyclopaedic entries of verbs (after Harley & Noyer 2000:361)
Thus, the transitive and intransitive uses of the roots in (32a) through (32c) are reflected
in the syntactic structures in (36), where cause and become are realized as zero
morphemes (cf. article 81 (Harley) Semantics in Distributed Morphology) and the LP head is
assumed to denote a resulting state:
(36) a. [iP[DPJohn][,, cause [LP grown [Dp tomatoes]]]]
b. [v. become [Lp grown [Dp tomatoes]]]
c. [iP [DP the insects] [|r. cause [Lp destroyed [Dp the crop]]]]
The fact that destroy does not allow an intransitive variant is due to the fact that its
licensing environment requires embedding under cause while grow is underspecified
in this respect. The explanation for the nominalization data in (33) relies on the
assumption that Spec,DP is not as semantically loaded as Spec,vp. It is further assumed that by
encyclopaedic knowledge destroy always requires external causation while grow refers
inherently to an internally caused spontaneous activity, which is optionally facilitated by
some agent. Since cause is only implied with destroy but not with grow, only the causer
of destroy can be interpreted in a semantically underspecified Spec,DP position. The fact
that some verbs like explode behave partly like grow, in allowing the transitive-intransitive
alternation, and partly like destroy, in allowing the realization of the causer in nominal-
izations, is explained by the assumption that the events denoted by such roots can occur
spontaneously (internal causation) but can also be directly brought about by some
agent (external causation) (cf. also Marantz 1997). In summary, the phenomena in (32)
are traced back to syntactic regularities, those in (33) to encyclopaedic, that is, pragmatic
conditions.
17. Frameworks of lexical decomposition of verbs 393
10.4. Evaluation
While some other approaches to argument structure share a number of assumptions with
DM - for example, they also operate on category-neutral roots (e.g., Borer 2005; Arad
2002) - of course the radical theses of Distributed Morphology have also drawn some
criticism. It has been doubted that all the differences in the syntactic behaviour of verbs
can be accounted for with a syntax-free lexicon (cf.c.g., Ramchand 2008).
Cross-linguistic differences might also pose some problems. For example, it is
assumed that verbs allowing the unaccusative-transitive alternation are distinguished
on the basis of encyclopaedic semantic knowledge from those that do not (Embick
2004: 139). The causative variant of the showcase example grow in (32a) is
grammatically licensed and pragmatically acceptable because of encyclopaedic
knowledge. However, it is not clear why German wachsen 'grow' does not have a causative
variant.The alternation would be expected since wachsen does not seem to differ from
grow in its encyclopaedic properties. Moreover, many other verbs like to dry 'trocknen'
(37) demonstrate that German does not show any kind of structural aversion to
alternations of this sort.
(37) a. Der Salat trocknet / wachst.
'The lettuce dries / grows.'
b. Peter trocknet Salat / *wachst Salat.
'Peter dries lettuce / grows lettuce.'
The way the line is drawn between grammatical and pragmatic (un)acceptability also
poses some problems. If the use of put with only one argument is considered ungrammat-
ical, then how can similar uses of three-place verbs like German stellen 'put (in upright
position)' and geben 'give' be explained (38)?
(38) a. Er gibt.'He deals (in a card game).'
b. Sie stellt.'She plays a volleyball such that somebody can smash it.'
Since they are ruled out by grammar, encyclopaedic knowledge cannot save these
examples by assigning them an idiomatic meaning. Thus, it might turn out that sometimes
argument-structure flexibility is not as general as DM's encyclopaedia suggests, and
grammatical restrictions are not as strict as syntax and DM's vocabulary predict.
11. Outlook
The overview has shown that stances on lexical decomposition still differ widely, in
particular with respect to the questions of where to locate lexical decompositions, how to
interpret them, and how to justify them. It has to be noted that most work on lexical
decompositions has not been accompanied by extensive empirical research. With the rise
of new methods in the domain of corpus analysis, grammaticality judgements, and psy-
cholinguistics (cf. article 7 (Engelberg) Lexical decomposition, section 3.6), the empirical
basis for further decompositional theories will alter dramatically. It remains to be seen
how theories of the sort presented here will cope with the empirical turn in contemporary
linguistics.
394
IV. Lexical semantics
12. References
Alexiadou, Artemis & Elena Anagnostopoulou 2004. Voice morphology in the causative-inchoative
alternation: Evidence for a non-unified structural analysis of unaccustives. In: A. Alexiadou.
E. Anagnostopoulou & M. Everaert (eds.). The Unaccusativity Puzzle. Explorations of the
Syntax-Lexicon Interface. Oxford: Oxford University Press, 114-136.
Arad, Maya 2002. Universal features and language-particular morphemes. In: A. Alexiadou (ed.).
Theoretical Approaches to Universals. Amsterdam: Benjamins, 15-29.
Bach, Emmon 1968. Nouns and noun phrases. In: E. Bach & R.T. Harms (eds.). Universals in
Linguistic Theory. New York: Holt, Rinehart & Winston, 90-122.
Barker, Chris 2003. Paraphrase is not enough. Theoretical Linguistics 29.201-209.
Bartsch, Renate & Theo Vennemann 1972. Semantic Structures. A Study in the Relation between
Semantics and Syntax. Frankfurt/M.: Athenaum.
Bierwisch, Manfred 1982. Formal and lexical semantics. Linguistische Berichte 30,3-17.
Bierwisch, Manfred 1989. Event nominalizations: Proposals and problems. In: W. Motsch (ed.).
Wortstruktur und Satzstruktur. Berlin: Akademie Verlag. 1-73.
Bierwisch, Manfred 1997. Lexical information from a minimalist point of view. In: C. Wilder, H.-M.
Gartner & M. Bierwisch (eds.). The Role of Economy Principles in Linguistic Theory. Berlin:
Akademie Verlag, 227-266.
Bierwisch, Manfred 2002. A case for cause. In: I. Kaufmann & B. Stiebels (eds.). More than Words:
A Festschrift for Dieter Wunderlich. Berlin: Akademie Verlag. 327-353.
Bicrwisch, Manfred & Ewald Lang (eds.) 1989. Dimensional Adjectives. Grammatical Structure and
Conceptual Interpretation. Berlin: Springer.
Binnick, Robert I. 1968. On the nature of the 'lexical item'. In: B. J. Dai den, C.-J. N. Bailey &
A. Davison (eds.). Papers from the Fourth Regional Meeting of the Chicago Linguistic Society
(= CLS). Chicago. IL: Chicago Linguistic Society, 1-13.
Binnick. Robert J. 1972. Zur Entwicklung der generativen Semantik. In: W. Abraham & R. J.
Binnick (eds.). Generative Semantik. Frankfurt/M.: Athenaum, 1-48.
Borer, Hagit 2005. Structuring Sense, vol. 2: The Normal Course of Events. Oxford: Oxford
University Press.
Carter, Richard J. 1976. Some constraints on possible words. Semantikos 1,27-66.
Carter, Richard 1988. Arguing for semantic representations. In: B. Levin & C. Tenny (eds.). On
Linking: Papers by Richard Carter. Lexicon Project Working Papers 25. Cambridge, MA: Center
for Cognitive Science, MIT, 139-166.
Chomsky, Noam 1965. Aspects of the Theory of Syntax. Cambridge, MA: The MIT Press.
Chomsky, Noam 1970. Some empirical issues in the theory of transformational grammar. In: P. S.
Peters (ed.). Goals of Linguistic Theory. Englcwood Cliffs, NJ: Prentice Hall, 63-130.
Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The
Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI Publications, 21-63.
Culicover, Peter W. & Ray Jackendoff 2005. Simpler Syntax. Oxford: Oxford University Press.
Davidson, Donald 1967. The logical form of action sentences. In: N. Reseller (ed.). The Logic of
Decision and Action. Pittsburgh, PA: University of Pittsburgh Press, 81-95.
Dowty, David R. 1972. Studies in the Logic of Verb Aspect and Time Reference in English. Ph.D.
dissertation. University of Texas, Austin,TX.
Dowty, David R. 1976. Montague Grammar and the lexical decomposition of causative verbs. In:
B. Partee (ed.). Montague Grammar. New York: Academic Press, 201-246.
Dowty. David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times
in Generative Semantics and in Montague's PTQ. Dordrecht: Rcidcl.
Dowty, David R. 1991.Thematic proto-roles and argument selection. Language 67'. 547-619.
Durst, Uwe 2003.The Natural Semantic Metalanguage approach to linguistic meaning. Theoretical
Linguistics 29,157-200.
17. Frameworks of lexical decomposition of verbs 395
Einbick, David 2004. Unaccusative syntax and verbal alternations. In: A. Alexiadou, E. Anagno-
stopoulou & M. Everaert (eds.). The Unaccusativity Puzzle. Explorations of the Syntax-Lexicon
Interface. Oxford: Oxford University Press, 137-158.
Einbick, David & Rolf Noyer 2007. Distributed Morphology and the syntax/morphology interface.
In: G. Ramchand & C. Reiss (eds.). The Oxford Handbook of Linguistic Interfaces. Oxford:
Oxford University Press, 289-324.
Engelberg, Stefan 2006. A theory of lexical event structures and its cognitive motivation. In:
D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 235-285.
Erteschik-Shir. Nomi & Tova Rapoport 2005. Path predicates. In: N. Erteschik-Shir & T. Rapoport
(eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford
University Press, 65-86.
Erteschik-Shir, Nomi &Tova Rapoport 2007. Projecting argument structure. The grammar of
hitting and breaking revisited. In: E. Reuland. T. Bhattacharya & G. Spathas (eds.). Argument
Structure. Amsterdam: Benjamins, 17-35.
Fillmore, Charles J. 1968a. Lexical entries for verbs. Foundations of Language 4,373-393.
Fillmore, Charles J. 1968b. The case for case. In: E. Bach & R.T. Harms (eds.). Universals in
Linguistic Theory. New York: Holt, Rinehart & Winston, 1-88.
Fodor, Jerry A. 1970. Three reasons for not deriving'kill' from 'cause to die'. Linguistic Inquiry 1,
429-438.
Fodor, Jerry A., Merrill F. Garrett, Edword C.T. Walker & Cornelia H. Parkes 1980. Against
definitions. Cognition 8,263-367.
Fodor, Jerry & A. & Ernie Lepore 1999. Impossible words? Linguistic Inquiry 30,445-453.
Gamerschlag,Thomas 2005. Komposition und Argumentstniktiir komplexer Verben. Eine lexikalische
Analyse von Verb-Verb-Komposita und Serialverbkonstniktionen. Berlin: Akademie Verlag.
Geurts, Bart 2003. Semantics as lexicography. Theoretical Linguistics 29.223-226.
Goddard, Cliff 1998. Semantic Analysis. A Practical Introduction. Oxford: Oxford University Press.
Goddard, Cliff 2002a. Lexical decomposition II: Conceptual axiology. In: A. D. Cruse et al.
(eds.). Lexikologie - Lexicology. Ein internationales Handbuch zur Natur und Struktur von
Wortern und Wortschatzen - An International Handbook on the Nature and Structure of Words
and Vocabularies, (HSK 21.1). Berlin: de Gruyter, 256-268.
Goddard. Cliff 2002b. The search for the shared semantic core of all languages. In: C. Goddard &
A. Wierzbicka (eds.). Meaning and Universal Grammar, vol. I: Theory and Empirical Finding.
Amsterdam: Benjamins, 5-40.
Goddard. Cliff 2006. Ethnopraginatics: A new paradigm. In: C. Goddard (cd.). Ethnopragmatics.
Understanding Discourse in Cultural Context. Berlin: de Guytcr, 1-30.
Goddard, Cliff 2008a. Natural Semantic Metalanguage: The state of the art. In: C. Goddard (ed.).
Cross-Linguistic Semantics. Amsterdam: Benjamins, 1-34.
Goddard. Cliff 2008b.Towards a systematic table of semantic elements. In: C. Goddard (cd.). Cross-
Linguistic Semantics. Amsterdam: Benjamins, 59-81.
Goddard, Cliff, & Anna Wierzbicka (eds.) 1994a. Semantic and Lexical Universals - Theory and
Empirical Findings. Amsterdam: Benjamins.
Goddard, Cliff & Anna Wierzbicka 1994b. Introducing lexical primitives. In: C. Goddard &
A. Wierzbicka (eds.). Semantic and Lexical Universals - Theory and Empirical Findings.
Amsterdam: Benjamins, 31-54.
Goddard. Cliff & Anna Wierzbicka 2002. Semantic primes and universal grammar. In: C. Goddard &
A. Wierzbicka (eds.). Meaning and Universal Grammar, vol. I: Theory and Empirical Findings.
Amsterdam: Benjamins, 41-85.
Grimshaw. Jane 1990. Argument Structure. Cambridge, MA:The MIT Press.
Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA.
Guerssel, Mohamed, Kenneth Hale, Mary Laughren, Beth Levin & Josie White Eagle 1985. A
cross-linguistic study of transitivity alternations. In: W. H. Eilfort, P. D. Kroeber & K. L. Peterson
396
IV. Lexical semantics
(eds.). Papers from the Parasession on Causatives and Agentivity at the 21st Regional Meeting
of the Chicago Linguistic Society (= CLS), April 1985. Chicago, IL: Chicago Linguistic Society,
48-63.
Hale, Ken & Samuel Jay Keyser 1987. A View from the Middle. Lexicon Project Working Papers 10.
Cambridge. MA: Center for Cognitive Science, MIT.
Hale. Ken & Samuel Jay Keyser 1993. On argument structure and the syntactic expression of lexical
relations. In: K. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Linguistics in
Honor ofSylvain Bromberger. Cambridge, MA:The MIT Press,53-109.
Hale, Ken & Samuel Jay Keyser 1997. On the complex nature of simple predicators. In: A. Alsina,
J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 29-65.
Hale, Ken & Samuel Jay Keyser 1999. A response to Fodor and Lepore, "Impossible words?"
Linguistic Inquiry 30,453-466.
Hale, Ken & Samuel Jay Keyser 2002. Prolegomenon to a Theory of Argument Structure.
Cambridge, MA:The MIT Press.
Hale, Ken & Samuel Jay Keyser 2005. Aspect and the syntax of argument structure. In:
N. Ertcschik-Shir &T. Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual
Interpretation. Oxford: Oxford University Press, 11-41.
Halle, Morris & Alec Marantz 1993. Distributed Morphology and the pieces of inflection. In:
K. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in Honor ofSylvain Bomberger.
Cambridge, MA:The MIT Press, 111-176.
Harley, Heidi 2002. Possession and the double object construction. In: P. Pica & J. Rooryck (eds.).
Linguistic Variation Yearbook, vol. 2. Amsterdam: Benjamins, 31-70.
Harley, Heidi & Rolf Noyer 1999. Distributed Morphology. GLOTInternational 4,3-9.
Harley, Heidi & Rolf Noyer 2000. Formal versus encyclopedic properties of vocabulary: Evidence
from nominalizations. In: B. Peelers (ed.). The Lexicon-Encyclopedia Interface. Amsterdam:
Elsevier. 349-375.
Immler, Manfred 1974. Generative Syntax - Generative Semantik. Miinchcn: Fink.
Jackendoff. Ray 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA:The MIT
Press.
Jackendoff. Ray 1976.Toward an explanatory semantic representation. Linguistic Inquiry 7,89-150.
Jackendoff, Ray 1983. Semantics and Cognition. Cambridge. MA:Thc MIT Press.
Jackendoff, Ray 1987. Relations in linguistic theory. Linguistic Inquiry 18,369-411.
Jackendoff. Ray 1990. Semantic Structures. Cambridge. MA:The MIT Press.
Jackendoff, Ray 2002. Foundations of Language. Brain, Meaning, Grammar, Evolution. Oxford:
Oxford University Press.
Joppen. Sandra & Dieter Wunderlich 1995. Argument linking in Basque. Lingua 97,123-169.
Joshi, Aravind 1974. Factorization of verbs. In: C. H. Heidi ich (ed.). Semantics and Communication.
Amsterdam: North-Holland, 251-283.
Kandiah.Thiru 1968.Transformational Grammar and the layering of structure in Tamil. Journal of
Linguistics 4,217-254.
Kaufmann, Ingrid 1995a. Konzeptuelle Grundlagen semantischer Dekompositionsstrukturen. Die
Kombinatorik lokaler Verben und pradikativer Komplemente. Tubingen: Niemeyer.
Kaufmann, Ingrid 1995b. O- and D-predicates: A semantic approach to the u n accusative-
unergative distinction. Journal of Semantics 12,377-427.
Kaufmann, Ingrid 1995c. What is an (im-)possible verb? Restrictions on semantic form and their
consequences for argument structure. Folia Linguistica 29,67-103.
Kiparsky, Paul 1997. Remarks on dcnominal verbs. In: A. Alsina. J. Bresnan & P. Sells (eds.).
Complex Predicates. Stanford. CA: CSLI Publications, 473-499.
Lakoff, George 1965. Irregularity in Syntax. Ph.D. dissertation. Indiana University. Bloomington.
IN. Reprinted: New York: Holt, Rinehart & Winston. 1970.
Lakoff, George & John Robert Ross 1972. A note on anaphoric islands and causatives. Linguistic
Inquiry 3, 121-125.
17. Frameworks of lexical decomposition of verbs 397
Larson, Richard K. 1988. On the double object construction. Linguistic Inquiry 19,335-391.
Levin, Beth 1985. Lexical semantics in review: An introduction. In: B. Levin (ed.). Lexical Semantics
in Review. Cambridge, MA:The MIT Press, 1-62.
Levin, Beth 1993. English Verb Classes and Alternations. A Preliminary Investigation. Chicago, IL:
The University of Chicago Press.
Levin, Beth 1995. Approaches to lexical semantic representation. In: D. E. Walker, A. Zampolli &
N. Calzolari (eds.). Automating the Lexicon: Research and Practice in a Multilingual
Environment. Oxford: Oxford University Press, 53-91.
Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge
University Press.
Lewis, David 1973. Causation. The Journal of Philosophy 70,556-567.
Maienborn, Claudia 2003. Event-internal modifiers: Semantic underspecification and conceptual
interpretation. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds.). Modifying Adjuncts.
Berlin: Mouton de Gruyter, 475-509.
Marantz, Alec 1997. No escape from syntax: Don't try morphological analysis in the privacy of your
own lexicon. In: A. Dimitriadis, L. Siegel, C Surek-Clark & A. Williams (eds.). Proceedings of the
21st Annual Penn Linguistics Colloquium. Philadelphia, PA: University of Pennsylvania Press,
201-225.
Mateu, Jaume 2001. Unselected objects. In: N. Dehe- & A. Wanner (cds.). Structural Aspects of
Semantically Complex Verbs. Frankfurt/M.: Lang, 83-104.
Matthewson, Lisa 2003. Is the meta-language really natural? Theoretical Linguistics 29,263-274.
McCawley, James D. 1968. Lexical insertion in a Transformational Grammar without deep structure.
In: B. J. Darden, C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of
the Chicago Linguistic Society. Chicago, IL: Chicago Linguistics Society, 71-80.
McCawley, James D. 1971. Prelexical syntax. In: R. J. O'Brien (ed.). Report of the 22nd Annual
Round Table Meeting on Linguistics and Language Studies. Washington, DC: Georgetown
University Press, 19-33.
McCawley, James D. 1994. Generative Semantics. In: R. E. Asher & J. M. Y. Simpson (ed.). The
Encyclopedia of Language and Linguistics. Oxford: Pergamon, 1398-1403.
Montague, Richard 1973. The proper treatment of quantification in ordinary English. In:
J. Hintikka, J. M. E. Moravcsik & P. Suppcs (cds.). Approaches to Natural Language. Proceedings
of the 1970 Stanford Workshop on Grammar and Semantics. Dordrecht: Reidel, 221-242.
Morgan, Jerry L. 1969. On arguing about semantics. Papers in Linguistics 1,49-70.
Postal, Paul M. 1971. On the surface verb 'remind'. In: C. J. Fillmore & D. T. Langendoen (eds.).
Studies in Linguistic Semantics. New York: Holt, Rinehart & Winston, 180-270.
Pustejovsky, James 1988. The geometry of events. In: C. Tenny (ed.). Studies in Generative
Approaches to Aspect. Cambridge, MA:The MIT Press, 19-39.
Pustejovsky, James 1991a. The syntax of event structure. Cognition 41,47-81.
Pustejovsky, James 1991b.The generative lexicon. Computational Linguistics 17,409-441.
Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press.
Pustejovsky, James & Pierrette Bouillon 1995. Aspectual coercion and logical polysemy. Journal of
Semantics 12,133-162.
Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge-
Cambridge University Press.
Rappaport, Malka 1985. Review of Joshi's "Factorization of verbs". In: B. Levin (ed.). Lexical
Semantics in Review. Cambridge, MA:The MIT Press, 137-148.
Rappaport, Malka, Mary Laughrcn & Beth Levin 1987. Levels of Lexical Representation. Lexicon
Project Working Papers 20. Cambridge, MA: MIT.
Rappaport, Malka & Beth Levin 1988. What to do with 9-roles. In: W. Wilkins (ed.). Syntax and
Semantics 21: Thematic Relations. New York: Academic Press, 7-36.
Rappaport, Malka, Beth Levin & Mary Laughren 1993. Levels of lexical representation. In:
J. Pustejovsky (ed.). Semantics and the Lexicon. Dordrecht: Kluwer, 37-54.
398
IV. Lexical semantics
Rappaport Hovav, Malka & Beth Levin 1998. Building verb meanings. In: M. Butt & W. Geuder
(eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI
Publications, 97-134.
Rappaport Hovav, Malka & Beth Levin 2005. Change-of-state verbs: Implications for theories of
argument projection. In: N. Erteschik-Shir &T Rapoport (eds.). The Syntax of Aspect. Deriving
Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 274-286.
Reinhart,Tanya 2002.The theta system - an overview. Theoretical Linguistics 28,229-290.
Riemer, Nick 2006. Reductive paraphrase and meaning: A critique of Wierzbickian semantics.
Linguistics & Philosophy 29,347-379.
Ritter, Elizabeth & Sara Thomas Rosen 1998. Delimiting events in syntax. In: M. Butt & W. Geuder
(eds.). The Projection of Arguments: Lexical Compositional Factors. Stanford, CA: CSLI
Publications, 135-164.
Ross, John Robert 1972. Act. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language.
Dordrecht: Reidel,70-126.
Rozwadowska, Bozena 1988.Thematic restrictions on derived nominals. In: W. Wilkins (ed.). Syntax
and Semantics 21: Thematic Relations. New York: Academic Press, 147-165.
Ruppenhofer, Josef, Michael Ellsworth, Miriam R. L. Petruck, Christopher R. Johnson &
Jan Scheffczyk 2006. Frame Net II: Extended Theory and Practice, http://framenet.icsi.berkeley.
edu/index.php?option=com_wrapper&Itemid= 126. December 30,2006.
Shibatani, Masayoshi 1976. The grammar of causative constructions: A conspectus. In: M. Shibatani
(ed.). The Grammar of Causative Constructions New York: Academic Press, 1-40.
Stiebels, Barbara 1996. Lexikalische Argumente und Adjunkte. Zum semantischen Beitrag von ver-
balen Prdfixen und Partikeln. Berlin: Akademie Verlag.
Stiebels,Barbara 1998. ComplexdenominalverbsinGermanandthemorphology-seinantics interface.
In: G. Booij & J. van Marie (eds.). Yearbook of Morphology 1997. Dordrecht: Kluwer, 265-302.
Stiebels, Barbara 2002. Typologie des Argumentlinkings. Okonomie und Expressivitat. Berlin:
Akademie Verlag.
Stiebels, Barbara 2006. From rags to riches. Nominal linking in contrast to verbal linking. In:
D. Wunderlich (ed.). Advances in the Theory of the Lexicon. Berlin: de Gruyter, 167-234.
Summers, Delia (Med.) 1995. Longman Dictionary of Contemporary English. Munchen:
Langcnschcidt-Longman.
Taylor, John R. 2000. The network model and the two-level model in comparison. In: B. Peeters
(ed.). The Lexicon-Encyclopedia Interface. Amsterdam: Elsevier, 115-141.
Tenny, Carol 1987. Grammaticalizing Aspect and Affectedness. Ph.D. dissertation. MIT, Cambridge,
MA.
Tenny, Carol 1988. The aspectual interface hypothesis: The connection between syntax and lexical
semantics. In: C. Tenny (ed.). Studies in Generative Approaches to Aspect. Cambridge, MA: The
MIT Press, 1-18.
Travis, Lisa 2000. Event structure in syntax. In: C. Tenny (ed.). Events as Grammatical Objects.
The Converging Perspectives of Lexical Semantics and Syntax. Stanford, CA: CSLI Publications,
145-185.
van Valin, Robert D. Jr. 1993. A synopsis of Role and Reference Grammar. In: R. D. van Valin
(ed.). Advances in Role and Reference Grammar. Amsterdam: Benjamins, 1-164.
Vendlei, Zeno 1957. Verbs and times. The Philosophical Review LXVI, 143-160.
van Voorst, Jan 1988. Event Structure. Amsterdam: Benjamins.
Wicrzbicka, Anna 1972. Semantic Primitives. Frankfurt/M.: Athcnaum.
Wicrzbicka. Anna 1980. Lingua Mentalis. The Semantics of Natural Language. Sydney: Academic Press.
Wierzbicka, Anna 1985. Lexicography and Conceptual Analysis. Ann Arbor, MI: Kaioma.
Wierzbicka, Anna 1987. English Speech Act Verbs. A Semantic Dictionary. Sydney: Academic Press.
Wierzbicka, Anna 1988. The Semantics of Grammar. Amsterdam: Benjamins.
Wierzbicka, Anna 1992. Semantics, Culture, and Cognition. Universal Human Concepts in Culture-
Specific Configurations. Oxford: Oxford University Press.
18. Thematic roles
399
Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press.
Wierzbicka, Anna 1999. Emotions across Languages and Cultures. Cambridge: Cambridge
University Press.
Wierzbicka, Anna 2009. The theory of the mental lexicon. In: S. Kempgen et al. (eds.). Die slav-
ischen Sprachen - The Slavic Languages. Ein internationales Ilandbuch zu ihrer Geschichte, ihrer
Struktur und ihrer Erforschung -An International Handbook of their History, their Structure and
their Investigation. (HSK 32.1). Berlin: de Gruyter, 848-863.
von Wright, George Henrik 1963. Norm and Action. London: Routledge & Kegan Paul.
Wunderlich, Dieter 1991. How do prepositional phrases fit into compositional syntax and
semantics? Linguistics 25,283-331.
Wunderlich, Dieter 1994.Towards a lexicon-based theory of agreement. Theoretical Linguistics 20,
1-35.
Wunderlich, Dieter 1996. Models of lexical decomposition. In: E. Weigand & F. Hundsnurscher
(eds.). Lexical Structures and Language Use. Proceedings of the International Conference on
Lexicology and Lexical Semantics, Milnster, September 13-15,1994. Vol. 1: Plenary Lectures and
Session Papers. Tubingen: Nieineyer, 169-183.
Wunderlich, Dieter 1997a. cause and the structure of verbs. Linguistic Inquiry 28,27-68.
Wunderlich, Dieter 1997b. Argument extension by lexical adjunction. Journal of Semantics 14,
95-142.
Wunderlich, Dieter 2000. Predicate composition and argument extension as general options - a
study in the interface of semantic and conceptual structure. In: B. Stiebels & D. Wunderlich
(eds.). Lexicon in Focus. Berlin: Akadeinie Verlag, 247-270.
Wunderlich, Dieter 2006. Towards a structural typology of verb classes. In: D. Wunderlich (ed.).
Advances in the Theory of the Lexicon. Berlin: de Gruyter, 57-166.
Zubizarieta, Maria Luisa & Eunjeong Oh 2007. On the Syntactic Composition of Manner and
Motion. Cambridge, MA:Thc MIT Press.
Stefan Engelberg, Mannheim (Germany)
18. Thematic roles
1. Introduction
2. Historical and terminological remarks
3. The nature of thematic roles
4. Thematic role systems
5. Concluding remarks
6. References
Abstract
Thematic roles provide one way of relating situations to their participants. Thematic
roles have been widely invoked both within lexical semantics and in the syntax-semantics
interface, in accounts of a wide range of phenomena, most notably the mapping between
semantic and syntactic arguments (argument realization). This article addresses two sets
of issues. The first concerns the nature of thematic roles in semantic theories: what are
thematic roles, are they specific to individual predicates or more general, how do they
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 399-420
400
IV. Lexical semantics
figure in semantic representations, and what interactions do they have with event and object
individuation and with the semantics of plurality and aspect? The second concerns
properties of systems of thematic roles: what is the inventory of thematic roles, and what
relationships, such as an ordering of roles in a thematic hierarchy, or the consistency of roles across
semantic domains posited by the thematic relations hypothesis, exist among roles? Various
applications of thematic roles will be noted throughout these two sections, in some cases
briefly mentioning alternative accounts that do not rely on them. The conclusion notes
some skepticism about the necessity for thematic roles in linguistic theory.
1. Introduction
Thematic roles provide one way of relating situations to their participants. Somewhat
informally, we can paraphrase this by saying that participant x plays role R in situation
e. Still more informally, the linguistic expression denoting the participant is said to play
that role.
Thematic roles have been widely invoked both within lexical semantics and in the
syntax-semantics interface, in accounts of a wide range of phenomena, including the
mapping between semantic and syntactic arguments (argument realization), controller
choice of infinitival complements and anaphors, constraints on relativization, constraints
on morphological or syntactic phenomena such as passivization, determinants of telicity
and distributivity, patterns of idiom frequencies, and generalizations about the lexicon
and lexical acquisition. Automatic role labeling is now an active area of investigation in
computational linguistics as well (Marquez et al. 2008).
After a brief terminological discussion, this article addresses two sets of issues. The
first concerns the nature of thematic roles in semantic theories: what are thematic roles,
are they specific to individual predicates or more general, how do they figure in semantic
representations, and what interactions do they have with event and object individuation
and with the semantics of plurality and aspect? The second concerns properties of
systems of thematic roles: what is the inventory of thematic roles, and what relationships,
such as an ordering of roles in a thematic hierarchy, or the consistency of roles across
semantic domains posited by the thematic relations hypothesis, exist among roles?
Various applications of thematic roles will be noted throughout these two sections, in some
cases briefly mentioning alternative accounts that do not rely on them. The conclusion
notes some skepticism about the necessity for thematic roles in linguistic theory.
2. Historical and terminological remarks
Panini's karakas are frequently noted as forerunners of thematic roles in modern
linguistics. Grubcr (1965) and Fillmore (1968) arc widely credited with initiating the discourse
on thematic roles within generative grammar and research relating to thematic roles
has blossomed since the 1980s in conjunction with growing interest in the lexicon and
in semantics. A variety of terms appear in the literature to refer to essentially the same
notion of thematic role: thematic relation, theta-role, (deep) case role, and participant role.
A distinction between "broad", general roles and "narrow", predicate-specific roles is
worth noting as well. General roles (also termed absolute roles (Schein 2002) or thematic
role types (Dowty 1989)) apply to a wide range of predicates or events; they include
the well-known Agent, Patient, Theme, Goal, and so on. Predicate-specific roles (also
18. Thematic roles
401
termed relativized (Schein 2002), individual thematic roles (Dowty 1989), or "relation-
specific roles") apply only to a specific event, situation, or predicate type; such roles
as Devourer or Explainer arc examples. There need not be a sharp distinction between
broad and narrow roles; indeed, roles of various degrees of specificity, situated in a sub-
sumption hierarchy, can be posited. In this article the term thematic role will cover both
broad and narrow roles, though some authors restrict its use to broad roles.
3. The nature of thematic roles
Thematic roles have been formally defined as relations, functions, and sets of entailments
or properties. As Rappaport & Levin (1988:17) state: "Thcta-rolcs arc inherently
relational notions; they label relations of arguments to predicators and therefore have no
existence independent of predicators." Within model-theoretic semantics, an informal
definition like the one that begins this article needs to be made explicit in several respects.
First, what are the entities to be related? Are thematic roles present in the semantic
representations of verbs and other predicators (i.e., lexical items that take arguments),
or added compositionally? Are they best viewed as relations or as more complex
constructs such as sets of entailments, bundles of features, or merely epiphenomena defined
in terms of other linguistic representations that need not be reified at all?
3.1. Defining thematic roles in model-theoretic semantics
Attempts to characterize thematic roles explicitly within model-theoretic semantics begin
with works such as Chierchia (1984) and Carlson (1984). Both Chierchia and Carlson
treat thematic roles as relations between an event and a participant in it. Dowty (1989:
80) provides the following version of Chierchia's formulation. Events are regarded as
tuples of individuals, and thematic roles are therefore defined thus:
(1) A 9-rolc 9 is a partial function from the set of events into the set of individuals such
that for any event k, if 0(k) is defined, then 0(k) e k.
For example, given (2a), an event tuple in which Akill' is the intension of the verb kill
and x is the killer of y, the functions Agent and Patient yield the values in (2b) and (2c:)
(2) a. <Akiir,*,;y>
b. Agent((Aki\\',x,y))=x
c. Patient «AkiH\jc,y» = y
This shows how thematic roles can be defined within an ordered argument system for
representing event types. The roles are distinguished by the order of arguments of the
predicate, though there is no expectation that the same roles appear in the same order
for all predicates (some predicates may not have an Agent role at all, for example).
Another representation, extending Davidson's (1967) insights on the representation
of events, is the neo-Davidsonian one, in which thematic roles appear explicitly as
relations between events and participants; see article 34 (Maienborn) Event semantics.The
roles are labeled, but not ordered with respect to one another. In one form of such a
representation, the equivalent of (2) would be (3):
402
IV. Lexical semantics
(3) 3e [killing(e) & Agent(e,*) & Patient(e,.y)]
Here, thematic roles arc binary predicates, taking an eventuality (event or state) as first
argument and a participant in that eventuality as second argument. For a discussion of
other possibilities for the type of the first argument;see Bayer (1997:24-28). Furthermore,
as Bayer (1997: 5) notes, there are two options for indexing arguments in a neo-David-
sonian representation: lexical and compositional. In the former, used, e.g., in Landman
(2000), the lexical entry for a verb includes the thematic roles that connect the verb and
its arguments. This is also effectively the analysis implicit in structured lexical semantic
representations outside model-theoretic semantics, such as Jackendoff (1987,1990), Rap-
paport & Levin (1988), Foley & van Valin (1984), and van Valin (2004). But it is also
possible to pursue a compositional approach, in which the lexical entry contains only the
event type, with thematic roles linking the arguments through some other process, as does
Krifka (1992,1998), who integrates the thematic role assignments of the verb's arguments
through its subcategorization. Bayer (1997: 127-132) points out some difficulties with
Krifka's system, including coordination of VPs that assign different roles to a shared NP.
A position intermediate between the lexical and compositional views is advocated by
Kratzer (1996), who claims that what has been regarded as a verb's external argument is
in fact not an argument at all, although its remaining arguments are present in its lexical
entry. Thus the representation of kill in (3) would lack the clause Agent(e, x), this role
being assigned by a VoiceP (or "little v") above VP. This, Kratzer argues, accounts for
subject/object asymmetries in idiom frequencies and the lack of a true overt agent
argument in gerunds. Svenonius (2007) extends Kratzer's analysis to adpositions. However,
Wechsler (2005) casts doubt on Kratzer's prediction of idiom asymmetries and notes that
some mechanism must still select which role is external to the verb's lexical entry,
particularly in cases where there is more than one agentive participant, such as a commercial
transaction involving both a buyer and a seller.
Characterizing a thematic role as a partial function leaves open the question of how to
determine the function's domain and range; that is, what events is a role defined for, and
which participant does the role pick out? In practice, this problem of determining which
roles arc appropriate for a predicate plagues every system of broad thematic roles, but
it is less serious for roles defined on individual predicates. Dowty (1989:76) defines the
latter in terms of a set of entailments, as in (4):
(4) Given an /i-place predicate 8 and a particular argument xr the individual thematic
role (8, i) is the set of all properties a such that the entailment □[§(*,,... Jtf., ...-*„) ->
a(oc()] holds.
What counts as a "property" must be specified, of course. And as Bayer (1997:119-120)
points out, nothing in this kind of definition tells us which of these properties are important
for, say, argument realization. Formulating such cross-lexicon generalizations demands
properties or relations that are shared across predicates. Dowty defines cross-predicate
roles, or thematic role types, as he terms them, as "the intersection of all the individual
thematic roles" (Dowty 1989: 77). Therefore, as stated by Dowty (1991: 552): "From the
semantic point of view, the most general notion of thematic role (type) is A SET OF
ENTAILMENTS OF A GROUP OF PREDICATES WITH RESPECT TO ONE OF
THE ARGUMENTS OF EACH. (Thus a thematic role type is a kind of second-order
18. Thematic roles
403
property, a property of multiplace predicates indexed by their argument
positions.)."Thematic role types are numerous; however, he argues, linguists will generally be interested in
identifying a fairly small set of these that play some vital role in linguistic theory.
This definition of thematic role types imposes no restrictions on whether an
argument can bear multiple roles to a predicate, whether roles can share entailments in their
defining sets and thus overlap or subsume one another, whether two arguments of a
predicate can bear the same role (though this is ruled out by the functional requirement in
(1)), and whether every argument must be assigned a role. These constraints, if desired,
must be independently specified (Dowty 1989:78-79) As the discussion of thematic role
systems below indicates, various models answer these questions differently. Whether
suitable entailments for linguistically significant broad thematic roles can be found at all is a
further issue that leads Rappaport & Levin (1988,2005), Dowty (1991),Wechsler (1995),
Croft (1991,1998), and many others to doubt the utility of positing such roles at all.
Finally, note that definitions such as(l) and (4) above, or Dowty's thematic role types,
make no reference to morphological, lexical, or syntactic notions. Thus they are agnostic
as to which morphemes, words, or constituents have thematic roles associated with them;
that depends on the semantics assigned to these linguistic entities and whether roles
are defined only for event types or over a broader range of situation types. Verbs are
the prototypical bearers of thematic roles, but nominalizations are typically viewed
as role-bearing, and other nouns, adjectives, and adpositions (Gawron 1983, Wechsler
1995, Scvcnonius 2007), as prcdicators denoting event or situation types, will have roles
associated with them, too.
3.2. Thematic role uniqueness
Many researchers invoking thematic roles have explicitly or implicitly adopted a
criterion of thematic role uniqueness. Informally, this means that only one participant in a
situation bears a given role. Carlson (1984: 271) states this constraint at a lexical level:
"one of the more fundamental constraints is that of'thematic uniqueness' - that no verb
seems to be able to assign the same thematic role to two or more of its arguments." This
echoes the 9-criterion of Chomsky (1981), discussed below. Parsons (1990: 74) defines
thematic uniqueness with respect to events and their participants: "No event stands
in one of these relations to more than one thing." Note that successfully connecting
the lexical-level constraint and the event-level constraints requires the Davidsonian
assumption that there is a single event variable in the semantic representation of a
predicator, as Carlson (1998: 40) points out. In some models of lexical representation
(e.g., Jackendoffs lexical decomposition analyses and related work), this is not the case,
as various subevents are represented, each of which can have a set of roles associated
with it.
The motivations for role uniqueness are varied. One is that it simplifies accounts of
mapping from thematic roles to syntactic arguments of predicators (which are also
typically regarded as unique). It is implicit in hypotheses such as the Universal Alignment
Hypothesis (Rosen 1984) and the Uniformity of Theta Assignment Hypothesis (Baker
1988, 1997), described in greater detail in the section on argument realization below.
Role uniqueness also provides a tool to distinguish situations - if two different
individuals appear to bear the same role, then there must be two distinct situations involved
(Landman 2000:39). A further motivation,emphasized in the work of Krifka (1992,1998),
404
IV. Lexical semantics
is that role uniqueness and related conditions are crucial in accounting for the semantics
of events in which a participant is incrementally consumed, created, or traversed.
Role uniqueness does not apply straightforwardly to all types of situations. Krifka
(1998:209) points out that a simple definition of uniqueness - "it should not be the case
that one and the same event has different participants" - is "problematic for see and
touchr If one sees or touches an orange, for example, then one also typically sees or
touches the peel of the orange. But the orange and its peel are distinct; thus the seeing or
touching event appears to have multiple entities bearing the same role.
A corollary of role uniqueness is exhaustivity, the requirement that whatever bears a
given thematic role to a situation is the only thing bearing that role. Thus if some group
is designated as the Agent of an event, then there can be no larger group that is also the
Agent of that same event. Exhaustivity is equivalent to role uniqueness under the
condition that only one role may be assigned to an individual; as Schein (2002:272) notes,
however, if thematic roles are allowed to be "complex" - that is, multiple roles can be assigned
conjunctively to individuals, then exhaustivity is a weaker constraint than uniqueness.
3.3. How fine-grained should thematic roles be?
As noted in the introduction, thematic roles in the broad sense are frequently postulated
as crucial mechanisms in accounts of phenomena at the syntax-semantics interface, such
as argument realization, anaphoric binding, and controller choice. For thematic roles to
serve many of these uses, they must be sufficiently broad, in the sense noted above, that
they can be used in stating generalizations covering classes of lexical items or
constructions. To be useful in this sense, therefore, a thematic role should be definable over a
broad class of situation types. Bach (1989: 111) articulates this view: "Thematic roles
seem to represent generalizations that we make across different kinds of happenings
in the world about the participation of individuals in the eventualities that the various
sentences are about." Schein (2002: 265) notes that if "syntactic positions [of arguments
given their thematic roles] arc predictable, wc can explain the course of acquisition and
our understanding of novel verbs and of familiar verbs in novel contexts."
Within model-theoretic semantics, two types of critiques have been directed at the
plausibility of broad thematic roles. One is based on the difficulty of formulating
definitional criteria for such roles, arguing that after years of effort, no rigorous definitions
have emerged.The other critique examines the logical implications of positing such roles,
bringing out difficulties for semantic representations relying on broad thematic roles
given basic assumptions, such as role uniqueness.
It is clear that the Runner role of run and the Jogger role of jog share significant
entailments (legs in motion, body capable of moving along a path, and so on). Indeed, to
maintain that these two roles arc distinct merely because there arc two distinct verbs run and
jog in English seemingly determines a semantic issue on the basis of a language-particular
lexical accident. But no attempt to provide necessary, sufficient, and comprehensive
criteria for classifying most or all of the roles of individual predicates into a few broad roles
has yet to meet with consensus. This problem has been noted by numerous researchers;
see, for example, Dowty (1991: 553-555), Croft (1991: 155-158), Wechsler (1995: 9-11),
and Levin & Rappaport Hovav (2005:38-41) for discussion. Theme in particular is
notoriously defined in vague and differing ways, but it is not an isolated case; many partially
overlapping criteria for Agenthood have been proposed, including volitionality, causal
18. Thematic roles
405
involvement, and control or initiation of an event; see Kiparsky (1997:476) for a remark
on cross-linguistic variation in this regard and Wechsler (2005:187-193) for a proposal
on how to represent various kinds of agcntivc involvement.
Moreover, there are arguments against the possibility of broad roles even for
situations that seemingly semantically quite close, in cases where two predicators would
appear to denote the same situation type.The classic case of this, examined from various
perspectives by numerous authors, involves the verbs buy and sell. As Parsons (1990:84)
argues, given the two descriptions of a commercial transaction in (5):
(5) a. Kim bought a tricycle from Sheehan.
b. Sheehan sold a tricycle to Kim.
and the assumptions that "Kim is the Agent of the buying, and Sheehan is the Agent of
the selling", then to "insist that the buying and the selling are one and the same event,
differently described" entails that "Kim sold a tricycle to Kim, and Sheehan bought a
tricycle from Sheehan." Parsons concludes that the two sentences must therefore describe
different, though "intimately related", events; see article 29 (Gawron) Frame Semantics.
Landman (2000:31-33) and Schein (2002) make similar arguments; the import of which
is that one must either individuate fine-grained event types or distinguish thematic roles
in a fine-grained fashion.
Schein (2002) suggests, as a possible alternative to fine-grained event distinctions, a
ternary view of thematic roles, in which the third argument is essentially an index to the
predicate, as in (6), rather than the dyadic thematic roles in, e.g., (3):
(6) Agent(e,*,fo7/')
The Agent role of kill is thereby distinguished from the Agent of murder, throw, explain,
and so on.This strategy, applied to buy and sell, blocks the invalid inference above.
However, this would allow the valid inference from one sentence in (5) to the other only if
we separately ensure that the Agent of buy corresponds to the Recipient of sell, and vice
versa. Moreover, other problems remain even under this relativized view of thematic
roles, in particular concerning symmetric predicates, the involvement of parts of
individuals or of groups of individuals in events, whether they are to be assigned thematic
roles, and whether faulty inferences would result if they are.
Assuming role uniqueness (or exhaustivity), and drawing a distinction between an
individual car and the group of individuals constituting its parts, one is seemingly forced
to conclude that (7a) and (7b) describe different events (Carlson 1984, Schein 2002).The
issue is even plainer in (8), as the skin of an apple is not the same as the entire apple.
(7) a. I weighed the Volvo.
b. I weighed (all) the parts of the Volvo.
(8) a. Kim washed the apple.
b. Kim washed the skin of the apple.
Similar problems arise with symmetric predicators such as face or border, as illustrated in
(9) (based on Schein 2002) and (10).
406
IV. Lexical semantics
(9) a. The Carnegie Deli faces Carnegie Hall,
b. Carnegie Hall faces the Carnegie Deli.
(10) a. Rwanda borders Burundi,
b. Burundi borders Rwanda.
If two distinct roles are ascribed to the arguments in these sentences, and the mapping
of roles to syntactic positions is consistent, then the two sentences in each pair must
describe distinct situations.
Schein remarks that the strategy of employing fine-grained roles indexed to a predi-
cator fails for cases like (7) through (10), since the verb in each pair is the same. Noting
that fine-grained events seem to be required regardless of the availability of fine-grained
roles, Schein (2002) addresses these issues by introducing additional machinery into
semantic representations, including fine-grained scenes as perspectives on events, to
preserve absolute (broad) thematic roles. Each sentence in (9) or (10) involves a different
scene, even though the situation described by the two sentences is the same. "In short,
scenes are fine-grained, events are coarser, and sentences rely on (thematic) relations
to scenes to convey what they have to say about events." (Schein 2002: 279). Similarly,
the difficulties posed by (7) and (8) are dealt with through "a notion of resolution to
distinguish a scene fine-grained enough to resolve the Volvo's parts from one which only
resolves the whole Volvo." (Schein 2002:279).
3.4. Thematic roles and plurality
Researchers with an interest in the semantics of plurals have studied the interaction
of plurality and thematic roles. Although Carlson (1984: 275) claims that "it appears to
be necessary to countenance groups or sets as being able to play thematic roles", he
does not explore the issues in depth. Landman (2000: 167), contrasting collective and
distributive uses of plural subjects, argues that "distributive predication is not an instance
of thematic predication." Thus, in the distributive reading of a sentence like The boys
sing., the semantic properties of Agents hold only of the individual boys, so "no thematic
implication concerning the sum of the boys itself follows." For "on the distributive
interpretation, not a single property that you might want to single out as part of agenthood
is predicated of the denotation of the boys" (Landman 2000:169). Therefore, Landman
argues, "the subject the boys does not fill a thematic role" of sing, but rather a "non-
thematic role" that is derived from the role that sing assigns to an individual subject.
He then develops a theory of plural roles, derived from singular roles (which are
defined only on atomic events), as follows (Landman 2000:184), where E is the domain
of events (both singular and plural):
(11) If e is an event in E, and for every atomic part a of e, thematic role R is defined for a,
then plural role *R is defined for e, and maps e onto the sum of the /?-values of the
atomic parts of e.
Thus *R subsumes R (if e is itself atomic then R is defined for e). Although Landman
does not directly address the issue, his treatment of distributive readings and plural roles
seems to imply that a role-based theory of argument realization - indeed, any theory of
18. Thematic roles
407
argument realization based on entailments of singular arguments - must be modified to
extend to distributive plural readings.
3.5. Thematic roles and aspectual phenomena
Krifka (1992,1998) explores the ways in which entailments associated with participants
can account for aspectual phenomena. In particular, he aims to make precise the notions
of incremental theme and of the relationships of incremental participants to events
and subevents, formalizing properties similar to some of the proto-role properties of
Dowty (1991). Krifka (1998:211-212) defines some of these properties as follows, where
© denotes a mereological sum of objects or events, s denotes a part or subevent
relation, and 3x means that there exists a unique entity x of which the property following
x holds:
(12) a. A role 6 shows uniqueness of participants, UP(#), iff:
9{x, e) & 0{y, e) — x = y
b. A role 9 is cumulative, CUM(0), iff:
0{x, e) & 0{y, e') -► 9{x ®p y, e 0£ e')
c. A role 6 shows uniqueness of events, UE(0), iff:
0(x,e)&yzpx^> 3\e'[e' s£e & 0(y,e')]
d. A role 8 shows uniqueness of objects, UO(#), iff:
0{x,e) & e' *Fe^ 3\y[y <.px & 6{y,e')\
The first of these is a statement of role uniqueness, and the second is a weak property
that holds of a broad range of participant roles, which somewhat resembles Landman's
definition of plural roles in (11). The remaining two are ingredients of incremental
participant roles, including those borne by entities gradually consumed or created in an
event. Krifka's analysis treats thematic roles as the interface between aspectual
characteristics of events, such as tclicity, and the mereological structure of entities (parts and
plurals). In the case of event types with incremental roles, the role's properties establish
a homomorphism between parts of the participant and subevents of the event. Slightly
different properties are required for a parallel analysis of objects in motion and the paths
they traverse.
4. Thematic role systems
This section examines some proposed systems of thematic roles; that is, inventories of
roles and relationships among them, if any. One simple version of such a system is an
unorganized set of broad roles, such as Agent, Instrument, Experiencer, Theme, Patient,
etc., each assumed to be atomic, primitive, and independent of one another. In other
systems, roles may be treated as non-atomic, being defined either in terms of more basic
features, as derived from positional criteria within structured semantic representations,
or situated within a hierarchy of roles and subroles (for example, the Location role might
have Source and Goal subroles). Dependencies among roles are also posited; it is
reasonable to claim that predicates allowing an Instrument role should also then require an
Agent role, or that Source or Goal are meaningless without a Theme in motion from one
to the other. Another type of dependency among roles is a thematic hierarchy, a global
408
IV. Lexical semantics
ordering of roles in terms of their prominence, as reflected in, for instance, argument
realization or anaphoric binding phenomena.These applications of thematic roles at the
syntax-semantic interface are examined at the end of this section.
4.1. Lists of primitive thematic roles
Fillmore (1968) was one of the earliest in the generative tradition to present a system
of thematic roles (which he terms "deep cases"). This foreshadows the hypotheses of
later researchers about argument realization, such as the Universal Alignment
Hypothesis (Perlmutter & Postal 1984, Rosen 1984), and the Uniformity of Theta Assignment
Hypothesis (Baker 1988,1997), discussed in the section on argument realization below.
Fillmore's cases are intended as an account of argument realization at deep structure,
and are, at least implicitly, ranked. A version of role uniqueness is also assumed, and roles
are treated as atomic and independent of one another.
A simple use of thematic roles that employs a similar conception of them is to ensure
a one-to-one mapping from semantic roles of a predicate to its syntactic arguments.This
is exemplified by the 0-criterion, a statement of thematic unique-ness formulated by
Chomsky (1981:36), some version of which is assumed in most syntactic research in
Government and Binding, Principles and Parameters, early Lexical-Functional Grammar,
and related frameworks.
(13) Each argument bears one and only one 6-role, and each 8-role is assigned to one and
only one argument.
This use of thematic roles as "OK marks", in Carlson's (1984) words, makes no
commitments as to the semantic content of 9-roles, nor to the ultimate syntactic effects.
As Ladusaw & Dowty (1988:62) remark, "the 9-criterion and 9-roles are a principally
diacritic theory: what is crucial in their use in the core of GB is whether an argument
is assigned a 9-rolc or not, which limits possible structures and thereby constrains the
applications of rules." The 9-criterion as stated above is typically understood to apply
to coreference chains; thus "each chain is assigned a 9-role." and "the 9-criterion must
be reformulated in the obvious way in terms of chains and their members.." (Chomsky
1982:5-6).
Other researchers have continued along the lines suggested by Fillmore (1968),
furnishing each lexical item with a list of labeled thematic roles. These representations are
typically abstracted away from the issue of whether roles should be represented in a
Davidsonian fashion; an example would look like (14), where the verb cut is represented
as requiring the two roles Agent and Patient and allowing an optional Instrument:
(14) cut (Agent, Patient, (Instrument))
This is one version of the "9-grid" in the GB/P&P framework. Although the
representation in (14) uses broad roles, the granularity of roles is a independent issue. For an
extensive discussion of such models, see chapter 2 of Levin & Rappaport Hovav (2005). They
note several problems with thematic list approaches: difficulties in determining which
role to assign "symmetric" predicates like face and border with two arguments
seemingly bearing the same role, "the assumption that semantic roles are taken to be discrete
18. Thematic roles
409
and unanalyzable" Levin & Rappaport Hovav (2005:42) rather than exhibiting relations
and dependencies amongst themselves, and the failure to account for restrictions on the
range of possible case frames (Davis & Koenig 2000:59-60).
An additional feature of some thematic list representations is the designation of one
argument as the external argument. Belletti & Rizzi (1988: 344), for example,
annotate this external argument, if present, by italicizing it, as in (15a), as opposed to (15b)
(where lexically specified case determines argument realization, and there is no external
argument).
(15) a. temere ('fear') (Experiencer,Theme)
b. preocupare ('worry') (Experiencer,Theme)
This leaves open the question of how the external argument is to be selected; Belletti &
Rizzi address this issue only in passing, suggesting the possibility that a thematic
hierarchy is involved.
4.2. Thematic hierarchies
It is not an accident that thematic list representations, though ostensibly consisting of
unordered roles, typically list the Agent role first if it is present. Apart from the
intuitive sense of Agents being the most "prominent", the widespread use of thematic role
lists in argument realization leads naturally to a ranking of thematic roles in a thematic
hierarchy, in which prominence on the hierarchy corresponds to syntactic prominence,
whether configurationally or in terms of grammatical functions.
As Levin & Rappaport Hovav (2005: chapters 5 and 6) make clear, there are several
distinct bases on which a thematic hierarchy might be motivated, independently of its
usefulness in argument realization:prominence in lexical semantic representations (Jack-
endoff 1987,1990), event structure, especially causal structure (Croft 1991, Wunderlich
1997), and topicality or salience (Fillmore 1977). However, many authors motivate a
hierarchy primarily by argument realization, adopting some version of a correspondence
principle to ensure that the thematic roles defined for a predicate are mapped to
syntactic positions (or grammatical functions) in prominence order.
The canonical thematic hierarchy is a total ordering of all the thematic roles in a
theory's inventory (many hierarchies do not distinguish an ordering among Source,
Goal, and Location, however). While numerous variants have been proposed - Levin &
Rappaport Hovav (2005:162-163) list over a dozen - they agree on ranking Agent/Actor
topmost, Theme and/or Patient near the bottom, and Instrument between them. As with
any model invoking broad thematic roles, thematic hierarchy approaches face the
difficulty of defining the roles they use, and addressing the classification of roles of individual
verbs that do not fit well. A thematic hierarchy of fine-grained roles would face the twin
drawbacks of a large number of possible orderings and a lack of evidence for establishing
a relative ranking of many roles.
Some researchers emphasize that the sole function of the thematic hierarchy is to
order the roles; the role labels themselves are invisible to syntax and morphology, which
have access only to the ordered list of arguments. Thus Grimshaw (1990:10) states that
though she will "use thematic role labels to identify arguments ... the theory gives no
status to this information." Williams (1994) advocates a similar view of the visibility of
410
IV. Lexical semantics
roles at the syntactic level. Wunderlich (1997) develops an approach that is similar in
this regard, where depth of embedding in a lexical semantic structure determines the
prominence of semantic arguments.
It is worth noting that these kinds of rankings can be achieved by means other than
a thematic hierarchy, or even thematic roles altogether. Rappaport & Levin's (1988)
predicate argument structures consist of an ordered list of argument variables derived
from lexical conceptual structures like those in (17) below, but again no role
information is present. Fillmore (1977) provides a saliency hierarchy of criteria for ascertaining
the relative rank of two arguments; these include active elements outranking inactive
ones, causal arguments outranking noncausal ones, and changed arguments outranking
unchanged ones. Gawron (1983) also makes use of this system in his model of argument
realization. Wechsler (1995) similarly relies on entailments between pairs of participants,
such as one having a notion of the other, to determine a partial ordering amongst
arguments. Dowty's (1991) widely-cited system of comparing numbers of proto-agent and
proto-patient entailments is a related though distinct approach, discussed in greater
detail below. Finally, Grimshaw (1990) posits an aspectual hierarchy in addition to a
thematic hierarchy, on which causers outrank other elements. Li (1995) likewise argues for
a causation-based hierarchy independent of a thematic hierarchy, based on argument
realization in Mandarin Chinese resultative compounds. Primus (1999, 2006) presents
a system of two role hierarchies, corresponding to involvement and causal dependency.
"Morphosyntactic linking , i.e., case in the broader sense, corresponds primarily to the
degree and kind of involvement ...while structural linking responds to semantic
dependency" (Primus 2006: 54). These multiple-hierarchy systems bear an affinity to systems
that assign multiple roles to participants and to the structured semantic representations
of Jackendoff, discussed in the following section.
4.3. Multiple role assignment
Thematic role lists and hierarchies generally assume some principle of thematic
uniqueness. However, there arc various arguments for assigning multiple roles to a single
argument of a predicator, as well as cases where it is hard to distinguish the roles of two
arguments. Symmetric predicates such as those in (9) and (10) exemplify the latter
situation. As for the former, Jackendoff (1987:381-382) suggests buy, sell, and chase as verbs
with arguments bearing more than one role, and any language with morphologically
productive causative verbs furnishes examples such as 'cause to laugh', in which the laugher
exhibits both Agent and Patient characteristics. Williams (1994) argues that the subject
of a small clause construction like John arrived sad. is best analyzed as bearing two roles,
one from each predicate. Broadwell (1988: 123) offers another type of evidence for
multiple role assignment in Choctaw (a Muskogcan language of the southeastern U.S.).
Some verbs have supplctive forms for certain persons and numbers, and "1 [= subject]
agreement is tied to the 0-roles Agent, Effector, Experiencer, and Source/Goal. Since the
Choctaw verb 'arrive' triggers 1 agreement, its subject must bear one of these 0-roles.
But I have also argued that suppletion is tied to the Theme, and so the subject must bear
the role Theme." Similarly, one auxiliary is selected if the subject is a Theme, another
otherwise, and arrive in Choctaw selects the 7/ieme-subject auxiliary.
Cases like these have led many researchers to pursue representations in which roles
are derived, not primitive. Defining roles in terms of features, or positions in lexical
18. Thematic roles
411
representations based on semantic decomposition, can more readily accommodate those
predicators that appear to violate role uniqueness within a system of broad thematic
roles.
4.4. Structural and featural analyses of thematic roles
Many researchers, perhaps dissatisfied with the seemingly arbitrary set of descriptions
of the meanings of thematic roles such as Agent, Patient, Goal, and, notoriously, Theme,
have sought organizing principles that would characterize the range of broad thematic
roles. These efforts can be loosely divided into two, somewhat overlapping types. The
first might be called structural or relational, situating roles within structures typically
representing lexical entries, including properties of the event type they denote. The second
approach is featural, analyzing roles in terms of another set of more fundamental
features that provides some structure for the set of possible roles. Both approaches are
compatible with viewing thematic roles as sets of entailments. Under the structural approach,
the entailments are associated with positions in a lexical or event structure, while under
the featural approach, the entailments can be regarded as features. Both approaches
are also compatible with thematic hierarchies or other prominence schemes. However,
a notion of prominence within event structures can do the work of a thematic hierarchy
in argument selection within a structural approach, and prominence relations between
features can do the same within a featural approach.
While Fillmore's cases are presented as an unstructured list, Gruber (1965) and Jack-
endoff (1983) develop a model in which the relationships between entities in motion
and location situations provide an inventory of roles: Theme, Source, Goal, and
Location. Through analogy, these extend to a wide range of semantic domains, as phrased by
Jackendoff (1983:188):
(16) Thematic Relations Hypothesis
In any semantic field of [events] and [states], the principal event-, state-, path-, and
place-functions are a subset of those used for the analysis of spatial location and
motion. Fields differ in only three possible ways:
a. what sorts of entities may appear as theme;
b. what sorts of entities may appear as reference objects;
c. what kind of relation assumes the role played by location in the field of spatial
expressions.
This illustrates one form of lexical decomposition (see article 17 (Engelberg)
Frameworks of decomposition) allowing thematic roles to be defined in terms of their
positions within the structures representing the semantics of lexical items. Such an approach
is compatible with the entailment-based treatments of thematic roles discussed above,
but developing a model-theoretic interpretation of these structures that would
facilitate this has not been a priority for those advocating this kind of analysis. However,
lexical decomposition does reflect the internal complexity of natural language
predicators. For example, causative verbs are plausibly analyzed as denoting two situation
types standing in a causal relationship to one another, and transactional verbs such as
buy, sell, and rent as denoting two oppositely-directed transfers.The Lexical Conceptual
Structures of Rappaport & Levin (1988) (see article 19 (Levin & Rappaport Hovav)
412
IV. Lexical semantics
Lexical Conceptual Structure) illustrate decomposition into multiple subevents of the
two alternants of "spray/load" verbs; the LCS for the TTieme-object alternant is shown
in (17a) and that of the Location-ob')ccl alternant in (17b), in which the LCS of the
former is embedded as a substructure:
(17) a. [x cause [y to come to be at z]\
b. [x cause [z to come to be in STATE]
BY MEANS OF [x cause [y to come to be at z]]]
As noted above, in such representations, thematic roles are not primitive, but derived
notions. And because LCSs - or their counterparts in other models - can be embedded
as in (17b), there may be multiple occurrences of the same role type within the
representation of a single predicate. A causative or transactional verb can then be regarded
as having two Agents, one in each subevent. Furthermore, participants in more than one
subevent can accordingly be assigned more than one role; thus the buyer and seller in
a transaction are at once Agents and Recipients (or Goals). This leads to a view of
thematic roles that departs from the formulations of role uniqueness developed within an
analysis of predicators as denoting a unitary, undecomposed event, though uniqueness
can still be postulated for each subevent in a decompositional representation. It also can
capture some of the dependencies among roles; for example Jackendoff (1987:398—402)
analyzes the Instrument role in terms of conceptual structures representing intermediate
causation, and this role exists only in relation to others in the chain of causation.
Croft (1991,1998) has taken causation as the fundamental framework for defining
relationships between participants in events,with the roles more closely matching the Agent,
Patient, and Instrument of Fillmore (1968). The causal ordering is plainly correlated
with the ordering of roles found in thematic hierarchies and with the Actor-Undergoer
cline of Role and Reference Grammar (Foley & van Valin 1984, van Valin 2004), which
orders thematic roles according to positions in a structure intended to represent causal
and aspectual characteristics of situations. In Jackendoff (1987,1990) the causal and the
motion/location models are combined in representations of event structures, some quite
detailed and elaborate. Thematic roles within these systems are also derived notions,
defined in terms of positions within these event representations. As the systems become
more elaborate, one crucial issue is: how can we tell what the correct representation
should be? Both types of systems exploit metaphorical extensions of space, motion, and
causation to other, more abstract domains, and it is often unclear what the "correct"
application of the metaphors should be.The implication for thematic roles defined within
such systems is that their semantic foundations are not always sound.
Notable featural analyses of thematic roles include Ostler (1979), whose system uses
eight binary features that characterize 48 roles. His features arc further specifications
of the four generalized roles: Theme, Source, Goal, and Path from the Thematic Role
Hypothesis, and include some that resemble entailments (volitional) and some that
specify a semantic domain (positional, cognitive). Somers (1987) puts forward similar
systems, again based on four broad roles that appear in various semantic domains, and
Sowa (2000), takes this as a point of departure for a hierarchy of role types within a
knowledge representation system. Sowa decomposes the four types of Somers into two
pairs of roles characterized by features or entailments: Source ("present at the
beginning of the process") and Goal ("present at the end of the process"), and Determinant
18. Thematic roles
413
("determines the direction of the process") and Immanent ("present throughout the
process", but "does not actively control what happens"). Sowa argues that this system
allows for useful undcrspccification; in the dog broke the window, the dog might might
be involved volitionally or nonvolitionally as initiator, or used as an instrument by
some other initiator, but Source covers all of these possibilities. This illustrates another
characteristic of Sowa's system; he envisions an indefinitely large number of roles, of
varying degrees of specificity, but all are subtypes of one of these four. In this kind
of system, the set of features will also grow indefinitely, so that it is more naturally
viewed as a hierarchy of roles induced from the hierarchy of event types, at least below
the most general roles. Similar ideas have been pursued in Lehmann (1996) and the
Framenet project (http://framenet.icsi.berkeley.edu); see article 29 (Gawron) Frame
Semantics.
Rozwadowska (1988) pursues a somewhat different analysis, with three binary
features: ±sentient, ±cause, and ±change, intended to characterize broad thematic roles.This
system permits natural classes of roles to be defined, which Rozwadowska argues are
useful in describing restrictions on the interpretation of English and Polish specifiers of
nominalizations. For example, the distinction between *the movie's shock of the audience,
vs. the audience's shock at the movie is accounted for by a requirement that the specifier's
role not be Neutral; that is, not have negative values of all three features. Thus either an
Agent or an Experiencer NP (both marked as +change) can appear in specifier position,
but not a Neutral one.
Featural analyses of thematic roles are close in spirit to entailment-based frameworks
that dispense with reified roles altogether. If the features can be defined with sufficient
clarity and consistency, then a natural question to ask is whether the work assigned to
thematic roles can be accomplished simply by direct reference to these definitions of
participant properties.The most notable exponent of this approach is Dowty (1991), with
the two sets of proto-Agent and proto-Patient entailments. Wechsler (1995) is similar
in spirit but employs entailments as a means of partially ordering a predicator's
arguments. Davis & Koenig (2000) and Davis (2001) combine elements of these with a limited
amount of lexical decomposition. While these authors eschew the term "thematic role"
for their characterizations of a predicate's arguments, such definitions do conform to
Dowty's (1989) definition of thematic role types noted above, though they do not impose
any thematic uniqueness requirements.
4.5. Thematic roles and argument realization
Argument realization, the means by which semantic roles of predicates are realized as
syntactic arguments of verbs, nominalizations, or other predicators, has been a consistent
focus of linguists seeking to demonstrate the utility of thematic roles. This section
examines some approaches to argument realization, and the following one briefly notes other
syntactic and morphological phenomena that thematic roles have been claimed to play
a role in.
Levin & Rappaport Hovav (2005) provide an extensive discussion of diverse
approaches to argument realization, including those in which thematic roles play a crucial
part. Such accounts can involve a fixed,"absolute" rule, such as "map the Agent to the
subject (or external argument)" or relative rules, which establish a correspondence between
an ordering (possibly partial) among a predicate's semantic roles and an ordering among
414
IV. Lexical semantics
grammatical relations or configurationally-defined syntactic positions.The role ordering
may correspond to a global ordering of roles, such as a thematic hierarchy, or be derived
from relative depth of semantic roles or from relationships, such as entailments, holding
among sets of roles. One example is the Universal Alignment Hypothesis developed
within Relational Grammar; the version here is from Rosen (1984:40):
(18) There exists some set of universal principles on the basis of which, given the semantic
representation of a clause, one can predict which initial grammatical relation each
nominal bears.
Another widely employed principle of this type is the Uniformity of Thcta Assignment
Hypothesis (UTAH), in Baker (1988: 46), which assumes a structural conception of
thematic roles:
(19) Identical thematic relationships between items are represented by identical
structural relationships between those items at the level of D-structure.
Baker (1997) examines a relativized version of this principle, under which the ordering
of a predicator's roles, according to the thematic hierarchy, must correspond to their
relative depth in D-structure.
In Lexical Mapping Theory (Brcsnan & Kancrva 1989,Brcsnan & Moshi 1990,Alsina
1992), the correspondence is more complex, because grammatical relations are
decomposed into features, which in turn interface with the thematic hierarchy. A role's
realization is thus underspecified in some cases until default assignments fill in the remaining
feature. For example, an Agent receives an intrinsic classification of -O(bjective), and
the highest of a predicate's roles is assigned the feature -R(estricted) by default, with the
result that Agents arc by default realized as subjects.
These general principles typically run afoul of the complexities of argument
realization, including cross-linguistic and within-language variation among predicators and
diathesis alternations. A very brief mention of some of the difficulties follows. First, to avoid
an account that is essentially stipulative, the inventory of roles cannot be too large or
too specific; once again this leads to the problem of assigning roles to the large range of
verbs whose arguments appear to fit poorly in any of the roles. Second, cross-linguistic
variation in realization patterns poses problems for principles like (18) and (19) that
claim universally consistent mapping; some languages permit a wide range of ditransi-
tive constructions, for example, while others entirely lack them (Gerdts 1992). Third,
there are some cases where argument realization appears to involve information outside
what would normally be ascribed to lexical semantic representations;some semantically
similar verbs require different subcategorizations (wish for vs. desire, look at vs. watch,
appeal to vs.please) or display differing diathesis alternations (hide and dress permit an
intransitive alternant with reflexive meaning, while conceal and clothe are only
transitive) (Jackendoff 1987: 405-406, Davis 2001:171-173). Fourth, there are apparent cases
of the same roles being mapped differently, as in the Italian verbs temere and preocupare
in (15) above; these cases prove problematic particularly for a model with a thematic
hierarchy and a role list for each predicate.
These difficulties have been addressed in large degree through entailment-based
models described in the following section, and through models that employ more
18. Thematic roles
415
elaborate semantic decomposition that what is assumed by principles such as (18) and
(19) (Jackendoff 1987,1990, Alsina 1992, Croft 1991,1998).
4.6. Lexical entailments as alternatives to thematic roles
Dowty (1991) is an influential proposal for argument realization avoiding reified
thematic roles in semantic representations.The central idea is that subject and object
selection relies on a set of proto-role entailments,grouped into two sets,proto-agent properties
and proto-patient properties, as follows (Dowty 1991:572):
(20) Contributing properties for the Agent Proto-Role
a. volitional involvement in the event or state
b. sentience (and/or perception)
c. causing an event or change of state in another participant
d. movement (relative to the position of another participant)
e. exists independently of the event named by the verb)
(21) Contributing properties for the Patient Proto-Role
a. undergoes change of state
b. incremental theme
c. causally affected by another participant
d. stationary relative to the movement of another participant
e. does not exist independently of the event, or not at all)
Each of these properties may or may not be entailed of a participant in a given type
of event or state. Dowty's Argument Selection Principle, in (22), characterizes how
transitive verbs may realize their semantic arguments (Dowty 1991:576):
(22) In predicates with grammatical subject and object, the argument for which
the predicate entails the greatest number of Proto-Agent properties will be lexi-
calized as the subject of the predicate; the argument having the greatest number
of Proto-Patient properties will be lexicalized as the direct object.
This numerical comparison across semantic roles accounts well for the range of attested
transitive verbs in English, but fares less well with other kinds of data (Primus 1999,
Davis & Koenig 2000, Davis 2001, Levin & Rappaport Hovav 2005). In the Finnish
causative example in (23), for example (Davis 2001: 69), the subject argument bears fewer
proto-agent entailments than the direct object.
(23) Uutinen puhu-tt-i nais-i-a pitkaan.
news-item talk-CAUS-PAST woman-pl-PART long-ILL
'The news made the women talk for a long time.'
Causation appears to override the other entailments in (23) and similar cases. In
addition, (22) makes no prediction regarding predicators other than transitive verbs, but their
argument realization is not unconstrained; as with transitive verbs, a causally affecting
argument or an argument bearing a larger number of proto-agent entailments is realized
416
IV. Lexical semantics
as the subject of verbs such as English prevail (on), rely (on), hope (for), apply (to), and
many more.
Wechsler (1995) presents a model of argument realization that resembles Dowty's
in eschewing thematic roles and hierarchies in favor of a few relational entailments
amongst participants in a situation, such as whether one participant necessarily has a
notion of another, or is part of another. Davis & Koenig(2000) and Davis (2001) borrow
elements of Dowty's and Wechsler's work but posit reified "proto-role attributes" in
semantic representations reflecting the flow of causation in response to the difficulties
noted above.
4.7. Other applications of thematic roles in syntax and morphology
Thematic roles and thematic hierarchies have been invoked in accounts of various other
syntactic phenomena, and the following provides only a sample.
Some accounts of anaphoric binding make explicit reference to thematic roles, as
opposed to structural prominence in lexical semantic representations. Typically, such
accounts involve a condition that the antecedent of an anaphor must outrank it on the
thematic hierarchy. Wilkins (1988) is one example of this approach. Williams (1994)
advocates recasting the principles of binding theory in terms of role lists defined on
predicators (including many nouns and adjectives). This allows for "implicit",
syntactically unrealized arguments to participate in binding conditions. For example, the
contrast between admiration of him (admirer and admiree must be distinct) and admiration
of himself (admirer and admiree must be identical) suggests that even arguments that
are not necessarily syntactically realized play a crucial role in binding. Thematic role
labels, however, play no part in this system; rather, the arguments of a predicator are
simply in an ordered list, as in the argument structures of Rappaport & Levin (1988) and
Grimshaw (1990), though ordering on the list might be determined by a thematic
hierarchy. And Jackendoff (1987) suggests that indexing argument positions within seman-
tically decomposed lexical representations can address these binding facts, without
reference to thematic hierarchies.
Everaert & Anagnostopoulou (1997) argue that local anaphors in Modern Greek
display a dependence on the thematic hierarchy; as a Goal or Experiencer antecedent can
bind a Theme, for example, but not the re verse. This holds even when the lower thematic
role is realized as the subject, resulting in a subject anaphor.
Nishigauchi (1984) argues for thematic role-based effect in controller selection for
infinitival complements and purpose clauses, a view defended by Jones (1988). For
example, the controller of a purpose clause is said to be a Goal. Ladusaw & Dowty (1988)
counter that the data is better handled by verbal entailments and by general principles of
world knowledge about human action and responsibility.
Donohue (1996) presents data on relativization in Tukang Besi (an Austronesian
language of Indonesia) that suggest a distinct relativization strategy for Instruments,
regardless of their grammatical relation.
Mithun (1984) proposes an account of noun incorporation in which Patient is
preferred over other roles for incorporation, though in some languages arguments bearing
other roles (Instrument or Location) may incorporate as well. But alternatives based on
underlying syntactic structure (Baker 1988) and depth of embedding in lexical semantic
representations (Kiparsky 1997) have also been advanced. Evans (1997) examines
18. Thematic roles
417
noun-incorporation in Mayali (a non-Pama-Nyungan language of northern Australia)
and finds a thematic-role based account inadequate to deal with the range of
incorporated nominals. He instead suggests that constraints based on animacy and
prototypicality in the denoted event are crucial in selecting the incorporating argument.
5. Concluding remarks
Dowty (1989:108-109) contrasts two positions on the utility of (broad) thematic roles.
Those advocating that thematic roles are crucially involved in lexical, morphological,
and syntactic phenomena have consequently tried to define thematic roles and develop
thematic role systems. But, even 20 years later, the position Dowty states in (24) can also
be defended:
(24) Thematic roles per se have no priviledged [sic] status in the conditioning of syntactic
processes by lexical meaning, except insofar as certain semantic distinctions happen
to occur more frequently than others among natural languages.
Given the range of alternative accounts of argument realization, lexical acquisition, and
other phenomena for which the "traditional", broad thematic roles have sometimes been
considered necessary, and the additional devices required even in those approaches that
do employ them, it is unclear how much is gained by introducing them as reified elements
of linguistic theory. There do appear to be some niche cases of phenomena that depend
on such notions, some of which are noted above, and there are stronger motivations for
entailment-based approaches to argument realization, diathesis alternations, aspect, and
complement control. Such entailments can certainly be viewed as thematic roles, some
even as roles in a broad sense that apply to a large class of predicates. But the overall
picture is not one that lends support to the "traditional" notion of a small inventory of
broad roles, with each of a predicate's arguments uniquely assigned one of them.
Fine-grained roles serve a somewhat different function in model-theoretic semantics,
one not dependent on the properties of a thematic role system but on the use of roles in
individuating events and how they are related to their participants. But in this realm, too,
they do not come without costs; as Landman and Schein have argued, dyadic thematic
roles, coupled with principles of role uniqueness, lead both to unwelcome inferences that
can be blocked only with additional mechanisms and to requiring events that are
intuitively too fine-grained.These difficulties arise in connection with symmetric predicators,
transactional verbs, and other complex event types that may warrant a more elaborated
treatment than thematic roles defined on a unitary predicate can offer.
The author gratefully acknowledges detailed comments on this article from Cleo
Condoravdi and from the editors.
6. References
Alsina, Alex 1992. On the argument structure of causatives. Linguistic Inquiry 23,517-555.
Bach, Einmon 1989. Informal Lectures on Formal Semantics. Albany, NY: State University of New
York Press.
Baker, Mark C. 1988. Incorporation. A Theory of Grammatical Function Changing. Chicago, IL:The
University of Chicago Press.
418
IV. Lexical semantics
Baker, Mark C. 1997. Thematic roles and syntactic structure. In: L. Haegeinan (ed.). Elements of
Grammar. Dordrecht: Kluwer, 73-137.
Bayer, Samuel L. 1997. Confessions of a Lapsed Neo-Davidsonian. Events and Arguments in
Compositional Semantics. New York: Garland.
Belletti, Adriana & Luigi Rizzi 1988. Psych-verbs and 6-Theory. Natural Language and Linguistic
Theory 6,291-352.
Bresnan.Joan & Jonni Kanerva 1989. Locative inversion in Chichewa. A case study of factorization
in grammar. Linguistic Inquiry 20.1-50.
Bresnan.Joan & Lioba Moshi 1990. Object asymmetries in comparative Bantu syntax. Linguistic
Inquiry 21,147-185.
Broadwell, George A. 1988. Multiple 6-role assignment in Choctaw. In: W. Wilkins (ed.). Syntax and
Semantics 21: Thematic Relations. New York: Academic Press, 113-127.
Carlson, Greg 1984. On the role of thematic roles in linguistic theory. Linguistics 22, 259-
279.
Carlson, Greg 1998. Thematic roles and the individuation of events. In: S. Rothstein (ed.). Events
and Grammar. Dordrecht: Kluwer, 35-51.
Chierchia, Gennaro 1984. Topics in the Syntax and Semantics of Infinitives and Gerunds. Ph.D.
dissertation. University of Massachusetts, Amherst, MA.
Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam 1982. Some Concepts and Consequences of the Theory of Government and Binding.
Cambridge. MA:The MIT Press.
Croft, William 1991. Syntactic Categories and Grammatical Relations: The Cognitive Organization
of Information. Chicago, IL:The University of Chicago Press.
Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The
Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications,
21-63.
Davidson, Donald 1967. The logical form of action sentences. In: N. Reseller (ed.). The Logic of
decision and Action. Pittsburgh. PA: University of Pittsburgh Press, 81-95.
Davis, Anthony R. 2001. Linking by Types in the Hierarchical Lexicon. Stanford, CA: CSLI
Publications.
Davis, Anthony R. & Jean-Pierre Kocnig 2000. Linking as constraints on word classes in a
hierarchical lexicon. Language 76,56-91.
Donohue, Mark 1996. Relative clauses inTukang Besi. Grammatical functions and thematic roles.
Linguistic Analysis 26,159-173.
Dowty, David 1989. On the semantic content of the notion of 'thematic role'. In: G. Chierchia,
B. Partee & R. Turner (eds.). Properties, Types, and Meaning, vol. 2. Semantic Issues. Dordrecht:
Kluwer, 69-129.
Dowty, David 1991.Thematic proto-roles and argument selection. Language 67,547-619.
Everaert, Martin & Elena Anagnostopoulou 1997. Tlieinatic hierarchies and Binding Theory.
Evidence from Greek. In: F. Gorblin, D. Godard & J.-M. Marandin (eds.). Empirical Issues in
Formal Syntax and Semantics. Selected Papers from the Colloque de Syntaxe et de Semantique de
Paris (CSSP1995). Bern: Lang. 43-59.
Evans, Nick 1997. Role or cast. In: A. Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates.
Stanford. CA: CSLI Publications, 397^30.
Fillmore, Charles J. 1968. The case for case. In: E. Bach & R.T. Harms (eds.). Universals of Linguistic
Theory. New York: Holt, Rinchai t & Winston. 1-88.
Fillmore. Charles J. 1977.Topics in lexical semantics. In: R. W. Cole (ed.). Current Issues in Linguistic
Theory. Bloomington. IN: Indiana University Press, 76-138.
Foley. William A. & Robert D. van Valin, Jr. 1984. Functional Syntax and Universal Grammar.
Cambridge: Cambridge University Press.
Gawron, Jean Mark 1983. Lexical Representations and the Semantics of Complementation. Ph.D.
dissertation. University of California, Berkeley, CA.
18. Thematic roles
419
Gerdts, Donna B. 1992. Morphologically-mediated relational profiles. In: L. A. Buszard-Welcher,
L. Wee & W. Weigel (eds.). Proceedings of the I8th Annual Meeting of the Berkeley Linguistics
Society (=BLS). Berkeley, CA: Berkeley Linguistic Society. 322-337.
Grimshaw. Jane 1990. Argument Structure. Cambridge, MA:The MIT Press.
Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge. MA.
Jackendoff, Ray 1983. Semantics and Cognition. Cambridge, MA:The MIT Press.
Jackendoff, Ray 1987. The Status of thematic relations in linguistic theory. Linguistic Inquiry 18,
369^11.
Jackendoff, Ray 1990. Semantic Structures. Cambridge, MA: The MIT Press.
Jones, Charles 1988. Thematic relations in control. In: W. Wilkins (ed.). Syntax and Semantics 21:
Thematic Relations. New York: Academic Press, 75-89.
Kiparsky, Paul 1997. Remarks on denominal verbs. In: A. Alsina, J. Bresnan & P. Sells (eds.).
Complex Predicates. Stanford, CA: CSLI Publications, 473-499.
Kratzer. Angclika 1996. Severing the external argument from its verb. In: J. Rooiyck & L. Zai ing
(eds.). Phrase Structure and the Lexicon. Dordrecht: Kluwcr, 109-137.
Krifka, Manfred 1992. Thematic relations as links between nominal reference and temporal
constitution. In: I. A. Sag & A. Szabolcsi (eds.). Lexical Matters. Stanford, CA: CSLI Publications,
29-53.
Krifka, Manfred 1998. The origins of telicity. In: S. Rothstein (ed.). Events and Grammar.
Dordrecht: Kluwer, 197-235.
Ladusaw. William A. & David R. Dowty 1988.Towards a nongrammatical account of thematic roles.
In: W. Wilkins (ed.). Syntax and Semantics 21: Thematic Relations. New York: Academic Press,
61-73.
Landman, Fred 2000. Events and Plurality. The Jerusalem Lectures. Dordrecht: Kluwer.
Lehmann, Fritz 1996. Big posets of participatings and thematic roles. In: P. W. Eklund, G. Ellis &
G. Mann (eds.). Proceedings of the 4th International Conference on Conceptual Structures.
Knowledge Representation as Interlingua. Berlin: Springer, 50-74.
Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge
University Press.
Li, Yafei 1995. The thematic hierarchy and causativity. Natural Language and Linguistic Theory 13,
255-282.
Marquez, Llufs et al. 2008. Semantic role labeling. An introduction to the special issue.
Computational Linguistics 34,145-159.
Mithun, Marianne 1984.The evolution of noun incorporation. Language 60,847-894.
Nishigauchi,Taisuke 1984. Control and the thematic domain. Language 60,215-250.
Ostler, Nicholas 1979. Case Linking. A Theory of Case and Verb Diathesis Applied to Classical
Sanskrit. Ph.D. dissertation. MIT, Cambridge, MA.
Parsons, Terence 1990. Events in the Semantics of English. A Study in Subatomic Semantics.
Cambridge, MA:The MIT Press.
Pei (mutter, David M.& Paul Postal. 1984.The I-advancementexclusivencsslaw. In:D.M.Pcrhnutter
& C. Rosen (eds.). Studies in Relational Grammar, vol. 2. Chicago, IL:The University of Chicago
Press, 81-125.
Primus, Beatrice 1999. Cases and Thematic Roles. Ergative, Accusative and Active. Tubingen:
Niemeyer.
Primus, Beatrice 2006. Mismatches in semantic-role hierarchies and the dimensions of Role
Semantics. In: I. Bornkcsscl ct al. (eds.). Semantic Role Universals and Argument Linking. Theoretical,
Typological, and Psycholinguistic Perspectives. Berlin: Mouton dc Gruytcr, 53-88.
Rappaport, Malka & Beth Levin 1988. What to do with 6-roles. In: W. Wilkins (ed.). Syntax and
Semantics 21: Thematic Relations. New York: Academic Press, 7-36.
Rosen, Carol 1984.The interface between semantic roles and initial grammatical relations. In: D.M.
Perlmutter & C. Rosen, (eds.). Studies in Relational Grammar, vol. 2. Chicago, IL:The University
of Chicago Press, 38-77.
420
IV. Lexical semantics
Rozwadowska. Bozena 1988.Thematic restrictions on derived noininals. In: W. Wilkins (ed.). Syntax
and Semantics 21: Thematic Relations. New York: Academic Press, 147-165
Scliein. Barry 2002. Events and the semantic content of thematic relations. In: G. Preyer & G. Peter
(eds.). Logical Form and Language. Oxford: Oxford University Press, 263-344.
Somers, Harold L. 1987. Valency and Case in Computational Linguistics. Edinburgh: Edinburgh
University Press.
Sowa. John F. 2000. Knowledge Representation. Logical, Philosophical, and Computational
Foundations. Pacific Grove, CA: Brooks Cole Publishing Co.
Svenonius, Peter 2007. Adpositions, particles and the arguments they introduce. In: E. Reuland,T.
Bhattacharya & G. Spathas (eds.). Argument Structure. Amsterdam: Benjamins, 63-103.
van Valin. Robert D., Jr. 2004. Semantic macroroles in Role and Reference Grammar. In: R. Kailu-
weit & M. Hummel (eds.). Semantische Rollen.Tubingen: Narr, 62-82.
Wechsler, Stephen 1995. The Semantic Basis of Argument Structure. Stanford, CA: CSLI Publications.
Wechslei. Stephen 2005. What is right and wrong about little v. In: M. Vulchanova & T. A. A fail i
(eds.). Grammar and Beyond. Essays in Honour of Lars Hellan. Oslo: Novus Press, 179-195.
Wilkins, Wendy 1988. Thematic structure and rcflcxivization. In: W. Wilkins (ed.). Syntax and
Semantics 21: Thematic Relations. New York: Academic Press, 191-213.
Williams, Edwin 1994. Thematic Structure in Syntax. Cambridge. MA:The MIT Press.
Wunderlich. Dieter 1997. CAUSE and the structure of verbs. Linguistic Inquiry 28,27-68.
Anthony R. Davis, Washington, DC (USA)
19. Lexical Conceptual Structure
1. Introduction
2. The introduction of LCSs into linguistic theory
3. Components of LCSs
4. Choosing primitive predicates
5. Subeventual analysis
6. LCSs and syntax
7. Conclusion
8. References
Abstract
The term "Lexical Conceptual Structure" was introduced in the 1980s to refer to a
structured lexical representation of verb meaning designed to capture those meaning
components which determine grammatical behavior, particularly with respect to argument
realization. Although the term is no longer much used, representations of verb meaning
which share many of the properties of LCSs are still proposed in theories which
maintain many of the aims and assumptions associated with the original work on LCSs. As
LCSs and the representations that are their descendants take the form of predicate
decompositions, the article reviews criteria for positing the primitive predicates that make up
LCSs. Following an overview of the original work on LCS, the article traces the
developments in the representation of verb meaning that characterize the descendants of the
early LCSs. The more recent work exploits the distinction between root and event structure
implicit in even the earliest LCS in the determination of grammatical behavior. This work
Maienborn, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 42(M40
19. Lexical Conceptual Structure
421
also capitalizes on the assumption that predicate decompositions incorporate a subeven-
tual analysis which defines hierarchical relations among arguments, allowing argument
realization rules to be formulated in terms of the geometry of the decomposition.
1. Introduction
Lexical Conceptual Structure (LCS) is a term that was used in the 1980s and 1990s to
refer to a structured lexical representation of verb meaning. Although the term "LCS" is
no longer widely used, structured representations of verb meaning which share many of
the properties of LCSs are still often proposed, in theories which maintain many of the
aims and assumptions associated with those originally involving an LCS. These
descendants of the original LCSs go by various names, including lexical relational structures
(Hale & Keyser 1992,1993),event structures (Rappaport Hovav & Levin 1998a, Levin &
Rappaport Hovav 2005),semantic structures (Pinker 1989), L-syntax (Mateu 2001a,Travis
2000), 1-structure (Zubizarreta & Oh 2007) and first phase syntax (Ramchand 2008);
representations called Semantic Forms (Wunderlich 1997a, 1997b) and Semantic
Representations (van Valin 1993,2005, van Valin & LaPolla 1997) are also close in spirit to
LCSs. Here we provide an overview of work that uses a construct called LCS, and we then
trace the developments which have taken place in the representation of verb meaning in
descendants of this work. We stress, however, that we are not presenting a single coherent
or unified theory, but rather a synthetic perspective on a collection of related theories.
2. The introduction of LCSs into linguistic theory
In the early 1980s, the idea emerged that major facets of the syntax of a sentence are
projected from the lexical properties of the words in it (e.g. Chomsky 1981, Farmer
1984, Pcsctsky 1982,Stowcll 1981; sec Fillmore 1968 for an earlier proposal of this sort),
and over the course of that decade its consequences were explored. Much of this work
assumes that verbs are associated with predicate-argument structures (e.g. Bresnan 1982,
Grimshaw 1990), often called theta-grids (Stowell 1981, Williams 1981). The central
idea is that the syntactic structure that a verb appears in is projected from its predicate-
argument structure, which indicates the number of syntactic arguments a verb has, and
some information about how the arguments are projected onto syntax, for example, as
internal or external arguments (Marantz 1984, Williams 1981). One insight arising from
the closer scrutiny of the relationship between the lexical properties of verbs and the
syntactic environments in which they appear is that a great many verbs display a range
of what have been called argument - or diathesis - alternations, in which the same verb
appears with more than one set of morphosyntactic realization options for its arguments,
as in the causative and dative alternations, in (1) and (2), respectively.
(1) a. Pat dried the clothes,
b. The clothes dried.
(2) a. Pat sold the rare book to Terry,
b. Pat sold Terry the rare book.
Some argument alternations seem to involve two alternate realizations of the same set
of arguments (e.g. the dative alternation), while others seem to involve real changes in
422
IV. Lexical semantics
the meaning of the verb (e.g. the causative alternation) (Rappaport Hovav & Levin
1998b). Researchers who developed theories of LCS assumed that in addition to a verb's
argument structure, it is possible to isolate a small set of recurring meaning components
which determine the range of argument alternations a particular verb can participate
in. These meaning components are embodied in the primitive predicates of predicate
decompositions such as LCSs. Thus, LCSs are used both to represent systematic
alternations in a verb's meaning and to define the set of verbs which undergo alternate
mappings to syntax, as we now illustrate.
A case study which illustrates this line of investigation is presented by Guerssel
et al. (1985).Their study attempts to isolate those facets of meaning which determine a
verb's participation in several transitivity alternations in four languages: Berber, English,
Warlpiri, and Winnebago. Guerssel et al. compare the behavior of verbs corresponding to
English break (as a representative of the class of change of state verbs) and cut (as a
representative of the class of motion-contact-effect verbs) in several alternations, including
the causative and conative alternations in these languages (cf. Fillmore 1970).They
suggest that participation in the causative alternation is contingent on the LCS of a verb
containing a constituent of the form '[come to be STATE]' (represented via the
predicate BECOME or CHANGE in some other work), while participation in the conative
alternation requires an LCS with components of contact and effect.
The LCSs suggested for the intransitive and transitive uses of break, which together
make up the causative alternation, arc given in (3), and the LCSs for the transitive and
intransitive uses of cut, which together make up the conative alternation, illustrated
in (4), are presented in (5).
(3) a. break: y come to be BROKEN (Guerssel et al. 1985:54, ex. (19))
b. break: x cause (y come to be BROKEN) (Guerssel et al. 1985:55, ex. (21))
(4) a. I cut the rope around his wrists.
b. I cut at the rope around his wrists.
(5) a. cut: x produce CUT in y, by sharp edge coming into contact with y (Guerssel
etal. 1985:51,ex.(ll))
b. cut Conative LCS: x causes sharp edge to move along path toward y, in order
to produce CUT on y, by sharp edge coming into contact with y (Guerssel
etal. 1985:59, ex. (34))
We cite semantic representations in the forms given in the source, even though this leads
to inconsistencies in notation; where we formulate representations for the purposes of
this article, we adopt the representations used by Rappaport Hovav & Levin (1998a) and
subsequent work. Although the LCSs for cut in (5) include semantic notions not usually
encountered in predicate decompositions, central to them are the notions 'move' and
'produce', which have more common analogues: go and the combination of cause and
become, respectively.
The verb cut docs not have an intransitive noncausativc use, as in (6), since its LCS
docs not have an isolatable constituent of the form '[come to be in STATE]', while the
verb break lacks a conative variant, as in (7), because its LCS does not include a contact
19. Lexical Conceptual Structure 423
component. Finally, verbs like touch, whose meaning does not involve a change of state
and simply involves contact with no necessary effect, display neither alternation, as in (8).
(6) "The bread cut.
(7) *We broke at the box.
(8) a. We touched the wall.
b. "The wall touched.
c. *We touched at the wall.
For other studies along these lines, see Hale & Keyser (1987), Laughren (1988), and
Rappaport, Levin & Laughren (1988).
Clearly, the noncausative and causative uses of a verb satisfy different truth
conditions, as do the conative and nonconative uses of a verb. As we have just illustrated, LCSs
can capture these modulations in the meaning of a verb which, in turn, have an effect on
the way a verb's arguments are morphosyntactically realized. As we discuss in sections 5
and 6, subsequent work tries to derive a verb's argument realization properties in a
principled way from the structure of its LCS.
However, as mentioned above, verbs with certain LCSs may also simply allow more
than one syntactic realization of their arguments without any change in meaning.
Rappaport Hovav & Levin (2008) argue that this possibility is instantiated by the
English dative alternation as manifested by verbs that inherently lexicalize caused possession
such as give, rent, and ,se//.They propose that these verbs have a single LCS representing
the causation of possession, as in (9), but differ from each other with respect to the
specific type of possession involvcd.Thc verb give Icxicalizcs nothing more than caused
possession, while other verbs add further details about the event: it involves the exchange of
money for sell and is temporary and contractual for rent.
(9) Caused possession LCS: [ x cause [ y have z ] ]
According to Rappaport Hovav & Levin (2008), the dative alternation arises with these
verbs because the caused possession LCS has two syntactic realizations. (See Harley
2003 and Goldberg 1995 for an alternative view which takes the dative alternation to
be a consequence of attributing both caused motion and caused possession LCSs to all
alternating verbs; Rappaport Hovav & Levin only attribute both LCSs to verbs such as
send and throw, which are not inherently caused possession verbs.)
As these case studies illustrate, the predicate decompositions that fall under the rubric
"LCS" are primarily designed to capture those facets of meaning which determine
grammatical facets of behavior, including argument alternations. This motivation sets LCSs
apart from other predicate decompositions, which are primarily posited on the basis
of other forms of evidence, such as the ability to capture various entailment relations
between sets of sentences containing morphologically related words and the ability to
account for interactions between event types and various tense operators and temporal
adverbials;cf. article 17 (Engelberg) Frameworks of decomposition.To give one example,
it has been suggested that verbs which pass tests for telicity all have a state predicate
424
IV. Lexical semantics
in their predicate decomposition (Dowty 1979, Parsons 1990). Nevertheless, LCS
representations share many of the properties of other predicate decompositions used as
explications of lexical meaning, including those proposed by Dowty (1979), Jackendoff
(1976, 1983, 1990), and more recently in Role and Reference Grammar, especially, in
van Valin & LaPolla (1997), based in large part on the work of generative semanticists
such as Lakoff (1968,1970), McCawley (1968,1971), and Ross (1972).These similarities,
of course, raise the question of whether the same representation can be the basis for
capturing both kinds of generalizations.
LCSs, however, are not intended to provide an exhaustive representation of a verb's
meaning, as mentioned above. Positing an LCS presupposes that it is possible to
distinguish those facets of meaning that arc grammatically relevant from those which arc
not; this assumption is not uncontroversial, see, for example, the debate between Taylor
(1996) and Jackendoff (1996a). In addition, the methodology and aims of this form of
"componential" analysis of verb meaning differs in fundamental ways from the type of
componential analysis proposed by the structuralists (e.g. Nida 1975). For the
structuralists, meaning components were isolatable insofar as they were implicated in semantic
contrasts within a lexical field (e.g. "adult" to distinguish parent from child); the aim of a
componential analysis, then, was to provide a feature analysis of the words in a particular
semantic field that distinguishes every word in that field from every other. In contrast,
the goal of the work assuming LCS is not to provide an exhaustive semantic analysis, but
rather to isolate only those facets of meaning which recur in significant classes of verbs
and determine key facets of the linguistic behavior of verbs. This approach makes the
crucial assumption that verbs may be different in significant respects, while still having
almost identical LCSs; for example, freeze and melt denote "inverse" changes of state, yet
they would both share the LCS of change of state verbs.
Although all these works assume the value of positing predicate decompositions (thus
differing radically from the work of Fodor & Lcporc 1999), the nature of the predicate
decomposition and its place in grammar and syntactic structure varies quite radically
from theory to theory. Here we review the work which takes the structured lexical
representation to be a specifically linguistic representation and, thus, to be distinct from
a general conceptual structure which interfaces with other cognitive domains.
Furthermore, this work assumes that the information encoded in the LCSs is a small subset
of the information encoded in a fully articulated explication of lexical meaning. In this
respect, this work is different from the work of Jackendoff (1983, 1990), who assumes
that there is a single conceptual representation, used for linguistic and nonlinguistic
purposes; cf. article 30 (Jackendoff) Conceptual Semantics.
3. Components of LCSs
Since verbs individuate and name events, LCS-style representations are taken to specify
the limited inventory of basic event types made available by language for describing
happenings in the world. Thus, our use of the term "event" includes all situation types,
including states, similar to the notion of "eventuality" in some work on event
semantics (Bach 1986). For this reason, such representations are often currently referred to
as "event structures". In this section, we provide an overview of the representations of
the lexical meaning of verbs which are collectively called event structures and identify
the properties which are common to the various instantiations of these representations.
19. Lexical Conceptual Structure 425
In section 6, we review theories which differ in terms of how these representations are
related to syntactic structures.
All theories of event structure, either implicitly or explicitly, recognize a distinction
between the primitive predicates which define the range of event types available and a
component which represents what is idiosyncratic in a verb's meaning. For example, all
noncausative verbs of change of state have a predicate decomposition including a
predicate representing the notion of change of state, as in (10); however, these verbs differ from
one another with respect to an attribute of an entity whose value is specified as changing:
the attribute relevant to cool involves temperature, while that relevant to widen involves
a dimension. One way to represent these components of meaning is to allow the predicate
representing the change to take an argument which represents the attribute, and this
argument position can then be associated with distinct attributes. This idea is instantiated in
the representations for the three change of state verbs in (11) by indicating the attribute
relevant to each verb in capital italics placed within angle brackets.
(10) [ become [ y <RES-STATE> ] ]
(11) a. dry: [ become [ y <DRY> ] ]
b. widen: [ become [ y <WIDE> ] ]
c. dim: [ become [ y <DIM> ] ]
As this example shows, LCSs are constructed so that common substructures in the
representations of verb meanings can be taken to define grammatically relevant classes of
verbs, such as those associated with particular argument alternations. Thus, the
structure in (10), which is shared by all change of state verbs, can then be associated with
displaying the causative alternation. Being associated with this LCS substructure is a
necessary, but not sufficient, condition for participating in the causative alternation, since
only some change of state verbs alternate in English.The precise conditions for licensing
the alternation require further investigation, as does the question of why languages vary
somewhat in their alternating verbs; see Alexiadou, Anagnostopoulou & Schafer (2006),
Doron (2003), Haspclmath (1993), Koontz-Garbodcn (2007), and Levin & Rappaport
Hovav (2005) for discussion.
The idiosyncratic component of a verb's meaning has received several names,
including "constant", "root", and even "verb". We use the term "root" (Pesetsky 1995)
in the remainder of this article, although we stress that it should be kept distinct from
the notion of root used in morphology (e.g. Aronoff 1993). Roots may be integrated into
LCSs in two ways: a root may fill an argument position of a primitive predicate, as in the
change of state example (10), or it may serve as a modifier of a predicate, as with various
types of activity verbs, as in (12) and (13). (Modifier status is indicated by subscripting
the root to the predicate being modified.)
(12) Casey ran.
lXACT<*/.vJ
(13) Tracy wiped the table.
[x act<»,,,> y 1
426
IV. Lexical semantics
Although early work on the structured representation of verb meaning paid little
attention to the nature and contribution of the root (the exception being Grimshaw 2005),
more recent work has taken seriously the idea that the elements of meaning lexicalized
in the root determine the range of event structures that a root can be associated with (e.g.
Erteschik-Shir & Rapoport 2004,2005, Harley 2005, Ramchand 2008, Rappaport Hovav
2008, Rappaport Hovav & Levin 1998a, Zubizarreta & Oh 2007).
Thus, Rappaport Hovav & Levin (1998a) propose that roots are of different onto-
logical types, with the type determining the associated event structures.Two of the major
ontological types of roots are manner and result (Levin & Rappaport Hovav 1991,1995,
Rappaport Hovav & Levin 2010; see also Talmy 1975, 1985, 2000). These two types
of roots arc best introduced through an examination of verbs apparently in the same
semantic field which differ as to the nature of their root: the causative change of state
verb clean, for example, has a result root that specifies a state that often results from
some activity, as in (14), while the verb scrub has a manner root that specifies an activity,
as in (15); in this and many other instances, the activity is one conventionally carried out
to achieve a particular result. With scrub the result is "cleanness", which explains the
intuition of relatedness between the manner verb scrub and the result verb clean.
(14) [ [ x act<imvvaa(>] cause [become [ y <CLEAN> ] ] ]
(15)[xact_J
Result verbs specify the bringing about of a result state - a state that is the result of
some sort of activity; it is this state which is lexicalized in the root. Thus, the verbs clean
and empty describe two different result states that are often brought about by removing
material from a place; neither verb is specific about how the relevant result state comes
about. Result verbs denote externally caused eventualities in the sense of Levin &
Rappaport Hovav (1995).Thus, while a cave can be empty without having been emptied,
something usually becomes empty as a result of some causing event. Result verbs, then,
are associated with a causative change of state LCS;see also Hale & Keyser (2002) and
Koontz-Garboden (2007) and for slightly different views Alexiadou,Anagnostopoulou&
Schafer (2006) and Doron (2003).
A manner root is associated with an activity LCS;such roots describe actions, which
are identified by some sort of means, manner, or instrument.Thus, the manner verbs scrub
and wipe both describe actions that involve making contact with a surface, but differ in
the way the hand or some implement is moved against the surface and the degree of force
and intensity of this movement. Often such activities are characterized by the instrument
used in performing them and the verbs themselves take their names from the instruments.
Again among verbs describing making contact with a surface, there are the verbs rake
and shovel, which involve different instruments, designed for different purposes and, thus,
manipulated in somewhat different ways. Despite the differences in the way the
instruments arc used, linguistically all these verbs have a basic activity LCS. In fact, all
instrument verbs have this LCS even though there is apparent diversity among them: Thus,
the verb sponge might be used in the description of removing events (e.g. Tyler sponged
the stain off the fabric) and the verb whisk in the description of adding events (e.g.
Cameron whisked the sugar into the eggs), while the verbs rake and shovel might be used for
cither (e.g. Kelly shoveled the snow into the truck, Kelly shoveled the snow off the drive).
19. Lexical Conceptual Structure 427
According to Rappaport Hovav & Levin (1998b), this diversity has a unified source:
English allows the LCSs of all activity verbs to be "augmented" by the addition of a result
state, giving rise to causative LCSs, such as those involved in the description of adding and
removing events, via a process they call Template Augmentation.This process resembles
Wunderlich's (1997a,2000) notion of argument extension;cf. article 84 (Wunderlich)
Operations on argument structure; see also Rothstein (2003) and Ramchand (2008). Whether
an augmented instrument verb receives an adding or removing interpretation depends on
whether the instrument involved is typically used to add or remove stuff.
In recent work, Rappaport Hovav & Levin (2010) suggest an independent
characterization of manner and result roots by appealing to the notions of scalar and nonscalar
change—notions which have their origins in Dowty (1979,1991) and McClurc (1994), as
well as the considerable work on the role of scales in determining telicity (e.g. Beavers
2008, Borer 2005, Hay, Kennedy & Levin 1999, Jackendoff 1996b, Kennedy & Levin 2008,
Krifka 1998, Ramchand 1997,Tenny 1994). As dynamic verbs,manner and result verbs all
involve change, though crucially not the same type of change: result roots specify scalar
changes, while manner roots do not. Verbs denoting events of scalar change in one
argument lexically entail a scale: a set of degrees - points or intervals indicating measurement
values - ordered on a particular dimension representing an attribute of an argument
(e.g. height, temperature, cost) (Bartsch & Vennemann 1972, Kennedy 1999, 2001); the
degrees indicate the possible values of this attribute. A scalar change in an entity involves
a change in the value of the relevant attribute in a particular direction along the
associated scale. The change of state verb widen is associated with a scale of increasing values
on a dimension of width; and a widening event necessarily involves an entity showing an
increase in the value along this dimension. A nonscalar change is any change that cannot
be characterized in terms of a scale;such changes are typically complex, involving a
combination of many changes at once.They are characteristic of manner verbs. For example,
the verb sweep involves a specific movement of a broom against a surface that is repeated
an indefinite number of times. See Rappaport Hovav (2008) for extensive illustration of
the grammatical reflexes of the scalar/nonscalar change distinction.
As this section makes clear, roots indirectly influence argument realization as their
ontological type determines their association with a particular event structure. We leave
open the question of whether roots can more directly influence argument realization.
For example, the LCS proposed for cut in (5) includes elements of meaning which are
normally associated with the root since "contact" or a similar concept has not figured
among proposals for the set of primitive predicates constituting an LCS. Yet, this element
of meaning is implicated by Guerssel et al. in the conative alternation. (In contrast, the
notion "effect" more or less reduces to a change of state of some type.)
4. Choosing primitive predicates
LCSs share with other forms of predicate decomposition the properties that are said
to make such representations an improvement over lists of semantic roles, whether
Fillmore's (1968) cases or Gruber (1965/1976) and Jackendoff s (1972) thematic relations,
as structured representations of verb meaning. There is considerable discussion of the
problems with providing independent, necessary and sufficient definitions of semantic
roles (see e.g. Dowty 1991 and article 18 (Davis) Thematic roles), and one suggestion for
dealing with this problem is the suggestion first found in Jackendoff (1972) that semantic
428
IV. Lexical semantics
roles can be identified with particular open positions in predicate decompositions. For
example, the semantic role "agent" might be identified with the first argument of a
primitive predicate cause. There is a perception that the set of primitive predicates used in a
verb's LCS or event structure is better motivated than the set of semantic role labels for
its arguments, and for this reason predicate decompositions might appear to be
superior to a list of semantic role labels as a structured representation of a verb's meaning.
However, there is surprisingly little discussion of the explicit criteria for positing a
particular primitive predicate, although see the discussion in Carter (1978), Jackendoff (1983:
203-204), and Joshi (1974).
The primitive predicates which surface repeatedly in studies using LCSs or other
forms of predicate decomposition arc act or do, be, become or change, cause, and go,
although the predicates have, move, stay, and, more recently, result are also proposed.
Jackendoff (1990) posits a significantly greater number of predicates than in his previous
work, introducing the predicates configure, extend, exchange, form, inch(oative),
orient, and react. Article 32 (Hobbs) Word meaning and world knowledge discusses
how some of these predicates may be grounded in an axiomatic semantics.
Once predicates begin to proliferate, theories of predicate decomposition face many
of the well-known problems facing theories of semantic roles (cf. Dowty 1991). The
question is whether it is possible to identify a small, comprehensive, universal, and well-
motivated set of predicates accepted by all. It is worthwhile, therefore, to scrutinize the
motivation for proposing a predicate in the first place and to try to make explicit when
the introduction of a new predicate is justified.
In positing a set of predicates, researchers have tried to identify recurring elements of
verb meaning that figure in generalizations holding across the set of verbs within (and,
ultimately, across) languages. Often these generalizations involve common entailments
or common grammatical properties. Wilks (1987) sets out general desiderata for a set of
primitive predicates that arc implicit in other work. For instance, the set of predicates
should be finite in size and each predicate in the set should indeed be "primitive" in that
it should not be reducible to other predicates in the set, nor should it even be partially
definable in terms of another predicate. Thus, in positing a new predicate, it is important
to consider its effect on the overall set of predicates. Wilks also proposes that the set of
predicates should be able to exhaustively describe and distinguish the verbs of each
language, but LCS-style representations, by adopting the root-event structure distinction,
simply require that the set of primitives should be able to describe all the grammatically
relevant event types. It is the role of the root to distinguish between specific verbs of the
same event type, and there is a general, but implicit assumption that the roots themselves
cannot be reduced to a set of primitive elements. As Wilks (1987:760) concludes, the
ultimate justification for a set of primitive is in their "special organizing role in a language
system". We now briefly present several case studies chosen to illustrate the type of
reasoning used in positing a predicate.
One way of arriving at a set of primitive predicates is to adopt a hypothesis that
circumscribes the basic inventory of event types, while allowing for all events to be analyzed
in terms of these types. This approach is showcased in the work of Jackendoff (1972,
1976,1983,1987,1990), who develops ideas proposed by Grubcr (1965/1976). Jackendoff
adopts the localist hypothesis: motion and location events are basic and all other events
should be construed as such events.There is one basic type of motion event, represented
19. Lexical Conceptual Structure 429
by the primitive predicate go, which takes as arguments a theme and the path (e.g. The
cart went from the farm to the market).There are two basic types of locational events,
represented by the predicate be (for stative events) and stay (for non-stative events); these
predicates also take a theme and a location as arguments (e.g. The coat was/stayed in the
closet). In addition, Jackendoff introduces the predicates cause and let, which are used
to form complex events taking as arguments a causer and a motion or location event.
Events that are not obviously events of motion or location are construed in terms of
some abstract form of motion or location. For example, with events of possession, pos-
sessums can be taken as themes and possessors as locations in an abstract possessional
"field" or domain. The verb give is analyzed as describing a causative motion event in
the possessional field in which a posscssum is transferred from one possessor to another.
Physical and mental states and changes of state can be seen as involving an entity being
"located" in a state or "moving" from one state to a second state in an identificational
field; the verb break, for instance, describes an entity moving from a state of being whole
to a state of being broken. Generalizing, correspondences are set up between the
components of motion and location events and the components of other event domains or
"semantic fields" in Jackendoff's terms; this is what Jackendoff (1983:188) calls the
Thematic Relations Hypothesis;see article 30 (Jackendoff) Conceptual Semantics. In general
on this view, predicates are most strongly motivated when they figure prominently in
lexical organization and in cross-field generalizations.
This kind of cross-field organization can be illustrated in a number of ways. First,
many English verbs have uses based on the same predicate in more than one field
(Jackendoff 1983:203-204).Thus, the predicate stay receives support from the English
verb keep, whose meaning presumably involves the predicates cause and stay,
combined in a representation as in (16), because it shows uses involving the three fields
just introduced.
(16) [cause (x, (stay y,z))]
(17) a. Terry kept the bike in the shed. (Positional)
b. Terry kept the bike basket. (Possessional)
c. Terry kept the bike clean. (Identificational)
Without the notion of semantic field, there would be no reason to expect English to
have verbs which can be used to describe events which on the surface seem quite
different from each another, as in (17). Second, rules of inference hold of shared predicates
across fields (Jackendoff 1976). One example is that "if an event is caused, it takes place"
(Jackendoff 1976:110), so that the entailment The bike stayed in the shed can be derived
from (17a), the entailment The bike basket stayed with Terry can be derived from (17b),
and the entailment The bike stayed clean can be derived from (17c). This supports the
use of the predicate cause across fields. Finally, predicates are justified when they explain
cross-field generalizations in the use of prepositions.The use of the allative preposition to
is taken to support the analysis of give as a causative verb of motion.
These very considerations lead Carter (1978: 70-74) to argue against the primitive
predicate stay. Carter points out that few English words have meanings that include
the notion captured by stay, yet if stay were to number among the primitive predicates,
430
IV. Lexical semantics
such words would be expected to be quite prevalent. So, while the primitives cause and
become are motivated because languages often contain a multitude of lexical items
differentiated by just these predicates (e.g. the various uses of cool in The cook cooled the
cake, The cake cooled, The cake was cool), there is no minimal pair differentiated by the
existence of stay, for example, the verb cool cannot also mean 'stay cool'. Carter also
notes that if not is included in the set of predicates, then the predicate stay becomes
superfluous, as it could be replaced by not plus change, a predicate which is roughly an
analogue to Jackendoff's go. Carter further notes that as a result simpler statements of
certain inference rules and other generalizations might be possible.
However, the primitive predicates which serve best as the basis for cross-field
generalizations arc not necessarily the ones that emerge from efforts to account for argument
alternations—the efforts that lead to LCS-style representations and their descendants.
This point can be illustrated by examining another predicate whose existence has been
controversial, have. Jackendoff posits a possessional field that is modeled on the loca-
tional field: being possessed is taken to be similar to being at a location—existing at that
location (see also Lyons 1967,1968:391-395).This approach receives support since many
entailments involving location also apply to possession, such as the entailment described
for stay. Furthermore, in some languages, including Hindi-Urdu, the same verb is used in
basic locational and possessive sentences, suggesting that possession can be reduced to
location (though the facts arc often more complicated than they appear on the surface;
sec Harlcy 2003). Nevertheless, Pinker (1989: 189-190) and Tham (2004: 62-63, 74-85,
100-104) argue that an independent predicate have is necessary; see also Harley (2003).
Pinker points out that in terms of the expression of its arguments, it is belong and not
have which resembles locational predicates.The verb belong takes the possessum, which
would be analyzed as a theme (i.e. located entity), as its subject and the possessor, which
would be analyzed as a location, as an oblique. Its argument realization, then,
parallels that of a locational predicate; compare (18a) and (18b). In contrast, have takes the
possessor as subject and the possessum as object, as in (19), so its arguments show the
reverse syntactic prominence relations - a "marked" argument realization, which would
need an explanation, on the localist analysis.
(18) a. One of the books belongs to me.
b. One of the books is on the table.
(19) I have one of the books.
Pinker points out that an analysis which takes have to be a marked possessive predicate
is incompatible with the observations that it is a high-frequency verb, which is acquired
early and unproblematically by children.Tham (2004:100-104) further points out that it
is belong that is actually the "marked" verb from other perspectives: it imposes a referen-
tiality condition on its possessum and it is used in a restricted set of information structure
contexts - all restrictions that have does not share.Taking all these observations together,
Tham concludes that have shows the unmarked realization of arguments for possessive
predicates, while belong shows a marked realization of arguments. Thus, she argues that
the semantic prominence relations in unmarked possessive and locative sentences are
quite different and, therefore, warrant positing a predicate have.
19. Lexical Conceptual Structure 431_
5. Subeventual analysis
One way in which more recent work on event structure departs from earlier work on
LCSs is that it begins to use the structure of the semantic representation itself, rather
than reference to particular predicates in this representation, in formulating
generalizations about argument realization. In so doing, this work capitalizes on the assumption,
present in some form since the generative semantics era, that predicate decompositions
may have a subeventual analysis. Thus, it recognizes a distinction between two types of
event structures: simple event structures and complex event structures, which themselves
are constituted of simple event structures. The prototypical complex event structure is
a causative event structure, in which an entity or event causes another event, though
Ramchand (2008) takes some causative events to be constituted of three subevents, an
initiating event, a process, and a result.
Beginning with the work of the generative scmanticists, the positing of a complex
event structure was supported using evidence from scope ambiguities involving various
adverbial phrases (McCawley 1968, 1971, Morgan 1969, von Stechow 1995, 1996).
Specifically, a complex event structure may afford certain adverbials, such as again, more
scope-taking options than a simple event structure, and thus adverbials may show
interpretations in sentences denoting complex events that are unavailable in those denoting
simple events. Thus, (20) shows both so-called "restitutive" and "repetitive" readings,
while (21) has only a "repetitive" reading.
(20) Dale closed the door again.
Repetitive: the action of closing the door was performed before.
Restitutive: the door was previously in the state of being closed, but there is no
presupposition that someone had previously closed the door.
(21) John kicked the door again.
Repetitive: the action of kicking the door was performed before.
The availability of two readings in (20) and one reading in (21) is explained by
attributing a complex event structure to (20) and a simple event structure to (21).
A recent line of research argues that the architecture of event structure also
matters to argument realization, thus motivating complex event structures based on
argument realization considerations. This idea is proposed by Grimshaw & Vikner (1993),
who appeal to it to explain certain restrictions on the passivization of verbs of
creation (though the pattern of acceptability that they are trying to explain turns out to
have been mischaracterized; see Macfarland 1995). This idea is further exploited by
Rappaport Hovav & Levin (1998a) and Levin & Rappaport Hovav (1999) to explain a
variety of facts related to objecthood.
In this work, the notion of event complexity gains explanatory power via the
assumption that there must be an argument in the syntax for each subevent in an event structure
(Rappaport Hovav & Levin 1998a, 2001; sec also Grimshaw & Vikner 1993 and van
Hout 1996 for similar conditions). Given this assumption, a verb with a simple event
structure may be transitive or intransitive, while a verb with a complex event structure,
432
IV. Lexical semantics
say a causative verb, must necessarily be transitive. Rappaport Hovav & Levin (1998a)
attribute the necessary transitivity of break and melt, which contrasts with the "optional"
transitivity of sweep and wipe, to this constraint; the former, as causative verbs, have a
complex event structure.
(22) a. *Blair broke/melted,
b. Blair wiped and swept.
Levin & Rappaport Hovav (1999) use the same assumption to explain why a resultative
based on an unergative verb can only predicate its result state of the subject via a "fake"
reflexive object.
(23) My neighbor talked *(herself) hoarse.
Resultatives have a complex, causative event structure, so there must be a syntactically
realized argument representing the argument of the result state subevent; in (23) it is a
reflexive pronoun as it is the subject which ends up in the result state. Levin (1999) uses
the same idea to explain why agent-act-on-patient verbs are transitive across languages,
while other two-argument verbs vary in their transitivity: only the former are required
to be transitive. As in other work on LCSs and event structure, the use of subeventual
structure is motivated by argument realization considerations.
Pustejovsky (1995) and van Hout (1996) propose an alternative perspective on event
complexity: they take telic events, rather than causative events, to be complex events. Since
most causative events are telic events, the two views of event complexity assign complex
event structures in many of the same instances. A major difference is in the treatment of
telic uses of manner of motion verbs such as Terry ran to the library and telic uses of
consumption verbs such as Kerry ate the peach; these arc considered complex predicates on the
telicity approach, but not on the causative approach. See Levin (2000) for some discussion.
The subeventual analysis also defines hierarchical relations among arguments,
allowing rules of argument realization to be formulated in terms of the geometry of the
LCS.We now discuss advantages of such a formulation over direct reference to semantic
roles.
6. LCSs and syntax
LCSs, as predicate decompositions, include the embedding of constituents, giving rise
to a hierarchical structure. This hierarchical structure, which includes the subeventual
structure discussed in section 5, allows a notion of semantic prominence to be defined,
which mirrors the notion of syntactic prominence. For instance, Wundcrlich (1997a,
1997b, 2006) introduces a notion of a-command defined over predicate decompositions,
which is an analogue of syntactic c-command. By taking advantage of the hierarchical
structure of LCSs, it becomes possible to formulate argument realization rules in terms
of the geometry of LCSs and, more importantly, to posit natural constraints on the
nature of the mapping between LCS and syntax. As discussed at length in Levin &
Rappaport Hovav (2005: chapter 5), many researchers assume that the mapping between
lexical semantics and syntax obeys a constraint of prominence preservation: relations of
semantic prominence in a semantic representation are preserved in the corresponding
19. Lexical Conceptual Structure 433
syntactic representation, so that prominence in the syntax reflects prominence in
semantics (Bouchard 1995). This idea is implicit in the many studies that use a thematic
hierarchy - a ranking of semantic roles - to guide the semantics-syntax mapping and explain
various other facets of grammatical behavior; however, most work adopting a thematic
hierarchy does not provide independent motivation for the posited role ranking.
Predicate decompositions can provide some substance to the notion of a thematic hierarchy
by correlating the position of a role in the hierarchy with the position of the argument
bearing that role in a predicate decomposition (Baker 1997, Croft 1998, Kiparsky 1997,
Wunderlich 1997a, 1997b, 2000). (There are other ways to ground the thematic hierarchy;
see article 18 (Davis) Thematic roles.)
Researchers such as Bouchard (1995), Kiparsky (1997),and Wunderlich (1997a, 1997b)
assume that predicate decompositions constitute a lexical semantic representation, but
many other researchers now assume that predicate decompositions are syntactic
representations, built from syntactic primitives and constrained by principles of syntax. This
move obviates the need for prominence preservation in the semantics-syntax mapping
since the lexical semantic representation and the syntactic representation are one. The
idea that predicate decompositions motivated by semantic considerations are
remarkably similar to syntactic structures, and thus should be taken to be syntactic structures has
previously been made explicit in the generative semantics literature (e.g. Morgan 1969).
Hale & Keyser were the first to articulate this position in the context of current syntactic
theory; their LCS-stylc syntactic structures arc called "lexical relational structures" in
some of their work (1992,1993,1997).The proposal that predicate decompositions should
be syntactically instantiated has gained currency recently and is defended or assumed in a
range of work, including Erteschik-Shir & Rapoport (2004), Mateu (2001a, 2001b),Travis
(2000), and Zubizaretta & Oh (2007), who all build directly on Hale & Keyser's work, as
well as Alexiadou, Anagnostopoulou & Schafer (2006), Harley (2003,2005), Pylkkanen
(2008), and Ramchand (2008), who calls the "lexical" part of syntax "first phase syntax".
We now review the types of arguments adduced in support of this view.
Hale & Keyser (1993) claim that their approach explains why there are few semantic
role types (although this claim is not entirely uncontroversial; see Dowty 1991 and
Kiparsky 1997). For them, the argument structure of a verb is syntactically defined and
represented. Furthermore, individual lexical categories (V,in particular) are constrained
so as to project syntactic structures using just the syntactic notions of specifier and
complement. These syntactic structures arc associated with coarse-grained semantic notions,
often corresponding to the predicates typical in standard predicate decompositions; the
positions in these structures correspond to semantic roles, just as the argument
positions in a standard predicate decomposition are said to correspond to semantic roles; see
section 4. For example, the patient of a change of state verb is the specifier of a verbal
projection in which a verb takes an adjective complement (i.e. the N position in '[v N [v V
A]]'). Since the number of possible structural relations in syntax is limited, the number of
expressible semantic roles is also limited. Furthermore, Hale & Keyser suggest that the
nature of these syntactic representations of argument structure also provide insight into
Baker's (1988:46,1997) Uniformity of Theta Role Assignment, which requires that
identical semantic relationships between items be represented by identical structural
relations between those items at d-structure. On Hale & Keyser's approach, this principle
must follow since semantic roles are defined over a hierarchical syntactic structure, with
a particular role always having the same instantiation.These ideas are developed further
434
IV. Lexical semantics
in Ramchand (2008), who assumes a slightly more articulated lexical syntactic structure
than Hale & Keyser do.
Hale & Keyser provide support for their proposal that verb meanings have internal
syntactic structure from the syntax of denominal verbs.They observe that certain
impossible types of denominal verbs parallel impossible types of noun-incorporation. In
particular, they argue that just as there is no noun incorporation of either agents or recipients
in languages with noun-incorporation (Baker 1988: 453-454, fn. 13, 1996: 291-295), so
there is no productive denominal verb formation where the base noun is interpreted
as an agent or a recipient (e.g. */ churched the money, where church is understood as a
recipient; *// cowed a calf, where cowN is understood as the agent). However, denominal
verbs arc productively formed from nouns analyzed as patients (e.g. / buttered the pan)
and containers or locations (e.g. / bottled the wine). Hale & Keyser argue that the possible
denominal verb types follow on the assumption that these putative verbs are derived
from syntactic structures in which the base noun occupies a position which reflects its
semantic role. This point is illustrated using their representations for the verbs paint and
shelve given in (24) and (25), respectively, which are presented in the linearized form
given in Wechsler (2006: 651, ex. (17)-(18)) rather than in Hale & Keyser's (1993) tree
representations.
(24) a. We painted the house.
b. Wc [v VI [vp house I, V2 [pp Pwith paint]]]].
(25) a. We shelved the books.
b. We [, VI [vp books [, V2 [pp Pon shelf]]]].
The verbs paint and shelf are derived through the movement of the base noun - the
verb's root - into the empty VI position in the structures in (b), after first merging it with
the preposition Pwit|) or Pon and then with V2. Hale & Keyser argue that the movement
of the root is subject to a general constraint on the movement of heads (Baker 1988,
Travis 1984). Likewise, putative verbs such as bush or house, used in sentences such as */
bushed a trim (with the intended interpretation 'I gave the bush a trim') or * I housed a
coat of paint (with the intended interpretation 'I gave the house a coat of paint') are also
excluded by the same constraint.
Hale & Keyser's approach is sharply criticized by Kiparsky (1997). He points out that
the syntax alone will not prevent the derivation of sentences such as */ bushed some
fertilizer from a putative source corresponding to I put some fertilizer on the bush (cf. the
source for / bottled the wine or / corralled the Zior.se). The association of a particular root
with a specific lexical syntactic structure is governed by conceptual principles such as "If
an action refers to a thing, it involves a canonical use of the thing" (Kiparsky 1997:482).
Such principles ensure that bush will not be inserted into a structure as a location, since
unlike a bottle, its canonical use is not as a container or place.
Denominal verbs in English, although they do not involve any explicit verb-forming
morphology, have been said, then, to parallel constructions in other languages which do
involve overt morphological or syntactic derivation (e.g. noun incorporation), and this
parallel has been taken to support the syntactic derivation of these words. Comparable
arguments can be made in other areas of the English lexicon. For example, in most
languages of the world, manner of motion verbs cannot on their own license a directional
19. Lexical Conceptual Structure 435
phrase, though in English all manner of motion verbs can. Specifically, sentences parallel
to the English Tracy ambled into the room are derived through a variety of morphosyn-
tactic means in other languages, including the use of directional suffixes or applicative
morphemes on manner of motion verbs and the combination of manner and directed
motion verbs in compounds or serial verb constructions (Schaefer 1985, Slobin 2004,
Talmy 1991). Japanese, for example, must compound a manner of motion verb with a
directed motion verb in order to express manner of motion to a goal, as the contrast in
(26) shows.
(26) a. ?John-wa kishi-e oyoida.
John-Top shorc-to swam
'John swam to the shore.' (intended; Yoneyama 1986: l,ex.(lb))
b. John-wa kishi-c oyoidc-itta.
John-Top shore-to swimming-went
'John swam to the shore.' (Yoneyama 1986:2, ex. (3b))
The association of manner of motion roots with a direction motion event type is
accomplished in theories such as Rappaport Hovav & Levin (1998b, 2001) and Levin &
Rappaport Hovav (1999) by processes which augment the event structure
associated with the manner of motion verbs. But such theories never make explicit exactly
where the derivation of these extended structures takes place. Ramchand (2008) and
Zubizarreta & Oh (2007) argue that since these processes are productive and their
outputs are to a large extent compositional, they should be assigned to the
"generative computational" module of the grammar, namely, syntax. Finally, as Ramchand
explicitly argues, this syntacticized approach suggests that languages that might appear
quite different, are in fact, underlyingly quite similar, once lexical syntactic structures are
considered.
Finally, we comment on the relation between structured event representations and
the lexical entries of verbs. Recognizing that many roots in English can appear as words
belonging to several lexical categories and that verbs themselves can be associated
with various event structures, Borer (2005) articulates a radically nonlexical position:
she proposes that roots are category neutral. That is, there is no association specified in
the lexicon between roots and the event structures they appear with. Erteschik-Shir &
Rapoport (2004, 2005), Levin & Rappaport Hovav (2005), Ramchand (2008), and
Rappaport Hovav & Levin (2005) all point out that even in English the flexibility of this
association is still limited by the semantics of the root. Ramchand includes in the lexical
entries of verbs the parts of the event structure that a verbal root can be associated with,
while Levin & Rappaport Hovav make use of "canonical realization rules", which pair
roots with event structures based on their ontological types, as discussed in section 3.
7. Conclusion
LCSs are a form of predicate decomposition intended to capture those facets of
verb meaning which determine grammatical behavior, particularly in the realm of
argument realization. Research on LCSs and the structured representations that are their
descendants has contributed to our understanding of the nature of verb meaning and the
436
IV. Lexical semantics
relation between verb syntax and semantics. This research has shown the importance of
semantic representations that distinguish between root and event structure, as well as the
importance of the architecture of the event structure to the determination of
grammatical behavior. Furthermore, such developments have led some researchers to propose
that representations of verb meaning should be syntactically instantiated.
This research was supported by Israel Science Foundation Grants 806-03 and 379-07
to Rappaport Hovav. We thank Scott Grimm, Marie-Catherine de Marneffe, and Tanya
Nikitina for comments on an earlier draft.
8. References
Alexiadou, Artemis, Elena Anagnostopoulou & Florian Schafcr 2006. The properties of anti-
causatives crosslinguistically. In: M. Frascarelli (ed.). Phases of Interpretation. Berlin: Mouton
deGruyter, 187-211.
Aronoff, Mark 1993. Morphology by Itself. Stems and Inflectional Classes. Cambridge, MA: The
MIT Press.
Bach, Einmon 1986.The algebra of events. Linguistics & Philosophy 9,5-16.
Baker, Mark C. 1988. Incorporation. A Theory of Grammatical Function Changing. Chicago, IL:The
University of Chicago Press.
Baker, Mark C. 1996. The Polysynthesis Parameter. New York: Oxford University Press.
Baker, Mark C. 1997. Tliematic roles and syntactic structure. In: L. Haegeman (ed.). Elements of
Grammar. Handbook of Generative Syntax. Dordrecht: Kluwer, 73-137.
Bartsch, Renate & Theo Veiineinanii 1972. The grammar of relative adjectives and comparison.
Linguistische Berichte 20.19-32.
Beavers, John 2008. Scalar complexity and the structure of events. In: J. Dolling & T Heyde-
Zybatow (eds.). Event Structures in Linguistic Form and Interpretation. Berlin: de Gruyter,
245-265.
Borer, Hagit 2005. Structuring Sense, vol. 2: The Normal Course of Events. Oxford: Oxford
University Press.
Bouchard, Denis 1995. The Semantics of Syntax. A Minimalist Approach to Grammar. Chicago, IL:
The University of Chicago Press.
Brcsnan, Joan 1982.The passive in lexical theory. In: J. Brcsnan (ed.). The Mental Representation of
Grammatical Relations. Cambridge, MA:The MIT Press, 3-86.
Carter, Richard J. 1978. Arguing for semantic representations. Recherches Linguistiques de Vin-
cennes 5-6,61-92.
Chomsky, Noam 1981. Lectures on Government and Binding. Dordrecht: Foris.
Croft, William 1998. Event structure in argument linking. In: M. Butt & W. Geuder (eds.). The
Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI Publications,
21-63.
Doron, Edit 2003. Agency and voice. The Semantics of the Semitic templates. Natural Language
Semantics 11,1-67.
Dowty. David R. 1979. Word Meaning and Montague Grammar. The Semantics of Verbs and Times
in Generative Semantics and in Montagues PTQ. Dordrecht: Reidel.
Dowty, David R. 1991.Tliematic proto-roles and argument selection. Language 67'. 547-619.
Erteschik-Shir, Nomi & Tova R. Rapoport 2004. Bare aspect. A theory of syntactic projection. In:
J. Gu<Sron & J. Lecarme (eds.). The Syntax of Time. Cambridge, MA:Tlie MIT Press, 217-234.
Erteschik-Shir. Nomi & Tova R. Rapoport 2005. Path predicates. In: N. Erteschik-Shir & T
Rapoport (eds.). The Syntax of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford:
Oxford University Press, 65-86.
19. Lexical Conceptual Structure 437
Farmer, Ann K. 1984. Modularity in Syntax. A Study of Japanese and English. Cambridge, M A: The
MIT Press.
Fillmore, Charles J. 1968. The case for case. In: E. Bach & R. T. Harms (eds.). Universals in
Linguistic Theory. New York: Holt, Rinehart & Winston. 1-88.
Fillmore. Charles J. 1970.The grammar of hitting and breaking. In: R. A. Jacobs & P. S. Rosenbaum
(eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 120-133.
Fodor, Jerry & Ernest Lepore 1999. Impossible words? Linguistic Inquiry 30,445-453.
Goldberg, Adele E. 1995. Constructions. A Construction Grammar Approach to Argument
Structure. Chicago, IL:The University of Chicago Press.
Grimshaw, Jane 1990. Argument Structure. Cambridge, MA:The MIT Press.
Grimshaw, Jane 2005. Words and Structure. Stanford, CA: CSLI Publications.
Grimshaw, Jane & Sten Vikner 1993. Obligatory adjuncts and the structure of events. In: E. Reuland
& W. Abraham (eds.). Knowledge and Language, vol. 2: Lexical and Conceptual Structure.
Dordrecht: Kluwer, 143-155.
Gruber, Jeffrey S. 1965/1976. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge,
MA. Reprinted in: J. S. Gruber. Lexical Structures in Syntax and Semantics. Amsterdam: North-
Holland, 1976,1-210.
Guerssel, Mohamed et al. 1985. A cross-linguistic study of transitivity alternations. In: W. H. Eilfort,
P. D. Kroeber & K. L. Peterson (eds.). Papers from the Parasession on Causatives and Agentivity.
Chicago, IL: Chicago Linguistic Society, 48-63.
Hale, Kenneth L. & Samuel J. Keyser 1987. A View from the Middle. Lexicon Project Working
Papers 10. Cambridge, MA: Center for Cognitive Science, MIT.
Hale, Kenneth L. & Samuel J. Keyser 1992. The syntactic character of thematic structure. In: I. M.
Roca (ed.). Thematic Structure. Its Role in Grammar. Berlin: Foris, 107-143.
Hale, Kenneth L. & Samuel J. Keyser 1993. On argument structure and the lexical expression of
syntactic relations. In: K. L. Hale & S. J. Keyser (eds.). The View from Building 20. Essays in
Linguistics in Honor ofSylvain Bromberger. Cambridge, MA:Thc MIT Press, 53-109.
Hale, Kenneth L. & Samuel J. Keyser 1997. On the complex nature of simple predicators. In: A.
Alsina, J. Bresnan & P. Sells (eds.). Complex Predicates. Stanford, CA: CSLI Publications, 29-65.
Hale, Kenneth L. & Samuel J. Keyser 2002. Prolegomenon to a Theory of Argument Structure.
Cambridge, MA:Thc MIT Press.
Harley, Heidi 2003. Possession and the double object construction. In: P. Pica & J. Rooryck (eds.).
Linguistic Variation Yearbook, vol. 2. Amsterdam: Benjamins, 31-70.
Harley, Heidi 2005. How do verbs get their names? Denominal verbs, manner incorporation and
the ontology of verb roots in English. In: N. Erteschik-Shir & T Rapoport (eds.). The Syntax
of Aspect. Deriving Thematic and Aspectual Interpretation. Oxford: Oxford University Press,
42-64.
Haspelmath, Martin 1993. More on the typology of inchoative/causative verb alternations. In:
B. Connie & M. Polinsky (eds.). Causatives and Transitivity. Amsterdam: Benjamins, 87-120.
Hay, Jennifer, Christopher Kennedy & Beth Levin 1999. Scalar structure underlies telicity
in 'degree achievements'. In: T. Matthews & D. Strolovich (eds.). Proceedings of Semantics and
Linguistic Theory (=SALT) IX. Ithaca, NY: Cornell University. 127-144.
van Hout, Angeliek 1996. Event Semantics of Verb Frame Alternations. A Case Study of Dutch and
Its Acquisition. Doctoral dissertation. Katholieke Universiteit Brabant.Tilburg.
Jackeiidoff, Ray S. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA:The MIT
Press.
Jackeiidoff, Ray S. 1976. Toward an explanatory semantic representation. Linguistic Inquiry 7,
89-150.
Jackeiidoff, Ray S. 1983. Semantics and Cognition. Cambridge, MA:The MIT Press.
Jackendoff, Ray S. 1987. The status of thematic relations in linguistic theory. Linguistic Inquiry 18,
369^11.
Jackeiidoff, Ray S. 1990. Semantic Structures. Cambridge, MA:The MIT Press.
438
IV. Lexical semantics
Jackendoff, Ray S. 1996a. Conceptual semantics and cognitive linguistics. Cognitive Linguistics 7,
93-129.
Jackendoff, Ray S. 1996b.The proper treatment of measuring out, telicity, and perhaps even
quantification in English. Natural Language and Linguistic Theory 14,305-354.
Joshi, Aravind 1974. Factorization of verbs. In: C. H. Heidi ich (ed.). Semantics and Communication.
Amsterdam: North-Holland, 251-283.
Kennedy, Christopher 1999. Projecting the Adjective. The Syntax and Semantics of Gradability and
Comparison. New York: Garland.
Kennedy, Christopher 2001. Polar opposition and the ontology of 'degrees'. Linguistics &
Philosophy 24,33-70.
Kennedy, Christopher & Beth Levin 2008. Measure of change. The adjectival core of degree
achievements. In: L. McNally & C. Kennedy (eds.). Adjectives and Adverbs. Syntax, Semantics,
and Discourse. Oxford: Oxford University Press, 156-182.
Kiparsky, Paul 1997. Remarks on denominal verbs. In: A. Alsina, J. Bresnan & P. Sells (eds.).
Complex Predicates. Stanford, CA: CSLI Publications, 473^99.
Koontz-Garbodcn, Andrew 2007. States, Changes of State, and the Monotonicity Hypothesis. Ph.D.
dissertation. Stanford University, Stanford, CA.
Krifka, Manfred 1998. The origins of telicity. In: S. Rothstein (ed.). Events and Grammar.
Dordrecht: Kluwer, 197-235.
Lakoff, George 1968. Some verbs of change and causation. In: S. Kuno (ed.). Mathematical
Linguistics and Automatic Translation. Report NSF-20. Cambridge, MA: Aiken Computation
Laboratory, Harvard University.
Lakoff, George 1970. Irregularity in Syntax. New York: Holt, Rinehart & Winston.
Laughren, Mary 1988. Towards a lexical representation of Warlpiri verbs. In: W. Wilkins (ed.).
Syntax and Semantics 21: Thematic Relations. New York: Academic Press, 215-242.
Levin, Beth 1999. Objecthood. An event structure perspective. In: S. Billings, J. Boyle & A. Griffith
(eds.). Proceedings of the Chicago Linguistic Society (=CLS) 35, Part I: Papers from the Main
Session. Chicago, IL: Chicago Linguistic Society, 223-247.
Levin, Beth 2000. Aspect, lexical semantic representation, and argument expression. In: L. J.
Conathan et al. (eds.). Proceedings of the 26th Annual Meeting of the Berkeley Linguistics
Society (=BLS). General Session and Parasession on Aspect. Berkeley, CA: Berkeley Linguistic
Society, 413-429.
Levin, Beth & Malka Rappaport Hovav 1991. Wiping the slate clean. A lexical semantic
exploration. Cognition 41,123-151.
Levin, Beth & Malka Rappaport Hovav 1995. Unaccusativity. At the Syntax-Lexical Semantics
Interface. Cambridge, MA:The MIT Press.
Levin, Beth & Malka Rappaport Hovav 1999. Two structures for compositionally derived events.
In:T. Matthews & D. Strolovich (eds.). Proceedings of Semantics and Linguistic Theory (=SALT)
IX. Ithaca. NY: Cornell University. 199-223.
Levin, Beth & Malka Rappaport Hovav 2005. Argument Realization. Cambridge: Cambridge
University Press.
Lyons, John 1967. A note on possessive, existential and locative sentences. Foundations of Language
3,390-396.
Lyons, John 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press.
Macfarland.Talke 1995. Cognate Objects and the Argument/Adjunct Distinction in English. Ph.D.
dissertation. Northwestern University, Evanston, IL.
Mai antz, Alec P. 1984. On the Nature of Grammatical Relations. Cambridge, MA:Thc MIT Press.
Mateu, Jaume 2001a. Small clause results revisited. In: N. Zhang (ed.). Syntax of Predication. ZAS
Papers in Linguistics 26. Berlin: Zentrum fur Allgeineine Sprachwissenschaft.
Mateu, Jauine 2001b. Unselected objects. In: N. Dehe' & A. Wanner (eds.). Structural Aspects of
Semantically Complex Verbs. Frankfurt/M.: Lang, 83-104.
19. Lexical Conceptual Structure 439
McCawley, James D. 1968. Lexical insertion in a transformational grammar without deep structure.
In: B. J. Dai den. C.-J. N. Bailey & A. Davison (eds.). Papers from the Fourth Regional Meeting of
the Chicago Linguistic Society (=CLS). Chicago, IL: Chicago Linguistic Society, 71-80.
McCawley, James D. 1971. Prelexical syntax. In: R. J. O'Brien (ed.). Report of the 22nd Annual
Roundtable Meeting on Linguistics and Language Studies. Washington, DC: Georgetown
University Press, 19-33.
McClure, William T. 1994. Syntactic Projections of the Semantics of Aspect. Ph.D. dissertation.
Cornell University, Ithaca, NY.
Morgan, Jerry L. 1969. On arguing about semantics. Papers in Linguistics 1,49-70.
Nida. Eugene A. 1975. Componential Analysis of Meaning. An Introduction to Semantic Structures.
The Hague: Mouton.
Parsons, Terence 1990. Events in the Semantics of English. Cambridge, MA: The MIT Press.
Pesetsky, David M. 1982. Paths and Categories. Ph.D. dissertation. MIT. Cambridge, MA.
Pesetsky, David M. 1995. Zero Syntax. Experiencers and Cascades. Cambridge, MA:Thc MIT Press.
Pinker, Steven 1989. Learnability and Cognition. The Acquisition of Argument Structure.
Cambridge, MA:The MIT Press.
Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press.
Pylkkanen, Liina 2008. Introducing Arguments. Cambridge, MA:The MIT Press.
Ramchand, Gillian C. 1997. Aspect and Predication. The Semantics of Argument Structure. Oxford:
Clarendon Press.
Ramchand, Gillian C. 2008. Verb Meaning and the Lexicon. A First-Phase Syntax. Cambridge:
Cambridge University Press.
Rappaport Hovav, Malka 2008. Lexicalized meaning and the internal temporal structure of events.
In: S. Rothstein (ed.). Theoretical and Crosslinguistic Approaches to the Semantics of Aspect.
Amsterdam: Benjamins. 13-42.
Rappaport Hovav. Malka & Beth Levin 1998a. Morphology and lexical semantics. In: A. Spencer &
A. Zwicky (eds.). The Handbook of Morphology. Oxford: Blackwell, 248-271.
Rappaport Hovav, Malka & Beth Levin 1998b. Building verb meanings. In: M. Butt & W. Geuder
(eds.). The Projection of Arguments. Lexical and Compositional Factors. Stanford, CA: CSLI
Publications, 97-134.
Rappaport Hovav, Malka & Beth Levin 2001. An event structure account of English rcsultativcs.
Language 11,766-797.
Rappaport Hovav, Malka & Beth Levin 2005. Change of state verbs. Implications for theories of
argument projection. In: N. Erteschik-Shir & T Rapoport (eds.). The Syntax of Aspect. Deriving
Thematic and Aspectual Interpretation. Oxford: Oxford University Press, 276-286.
Rappaport Hovav, Malka & Beth Levin 2008. The English dative alternation. The case for verb
sensitivity. Journal of Linguistics 44,129-167.
Rappaport Hovav, Malka & Beth Levin 2010. Reflections on manner/result complementarity. In:
E. Doron, M. Rappaport Hovav & I. Sichel (eds.). Syntax, Lexical Semantics, and Event
Structure. Oxford: Oxford University Press 5,21-38.
Rappaport, Malka, Beth Levin & Mary Laughren 1988. Niveaux de representation lexicale. Lex-
ique 7,13-32. English translation in: J. Pustejovsky (ed.). Semantics and the Lexicon. Dordrecht:
Kluwer, 1993,37-54.
Ross, John Robert 1972. Act. In: D. Davidson & G. Harman (eds.). Semantics of Natural Language.
Dordrecht: Reide 1,70-126.
Rothstein, Susan 2003. Structuring Events. A Study in the Semantics of Aspect. Oxford: Blackwell.
Schacfcr, Ronald P. (1985) Motion in Tswana and its characteristic lexicalization. Studies in African
Linguistics 16,57-87.
Slobin, Dan I. 2004. The many ways to search for a frog. Linguistic typology and the expression
of motion events. In: S. Stromqvist & L. Veiiioeven (eds.). Relating Events in Narrative, vol. 2:
Typological and Contextual Perspectives. Mahwah, NJ: Erlbaum, 219-257.
440
IV. Lexical semantics
von Stechow, Arnim 1995. Lexical decomposition in syntax. In: U. Egli et al. (eds.). Lexical
Knowledge in the Organization of Language. Amsterdam: Benjamins, 81-118.
von Stechow, Arnim 1996. The different readings of w/eder. A structural account. Journal of
Semantics 13,87-138.
Stowell,Timothy 1981. Origins of Phrase Structure. Ph.D. dissertation. MIT, Cambridge, MA.
Taliny, Leonard 1975. Semantics and syntax of motion. In: J. P. Kimball (ed.). Syntax and Semantics
4. New York: Academic Press, 181-238.
Taliny, Leonard 1985. Lexicalization patterns. Semantic structure in lexical forms. In: T. Shopen
(ed.). Language Typology and Syntactic Description, vol. 3: Grammatical Categories and the
Lexicon. Cambridge: Cambridge University Press, 57-149.
Talmy, Leonard 1991. Path to realization. A typology of event conflation. In: L. A. Sutton, C. Johnson
& R. Shields (eds.). Proceedings of the I7th Annual Meeting of the Berkeley Linguistics Society
(=BLS). General Session and Parasession. Berkeley, CA: Berkeley Linguistic Society, 480-519.
Talmy, Leonard 2000. Toward a Cognitive Semantics, vol. 2: Typology and Process in Concept
Structuring. Cambridge, MA:The MIT Press.
Taylor, John R. 1996. On running and jogging. Cognitive Linguistics 7,21-34.
Tenny, Carol L. 1994. Aspectual Roles and the Syntax-Semantics Interface. Dordrecht: Kluwer.
Tham, Shiao Wei 2004. Representing Possessive Predication. Semantic Dimensions and Pragmatic
Bases. Ph.D. dissertation. Stanford University, Stanford, CA.
Travis, Lisa D. 1984. Parameters and Effects of Word Order Variation. Ph.D. dissertation. MIT,
Cambridge, MA.
Travis, Lisa D. 2000. The 1-syntax/s-syntax boundary. Evidence from Austronesian. In: I. Paul, V.
Phillips & L. Travis (eds.). Formal Issues In Austronesian Linguistics. Dordrecht: Kluwer,
167-194.
van Valin, Robert D. 1993. A synopsis of role and reference grammar. In: R. D. van Valin (ed.).
Advances in Role and Reference Grammar. Amsterdam: Benjamins, 1-164.
van Valin, Robert D. 2005. Exploring the Syntax-Semantics Interface. Cambridge: Cambridge
University Press.
van Valin, Robert D. & Randy J. LaPolla 1997. Syntax. Structure, Meaning, and Function.
Cambridge: Cambridge University Press.
Wcchslcr, Stephen 2006. Thematic structure. In: K. Brown (ed.). Encyclopedia of Language and
Linguistics. 2nd edn. Amsterdam: Elsevier, 645-653.1st edn. 1994.
Wilks, Yorick 1987. Primitives. In: S. C. Shapiro (ed.). Encyclopedia of Artificial Intelligence. New
York: Wiley, 759-761.
Williams, Edwin 1981. Argument structure and morphology. The Linguistic Review 1,81-114.
Wunderlich, Dieter 1997a. Argument extension by lexical adjunction. Journal of Semantics 14,
95-142.
Wunderlich, Dieter 1997b. cause and the structure of verbs. Linguistic Inquiry 28,27-68.
Wunderlich, Dieter 2000. Predicate composition and argument extension as general options. A
study in the interface of semantic and conceptual structure. In: B. Stiebels & D. Wunderlich
(eds.). Lexicon in Focus. Berlin: Akademie Verlag, 247-270.
Wunderlich, Dieter 2006. Argument hierarchy and other factors determining argument
realization. In: I. Bornkessel et al. (eds.). Semantic Role Universals and Argument Linking. Theoretical,
Typological, and Psycholinguistic Perspectives. Berlin: Mouton de Gruyter, 15-52.
Yoneyaina, Mitsuaki 1986. Motion verbs in conceptual semantics. Bulletin of the Faculty of
Humanities 22. Tokyo: Scikci University, 1-15.
Zubizarrcta, Maria Luisa & Eunjcong Oh 2007. On the Syntactic Composition of Manner and
Motion. Cambridge, MA:The MIT Press.
Beth Levin, Stanford, CA (USA)
Malka Rappaport Hovav, Jerusalem (Israel)
20. Idioms and collocations
441
20. Idioms and collocations
1. Introduction: Collocation, collocations, and idioms
2. Lexical properties of idioms
3. Compositionality
4. How frozen arc idioms?
5. Syntactic flexibility
6. Morphosyntax
7. Idioms as constructions
8. Diachronic changes
9. Idioms in the lexicon
10. Idioms in the mental lexicon
11. Summary and conclusion
12. References
Abstract
Idioms constitute a subclass of multi-word units that exhibit strong collocational
preferences and whose meanings are at least partially non-compositional. Drawing on English
and German corpus data, we discuss a number of lexical, syntactic, morphological and
semantic properties of Verb Phrase idioms that distinguish them from freely composed
phrases. The classic view of idioms as "long words" admits of little or no variation of a
canonical form. Fixedness is thought to reflect semantic non-compositionality: the
nonavailability of semantic interpretation for some or all idiom constituents and the
impossibility to parse syntactically ill-formed idioms block regular grammatical operations.
However, corpus data testify to a wide range of discourse-sensitive flexibility and
variation, weakening the categorical distinction between idioms and freely composed phrases.
We cite data indicating that idioms are subject to the same diachronic developments
as simple lexemes. Finally, we give a brief overview of psycholinguistic research into
the processing of idioms and attempts to determine their representation in the mental
lexicon.
1. Introduction: Collocation, collocations and idioms
Words in text and speech do not co-occur freely but follow rules and patterns. We draw
an initial three-fold distinction for recurrent lexical co-occurrences: collocation,
patterns of words found in close neighborhood, and collocations and idioms, multi-word
units with lexical status that are often distinguished in terms of their fixedness and
semantic non-compositionality. The remainder of the paper will focus on idioms, in
particular verb phrase idioms. We describe their syntactic, morphosyntactic and lexical
properties with respect to frozenness and variation. The data examined here show a
sliding scale of fixedness and a concomitant range of semantic compositionality. We
next consider the diachronic behavior of idioms, which does not seem to differ from
that of simple lexemes. Finally, several theories concerning the mental representation
of idioms are discussed.
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 441-456
442
IV. Lexical semantics
1.1. Collocation: A statistical property of language
Collocation, the co-occurrence patterns of words observed in spoken and written
language, is constrained by syntactic (grammatical), semantic, and lexical properties of
words. At each level, linguists have attempted to formulate rules and constraints for
co-occurrence.
The syntactic constraints on lexical co-occurrcncc arc specific to a word but
constrained by its syntactic class membership. For example, the adjective prone, unlike
hungry and tired, subcategorizes for a Prepositional Phrase headed by to; moreover,
hungry and tired, but not prone, can occur pre-nominally. Firth (1957) called the syntactic
constraints on a word's selection of neighboring words "colligation".
Firth moreover recognized that words display collocational properties beyond those
imposed by syntax. He coined the term "collocation" for the "habitual or customary
places" of a word. Firth's statement that "you can recognize a word by the company it
keeps" has been given scientific validation by corpus studies establishing lexical profiles
for words, which reflect their regular attraction to other words. Co-occurrence properties
of individual lexical items can be expressed quantitatively (Halliday 1966, Sinclair 1991,
Stubbs 2001, Krenn 2000, Evert 2005, inter alia).
Church & Hanks (1990), and Church et al. (1991) proposed Mutual Information as
a measure of the tendency of word forms to select or deselect one another in a context.
Mutual Information compares the probability of two lexemes' co-occurrence with the
probability that they occur alone and that their co-occurrence is merely chance.
Mutual Information is a powerful tool for measuring the degree of fixedness, and thus
lexical status, of word groups. Highly fixed expressions that have been identified
statistically are candidates for inclusion in the lexicon, regardless of their semantic
transparency. Fixed expressions are also likely to be stored as units in speakers' mental lexicons,
and retrieved as units rather than composed anew each time. Psycholinguists are aware
of people's tendency to chunk (Miller 1956).
Statistical analyses can also quantify the flexibility of fixed expressions, which are
rarely complclcty frozen (sec sections 3 and 4). Fazly & Stevenson (2006) propose a
method for the automatic discovery of the set of syntactic variations that VP idioms can
undergo and that should be included in their lexical representation. Fazly & Stevenson
further incorporate this information into statistical measures that effectively predict the
idiomaticity level of a given expression. Others have measured the distributional
similarity between the expression and its constituents (McCarthy, Keller & Carroll 2003,
Baldwin et al. 2003).
The collocational properties of a language are perhaps best revealed in the errors
committed by learners and non-native speakers. A fluent speaker may have sufficient
command of the language to find the words that express the intended concept.
However, only considerable competence allows him to select that word from among several
near-synonyms that is favored by its neighbors; the subtlety of lexical preference appears
unmotivated and arbitrary.
1.2. Collocations: Phrasal lexical items
We distinguish collocation, the linguistic and statistically measurable phenomenon of
co-occurrcncc, from collocations, specific instances of collocation that have lexical status.
20. Idioms and collocations
443
Collocations like beach house, eat up, and blood-red are mappings of word forms and
word meanings, much like simple lexemes such as house and eat and red. Collocations are
"prc-fabricatcd" combinations of simple lexemes.
Jackendoff (1997) collected multi-word units (MWUs) from the popular U.S.
television show "Wheel of Fortune" and showed the high frequency of collocations and
phrases, both in terms of types and tokens. He estimates that MWUs constitute at least
half the entire lexicon; Cowie (1998) reports the percentage of verb phrase idioms and
collocations in news stories and feature articles to be around forty percent. Such figures
make it clear that MWUs constitute a significant part of the lexicon and collocating is a
pervasive aspect of linguistic behavior.
Collocations include a wide range of MWUs. A syntax-based typology (e.g., Moon
1998) distinguishes sentential proverbs {the early bird gets the worm), routine formulae
{thanks a lot), and adjective phrases expressing similes {sharp as a whistle). Frames or
constructions (Fillmore, Kay & O'Connor 1988) consist of closed-class words with slots
for open-class words, such as what's X doing y? and the X-er the Y-er, these frames carry
meaning independent of the particular lexemes that fill the open slots. A large class of
collocations are support (or "light") verb constructions, where a noun collocates with a
specific verbal head such as make a decision and take a photograph (Starrer 2007).
Under a semantically-based classification, phrasal lexemes form a continuous scale of
fixedness and semantic transparency. Following Moon (1998), we say that collocations
are decomposable multi-word units that often follow the paradigmatic patterns of free
language. For example, a number of German verb phrase collocations show the caus-
ative/inchoative/stative alternation pattern, e.g., in Rage bringen/kommen/sein (make/
become/be enraged).
1.3. Idioms
Idioms like hit the ceiling, lose one's head, and when the cows come home, constitute
another class of MWUs with lexical status. They differ from the kinds of collocations
discussed in the previous section in several respects.
To begin with, the lexical make-up of idioms is usually unpredictable and often highly
idiosyncratic, violating the usual rules of selectional restrictions. Examples are English
rain cats and dogs and talk turkey, and German unter dem Pantoffel stehen (lit., 'stand
under the slipper', be dominated). Some idioms have a possible literal (non-idiomatic)
interpretation, such as drag one's feet and not move a finger.
Certain idioms are syntactically ill-formed, such as English trip the light fantastic and
German an jemandem einen Narren gefressen haben (lit.'have eaten a fool on/at some-
body'.be infatuated with somebody),or Bauklotzestaunen (lit.'be astonished toy blocks',
be very astonished), which violate the verb's subcategorization properties. Others, like by
and large, do not constitute syntactic categories.
Perhaps the most characteristic feature ascribed to idioms is their semantic non-
compositionality. Because the meaning of idioms is not made up by the meanings of their
constituents (or only to a limited extent), their meanings are considered largely opaque.
A common argument says that if the components of idioms are semantically non-
transparent to speakers, they are not available for the kinds of grammatical operations
found in free, literal, language. As a result, idioms are often considered fixed, "frozen"
expressions, or "long words" (Swinncy & Cutler 1979, Bobrow & Bell 1973).
444
IV. Lexical semantics
We will consider each of these properties attributed to idioms, focusing largely on
verb phrase (VP) idioms with a verbal head that select for at least one noun phrase or
prepositional phrase complement. Such idioms have generated much discussion, focused
mostly on putative constraints on their syntactic frozenness. VP idioms were also found
to be the most frequent type of MWUs in the British Hector Corpus (Moon 1998).
2. Lexical properties of idioms
Idioms are perhaps most recognizable by their lexical make-up. Selectional restrictions
are frequently violated, and the idiom constituents tend not to be semantically related to
words in the surrounding context, thus signaling the need for a non-literal interpretation.
But there are many syntactically and semantically well-formed strings that are polyse-
mous between an idiomatic and a literal, compositional meaning, such as play first fiddle
and fall off the wagon. Here, context determines which meaning is likely the intended
one. Of course, the ambiguity between literal and idiomatic reading is related to the
plausibility of the literal reading. Thus, pull yourself up by your bootstraps, give/lend
somebody a hand, and German mil der Kirche urns Dorf fahren (lit.'drive with the church
around the village', deal with an issue in an overly complicated or laborious manner)
denote highly implausible events in their literal readings.
2.1. Polarity
Many idioms are negative polarity items: not have a leg to stand on, not give somebody
the time of day, not lift a finger, be neither fish nor fowl, no love lost, not give a damn/
hoot/shit. Without the negation, the idiomatic reading is lost; in specific contexts, it may
be preserved in the presence of a marked stress pattern. Corpus data show that many
idioms do not require a fixed negation component but can occur in other negative
environments (questions, conditionals, etc.), just like their non-idiomatic counterparts (Sohn
2006, Sailer & Richter 2002, Stantcheva 2006 for German).
Idioms as negative polarity items occur crosslinguistically, and negative polarity may
be one universal hallmark of idiomatic language.
2.2. Idiom-specific lexemes
Many idioms contain lexemes that do not occur outside their idiomatic contexts.
Examples are English thinking cap (in put on one's thinking cap), humble pie (in eat humble
pie), and German Hungertuch {am Hungertuch nagen; lit.'gnaw at the hunger-cloth', be
destitute). Similarly, the noun Schlafittchen in the idiom am Schlafittchen packen, (lit.
'grab someone by the wing', confront or catch somebody), is an obsolete word referring
to "wing" and its meaning is opaque to contemporary speakers.
Other idiom components have meaning outside their idiomatic context but may occur
rarely in free language {gift horse in don't look a gift horse in the mouth), cf. article 80
(Olsen) Semantics of compounds.
2.3. Non-referential pronouns
Another lexical peculiarity of many English idioms is the obligatory presence of a non-
referential it: have it coming, give it a shot, lose it, have it in for somebody, etc. These
20. Idioms and collocations
445
pronouns are a fixed component of the idioms and no number or gender variation can
occur without loss of the idiomatic reading.They do not carry meaning, though they may
have referred at an earlier stage before the idiom became fixed. Substituting a context-
appropriate noun does not seem felicitous, either: have the punishment coming, give the
problem a shot etc. seem odd at best. (A somewhat similar phenomenon are a number
of German idioms containing eins, einen (lit.'one'): sich eins lachen, einen draufmachen,
'have a good laugh/have a night on the town.')
3. Compositionality
Idioms are often referred to as non-compositional, i.e., their meaning is not composed
of the meanings of their constituents, as is the case in freely composed language. In fact,
idioms vary with respect to the degree of non-compositionality. One way to measure the
extent to which an idiom violates the compositional norms of the language is to examine
and measure statistically the collocational properties of its constituents.The verb fressen
(eat like an animal) co-occurs with nouns that fill the roles of the eating and the eaten
entities; among the latter Narr ("fool") will stand out as not belonging into the class of
entities included in the verb's selectional preferences; a thesaurus or WordNet (Fellbaum
1998) could firm up this intuition.
In highly opaque idioms, none of the constituents can be mapped onto a referent, as
is the case with and jemandem einen Narren gefressen haben and the much-cited kick the
bucket. The literal equivalents of these verbs do not select for arguments that the idiom
constituents could be mapped to in a one-to-one fashion so as to allow a "translation".
An idiom's lexeme becomes obsolete outside its idiomatic use, with the result that
meaning becomes opaque. For example, Fittiche in the idiom unter die Fittiche nehmen
seems not to be interpreted as "wings" by some contemporary speakers, as attested by
corpus examples like unter der/seiner Fittiche, where the morphology shows that the
speaker analyzed the noun as a feminine singular rather than the plural and possibly it
assigned a new meaning.
Lexical substitutions and adjectival modifications show that speakers often assign a
meaning to one or more constituents of an idiom (see section Lexical variation), though
such variations from the "canonical forms" tend to be idiosyncratic and are often specific
to the discourse in which they are found.
Many idioms are partially decomposable; they may contain one or more lexemes with
a conventional meaning or a metaphor.
3.1. Metaphors
Many idioms containt metaphors, and the notion of metaphor is closely linked to idioms;
cf. article 26 (Tyler & Takahashi) Metaphors and metonymies. We use the term here to
denote a single lexeme (most often a noun or noun phrase) with a figurative meaning.
For example, jungle is often used as a metaphor for a complex, messy, competitive, and
perhaps dangerous situation; this use of jungle derives from its literal meaning,"impen-
ctrablc forest" and preserves some of the salient properties or the original concept
(Glucksberg 2001). A metaphor like jungle may become conventionalized and be readily
interpreted in the appropriate contexts as referring to a competitive situation. Thus, the
expression it's a jungle out here is readily understandable due to the interpretability of
446
IV. Lexical semantics
the noun. Idioms that contain conventional or transparent metaphors are thus at least
partly compositional.
Many idioms arguably contain metaphors, though their use is bound to a particular
idioms. For example, nettle in grasp the nettle and bull in take the bull by the horns can be
readily interpreted as referring to specific entities in the context where the idioms are used.
But such metaphors do not work outside the idioms; this assignment involves numerous
nettles/bulls would be difficult to interpret, despite the fact that nettle and bull seem like
appropriate expressions referring to a difficult problem or a challenge, respectively, and
could thus be considered "motivated".
Arguments have been made for a cognitive, culturally-embedded basis of
metaphors and idioms containing metaphors that shapes their semantic structure and
makes motivation possible (Lakoff & Johnson 1980, Lakoff 1987, Gibbs & Steen
1999, Dobrovol'skij 2004, inter alia). Indeed, certain conceptual metaphors are cross-
linguistically highly prevalent and productive. For example, the "time is money"
metaphor is reflected in the many possession verbs (have, give, take, buy, cost) that take time
as an argument.
However, Burger (2007) points out that not all idioms are metaphorical and that
not all metaphorical idioms can be traced back to general conceptual metaphors. For
example, in the idiom pull strings, strings does not encode an established metaphor that is
readily understandable outside the idiom and that is exploited in related idioms. Indeed,
a broad claim to the universality and cultural independence of metaphor seems
untenable. The English idiom pull strings structurally and lexically resembles the German
idiom die Drdhte Ziehen (lit.:'pull the wires'), but the German idiom does not mean "to
use one's influence to one's favor", but rather "to mastermind".
Testing the conceptual metaphor hypothesis, Keysar & Bly (1995) asked speakers
guess the meanings of unfamiliar idioms involving such salient concepts as "high" and
"up", which many cognitivists claim to be universally associated with increased quality
and quantity. Subjects could not guess the meanings of many of the phrases, casting
doubt on the power and absolute universality of conceptual metaphors.
4. How frozen are idioms?
A core property associated with idioms is lexical, morphosyntactic, and syntactic
fixedness, which is said to be a direct consequence of semantic opacity. Many grammatical
frameworks, beginning with early Transformational-Generative Grammar, explicitly
classify idioms as "long words" that are exempt from regular and productive rules, much
like strong verbs.
Lexicographic resources implicitly reinforce the view of idioms as fixed strings by
listing idioms in morphologically unmarked citation forms, or lemmata, usually based
of the lexicographer's intuition. And experimental psycholinguists studying idiom
processing commonly use only such citation forms.
The few much-cited classic examples, including kick the bucket and trip the light
fantastic, served well to exemplify the properties ascribed in a wholesale fashion to all idioms.
But soon linguists began to note that not all idioms were alike. Fraser (1970)
distinguished several degrees of syntactic frozenness among idioms. Where a mapping
can be made from the idiom's constituents to those of its literal counterpart, syntactic
20. Idioms and collocations
447
flexibility is possible; simply put, the degree of semantic opacity is reflected in the degree
of syntactic frozenness.
Echoing Weinreich (1969),Fraser states that there is no idiom that is completely free.
In particular, he rules out focusing operations like clefting and topicalization, as these are
conditional on the focused constituent bearing a meaning. Citing constructed data, Cruse
(2004) also categorically rules out cleft constructions for idioms.
Syntactic and lexical frozenness is commonly ascribed to the semantic non-
compositionality of idioms. Indeed, it seems reasonable to assume that semantically
unanalyzable strings cannot be subject to regular grammatical processes as syntactic
variations typically serve to (de-)focus or modify particular constituents; when these
carry no meanings, such operations arc unmotivated.
But corpus studies show that a "standard", fixed form cannot be determined for many
idioms, at least not on a quantitative basis. Speakers use many idioms in ways that deviate
from that given in dictionary in more than one way;some idioms exhibit the same degree
of freedom as freely composed strings.
We rely in this article on corpus examples to illustrate idiom variations. The English
data come from Moon's (1998) extensive analysis of the Hector Corpus. The German
data are based on the in-depth analysis of 1,000 German idioms in a one billion word
corpus of German (Fellbaum 2006,2007b). We are not aware of similar large-scale corpus
analyses in other languages and therefore the data in this article will be somewhat biased
towards English and German.
5. Syntactic flexibility
As a wider range of idioms were considered, linguists realized that syntactic flexibility
was not an exception.The influential paper by Nunberg, Sag & Wasow (1994) argued for
the systematic relation between semantic compositionality and flexibility.
Both Abeille (1995), who based her analysis on a number of French idioms, and
Dobrovol'skij (1999) situate idioms on a continuum of flexibility that interacts with
semantic transparency. But while it seems true that constituents that are assigned a
meaning by speakers are open to grammatical operations as well as modification and
even lexical substitution,semantic transparency is not a requirement for variation.
We consider as an example the German idiom kein Blatt vor den Mund nehmen (lit.
'take no leaf/sheet in front of one's mouth', be outspoken, speak one's mind). Blatt (leaf,
sheet) has no obvious referent. Yet numerous attested corpus examples are found where
Blatt is passivized, topicalized, relativized and pronominalized; moreover, this idiom
need not always appear as a Negative Polarity Item:
(1) Bci BMW wird kein Blatt vor den Mund genommen.
(2) Ein Blatt habe er nie vor den Mund genommen.
(3) Das Blatt, das Eva vor ihr erregendes Geheimnis gehalten, ich nahme es nicht
einmal vor den Mund.
(4) Ein Regierungssprecher ist ein Mann, der sich 100 Blatter vor den Mund nimmt.
448
IV. Lexical semantics
Even "cran-morphemes", i.e., words that do not usually occur outside the idiom, can
behave like free lexemes. For example, Fettnapfchen rarely occurs outside the idiom ins
Fettnapfchen treten (lit. 'step into the little grease pot', commit a social gaffe), and no
meaning can be attached to it by a contemporary speaker that would map to the meaning
of a constituent in the literal equivalent. Yet the corpus shows numerous uses of this
idiom where the noun is relativied, topicalized, quantified, and modified.
(5) Das Fettnapfchen, in das die Frau ihres jiingsten Sohnes gestiegen ist, ist aber auch
riesig.
(6) Ins Fettnapfchen trete ich bestimmt mal und das ist gut so.
(7) Silvio Berlusconi: Ein Mann, viele Fettnapfchen.
(8) Immer trat der New Yorker ins bereitstehende Fettnapfchen.
Syntactic variation is probably due to speakers' ad-hoc assignment of meanings of idiom
components. Adjectival modification and lexical variations in particular indicate that
the semantic interpretation of inherently opaque constituents arc dependent on the
particular context in which the idiom is embedded.
For more examples and discussion see Moon (1998) and Fellbaum (2006,2007b).
6. Morphosyntax
The apparent fixedness of many idioms extends beyond constituent order and lexical
selection to their morphosyntactic make-up.
6.1. Modality, tense, aspect
The idiomatic reading of many strings requires a specific modality, tense, or aspect. Thus
horses wouldn 't get (NP) to (VP) is at best odd without the conditional: horses don't get
(NP) to (VP). Similarly, the meaning of / couldn't care less is not preserved in / could
care less. Interestingly, the negation in the idiom is often omitted even when the speaker
does not intend a change of polarity, as in / could care less; perhaps speakers are uneasy
about the double negation in such cases but do not decompose the idiom to realize that a
single negation changes the meaning. Another English idiom requiring a specific modal
is will not/won't hear of it.
The German idioms in den Tuschkasten gefallen sein (lit. 'have fallen into the paint
box', be overly made up) and nicht auf den Kopf gefallen sein (lit. 'not have fallen on
one's head', be rather intelligent) denote states and cannot be interpreted as idioms in
any tense other than the perfect (Fellbaum 2007a).
6.2. Determiner and number
The noun in many VP idioms is preceded by a definite determiner even though it lacks
definiteness and, frequently, reference: kick the bucket, fall off the wagon, buy the farm.
The determiner here may be a relic from the time when transparent phrases became
lexicalized as idioms.
20. Idioms and collocations
449
A regular and frequent grammatical process in German is the contraction of the
determiner with a preceding preposition.Thus, the idiom jemanden hinters Licht fiihren
(lit. 'lead someone behind the light', deceive someone), occurs most frequently with
hinter (behind) and das (the) contracted to hinters.
Eisenberg (1999, inter alia) assert that the figurative reading of many German idioms
precludes a change in the noun's determiner. This hypothesis seems appealing: if the
noun phrase is semantically opaque, the choice of determiner does not follow the usual
rules of grammar and is therefore arbitrary. As speakers cannot interpret the noun, the
determiner is not subject to grammatical processes. Decontraction in cases like hinters
Licht fiihren is thus ruled out by Eisenberg, as is contraction in cases where the idiom's
citation form found in dictionaries shows the preposition separated from the determiner.
However, Firenze (2007) cites numerous corpus examples of idioms with
decontraction. In some cases, the decontraction is conditioned by the insertion of lexical material
such as adjectives between the preposition and the determiner or number variation of
the noun; in other cases, contracted and decontracted forms alternate freely.
Such data show that the NP is subject to the regular grammatical processes of free
language, even when its meaning is not transparent: Licht in the idiom has no referent,
like wool in the corresponding English idiom pull the wool over someone's eyes.
Corpus data also show number variations for the nouns. For example, in den Bock
zum Gartner machen (lit. 'make the buck the gardener', put an incompetent person in
charge), both nouns occur predominantly in the singular. But wc find examples with
plural nouns, Bocke zu Gdrtnern machen, where the speaker refers to several specific
incompetent persons.
6.3. Lexical variation
Variation is frequently attested for the nominal, verbal, and adjectival components of
VP idioms. In most cases, this variation is context-dependent, and shows conscious
playfulness on the part of the speaker or writer.
Fellbaum & Stathi (2006) discuss three principal cases of lexical variation:
paradigmatic variation, adjectival modification, and compounding. In each case, the variation
plays on the literal interpretation of the varied component. Paradigmatic substitution has
occurred in the case where verb in jemandem die Leviten lesen (lit.,'read the Levitus to
someone', read someone the riot act) has been replaced by quaken (croak like a frog) and
briillen (scream). An adjective has been added to the noun in an economics text in die
marktwirtschaftlichen Leviten lesen, 'read the market-economical riot act.' An example
for compounding is er nimmt kein Notenblatt vor den Mund where the idiom kein Blatt
vor den Mund nehmen (lit. 'take no leaf/sheet in front of one's mouth', meaning be
outspoken) occurs in the context of musical performance and Blatt becomes Notenblatt,
'sheet of music'
Besides morphological and syntactic variation, corpus data show that speakers vary
the lexical form of many idioms. Moon's (1998) observation that lexical variation is often
humorous is confirmed by the German corpus data, as is Moon's finding that such
variation is found most frequently in journalism. (See also Kjellmer 1991 for examples of
playful lexical variations in collocations, such as not exactly my cup of tequila.)
Ad-hoc lexical variation is dependent on the particular discourse and must play on the
literal reading of the substituted constituent rather than on a metaphoric one.That is,spill
450
IV. Lexical semantics
the secret would be hard to interpret, whereas rock the submarine would be interpretable
in the appropriate context, as submarine invokes the paradigmatically related boat.
Moon (1998) discusses another kind of lexical variation, called idiom schemas,
exemplified by the group of idioms hit the deck/sack/hay. Idiom schemas correspond to Nun-
berg, Sag & Wasow's (1994) idiom families, such as don't give a hoot/damn/shit. The
variation is limited, and each variant is lexicalized rather than ad-hoc. But in other cases,
lexical variation is highly productive. An example is the German idiom hier tanzt der Bar
(lit. 'here dances the bear', this is where the action is), which has spawned a family of
new expressions with the same meaning, including hier steppt der Bar (lit.'the bear does
a step dance here') and even hier rappt der Hummer (lit.'the lobster raps here') and hier
boxt der Papst (lit.,'the Pope is boxing here'). There are clear constraints on the lexical
variations in terms of semantic sets, unlike in cases like don't give a hoot/damn/shit; this
may account for the productivity of the "bear" idiom.
6.4. Semantic reanalysis
Motivation is undoubtedly a strong factor in the variability of idioms; if speakers
can assign meaning to a component, even if only in a specific context, the component is
available for modification and syntactic operations.
Corpus examples with lexical and syntactic variations suggest that speakers attribute
meaning to idiom components that are opaque to contemporary speakers and remo-
tivatc them. Gchwcilcr (2007) discusses the German idiom in die Rohre schauen (lit.
'look into the pipe', go empty-handed), which originates in the language of hunters and
referred to dogs peering into foxholes. The meaning of noun here is opaque to
contemporary speakers, but the idiom has aquired more recent, additional senses where the
noun is re-interpreted.
7. Idioms as constructions
Fillmore, Kay & O'Connor (1988) discuss the idiomaticity of syntact constructions like
the X-erthe Y-er, which carry meaning independent of the lexical items that fill the slots;
cf. article 86 (Kay & Michaelis) Constructional meaning.
Among VP idioms with a more regular phrase structure, some require the suppression
of an argument for the idiomatic reading, resulting in a violation of the verb's subcatego-
rization properties. For example, werfen ('throw') in the German idiom das Handtuch
werfen, lit.'throw the towel' does not co-occur with a Location (Goal) argument, which
is required in the verb's non-idiomatic use.
Other idioms require the presence of an argument that is optional in the literal
language; this is the case for many ditransitivc German VP idioms, including jemandem ein
Bein stellen, lit., 'place a leg for someone', 'trip someone up' and jemandem eine Szene
machen, lit. 'make a scene for someone', cause a scene that embarrasses someone in
German. Here, the additional indirect object (indicated by the placeholder someone) is
most often an entity that is negatively affected by the event, a Maleficiary. Whereas in
free language use, ditransitive constructions often denote the transfer of a Theme from
a Source to a Goal or Recipient, this is rarely the case for ditransitive idioms (Moon
1998,Fellbaum 2007c). Instead,such idioms exemplify Green's (1974) "symbolic action",
events intended by the Agent to have a specific effect on a Beneficiary, or, more often, a
20. Idioms and collocations
451
Maleficiary. Ditransitives that do not denote the transfer of an entity are highly restricted
in the literal language (Green 1974); it is interesting that so many idioms denoting an
event with a negatively affected entity are expressed with a ditransitive construction.
Structurally defined classes of idioms can be accounted for under Goldberg's (1995)
Construction Grammar, where syntactic configurations carry meaning independent of
lexical material; however, Goldberg (1995) and Goldberg & Jackendoff (2004) consider
syntactically well-formed and lexically productive structures such as resultatives rather
than idioms. The identification of classes of grammatically marked idioms points to
idiom-specific syntactic configurations or frames that carry meaning.
8. Diachronic changes
The origins of specific idioms is a subject of much speculation and folk etyomology.
Among the idioms with an indisputable history are those found in the Bible (e.g., throw
pearls before the swine, fall from grace, give up the ghost from the King James Bible).
These tend to be found in many of the languages into which the Bible was translated.
Many other idioms originate in specific domains: pull one's punches, go the distance
(boxing), have an ace up your sleeve, let the chips fall where they may (gambling), and fall
on one's sword, bite the bullet (warfare).
Idioms are subject to the same diachronic processes that have been observed for
lexemes with literal interpretation. Longitudinal corpus studies by Gehweiler, Hoser &
Kramer (2007) shows how German VP idioms undergo extension, merging, and semantic
splitting and may develop new, homonymic readings.
Idioms may also change their usage over time. Thus, the German idiom unter dem
Pantoffel stehen (lit.,'stand underneath somebody's slipper', be under someone's thumb)
used to refer to domestic situations where a husband is dominated by his wife. But corpus
data from the past few decades show that this idiom is extended to not only to female
spouses but also to other social relations, such as employees in a workplace dominated
by a boss (Fellbaum 2005).
Idioms change their phrase structure over time. Kwasniak (2006) examines cases
where a sentential idiom turns into a VP idiom when a fixed component becomes a
"free" constituent that is no longer part of the idiom.
9. Idioms in the lexicon
As form-meaning pairs, idioms belong in the lexicon. A dual nature - semantic
simplicity but structural complexity - is often ascribed to them. But the question arises as to
why natural languages show complex encoding for concepts whose semantics are as
straightforward as those of simple lexemes.
It has often been observed that many idioms express concepts already covered by
simple lexemes but with added connotational nuances or restrictions to specific contexts
and social situations; classic examples arc the many idioms meaning to "die", ranging
from disrespectful to euphemistic. But many - perhaps most - idioms, including cut
one's teeth on, live from hand to mouth, have eyes for, lend a hand, lose one's head are
neutral with respect to register. Similar idioms are found across languages, for example
idioms expressing a range of judgments of physical appearance, mental ability, and social
aptitude.
452
IV. Lexical semantics
Subtle register differences alone do not seem to warrant the structural and lexical
extravagance of idioms, the constraints on their use, and the added burden on language
acquisition and processing.
An examination of how English VP idioms fit into the structure of the lexicon reveals
that many lack non-idiomatic synonyms and express meanings not covered by simple
lexemes, arguably filling "lexical gaps". Moreover, many idioms appear not to fit the
regular lexicalization patterns of English. A typology of idioms based on semantic
criteria is suggested in (Fellbaum 2002, 2007a). It includes idioms expressing negations of
events or states (miss the bus, fall through the cracks, go begging) and idioms expressing
several events linked by a Boolean operator (fish or cut bair, have one's cake and eat
it). Such structurally and semantically complex idioms can be found across languages.
One function of idioms may be to encode pre-packaged complex messages that cannot
be expressed by simple words and whose salience makes them candidates for lexical
encoding.
10. Idioms in the mental lexicon
How are idioms represented in the mental lexicon and speakers' grammar? On the one
hand, they are more or less fixed MWUs - long words - that speakers produce and
recognize as such, which suggests that they are represented exactly like simple lexemes.
On the other hand, attested idiom use shows a wide range of syntactic, morphological,
and especially lexical variation, indicating that speakers access the internal structure of
idioms and subject them to all the grammatical processes found in literal language. To
determine whether idioms are represented and processed as unanalyzable units or as
of potentially (and often partially) decomposable strings, psycholinguistic experiments
have investigated both the production and the comprehension of idioms. The materials
used in virtually all experiments are constructed and not based on corpus examples, yet
the findings and the hypothesis based on the results are fully compatible with naturally
occurring data.
Comprehension time studies show that familiar idioms like kick the bucket are
processed faster in their idiomatic meaning ('die') that in a literal one (kicking a pail). Glucks-
berg (2001) asserts that the literal meaning may be inhibited by the figurative meaning of
a string, though both may be accessed. However, the timing experiments argue against a
processing model where the idiomatic reading kicks in only after a literal one has failed.
Cutting & Bock (1997), based on a number of experiments involving idiom
production, propose a "hybrid" theory of idiom representation. They argue for the existence
of a lexical concept node for each idiom; at the same time, idioms are syntactically
and semantically analyzed during production, independent of the idioms' degree of
compositionality.
This "hybrid account" is also supported by Sprenger, Levelt & Kempen (2006), who
show that idioms can be primed with lexemes that are semantically related to
constituents of the idioms. Sprenger, Levelt & Kempen propose the notion of a "superlemma"
as a conceptual unit whose lexemes are bound both to their idiomatic use and their use
in the free language.
The Superlemma theory is compatible with the Configuration Hypothesis that Cac-
ciari & Tabossi (1988) formulated on the basis of idiom comprehension experiments.The
Configuration Theory maintains that speakers activate the literal meanings of words in a
20. Idioms and collocations
453
phrase and recognize the idiomatic meaning of a polysemous string only when they
recognize an idiom-specific configuration of lexems or encounter a "key" lexeme. One such
key in many idioms may be the definite article (kick the bucket, fall off the wagon, buy
the farm) which suggests that a referent for the noun has been previously introduced
into the discourse; when no matching antecedent can be found, another
interpretation of the string must be attempted (Fellbaum 1993). The keyhypothesis is
compatible with Cutting & Bock's proposal concerning idioms' representation in the mental
lexicon.
Kuiper (2004) analyzed a collection of slips of the tongue for idioms, comprising 1,000
errors. He proposes a taxonomy of sources for the errors from all levels of grammar. Kui-
pcr's analysis of the data shows that idioms arc not simply stored as frozen long words,
consistent with the superlemma theory of idiom representation.
10.1. Idioms in natural language processing
If one inputs an idiom into a machine translation engine (such as Babelfish or Google
translate), it does not - in many cases - return a corresponding idiom or an adequate
non-idiomatic translation in the target language.This is one indication that the
recognition and processing especially of non-compositional idioms are still a challenge. One
reason is that lexical resources that many NLP applications rely on do not include many
idioms and fixed collocations. When idioms arc listed in computational lexicons, it is
often in a fixed form; idioms exhibiting morphosyntactic flexibility and lexical variations
make automatic recognition very challenging. A more promising approach than lexical
look-up is to search for the co-occurrence of the components of an idiom within a
specific window, regardless of syntactic configuration and morphological categories. Lexical
variation can be accounted for by searching for words that are semantically similar, as
reflected in a thesaurus.
Another difficulty for the automatic processing of idioms is polysemy. Many idioms
are "plausible" and have a literal reading (keep the ball rolling, make a dent in, not move
a finger). To distinguish the literal and the idiomatic readings, a system would have to
perform a semantic analysis of the wider context, a task similar to that performed by
human when disambiguating between literal and idiomatic meanings.
11. Summary and conclusion
A prevailing view in linguistics represents idioms as "long words", largely non-
compositional multi-word units with little or no room for deviation from a canonical
form; any morphosyntactic flexibility is often thought to be directly related to semantic
transparency. Corpus investigations show, first, that idioms are subject to far more
variation than the traditional view would allow, and, second, that speakers use idioms in
creative ways even in the absence of full semantic interpretation. The boundary between
compositional and non-compositional strings appears to be soft, as speakers assign
ad-hoc, discourse-specific meanings to idiom constituents that are opaque outside of
certain contexts.
Psycholinguistic experiments, too, indicate that idiomatic and non-idiomatic language
is not strictly separated in our mental lexicon and grammar.
454
IV. Lexical semantics
Perhaps the most important function of many idioms, which may account for their
universality and ubiquity, is that they provide convenient, pre-fabricated, conventionalized
encodings of often complex messages.
12. References
Abeille\ Anne 1995.The flexibility of French idioms. A representation with lexicalized tree adjoining
grammar. In: M. Everaert et al. (eds.). Idioms. Structural and Psychological Perspectives.
Hillsdale, NJ: Erlbaum, 15^2.
Baldwin,Timothy, Colin Bannard.TakaakiTanaka & Dominic Widdows 2003. An empirical model
of multiword expression decomposability. In: Proceedings of the ACL-03 Workshop on
Multiword Expressions. Analysis, Acquisition and Treatment. Stroudsburg, PA: ACL, 89-96.
Bobrow, Daniel & Susan Bell 1973. On catching on to idiomatic expressions. Memory & Cognition
1,343-346.
Burger, Harald 2007. Phraseologie. Eine Einfiihrung am Beispiel des Deutschen. 3rd edn. Berlin:
Erich Schmidt. 1st edn. 1998.
Cacciari, Cristina & Patrizia Tabossi 1988. The comprehension of idioms. Journal of Memory and
Language 27,668-683.
Church, Kenneth W. & Patrick Hanks 1990. Word association norms, mutual information, and
lexicography. Computational Linguistics 16,22-29.
Church, Kenneth W., William Gale, Patrick Hanks & Donald Hindle 1991. Using statistics in lexical
analysis. In: U. Zernik (ed.). Lexical Acquisition. Exploiting On-Line Resources to Build a
Lexicon. Hillsdale, NJ: Erlbaum, 115-164.
Cowie, Anthony P. 1998. Phraseology. Theory, Analysis, and Applications. Oxford: Oxford
University Press.
Cruse, D. Alan 2004. Meaning in Language. An Introduction to Semantics and Pragmatics. 2nd edn.
Oxford: Oxford University Press. 1st edn. 2000.
Cutting, J. Cooper & Kathryn Bock 1997.That's the way the cookie bounces. Syntactic and semantic
components of experimentally elicited idiom blends. Memory & Cognition 25,57-71.
Dobrovol'skij, Dmitrij 1999. Habcn transformationcllc Dcfcktc dcr Idiomstruktur scmantischc
Ursachen? In: N. Fernandez Bravo, I. Behr & C. Rozier (eds.). Phraseme und typisierende Rede.
Tubingen: Stauffenburg, 25-37.
Dobrovol'skij, Dmitrij 2004. Lexical semantics and combinatorial profile. A corpus-based approach.
In: G. Williams & S. Vesser (eds.). Euralex II. Lorient: Universitd de Bretagne Sud, 787-796.
Eisenberg, Peter 1999. Grundriss der deutschen Grammatik, Vol. 2: Der Satz. Stuttgart: Metzler.
Evert, Stefan 2005. The Statistics of Word Cooccurrences. Word Pairs and Collocations. Doctoral
dissertation. University of Stuttgart.
Fazly, Afsaneh & Suzanne Stevenson 2006. Automatically constructing a lexicon of verb phrase
idiomatic combinations. In: Proceedings of the European Chapter of the Association for
Computational Linguistics (EACL) 11. Stroudsburg, PA: ACL, 337-344.
Fellbaum, Christiane 1993. The determiner in English idioms. In: C. Cacciari & P. Tabossi (eds.).
Idioms. Processing, Structure, and Interpretation. Hillsdale, NJ: Erlbaum, 271-295.
Fellbaum, Christiane 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: The MIT
Press.
Fellbaum, Christianc 2002. VP idioms in the lexicon. Topics for research using a very large corpus.
In: S. Buscmann (cd.). Proceedings of Konferenz zur Verarbeitang natiirlicher Sprache (= KON-
VENS) 6. Kaiserslautern: DFKI, 7-11.
Fellbaum, Christiane 2005. Unter dein Pantoffel stehen. Circular der Berlin-Brandenburgischen
Akademie der Wissenschaften 31,18.
Fellbaum, Christiane (cd.) 2006. Corpus-Based Studies of German Idioms and Light Verbs. Special
issue of the Journal of Lexicography 19.
20. Idioms and collocations
455
Fellbauin. Christiane 2007a. The ontological loneliness of idioms. In: A. Schalley & D. Zaefferer
(eds.). Ontolinguistics. How Ontological Status Shapes the Linguistic Coding of Concepts. Berlin:
Mouton de Gruyter, 419-434.
Fellbauin, Christiane (ed.) 2007b. Idioms and Collocations. Corpus-Based Linguistic and
Lexicographic Studies. London: Continuum.
Fellbauin, Christiane 2007c. Argument selection and alternations in VP idioms. In: Ch. Fellbauin
(ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London:
Continuum. 188-202.
Fellbaum, Christiane & Ekaterini Stathi 2006. Idiome in der Grammatik und im Kontext. Wer
briillt hier die Leviten? In: K. Proost & E. Winkler (eds.). Von Intentionalitdt zur Bedeutung
konventionalisierter Zeichen. Festschrift fur Gisela Harras zum 65. Geburtstag. Tubingen: Narr,
125-146.
Fillmore. Charles, Paul Kay & Mary O'Connor 1988. Regularity and idiomaticity in grammatical
constructions. Language 64,501-538.
Fircnzc, Anna 2007. 'You fool her' doesn't mean (that) 'you conduct her behind the light'.
(Dis)agglutination of the determiner in German idioms. In: Ch. Fellbaum (ed.). Idioms
and Collocations. Corpus-Based Linguistic and Lexicographic Studies. London: Continuum,
152-163.
Firth, John R. 1957. A Synopsis of Linguistic Theory 1930-1955. In: Studies in Linguistic Analysis.
Oxford: Blackwell, 1-32.
Fraser, Bruce 1970. Idioms within a transformational grammar. Foundations of Language 6.
22-42.
Gehweiler, Elke 2007. How do homonymic idioms arise? In: M. Nenonen & S. Niemi (eds.).
Proceedings of Collocations and Idioms 1. Joensuu: Joensuu University Press.
Gehweiler, Elke, Iris Hoser & Undine Kramer 2007. Types of changes in idioms. Some surprising
results of corpus research. In: Ch. Fellbaum (ed.). Idioms and Collocations. Corpus-Based
Linguistic and Lexicographic Studies. London: Continuum, 109-137.
Gibbs, Ray & Gerard J. Steen (eds.) 1999. Metaphor in Cognitive Linguistics. Amsterdam: Benjamins.
Glucksberg, Sain 2001. Understanding Figurative Language. From metaphors to Idioms. New York:
Oxford University Press.
Goldberg, Adclc 1995. Constructions. A Construction Grammar Approach to Argument Structure.
Chicago, IL:The University of Chicago Press.
Goldberg, Adele & Ray Jackendoff 2004. The English resultative as a family of constructions.
Language 80,532-568.
Green, Georgia M. 1974. Semantics and Syntactic Regularity. Blooinington, IN: Indiana University
Press.
Halliday, Michael 1966. Lexis as a linguistic level. In: C. E. Bazell et al. (eds.). In Memory ofJ.R.
Firth. London: Longman, 148-162.
Jackendoff, Ray 1997.Twistin' the night away. Language 73,534-559.
Keysar, Boaz & Bridget Bly 1995. Intuitions of the transparency of idioms. Can one keep a secret
by spilling the beans? Journal of Memory and Language 34,89-109.
Kjelhner, Goran 1991. A mint of phrases. In: K. Aijmer & B. Altenberg (eds.). English Corpus
Linguistics. Studies in Honor of Jan Svartvik. London: Longman, 111-127.
Krenn, Brigitte 2000. The Usual Suspects. Data-Oriented Models for Identification and
Representation of Lexical Collocations. Doctoral dissertation. University of Saarbriicken.
Kuiper. Konrad 2004. Slipping on superlemmas. In: A. Hacki-Buhofcr & H. Burger (eds.).
Phraseology in Motion, vol. I. Baltmannswcilcr: Schneider, 371-379.
Kwasniak, Renata 2006. Wer hat nun den Salat? Now who's got the mess? Reflections on
phraseological derivation. From sentential to verb phrase idiom. International Journal of Lexicography
19,459^178.
Lakoff, George 1987. Women, Fire, and Dangerous Things. What Categories Reveal about the Mind.
Chicago, IL:The University of Chicago Press.
456
IV. Lexical semantics
Lakoff, George & Mark Johnson 1980. Metaphors We Live By. Chicago, IL: The University of
Chicago Press.
McCarthy, Diana, Bill Keller & John Carroll 2003. Detecting a continuum of compositionality in
phrasal verbs. In: Proceedings of the ACL-03 Workshop on Multiword Expressions. Analysis,
Acquisition and Treatment. Stroudsburg, PA: ACL, 73-80.
Miller, George A. 1956.The magical number seven, plus or minus two. Some limits on our capacity
for processing information. Psychological Review 63,81-97.
Moon, Rosamund 1998. Fixed Expressions and Idioms in English. A Corpus-Based Approach.
Oxford: Clarendon Press.
Nunberg, Geoffrey, Ivan Sag & Thomas Wasow 1994. Idioms. Language 70,491-538.
Sailer, Manfred & Frank Richter 2002. Not for love or money. Collocations! In: G. Jager et al.
(eds.). Proceedings of Formal Grammar 7. Stanford, CA: CSLI Publications, 149-160.
Sinclair, John 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Sohn, Jan-Philipp 2006. On idiom parts and their contexts. Linguistik Online 27,11-28.
Sprenger, Simone A., William J. M. Levelt & Gerard A. M. Kern pen 2006. Lexical access during the
production of idiomatic phrases. Journal of Memory and Language 54,161-184.
Stantcheva, Diana 2006. The many faces of negation. German VP idioms with a negative
component. International Journal of Lexicography 19,397-418.
Storrer, Angelika 2007. Corpus-based investigations on German support verb constructions. In: Ch.
Fellbauin (ed.). Idioms and Collocations. Corpus-Based Linguistic and Lexicographic Studies.
London: Continuum, 164-187.
Stubbs, Michael 2001. Words and Phrases. Corpus Studies of Lexical Semantics. Oxford: Blackwell.
Swinney, Douglas & Anne Cutler 1979.The access and processing of idiomatic expressions. Journal
of Verbal Learning and Verbal Behavior 18,523-534.
Weinreich. Uriel 1969. Problems in the analysis of idioms. In: J. Puhvel (ed.). Substance and
Structure of Language. Berkeley, CA: University of California Press, 23-81.
Christiane Fellbaum, Princeton, NJ (USA)
21. Sense relations
1. Introduction
2. The basic sense relations
3. Sense relations and word meaning
4. Conclusion
5. References
Abstract
This article explores the definition and interpretation of the traditional paradigmatic sense
relations such as hyponymy, synonymy, meronymy, antonymy, and syntagmatic relations
such as selectional restrictions. A descriptive and critical overview of the relations is
provided in section 1 and in section 2 the relation between sense relations and different theories
of word meaning is briefly reviewed. The discussion covers early to mid twentieth century
structuralist approaches to lexical meaning, with its concomitant view of the lexicon as being
structured into semantic fields, leading to more recent work on decompositional approaches
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 456-479
21. Sense relations
457
to word meaning. The latter are contrasted with atomic views of lexical meaning and the
capturing of semantic relations through the use of meaning postulates.
1. Introduction
Naive discussions of meaning in natural languages almost invariably centre around the
meanings of content words, rather than the meanings of grammatical words or phrases
and sentences, as is normal in academic approaches to the semantics of natural languages.
Indeed, at first sight, it might seem to be impossible to construct a theory of meaning
of sentences without first uncovering the complexity of meaning relations that hold
between the words of a language that make them up. So, it might be argued, to know the
meaning of the sentence Matthew rears horses we need also to at least know the meaning
of Matthew rears animals or Matthew breeds horses, since horses are a kind of animal and
rearing tends to imply breeding. It is in this context that the notion of sense relations, the
meaning relations between words (and expressions) of a language, could be seen as
fundamental to the success of the semantic enterprise. Indeed, the study of sense relations
has a long tradition in the western grammatical and philosophical traditions, going back
at least to Aristotle with discussions of relevant phenomena appearing throughout the
medieval and later literature. However, the systematisation and taxonomic classification
of the system of sense relations was only taken up in the structuralist movements of the
twentieth century,particularly in Europe following the swift developments in structuralist
linguistics after de Saussure.This movement towards systematic analyses of word sense
was then taken up in the latter part of that century and the early part of the twenty-first
century in formal modelling of the sense relations and, in particular, the development
of computational models of these for the purposes of natural language processing.
The notion of 'sense' in this context may be variously interpreted, but is usually
interpreted in contrast to the notion of reference (or, cquivalcntly, denotation or extension).
The latter expresses the idea that one aspect of word meaning is the relation between
words and the things that they can be used properly to talk about. Thus, the reference/
denotation of cat is the set of all cats (that are, have been and will be); that of run (on
one theoretical approach), the set of all past, present and future events of running (or,
on another view, the set of all things that ever have, are or will engage in the activity we
conventionally refer to in English as running). Sense, on the other hand, abstracts away
from the things themselves to the property that allows us to pick them out.The sense of
cat is thus the property that allows us to identify on any occasion an object of which it can
truthfully be said that is a cat - 'catness' (however that might be construed, cognitively in
terms of some notion of concept, see for instance Jackendoff 2002 or model-theoretically
in terms of denotations at different indices, see Montague 1973). Sense relations are
thus relations between the properties that words express, rather than between the things
they can be used to talk about (although, as becomes clear very quickly, it is often very
difficult to separate the two notions).
Whether or not the study of sense relations can provide a solid basis for the
development of semantic theories (and there are good reasons for assuming they cannot, see for
example Kilgarriff 1997), nevertheless the elaboration and discussion of such meaning
relations can shed light on the nature of the problems we confront in providing such
theories, not least in helping to illuminate features of meaning that are truly amenable to
semantic analysis and those that remain mysterious.
458
IV. Lexical semantics
2. The basic sense relations
There are two basic types of sense relation. The most commonly presented in
introductory texts arc the paradigmatic relations that hold between words of the same general
category or type and that are characterised in terms of contrast and hierarchy. Typically,
a paradigmatic relation holds between words (or word-forms) when there is choice
between them. So given the string John bought a, it is possible to substitute any noun
that denotes something that can be bought: suit, T-shirt, cauliflower, vegetable, house,...
Between some of these words there is more to the choice between them than just the fact
they are nouns denoting commodities. So, for example, if John bought a suit is true then
it follows that John bought a pair of trousers is also true by virtue of the fact that pairs
of trousers are parts of suits and if John bought a cauliflower is true then John bought a
vegetable is also true, this time by virtue of the fact that cauliflowers are vegetables.
The second type of sense relations arc syntagmatic which hold between words
according to their ability to co-occur meaningfully with each other in sentences.
Typically syntagmatic sense relations hold between words of different syntactic categories or
(possibly) semantic types such as verbs and nouns or adverbs and prepositional phrases.
In general, the closer the syntactic relation between two words such as between a head
word and its semantic arguments or between a modifier and a head, the more likely it is
that one word will impose conditions on the semantic properties the other is required
to show. For example, in the discussion in the previous paragraph, the things that one
can (non-metaphorically) buy are limited to concrete objects that are typically
acceptable commodities in the relevant culture: in a culture without slavery adding boy to the
string would be highly marked. As we shall see below, there is a sense in which these two
dimensions, of paradigm and syntagm, cannot be kept entirely apart, but it is useful to
begin the discussion as if they do not share interdependencies.
Of the paradigmatic sense relations there arc three basic ones that can be defined
between lexemes, involving sense inclusion, sense exclusion and identity of sense. Within
these three groups, a number of different types of relation can be identified and, in
addition, to these other sorts of sense relations, such as part-whole, have been identified and
discussed in the literature. As with most taxonomic endeavours, researchers may be
'lumpers', preferring as few primary distinctions as possible, and 'splitters' who consider
possibly small differences in classificatory properties as sufficient to identify a different
class. With respect to sense relations, the problem of when to define an additional
distinction within the taxonomy gives rise to questions about the relationship between
knowledge of the world and knowledge of a word: where does one end and the other begin (see
article 32 (Hobbs) Word meaning and world knowledge). In this article, I shall deal with
only those relations that are sufficiently robust as to have become standard within lexical
semantics: antonymy, hyponymy, synonymy and meronymy. In general, finer points of
detail will be ignored and the discussion will be confined to the primary, and generally
accepted, sense relations, beginning with hyponymy.
2.1. Hyponymy
Hyponymy involves specific instantiations of a more general concept such as holds
between horse and animal or vermilion and red or buy and get. In each case, one word
provides a more specific type of concept than is displayed by the other.The more specific
21. Sense relations
459
word is called a hyponym and the more general word is the superordinate which may also
be referred to as a hyperonym or hypernym, although the latter is disprcferred as in non-
rhotic dialects of English it is homophonic with hyponym. Where the words being
classified according to this relation are nouns, one can test for hyponymy by replacing X and Y
in the frame'X is a kind of Y' and seeing if the result makes sense. So we have'(A) horse
is a kind of animal' but not '(An) animal is a kind of horse' and so on. A very precise
definition of the relation is not entirely straightforward, however. One obvious approach is to
have recourse to class inclusion,so that the set of things denoted by a hyponym is a subset
of the set of things denoted by the superordinate. So the class of buying events is a subset
of the class of getting events.This works fine for words that describe concrete entities or
events, but becomes metaphysically more challenging when abstract words like thought
emotion, belief, understand, think etc. are considered. More importantly there are words
that may be said to have sense but no denotation such as phoenix, hobbit, light sabre and
so on. As such expressions do not pick out anything in the real word they can be said to
denote only the empty set and yet, obviously, there are properties that such entities would
possess if they existed that would enable us to tell them apart. A better definition of
hyponymy therefore is to forego the obvious and intuitive reliance on class membership and
define the relation in terms of sense inclusion rather than class inclusion: the sense of the
superordinate being included in the sense of the hyponym. So a daffodil has the sense of
flower included in it and more besides. If we replace'sense', as something we are trying to
define, with the (perhaps) more neutral term 'property', then we have:
(1) Hyponymy: X is a hyponym of Y if it is the case that if anything is such that it has
the property expressed by X then it also has the property expressed by Y.
Notice that this characterisation in terms of a universally quantified implication statement,
does not require there to be actually be anything that has a particular property, merely
that if such a thing existed it would have that property. So unicorn may still be considered
a hyponym of animal, because if such things did exist, they would partake of 'animalness'.
Furthermore, in general if X and Y are hyponyms of Z they are called co-hyponyms,
where two words can be defined as co-hyponyms just in case they share the same
superordinate term and one is not a hyponym of the other. Co-hyponyms arc generally
incompatible in sense, unless they are synonymous (see below section 2.3). For example, horse,
cat, bird, sheep, etc. are all co-hyponyms of mammal and all mutually incompatible with
each other: *That sheep is a horse. Hyponymy is a transitive relation so that if X is a
hyponym of Y and Y is a hyponym of Z then X is a hyponym of Z: foal is a hyponym of horse,
horse is a hyponym of animal, and so foal is a hyponym of animal. Note that because of
this transitivity,/oa/ is treated as a co-hyponym of, and so incompatible with, not only filly
and stallion, but also sheep, lamb and bull, etc.This sort of property indicates how
hyponymy imposes partial hierarchical structure on a vocabulary. Such hierarchies may define
a taxonomy of (say) natural kinds as in Fig. 21.1. Complete hierarchies are not common,
however. Often in trying to define semantic fields of this sort, the researcher discovers
that there may be gaps in the system where some expected superordinate term is missing.
For example, in the lexical field defined by move we have hyponyms like swim, fly, roll
and then a whole group of verbs involving movement using legs such as run, walk, hop,
jump, skip, crawl, etc. There is, however, no word in English to express the concept that
classifies the latter group together. Such gaps abound in any attempt to construct a fully
460
IV. Lexical semantics
hierarchical lexicon based on hyponymy. Some of these gaps may be explicable through
socio-cultural norms (for example, gaps in kinship terms in all languages), but many are
simply random: languages do not require all hierarchical terms to be lexicalised.That is
not to say, however, that languages cannot express such apparently superordinate
concepts. As above, we can provide the concept required as superordinate by modifying its
apparent superordinate to give move using legs. Indeed, Lyons (1977) argues that
hyponymy can in general be defined in terms of predicate modification of a superordinate.
Thus,swim is move through fluid, mare is female horse, lamb is immature sheep and so on.
This move pushes a paradigmatic relation onto some prior syntagmatic basis:
sheep cow eagle blackbird shark pike
mare stallion foal filly ewe lamb calf bull
Fig. 21.1: Hyponyms of animal
Hyponymy is a paradigmatic relation of sense which rests upon the encapsulation in the
hyponym of some syntagmatic modification of the sense of the superordinate relation.
Lyons (1977: 294)
Such a definition does not work completely. For example, it makes no sense at the level
of natural kinds (is horse to be defined as equine animal!) and there arc other apparent
syntagmatic definitions that are problematic in that precise definitions are not obvious
(saunter is exactly what kind of walk!). Such considerations, of course, reflect the
vagueness of the concepts that words express out of context and so we might expect any such
absolute definition of hyponymy to fail.
Hyponymy strictly speaking is definable only between words of the same (syntactic)
category, but some groups of apparent co-hyponyms seem to be related to a word of
some other category. This seems particularly true of predicate-denoting expressions
like adjectives which often seem to relate to (abstract) nouns as superordinates rather
than some other adjective. For example, round, square, tetrahedral, etc. all seem to be
'quasi-hyponyms' of the noun shape and hot, warm, cool, cold relate to temperature.
Finally, the hierarchies induced by hyponymy may be cross-cutting. So the animal
field also relates to fields involving maturity (adult, young) or sex (male, female) and
perhaps other domains. This entails that certain words may be hyponyms of more than one
superordinate, depending on different dimensions of relatedness. As we shall see below,
such multiple dependencies have given rise to a number of theoretical approaches to
word meaning that try to account directly for sense relations in terms of primitive sense
components or inheritance of properties in some hierarchical arrangement of conceptual
or other properties.
2.2. Synonymy
Synonymy between two words involves sameness of sense and two words may be defined
as synonyms if they are mutually hyponymous. For example, sofa and settee are both
21. Sense relations
461
hyponyms of furniture and both mutually entailing since if Bill is sitting on a settee is true,
then it is true that Bill is sitting on a sofa, and vice versa. This way of viewing synonymy
defines it as occurring between two words just in case they arc mutually intcrsubstitut-
able in any sentence without changing the meaning (or truth conditions) of those
sentences. So violin and fiddle both denote the same sort of musical instrument so that Joan
plays the violin and Joan plays the fiddle both have the same truth conditions (are both
true or false in all the same circumstances). Synonyms are beloved of lexicographers
and thesauri contain lists of putative synonyms. However, true or absolute synonyms are
very rarely attested and there is a significant influence of context on the acceptability
of apparent synonyms. Take an example from Roget's thesaurus at random, 726, for
combatant in which some of the synonyms presented are:
(2) disputant, controversialist, litigant, belligerent; competitor, rival; fighter, assailant,
agressor, champion; swashbuckler, duellist, bully, fighting-man, boxer, gladiator,...
Even putting to one side the dated expressions, it would be difficult to construct a single
context in which all these words could be substituted for each other without altering the
meaning or giving rise to pragmatic awkwardness. Context is the key for acceptability of
synonym construal. Even the clear synonymy of fiddle = violin mentioned above shows
differences in acceptability in different contexts: if Joan typically plays violin in a
symphony orchestra or as soloist in classical concerti, then someone might object that she
does not play the fiddle, where that term may be taken to imply the playing of more
popular styles of music. Even the sofa = settee example might be argued to show some
differences, for example in terms of the register of the word or possibly in terms of
possible differences in the objects the words pick out. It would appear, in fact, that humans
typically don't entertain full synonymy and when presented with particular synonyms in
context will try and provide explanations of (possibly imaginary) differences. Such an
effect would be explicable in terms of some pragmatic theory such as Relevance Theory
(Sperber & Wilson 1986/1995) in which the use of different expressions in the same
contexts is expected to give rise to different inferential effects.
A more general approach to synonymy allows there to be degrees of synonymy
where this may be considered to involve degrees of semantic overlap and it is this
sort of synonymy that is typically assumed by lexicographers in the construction of
dictionaries. Kill and murder arc strongly but not absolutely synonymous, differing
perhaps in terms of intcntionality of the killer/murderer and also the sorts of objects
such expressions may take (one may kill a cockroach but does not thereby murder it).
Of course, there are conditions on the degree of semantic similarity that we consider
to be definitional of synonymy. In the first place, it should in general be the case that
the denial of one synonym implicitly denies the other. Mary is not truthful seems
correctly to implicitly deny the truth of Mary is honest and Joan didn't hit the dog implies
that she didn't beat it. Such implications may only go one way and that is often the
case with near synonyms. The second condition on the amount of semantic overlap
that induces near synonymy is that the terms should not be contrastive.Thus, labrador
and corgi have a large amount of semantic overlap in that they express breeds of dog,
but there is an inherent contrast between these terms and so they cannot in general
be intersubstitutable in any context and maintain the truth of the sentence. Near-
synonyms arc often used to explain a word already used: John was dismissed, sacked
in fact. But if the terms contrast in meaning in some way then the resulting expression
462
IV. Lexical semantics
is usually nonsensical: #John bought a corgi, a labrador, in fact, where # indicates
pragmatic markedness.
The felicity of the use of particular words is strongly dependent on context. The
reasons for this have to do with the ways in which synonyms may differ. At the very
least two synonymous terms may differ in style or register. So, for example, baby and
neonate both refer to newborn humans, but while the neonate was born three weeks
premature means the same as the baby was born three weeks premature, What a beautiful
neonate! is distinctly peculiar. Some synonyms differ in terms of stylistic markedness.
Conceal and hide, for example, are not always felicitously intersubstitutable. John hid
the silver in the garden and John concealed the silver in the garden seem strongly
synonymous, but John concealed Mary's gloves in the cupboard does not have the same
air of normality about it as John hid Mary's gloves in the cupboard. Other aspects of
stylistic variation involve expressiveness or slang. So while gob and mouth mean the
same, the former is appropriately used in very informal contexts only or for its shock
value. Swear words in general often have acceptable counterparts and euphemism and
dysphemism thus provides a fertile ground for synonyms: lavatory = bathroom = toilet
= bog = crapper, elc; fuck = screw = sleep with, and so on. Less obvious differences in
expressiveness come about through the use of synonyms that indicate familiarity with
the object being referred to such as the variants of kinship terms: mother = mum =
mummy = ma. Regional and dialectal variations of a language may also give rise to
synonyms that may or may not co-occur in the language at large: valley = dale = glen or
autumn = fall. Sociolinguistic variation thus plays a very large part in the existence of
near synonymy in a language.
2.3. Antonymy
The third primary paradigmatic sense relation involves oppositeness in meaning, often
called antonymy (although Lyons 1977 restricts the use of this term to gradable opposites)
and is defined informally in terms of contrast, such that if 'A is X' then 'A is not Y'. So,
standardly, if John is tall is true then John is not small is also true. Unlike hyponymy, there
are a number of ways in which the senses of words contrast. The basic distinction is
typically made between gradable and ungradable opposites.Typically expressed by adjectives
in English and other Western European languages, gradable antonyms form instances of
contraries and implicitly or explicitly invoke a field over which the grading takes place, i.e.
a standard of comparison. Assuming, for example, that John is human, then human size
provides the scale against which John is tall is measured. In this way, John's being tall for
a human does not mean that he is tall when compared to buildings. Note that the implicit
scale has to be the same scale invoked for any antonym: John is tall contrasts with John
is not small for a human, not John is not small for a building. Other examples of gradable
antonyms are easy to identify: cold/hot,good/bad, old/young and so on.
(3) Gradable antonymy: Gradable antonyms form instances of contraries and
implicitly or explicitly invoke a field over which the grading takes place, i.e. a standard of
comparison.
Non-gradable antonyms are, as the name suggests, absolutes and divide up the domain of
discourse into discrete classes. Hence, not only docs the positive of one antonym imply
21. Sense relations
463
the negation of the other, but the negation of one implies the positive of the other. Such
non-gradable antonyms are also called complementaries and include pairs such as male/
female, man/woman, dead/alive.
(4) Complementaries (or binary antonyms) are all non-gradable and the sense of one
entails the negation of the other and the negation of one sense entails the positive
sense of the other.
Notice that there is, in fact, a syntagmatic restriction that is crucial to this definition. It
has to be the case that the property expressed by some word is meaningfully predicable
of the object to which it is applied. So, while that person is male implies that that person
is not female and that person is not female implies that that person is male, that rock
is not female does not imply that that rock is male. Rocks are things that do not have
sexual distinctions. Notice further that binary antonyms are quite easily coerced into
being gradable, in which case the complementarity of the concepts disappears. So we can
say of someone that they are not very alive without committing ourselves to the belief
that they are very dead.
Amongst complementaries, some pairs of antonyms may be classed as privative in that
one member expresses a positive property that the other negates. These include pairs
such as animate/inanimate. Others are termed equipollent when both properties express
a positive concept such as male/female. Such a distinction is not always easy to make: is
the relation dead/alive equipollent or privative?
Some relational antonyms differ in the perspective from which a relation is viewed:
in other words, according to the order of their arguments. Pairs such as husband/wife,
parent/child are of this sort and are called converses.
(5) Converses, involve relational terms where the argument positions involved with
one lexeme are reversed with another and vice versa.
So if Mary is John's wife then John is Mary's husband. In general, the normal antonymic
relation stands, provided that the relation expressed is strictly asymmetric: Mary is
Hilary's child implies Mary is not Hilary's parent. Converses may involve triadic predicates as
well, such as buy/sell, although it is only the agent and the goal that are reversed in such
cases. If Mary buys a horse from Bill then it must be the case that Bill sells a horse to Mary.
Note that the relations between the two converses here are not parallel: the agent subject
of buy is related to the goal object of sell whereas the agent subject of sell is related to
the source object of buy.This may indicate that the converse relation, in this case at least,
resides in the actual situations described by sentences containing these verbs rather than
necessarily inherently being part of the meanings of the verbs themselves.
So far, we have seen antonyms that are involved in binary contrasts such as die/live,
good/bad and so on, and the relation of antonymy is typically said only to refer to such
binary contrasts. But contrast of sense is not per se restricted to binary contrasts. For
example, co-hyponyms all have the basic oppositeness property of exclusion. So, if
something is a cow, it is not a sheep, dog, lion or any other type of animal and equally for all
other co-hyponyms that arc not synonyms. It is this exclusion of sense that makes corgi
and labrador non-synonymous despite the large semantic overlap in their semantic
properties (as breeds of dogs). Lyons (1977) calls such a non-binary relation 'incompatibility'.
464
IV. Lexical semantics
Some contrastive gradable antonyms form scales where there is an increase (decrease)
of some characteristic property from one extreme of the scale to another. With respect to
the property heat or temperature we have the scale [freezing, cold, cool, lukewarm, warm,
hot, boiling].These adjectives may be considered to be quasi-hyponyms of the noun heat
and all partake of the oppositeness relation. Interestingly (at least for this scale), related
points on the scale act like (gradable) antonyms: freezing/boiling, cold/hot, cool/warm.
There are other types of incompatible relations such as ranks (e.g. [private (soldier),
corporal, sergeant, staff sergeant, warrant officer, lieutenant, major, ...}) and cycles (e.g.
[monday, tuesday, Wednesday, thursday, friday, Saturday, Sunday)).
Finally on this topic, it is necessary again to point out the context-sensitivity of ant-
onymy. Although within the colour domain red has no obvious antonym, in particular
contexts it does. So with respect to wine, the antonym of red is white and in the context of
traffic signals, its opposite is green. Without a context, the obvious antonym to dry is wet,
but again within the context of wine, its antonym is sweet, in the context of skin it is soft
and for food moist (examples taken from Murphy 2003). This contextual dependence is
problematic for the definition of antonymy just over words (rather than concepts), unless
it is assumed that the lexicon is massively homonymous (see below). (See also article 22
(Lobner) Dual oppositions.)
2.4. Part-whole relations
The final widely recognised paradigmatic sense relation is that involving 'part-of relations
or meronymies.
(6) Meronymy: If X is part-of Y or Y has X then X is a meronym of Y and Y is a
holonym of X.
Thus, toe is a meronym of foot and foot is a meronym of leg, which in turn is a meronym
of body. Notice that there is some similarities between meronymy and hyponymy in
that a (normal) hand includes fingers and finger somehow includes the idea of hand,
but, of course, they are not the same things and so do not always take part in the same
entailment relations as hyponyms and superordinates. So while Mary hurt her finger
(sort of) entails Mary hurt her hand, just as Mary hurt her lamb entails Mary hurt an
animal, Mary saw her finger does not entail Mary saw her hand, unlike Mary saw her
lamb does entail Mary saw an animal. Hence, although meronymy is like hyponymy
in that the part-whole relations define hierarchical distinctions in the vocabulary, it is
crucially different in that meronyms and holonyms define different types of object that
may not share any semantic properties at all: a finger is not a kind of hand, but it does
share properties with hands such as being covered in skin and being made of flesh and
bone; but a wheel shares very little with one of its holonyms car, beyond being a
manufactured object. Indeed, appropriate entailment relations between sentences containing
meronyms and their corresponding holonyms are not easily stated and, while the
definition given above is a reasonable approximation, it is not unproblematic. Cruse (1986)
attempts to restrict the meronymy relation just to those connections between words
that allow both the 'X is part of Y' and 'Y has X' paraphrases. He points out that the
'has a' relation does not always involve a 'part of one, at least between the two words: a
wife has a husband but not #a husband is part of a wife. On the other hand, the reverse
21. Sense relations
465
may also not hold: stress is part of the job does not mean that the job has stress, at least
not in the sense of possession.
However, even if one accepts that both paraphrases must hold of a mcronymic pair,
there remain certain problems. As an example, the pair of sentences a husband is part
of a marriage and a marriage has a husband seems to be reasonably acceptable, it is not
obvious that marriage is strictly a holonym of husband. It may be, therefore, that it is
necessary to restrict the relation to words that denote things of the same general type:
concrete or abstract, which will induce different 'part of relations depending on the way
some word is construed. So, a chapter is part of a book = Books have chapters if book is
taken to be the abstract construal of structure, but not if it is taken to be the concrete
object. Furthermore, it might be necessary to invoke notions like 'discreteness' in order
to constrain the relation. For examp\e,flesh is part of a hand and hands have flesh, but are
these words thereby in a meronymic relationship? Flesh is a substance and so not
individuated and if meronymy requires parts and wholes to be discretely identifiable, then
the relationship would not hold of these terms. Again we come to the problem of world-
knowledge, which tells us that fingers are prototypically parts of hands, versus word-
knowledge: is it the case that the meaning of'finger' necessarily contains the information
that it forms part of a hand and thus that some aspect of the meaning of'hand' is
contained in the meaning of'finger'? If that were the case, how do we account for the lack
of any such inference in extensions of the word to cover (e.g.) emerging shoots of plants
(cf. finger of asparagus)! (See article 32 (Hobbs) Word meaning and world knowledge.)
This short paragraph does not do justice to the extensive discussions of meronymy, but
it should be clear that it is by far the most problematic of the paradigmatic relations to pin
down, a situation that has led some scholars to reject its existence as a different type of sense
relation altogether. (See further Croft & Cruse 2004:159-163, Murphy 2003:216-235.)
2.5. Syntagmatic relations
Syntagmatic relations between words appear to be less amenable to the sort of
taxonomies associated with paradigmatic relations. However, there is no doubt that some words
'go naturally' with each other, beyond what may be determined by general syntactic
rules of combination. At one extreme, there are fixed idioms where the words must
combine to yield a specific meaning. Hence we have idiomatic expressions in English
meaning 'die' such as kick the bucket (a reference to death by hanging) or pass away or
pass over (to the other side) (references to religious beliefs) and so on.There are certain
words also that only have a very limited ability to appear with others such as is the
case with addled which can only apply to eggs or brains. Other collocational possibilities
may be much freer, although none are constrained solely by syntactic category. For
example, hit is a typical transitive verb in English that takes a noun phrase as object.
However, it further constrains which noun phrases it acceptably collocates with by
requiring the thing denoted by that noun phrase to have concrete substance. Beyond that,
collocational properties are fairly free;see article 20 (Fellbaum) Idioms and collocations:
(7) The plane hit water/a building/#the idea.
That words do have semantic collocational properties can be seen by examining strings
of words that arc 'grammatical' (however defined) but make no sense. An extreme
466
IV. Lexical semantics
example of this is Chomsky (1965)'s ubiquitous colorless green ideas sleep furiously in
which the syntactic combination of the words is licit (as an example of a subject noun
phrase containing modifiers and a verb phrase containing an intransitive verb and an
adverb) but no information is expressed because of the semantic anomalies that result
from this particular combination.There are various sources of anomaly. In the first place,
there may be problems resulting from what Cruse (2000) calls collocational preferences.
Where such preferences are violated, various degrees of anomaly can arise, ranging from
marginally odd through to the incomprehensible. For example, the sentence my pansies
have passed away is peculiar because pass away is typically predicated of a human (or
pet) not flowers. However, the synonymous sentence, my pansies have died, involves no
such peculiarity since the verb die is prcdicablc of anything which is capable of life such
as plants and animals. A worse clash of meaning thus derives from the collocation of an
inanimate subject and any verb or idiom meaning 'die'. My bed has died is thus worse
than my pansies have passed a way, although notice that metaphorical interpretations can
be (and often are) attributed to such strings. For example, one could interpret my bed
has died as indicating that the bed has collapsed or is otherwise broken and such
metaphorical extensions are common. Compare the use of die collocated with words such as
computer, car, phone, etc. Notice further in this context that the antonym of die is not live
but go or run. It is only when too many clashes occur that metaphorical interpretation
breaks down and no information at all can be derived from a string of words. Consider
again colorless green ideas sleep furiously. Parts of this sentence arc less anomalous than
the whole and we can assign (by whatever means) some interpretation to them:
(8) a. Green ideas: 'environmentally friendly ideas' or 'young, untried ideas' (both via
the characteristic property of young plant shoots);
b. Colorless ideas: 'uninteresting ideas' (via lacklustre, dull);
c. Colorless green ideas: 'uninteresting ideas about the environment' or
'uninteresting untried ideas' (via associated negative connotations of things without
colour);
d. Ideas sleep: 'ideas are not currently active' (via inactivity associated with
sleeping);
e. Green parrots sleep furiously: 'parrots determinedly asleep (?)' or 'parrots
restlessly asleep'.
But in putting all the words together, the effort involved in resolving all the
contradictions just gets beyond any possible effect on the context by the information content of the
final proposition. The more contradictions that need to be resolved in processing some
sentence the greater the amount of computation required to infer a non-contradictory
proposition from it and the less information the inferred proposition will convey. A
sentence may be said to be truly anomalous if there is no relevant proposition that can be
deduced from it by pragmatic means.
A second type of clash involving collocational preferences is induced when words are
combined into phrases but add no new information to the string. Cruse calls this
pleonasm and exemplifies it with examples such as John kicked the ball with his foot (Cruse
2000:223). Since kicking involves contact with something by a foot the string final
prepositional phrase adds nothing new and the sentence is odd (even though context may
allow the apparent tautology to be acceptable). Similar oddities arise with collocations
21. Sense relations
467
such as female mother or human author. Note that pleonasm does not always give rise to
feelings of oddity. For example, pregnant female does not seems as peculiar as #female
mother, even although the concept of 'female' is included in 'pregnant'. This
observation is indicative of the fact that certain elements in strings of words have a privileged
status. For example, it appears that pleonastic anomaly is worse in situations in which a
semantic head, a noun or verb that determines the semantic properties of its satellites,
which does not always coincide with what may be identified as the syntactic head of a
construction, appears with a modifier (or sometimes complement, but not always) whose
meaning is contained within that of the head: bovine mammal is better than
^mammalian cow (what other sort of cow could there be?). Pleonastic anomaly can usually be
obviated by substituting a hyponym for one expression or a superordinate of the other
since this will give rise to informativity with respect to the combination of words: female
parent, He struck the ball with his foot and so on.
Collocational preferences are often discussed with respect to the constraints imposed
by verbs or nouns on their arguments and sometimes these constraints have been
incorporated into syntactic theories. In Chomsky (1965), for example, the subcategorisation of
verbs was defined not just in terms of the syntactic categories of their argument but also
their semantic selectional properties. A verb like kick imposes a restriction on its direct
object that it is concrete (in non-metaphorical uses) and on its subject that it is something
with legs like an animal, whereas verbs like think require abstract objects and human
subjects. Subsequent research into the semantic properties of arguments led to the pos-
tulation of participant (or case or thematic) roles which verbs 'assign' to their arguments
with the effect that certain roles constrained the semantic preferences of the verb to
certain sorts of subjects and objects. A verb like/ear, therefore, assigns to its subject the role
of experiencer, thus limiting acceptable collocations to things that are able to fear such as
humans and other animals. Some roles such as experiencer, agent, recipient, etc. arc more
tightly constrained by certain semantic properties, such as animacy, volition, and mobility
than others such as theme, patient (Dowty 1991).
We have seen that paradigmatic relations cannot always be separated from concepts
of syntagmatic relatedness. So Lyons' attempts to define hyponymy in terms of the
modification of a superordinate while the basic relation of antonymy holds only if the word
being modified denotes something that can be appropriately predicated of the relevant
properties. Given that meanings are constructed in natural languages by putting words
together, it would be unsurprising if syntagmatic relations are, in some sense, primary
and that paradigmatic relations are principally determined by collocational properties
between words. Indeed the primacy of syntagmatic relations is supported by psycholin-
guistic and acquisition studies. It is reported in Murphy (2003), for example, that in word
association experiments, children under 7 tend to provide responses that reflect
collocational patterns rather than paradigmatic ones. So to a word like black the response of
young children is more likely to give rise to responses such as bird or board rather than
the antonym white. Older children and adults, on the other hand, tend to give
paradigmatic responses. There is, furthermore, some evidence that knowledge of paradigmatic
relations is associated with metalinguistic awareness: that is, awareness of the properties
of, and interactions between, those words.
The primacy of syntagmatic relations over paradigmatic relations seems further to
be borne out by corpus and computational studies of collocation and lexis. For example,
there are many approaches to the automatic disambiguation of homonyms, identical
468
IV. Lexical semantics
word forms that have different meanings (see below), that rely on syntagmatic
context to determine which sense is the most likely on a particular occasion of the use of
some word. Such studies also rely on corpus work from which collocational
probabilities between expressions are calculated. In this regard, it is interesting also to consider
experimental research in computational linguistics which attempts to induce automatic
recognition of synonyms and hyponyms in texts. Erk (2009) reports research which uses
vector spaces to define representations of words meanings where such spaces are defined
in terms of collocations between words in corpora. Without going into any detail here,
she reports that (near) synonyms can be identified in this manner to a high degree of
accuracy, but that hyponymic relations cannot be identified easily without such
information being directly encoded, but that the use of vector spaces allows such encoding to be
done and to yield good results. Although this is not her point, Erk's results are interesting
with respect to the possible relation between syntagmatic and paradigmatic sense
relations. Synonymy may be defined not just semantically as involving sameness of sense, but
syntactically as allowing substitution in the all the same linguistic contexts (an
idealisation, of course, given the rarity of full synonymy, but probabilistic techniques may be
used to get a definition of degree of similarity between the contexts in which two words
can appear). Hence, we might expect that defining vector spaces in terms of collocational
possibilities in texts will yield a high degree of comparability between synonyms. But
hyponymy cannot be defined easily in syntactic terms, as hyponyms and their superordi-
nates will not necessarily collocate in the same way (for example, four-legged animal is
more likely than four-legged dog while Collie dog is fine but #Collie animal is marginal at
best).Thus, just taking collocation into account, even in a sophisticated manner, will fail
to identify words in hyponymous relations.This implies that paradigmatic sense relations
are 'higher order' or'metalexical' relations that do not emerge directly from syntagmatic
ones.
I leave this matter to one side from now on, because, despite the strong possibility
that syntagmatic relations are cognitively primary, it remains the case that the study of
paradigmatic relations remains the focus of studies of lexical semantics.
2.6. Homonymy and polysemy
Although not sense relations of the same sort as those reviewed above in that they do
not structure the lexicon in systematic ways, homonymy and polysemy have
nevertheless an important place in considerations of word meaning and have played an
important part in the development of theories of lexical semantics since the last two decades
of the twentieth century. Homonymy involves formal identity between words with
distinct meanings (i.e. interpretations with distinct extensions and senses) which Weinreich
(1963) calls "contrastive ambiguity". Such formal identity may involve the way a word
is spoken (homophony 'same sound') such as bank, line, taxi, can, lead (noun)Zled (verb)
and/or orthography bank, line, putting, in which case it is referred to as homography
'same writing'. It is often the case that the term homonymy is reserved only for those
words that are both homophones and homographs, but equally often the term is used for
cither relation. Homonymy may be full or partial: in the former case, every form of the
lexeme is identical for both senses such as holds for the noun punch (the drink or the
action) or it may be partial in which only some forms of the lexeme are identical for both
senses, such as between the verb punch and its corresponding noun. Homonymy leads to
21. Sense relations
469
the sort of ambiguity that is easily resolved in discourse context, whether locally through
syntactic disambiguation (9a), the context provided within a sentence (9b) or from the
topic of conversation (9c).
(9) a. His illness isn't terminal.
b. My terminal keeps cutting out on me.
c. I've just been through Heathrow Airport.The new terminal is rubbish.
In general, there is very little to say about homonymy. It is random and generally
only tolerated when the meanings of the homonyms are sufficiently semantically
differentiated as to be easily disambiguated.
Of more interest is polysemy in which a word has a range of meanings in different local
contexts but in which the meaning differences are taken to be related in some way. While
homonymy may be said to involve true ambiguity, polysemy involves some notion of
vagueness or underspecification with respect to the meanings a polyseme has in different
contexts (see article 23 (Kennedy) Ambiguity and vagueness). The classic example of a
polysemous word is mouth which can denote the mouth of a human or animal or various
other types of opening, such as bottle, and more remotely of river. Unlike homonymy no
notion of contrast in sense is involved and polysemes are considered to have an
apparently unique basic meaning that is modified in context.The word bank is both a homonym
and a polyseme in its meaning of 'financial establishment' between its interpretation as
the institution (The bank raised its interest rates yesterday ) and its physical
manifestation (The bank is next to the school). One of the things that differentiates polysemy from
homonymy is that the different senses of polysemes are not 'suppressed' in context (as
with homonyms) but one aspect of sense is foregrounded or highlighted. Other senses are
available in the discourse and can be picked up by other words in the discourse:
(10) a. Mary tried to jump through the window (aperture), but it was closed (aperture/
physical object) and she broke it (physical object),
b. *Mary walked along the bank of the river. It had just put up interest rates yet again.
Polysemy may involve a number of different properties: change of syntactic category
(11); variation in valency (12); and subcategorisation properties (13).
(11) a. Rambo picked up the hammer (noun).
b. Rambo hammered (verb) the nail into the tree.
(12) a. The candle melted.
b. The heat melted the candle.
(13) a. Rambo forgot that he had buried the cat (clausal complement - factive
interpretation)
b. Rambo forgot to bury the cat (infinitival complement - non-factive interpretation)
Some polysemy may be hidden and extensive, as often with gradable adjectives where
the adjective often picks out some typical property associated with the head noun that it
modifies which may vary considerably from noun to noun, as with the adjective good in
470
IV. Lexical semantics
(15), where the interpretation varies considerably according to the semantics of the head
noun and contrasting strongly with other adjectives like big, as illustrated in (14). Note
that one might characterise the meaning of this adjective in terms of an undcrspccificd
semantics such as that given in (15e).
(14) big car/big computer/big nose: 'big for N'
(15) a. good meal'/tasty, enjoyable, pleasant'
b. good knife:'sharp, easy to handle'
c. good car: 'reliable, comfortable, fast'
d. good typist:'accurate, quick, reliable'
e. good N: 'positive evaluation of some property associated with N'
There are also many common alternations in polysemy that one may refer to as
constructional (or logical) polysemy since they are regular and result from the semantic
properties of what is denoted.
(16) a. Figure/Ground: window, door, room
b. Count/Mass: lamb, beer
c. Container/Contained: bottle, glass
d. Product/Producer: book, Kleenex
e. Plant/Food: apple, spinach
f. Process/Result: examination, merger
Such alternations depend to a large degree on the perspective that is taken with respect
to the objects denoted. So a window may be viewed in terms of an aperture (e.g. when
it is open) or in terms of what it is made of (glass, plastic in some sort of frame) while
other nouns can be viewed in terms of their physical or functional characteristics, and
so on.
Polysemy is not exceptional but rather the norm for word interpretation in context.
Arguably every content word is polysemous and may have its meaning extended in
context, systematically or unsystematically.The sense extensions of mouth, for example, are
clear examples of unsystematic metaphorical uses of the word, unsystematic because the
metaphor cannot be extended to just any sort of opening: ?#mouth of a flask, timouth of
a motorway, ?#mouth of a stream, #mouth of a pothole. Of course, any of these
collocations could become accepted, but it tends to be the case that until a particular collocation
has become commonplace, the phrase will be interpreted as involving real metaphor
rather than the use of a polysemous word. Unsystematic polysemy, therefore, may have
a diachronic dimension with true (but not extreme) metaphorical uses becoming
interpreted as polysemy once established within a language. It is also possible for diachron-
ically homonymous terms to be interpreted as polysemes at some later stage. This has
happened with the word ear where the two senses (of a head and of corn) derive from
different words in Old English (pare andear, respectively).
More systematic types of metaphorical extension have been noted above, but may
also result from metonymy: the use of a word in a non-literal way, often based on a
partwhole or 'connected to' relationships. This may happen with respect to names of
composers or authors where the use of the name may refer to the person or to what they
21. Sense relations
471
have produced. (17) may be interpreted as Mary liking the man or the music (and indeed
listening to or playing the latter).
(17) Mary likes Beethoven.
Ad hoc types of metonymy may simply extend the concept of some word to some aspect
of a situation that is loosely related to, but contextually determined by, what the word
actually means. John has new wheels may be variously interpreted as John having a
new car or, if he was known to be paraplegic, as him having a new wheelchair. A more
extreme, but classic, example is one like (18) in which the actual meaning of the food
lasagna is extended to the person who ordered it. (Such examples arc also known as 'ham
sandwich' cases after the examples found in Nunberg 1995).
(18) The lasagna is getting impatient.
Obviously context is paramount here. In a classroom or on a farm, the example would
be unlikely to make any sense, whereas in a restaurant where the situation necessarily
involves a relation between customers and food, a metonymic relation can be easily
constructed. Some metonymic creations may become established within a linguistic
community and thus become less context-dependent. For example, the word suit(s) may refer
not only to the garment of clothing but also to people who wear them and thus the word
gets associated with types of people who do jobs that involve the wearing of suits, such
as business people.
3. Sense relations and word meaning
As indicated in the discussion above, the benefit of studying sense relations appears to
be that it gives us an insight into word meaning generally. For this reason, such relations
have often provided the basis for different theories of lexical semantics.
3.1. Lexical fields and componential analysis
One of the earliest modern attempts to provide a theory of word meaning using sense
relations is associated with European structuralists, developing out of the work of de
Saussure in the first part of the twentieth century. Often associated with theories of
componential analysis (see below), lexical field theory gave rise to a number of vying
approaches to lexical meaning, but which all share the hypothesis that the meanings
(or senses) of words derive from their relations to other words within some thematic/
conceptual domain defining a semantic or lexical field. In particular, it is assumed that
hierarchical and contrastive relations between words sharing a conceptual domain is
sufficient to define the meaning of those words. Early theorists such as Trier (1934) or Porzig
(1934) were especially interested in the way such fields develop over time with words
shifting with respect to the part of a conceptual field that they cover as other words come
into or leave that space. For Trier, the essential properties of a lexical field are that:
(i) the meaning of an individual word is dependent upon the meaning of all the other
words in the same conceptual domain;
472
IV. Lexical semantics
(ii) a lexical field has no gaps so that the field covers some connected conceptual space
(or reflects some coherent aspect of the world);
(iii) if a word undergoes a change in meaning, then the whole structure of the lexical
field also changes.
One of the obvious weaknesses of such an approach is that the identification of a
conceptual domain cannot be identified independently of the meaning of the expressions
themselves and so appears somewhat circular. Indeed, such research presented little
more than descriptions of diachronic semantic changes as there was little or no
predictive power in determining what changes are and are not possible within lexical fields, nor
what lexical gaps arc tolerated and what not. Indeed, it seems reasonable to suppose that
no such theory could exist, given the randomness that the lexical development of conten-
tive words displays and so there is no reason to suppose that sense relations play any part
in determining such change. (See Ullman 1957,Geckeler 1971,Coseriu & Geckeler 1981,
for detailed discussions of field theories at different periods of the twentieth century.)
Although it is clear that 'systems of interrelated senses' (Lyons 1977: 252) exist
within languages, it is not clear that they can usefully form the basis for explicating word
meaning.The most serious criticism of lexical field theory as more than a descriptive tool
is that it has unfortunate implications for how humans could ever know the meaning
of a word: if a word's meaning is determined by its relation to other words in its lexical
field, then to know that meaning someone has to know all the words associated with that
lexical field. For example, to know the meaning of tulip, it would not be enough to know
that it is a hyponym of (plant) bulb and a co-hyponym of daffodil, crocus, anemone, lily,
dahlia but also to trillium, erythronium, bulbinella, disia, brunsvigia and so on. But only
a botanist specialising in bulbous plants is likely to know anything like the complete list
of names, and even then this is unlikely. Of course, one might say that individuals might
have a shallower or deeper knowledge of some lexical field, but the problem persists, if
one is trying to characterise the nature of word meaning within a language rather than
within individuals. And it means that the structure of a lexical field and thus the meaning
of a word will necessarily change with any and every apparent change in knowledge. But
it is far from clear that the meaning of tulip would be affected if, for example, botanists
decided that that a disia is not bulbous but rhizomatous, and thus does not after all form
part of the particular lexical field of plants that have bulbs as storage organs. It is obvious
that someone can be said to know the meaning of tulip independently of whether they
have any knowledge of any other bulb or even flowering plant.
Field theory came to be associated in the nineteen-sixties with another theory of word
meaning in which sense relations played a central part.This is the theory of componen-
tial analysis which was adapted by Katz & Fodor (1963) for linguistic meaning from
similar approaches in anthropology. In this theory, the meaning of a word is decomposed
into semantic components, often conceived as features of some sort. Such features are
taken to be cognitively real semantic primitives which combine to define the meanings
of words in a way that automatically predicts their paradigmatic sense relations with
other words. For example, one might decompose the two meanings of dog as consisting
of the primitive features [CANINE] and [CANINE, MALE, ADULT]. Since the latter
contains the semantic structure of the former, it is directly determined to be a hyponym.
Assuming that bitch has the componential analysis [CANINE, FEMALE, ADULT], the
hyponym meaning of dog is easily identified as an antonym of bitch as they differ one
21. Sense relations
473
just one semantic feature. So the theory provides a direct way of accounting for sense
relations: synonymy involves identity of features; hyponymy involves extension of
features; and antonymy involves difference in one feature. Although actively pursued in
the nineteen sixties and seventies, the approach fell out of favour in mainstream
linguistics for a number of reasons. From an ideological perspective, the theory became
associated with the Generative Semantics movement which attempted to derive surface
syntax from deep semantic meaning components. When this movement was discredited,
the logically distinct semantic theory of componential analysis was mainly rejected too.
More significantly, however, the theory came in for heavy criticism. In the first place,
there is the problem of how primitives are to be identified, particularly if the
assumption is that the set of primitives is universal. Although more recent work has attempted
to give this aspect of decompositional theories as a whole a more empirically motivated
foundation (Wierzbicka 1996), nevertheless there appears to be some randomness to the
choice of primitives and the way they are said to operate within particular languages.
Additionally, the theory has problems with things like natural kinds: what distinguishes
[CANINE] from the meaning of dog or [EQUINE] from horse! And does each animal
(or plant) species have to be distinguished in this way? If so, then the theory achieves
very little beyond adding information about sex and age to the basic concepts described
by dog and horse. Using Latinate terms to indicate sense components for natural
kinds simply obscures the fact that the central meanings of these expressions are not
decomposable. An associated problem is that features were often treated as binary so
that, for example, puppy might be analysed as [CANINE,-ADULT]. Likewise, instead
of MALE/FEMALE one might have ±MALE or [-ALIVE] for dead. The problem
here is obvious: how does one choose a non-arbitrary property as the unmarked one?
±FEMALE and ±DEAD are just as valid as primitive features, as the reverse, reflecting
the fact that male/female and dead/alive are equipollent antonyms (see section 2.3).
Furthermore, restriction to binary values excludes the inclusion of relational concepts that
are necessary for any analysis of meaning in general. Overall, then, while componential
analysis does provide a means of predicting sense relations, it does so at the expense of a
considerable amount of arbitrariness.
3.2. Lexical decomposition
Although the structuralist concept of lexical fields is one that did not develop in the way
that its proponents might have expected, nevertheless it reinforced the view that words
are semantically related and that this relatedness can be identified and used to
structure a vocabulary. It is this concept of a structured lexicon that persists in mainstream
linguistics. In the same way, lexical decompositional analyses have developed in rather
different ways than were envisaged at the time that componcntial semantic analysis was
developed. See article 17 (Engelberg) Frameworks of decomposition.
Decomposition of lexical meaning appears in Montague (1973), one of the earliest
attempts to provide a formal analysis of a fragment of a natural language as one of two
different mechanisms for specifying the interpretations of words. Certain expressions
with a logical interpretation, like be and necessarily, are decomposed, not into
cognitive primitives, but into complex logical expressions, reflecting their truth-conditional
content. For example, necessarily receives the logical translation "kp\upx\, where the
abstracted variable, p, ranges over propositions and the decomposition has the effect
474
IV. Lexical semantics
of equating the semantics of the adverb with that of the logical necessity operator, □.
Montague restricted decomposition of this sort to those grammatical expressions whose
truth conditional meaning can be given a purely logical characterisation. In a detailed
analysis of word meaning within Montague Semantics, however, Dowty (1979) argued
that certain entailments associated with content expressions are constant in the same
way as those associated with grammatical expressions and he extended the decomposi-
tional approach to analyse such words in order to capture such apparently independent
entailments.
Dowty's exposition is concerned primarily with inferences from verbs (and more
complex predicates) that involve tense and modality. By adopting three operators
DO, BECOME and CAUSE, he is able to decompose the meanings of a range of
different types of verbs, including activities, accomplishments, inchoatives and causatives,
to account for the entailments that can be drawn from sentences containing them. For
example, he provides decomposition rules for de-adjectival inchoative and causative
verbs in English that modify the predicative interpretation of base adjectives in English.
Dowty uses the propositional operator BECOME for inchoative interpretations of (e.g.)
cool: Xx [BECOME cool'(x)] where cool is the semantic representation of the meaning
of the predicative adjective and the semantics of BECOME ensures that the resulting
predicate is true of some individual just in case it is now cool but just previously was not
cool.The causative interpretation of the verb involves the CAUSE operator in addition:
Xy "kx [x CAUSE BECOME cooV{y)\ which guarantees the entailment between (e.g.)
Mary cooled the wine and Mary caused the wine to become cool. Dowty also gives more
complex (and less obviously logical) decompositions for other content expressions, such
as kill which may be interpreted as x causes y to become not alive.
The quasi-logical decompositions suggested by Dowty have been taken up in
theories such as Role and Reference Grammar (van Valin & LaPolla 1997) but primarily
for accounting for syntagmatic relations such as argument realisation, rather than for
accounting for paradigmatic sense relations.The same is not quite true for other decom-
positional theories of semantics such as that put forward in the Generative Lexicon
Theory of Pustejovsky (1995). Pustejovsky presents a theory designed specifically to
account for structured polysemous relations such as those given in (11-13) utilising a
complex internal structure for word meanings that goes a long way further than that put
forward by Katz & Fodor (1963). In particular, words are associated with a number of
different 'structures' that may be more or less complex. These include: argument
structure which gives the number and semantic type of logical arguments; event structure
specifying the type of event of the lexeme; and lexical inheritance structure, essentially
hyponymous relations (which can be of a more general type showing the hierarchical
structure of the lexicon). The most powerful and controversial structure proposed is the
qualia structure. Qualia is a Latin term meaning 'of whatever sort' and is used for the
Greek aitiai 'blame', 'responsibility' or'cause' to link the current theory with Aristotle's
modes of explanation. Essentially the qualia structure gives semantic properties of a
number of different sorts concerning the basic sense properties, prototypicality
properties and encyclopaedic information of certain sorts. This provides an explicit model
for how meaning shifts and polyvalcncy phenomena interact. The qualia structure
provides the structural template over which semantic combinatorial devices, such as co-
composition, type coercion and subselection, may apply to alter the meaning of a lexical
item. Pustejovsky (1991:417) defines qualia structure as:
21. Sense relations
475
— The relation between [a word's denotation] and its constituent parts
— That which distinguishes it within a larger domain (its physical characteristics)
— Its purpose and function
— Whatever brings it about
Such information is used in interpreting sentences such as those in (19)
(19) a. Bill uses the train to get to work,
b. This car uses diesel fuel.
The verb use is scmantically undcrspccificd and the factors that allow us to determine
which sense is appropriate for any instance of the verb are the qualia structures for each
phrase in the construction and a rich mode of composition, which is able to take
advantage of this information. For example, in (19a) it is the function of trains to take people to
places, so use here may be interpreted as 'catches', 'takes' or 'rides on'. Analogously, cars
contain engines and engines require fuel to work, so the verb in (19b) can be interpreted
as 'runs on', 'requires', etc. Using these mechanisms Pustejovsky provides analyses of
complex lexical phenomena, including coercion, polysemy and both paradigmatic and
syntagmatic sense relations.
Without going into detail, Pustejovskys fundamental hypothesis is that the lexicon is
generative and compositional, with complex meanings deriving from less complex ones
in structured ways, so that the lexical representations of words should contain only as
much information as they need to express a basic concept that allows as wide a range of
combinatorial properties as possible. Additionally, lexical information is hierarchically
structured with rules specifying how phrasal representations can be built up from lexical
ones as words are combined. A view which contrasts strongly with that discussed in the
next section. Sec article 17 (Engclbcrg) Frameworks of decomposition.
3.3. Meaning postulates and semantic atomism
In addition to decomposition for logico-grammatical word meaning, Montague (1973)
also adopted a second approach to accounting for the meaning of basic expressions, one
that relates the denotations of words (analogously, concepts) to each other via logical
postulates. Meaning Postulates were introduced in Carnap (1956) and consist of
universally quantified conditional or bi-conditional statements in the logical metalanguage
which constrain the denotations of the constant that appears in the antecedent. For
example, Montague provides an example that relates the denotations of the verb seek
and the phrase try to find. (20) states (simplified from Montague) that for every instance
of x seeking y there is an instance of x trying find y:
(20) □VxVy[seek'(.*,>') <-> try-to (*,A find'(x,y)]
Note that the semantics of seek, on this approach, does not contain the content of try
to find, as in the decompositional approach. The necessity operator, □, ensures that the
relation holds in all admissible models, i.e. in all states-of-affairs that we can talk about
using the object language. This raises the bi-conditional statement to the status of a
logical truth (an axiom) which ensures that on every occasion in which it is true to say of
476
IV. Lexical semantics
someone that she is seeking something then it is also true to say that she is trying to
find that something (and vice versa). Meaning postulates provide a powerful tool for
encoding detailed information about non-logical entailments associated with particular
lexemes (or their translation counterparts). Note that within formal, model-theoretic,
semantics such postulates act, not as constraints on the meaning of words, but their
denotations. In other words, they reflect world knowledge, how situations are, not how word
meanings relate to each other. While it is possible to use meaning postulates to capture
word meanings within model-theoretic semantics, this requires a full intensional logic
and the postulation of'impossible worlds', to allow fine-grained differentiations between
worlds in which certain postulates do not hold. (See Cann 1993 for an attempt at this
and a critique of the notion of impossible worlds in Fox & Lappin 2005, cf. also article 33
(Zimmermann) Model-theoretic semantics).
A theory that utilises meaning postulates treats the meaning of words as atomic with
their semantic relations specified directly. So, although traditional sense relations, both
paradigmatic and syntagmatic,can easily be reconstructed in the system (see Cann 1993
for an attempt at this) they do not follow from the semantics of the words themselves.
For advocates of this theory, this is taken as an advantage. In the first place, it allows
for conditional, as opposed to bi-conditional, relations, as necessary in a decomposi-
tional approach. So while we might want to say that an act of killing involves an act
of causing something to die, the reverse may not hold. If kill is decomposed as x CAUSE
{BECOME{-*alive\y))), then this fact cannot be captured. A second advantage of
atomicity is that even if a word's meaning can be decomposed to a large extent, there is
nevertheless often a 'residue of meaning' which cannot be decomposed into other
elements. This is exactly what the feature CANINE is in the simple componential analysis
given above: it is the core meaning of dog/bitch that cannot be further decomposed.
In decomposition, therefore, one needs both some form of atomic concept and the
decomposed elements whereas in atomic approaches word meanings arc individual
concepts (or denotations), not further decomposed. What relations they have with the
meanings of other words is a matter of the world (or of experience of the world) not of
the meanings of the words themselves. Fodor & Lepore (1998) argue extensively against
decompositionality, in particular against Pustejovsky's notion of the generative lexicon,
in a way similar to the criticism made against field theories above. They argue that while
it might be that a dog (necessarily) denotes an animal, knowing that dogs are animals is
not necessary for knowing what dog means. Given the non-necessity of knowing these
inferences for knowing the meaning of the word means that they (including interlexical
relations) should not be imposed on lexical entries, because these relations are not part
of the linguistic meaning.
Criticisms can be made of atomicity and the use of meaning postulates (see
Pustejovsky 1998 for a rebuttal of Fodor & Lepore's views). In particular, since meaning
postulates are capable of defining any type of semantic relation, traditional sense
relations form just arbitrary and unpredictable parts of the postulate system, impossible to
generalise over. Nevertheless it is possible to define theories in which words have atomic
meanings, but the paradigmatic sense relations are used to organise the lexicon. Such a
one is WordNct developed by George A. Miller (1995) to provide a lexical database of
English organised by grouping words together that are cognitive synonyms (a synset),
each of which expresses a distinct concept with different concepts associated with a word
being found in different synsets (much like a thesaurus). These synsets then are related
21. Sense relations
477
to each other by lexical and conceptual properties, including the basic paradigmatic sense
relations. Although it remains true that the sense relations are stated independently of
the semantics of the words themselves, nonetheless it is possible to claim that using them
as an organisational principle of the lexicon provides them with a primitive status with
respect to human cognitive abilities. WordNet was set up to reflect the apparent way
that humans process expressions in a language and so using the sense relations as an
organisational principle is tantamount to claiming that they are the basis for the
organisation of the human lexicon, even if the grouping of specific words into synsets and the
relations defined between them is not determined directly by the meanings of the words
themselves. (See article 110 (Frank & Pado) Semantics in computational lexicons).
A more damning criticism of the atomic approach is that context-dependent
polysemy is impossible because each meaning (whether treated as a concept or a denotation)
is in principle independent of every other meaning. A consequence of this, as Pustejovsky
points out, is that every polysemous interpretation of a word has to be listed separately
and the interpretation of a word in context is a matter of selecting the right concept/
denotation a priori. It cannot be computed from aspects of the meaning of a word with
those of other words with which it appears. For example, the meanings of gradable
adjectives such as good in (15) will need different concepts associated with each collocation
that are in principle independent of each other. Given that new collocations between
words are made all the time and under the assumption that the number of slightly new
senses that result arc potentially infinite in number, this is a problem for storage given
the finite resources of the human brain. A further consequence is that, without some
means of computing new senses, the independent concepts cannot be learned and so
must be innate. While Fodor (1998) has suggested the possibility of the consequence
being true, this is an extremely controversial and unpopular hypothesis that is not likely
to help our understanding of the nature of word meaning.
4. Conclusion
In the above discussion, I have not been able to more than scratch the surface of the
debates over the sense relations and their place in theories of word meaning. I have not
discussed the important contributions of decompositionalists such as Jackendoff, or the
problem of analyticity (Quine 1960), or the current debate between contextualists and
semantic minimalists (Cappelen & Lepore 2005, Wedgwood 2007). Neither have I gone
into any detail about the variations and extensions of sense relations themselves, such
as is often found in Cognitive Linguistics (e.g. Croft & Cruse 2004). And much more
besides. Are there any conclusions that we can currently draw? Clearly, sense relations
are good descriptive devices helping with the compilation of dictionaries and thesauri, as
well as the development of large scale databases of words for use in various applications
beyond the confines of linguistics, psychology and philosophy. It would, however, appear
that the relation between sense relations and word meaning itself remains problematic.
Given the overriding context dependence of the latter, it is possible that pragmatics will
provide explanations of observed phenomena better than explicitly semantic approaches
(see for example, Blutner 2002, Murphy 2003, Wilson & Carston 2006). Furthermore, the
evidence from psycholinguistic and developmental studies, as well as the collocational
sensitivity of sense, indicates that syntagmatic relations may be cognitivcly primary and
that paradigmatic relations may be learned, either explicitly or through experience as
22. Dual oppositions in lexical meaning 479
Quine. Willard van Orman 1960. Word and Object. Cambridge, MA:Tlie MIT Press.
Sperber. Dan & Deirdre Wilson 1986/1995. Relevance. Communication and Cognition. Oxford:
Blackwell. 2nd edn. (with postface) 1995.
Trier, Jost 1934. Das sprachliche Feld. Eine Auseinandersetzung. Neue Jahrbucher fur Wbsenschaft
und Jugendbildung 10, 428-449. Reprinted in: L. Antal (ed.). Aspekte der Semantik. Zu ihrer
Theorie und Geschichte 1662-1970. Frankfurt/M.: Athenaum, 1972,77-104.
Ullman, Stephen 1957. The Principles of Semantics. 2nd edn. Glasgow: Jackson. 1st edn. 1951.
van Valin, Robert D. & Randy J. LaPolla 1997. Syntax. Structure, Meaning, and Function.
Cambridge: Cambridge University Press.
Wedgwood. Daniel 2007. Shared assumptions. Semantic minimalism and relevance theory. Journal
of Linguistics 43,647-681.
Weinreich. Uriel 1963. On the semantic structure of language. In: J. H. Greenberg (ed.). Universals
of Language. Cambridge, MA:The MIT Press, 142-216.
Wicrzbicka. Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press.
Wilson, Deirdre & Robyn Carston 2006. Metaphor, relevance and the 'emergent property' issue.
Mind & Language 21.404-433.
Ronnie Cann, Edinburgh (United Kingdom)
22. Dual oppositions in lexical meaning
1. Preliminaries
2. Duality
3. Examples of duality groups
4. Semantic aspects of duality groups
5. Phase quantification
6. Conclusion
7. References
Abstract
Starting from well-known examples, a notion of duality is presented that overcomes the
shortcomings of the traditional definition in terms of internal and external negation. Rather
duality is defined as a logical relation in terms of equivalence and contradiction. Based on
the definition, the notion of duality groups, or squares, is introduced along with examples
from quantification, modality, aspectual modification and scalar predication (adjectives).
The groups exhibit remarkable asymmetries as to the lexicalization of their four potential
members. The lexical gaps become coherent if the members of duality groups are
consistently assigned to four types, corresponding e.g. to some, all, no, and not all. Among
these types, the first two are usually lexicalized, the third is only rarely and the fourth
almost never. Using the example of the German schon ("already") group, scalar adjectives
and standard quantifiers, the notion of phase quantification is introduced as a general
pattern of second-order predication which subsumes quantifiers as well as aspectual
particles and scalar adjectives. Four interrelated types of phase quantifiers form a duality
group. According to elementary monotonicity criteria the four types rank on a scale of
markedness that accounts for the lexical distribution within the duality groups.
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 479-506
480
IV. Lexical semantics
1. Preliminaries
Duality of lexical expressions is a fundamental logical relation. However, unlike others
such as antonymy it enjoys much less attention. Duality relates all and some, must
and can, possible and necessary, already and still, become and stay. Implicitly, it is even
involved in ordinary antonymy such as between big and small. Traditionally duality is
defined in terms of inner and outer negation: two operators are dual iff the outer
negation of the one is equivalent to the inner negation of the other; alternatively two
operators are dual iff one is equivalent to the simultaneous inner and outer negation of the
other. For example, some is equivalent to not all not. Duality, in fact, is a misnomer.
Given the possibility of inner and outer negation, there are always four cases involved
with duality: a given operator, its outer negation, its inner negation and its dual, i.e. inner
plus outer negation. Gottschalk (1953) therefore proposed to replace the term duality by
quaternality.
In this article, a couple of representative examples are introduced before we proceed
to a formal definition of the relation of duality. The general definition of duality is not as
trivial as it might appear at first sight. Inner and outer negations are not always available
for dual operators at the syntactic level whence it is necessary to base the definition on
a semantic notion of negation. Following the formal definition of duality, a closer look
is taken at a variety of complete duality groups of four, their general structure and their
relationship to the so-called Square of Oppositions of Aristotle's.
Duality groups of four exhibit striking asymmetries: out of the four possible cases, two
are almost always lexicalized, while the third is occasionally and the fourth almost never.
(If the latter two are not lexicalized they are expressed by using explicit negation with
one of the other two cases.) Criteria will be offered for assigning the four members of a
group to four types defined in terms of monotonicity and "tolerance".
A general conceptual format is described that allows the analysis of dual
operators as instances of the general pattern of "phase quantification". This is a pattern
of second-order predication; a phase quantifier predicates about a given first-order
predication that there is, or is not, a transition on some scale between the predication
being false and being true, i.e. a switch in truth-value. Four possibilities arise out of
this setting: (i) there is a transition from false to true, (ii) there is no transition from
false to true, (iii) there is a transition from true to false; (iv) there is no transition from
true to false. These four possibilities of phase quantification form a duality group of
four. It can be argued that all known duality groups semantically are instances of this
general scheme.
1.1. First examples
1.1.1. Examples from logic
Probably the best-known cases of duality are the quantifiers in standard predicate logic,
3 and V. The quantifiers are attached a variable and combined with a sentence (formula,
proposition), to yield a quantified sentence.
(1) a. VxP for every xP
b. 3x P for at least one x P
22. Dual oppositions in lexical meaning 481_
Duality of the two quantifiers is stated in the logical equivalences in (2):
(2) a.
b.
c.
d.
3xP
VxP
-,3xP
-,VxP
■
■
■
■
-,Vx-,P
-,3x-,P
Vx-,P
3x-,P
Duality can be paraphrased in terms of EXTERNAL NEGATION and INTERNAL
NEGATION (cf. article 63 (Herburger) Negation). External negation is the negation of
the whole statement, as on the left formula in (2c,d) and on the right formula in (2a,b).
Internal negation concerns the part of the formula following the quantificr,i.c.the "scope"
of the quantifier (cf. article 62 (Szabolcsi) Scope and binding). For example, according to
(2c) the external negation of existential quantification is logically equivalent to the internal
negation of universal quantification, and vice versa in (2d). If both sides in (2c) and (2d)
are negated and double negation is eliminated, one obtains (2a) and (2b), respectively. In
fact the four equivalences in (2) are mutually equivalent: they all state that universal and
existential quantification are duals.
Dual operators are not necessarily operators on sentences. It is only required that
at least one of their operands can undergo negation ("internal negation"), and that the
result of combining the operator with its operand(s) can be negated, too ("external
negation").
Another case of duality is constituted by conjunction a and disjunction v; duality of
the two connectives is expressed by De Morgan's Laws, for example:
(3) -,(A a B) ■ (-.A v-,B)
These dual operators arc two-place connectives, operating on two sentences, and internal
negation is applied to both operands.
1.1.2. First examples from natural language
The duality relationship between 3 and V is analogously found with their natural
language equivalents some and every. Note that sentences with some NPs as subject are
properly negated by replacing some with no (cf. Lobner 2000: §1 for the proper negation
of English sentences):
(4) a. some tomatoes are green = not every tomato is not green
b. every tomato is green = no tomato is not green
c. no tomato is green = every tomato is not green
d. not every tomato is green = some tomatoes are not green
The operand of the quantificational subject NP is its 'nuclear' scope, the VP.
Modal verbs are another field where duality relations are of central importance.
Modal verbs combine with infinitives. A duality equivalence for epistemic must and can
is stated in (5):
(5) he must have lied = he cannot have told the truth
482
IV. Lexical semantics
Aspectual particles such as already and still are among the most thoroughly studied
cases of dual operators. Their duality can be demonstrated by pairs of questions and
negative answers as in (6). Let us assume that on and off arc logically complementary, i.e.
equivalent to the negations of each other:
(6) a. Is the light already on? - No, the light is still off.
b. Is the light still on? - No, the light is already off.
1.2. Towards a general notion of duality
The relationship of duality is based on logical equivalence. Duality therefore constitutes
a logical relation. In Model-theoretic semantics (cf. article 33 (Zimmermann) Model-
theoretic semantics), meaning is equated with truth conditions; therefore logical relations
are considered sense relations (cf. article 21 (Cann) Sense relations). However, in richer
accounts of meaning that assume a conceptual basis for meanings, logical equivalence
does not necessarily amount to equal meanings (cf. Lobner 2002, 2003: §§4.6, 10.5). It
could therefore not be inferred from equivalences such as in (4) to (6) that the meanings
of the pairs of dual expressions match in a particular way. All one can say is that their
meanings are such that they result in these equivalences.
In addition, expressions which exhibit duality relations are rather abstract in meaning
and, as a rule, can all be used in various different constructions and meanings. In general,
the duality relationship only obtains when the two expressions are used in particular
constructions and/or in particular meanings. For example, the dual of German schon
"already" is noch "still" in cases like the one in (6), but in other uses the dual of schon is
erst (temporal "only");/ioc/i on the other hand has uses where it does not possess a dual
altogether (cf. Lobner 1989 for a detailed discussion).
The duality relation crucially involves external negation of the whole complex of
operator with operands, and internal operand negation. The duality relationship may
concern one operand (out of possibly more) as in the case of the quantifiers, modal verbs
or aspectual particles, or more than one (witness conjunction and disjunction). In order
to permit internal negation, the operands have to be of sentence type or else of some
type of predicate expressions. For the need of external negation, the result of combining
dual operators with their operands must itself be eligible for negation.
A first definition of duality, in accordance with semantic tradition, would be:
(7) Let q and q' be two operators that fulfil the following conditions:
a. they can be applied to the same domain of operands
b. the operands can be negated (internal negation)
c. the results of applying the operators to appropriate operands can be negated
(external negation)
Then q and q' are DUALS iff external negation of one is equivalent to internal
negation of the other.
This definition, however, is in need of modification. First, "negation" must not be
taken in a syntactic sense as it usually is. If it were, English already and still would not
be candidates for duality, as they allow neither external nor internal syntactic negation.
22. Dual oppositions in lexical meaning 483
This is shown in Lobner (1999:89f) for internal negation;as to external negation,already
and still can only be negated by replacing them with not yet and no more/not anymore,
respectively. The term 'negation' in (7) has therefore to be replaced by a proper logical
notion.
A second inadequacy is hidden in the apparently harmless condition (a): dual
operators need not be defined for the same domain of operands. For example, already and
still have different domains: already presupposes that the state expressed did not obtain
before, while still presupposes that it may not obtain later. Therefore, (8a) and (8b) are
semantically odd if we assume that one cannot be not young before being young, or not
old after being old:
(8) a. She's already young.
b. She's still old.
These inadequacies of the traditional definition will be taken care of below.
1.3. Predicates, equivalence, and negation
1.3.1. Predicates and predicate expressions
For proper semantic considerations, it is very important to carefully distinguish between
the levels of expression and of meaning, respectively. Unfortunately there is a
terminological tradition that conflates these two levels when talking of "predicates", "arguments",
"operators", "operands", "quantifiers" etc.: these terms are very often used both for
certain types of expressions and for their meanings. In order to avoid this type of confusion,
the following terminological distinctions will be observed in this article: A "predicate" is
a meaning; what a meaning is depends on semantic theory (cf. article 1 (Maicnborn, von
Heusinger & Portner) Meaning in linguistics). In a model-theoretic approach, a predicate
would be a function that assigns truth values to one or more arguments; in a cognitive
approach, a predicate can be considered a concept that assigns truth values to arguments.
For example, the meaning of has lied would be a predicate (function or concept) which
in a given context (or possible world) assigns the truth value true to everyone who has
lied and false to those who told the truth. Expressions, lexical or complex, with predicate
meanings will be called PREDICATE EXPRESSIONS. ARGUMENTS which a
predicate is applied to are neither expressions nor meanings; they are objects in the world (or
universe of discourse); such objects may or may not be denoted by linguistic expressions;
if they arc, let us call these expressions ARGUMENT TERMS. (For a more
comprehensive discussion of these distinctions see Lobner 2002, 2003: §6.2) Sometimes, arguments
of predicate expressions are not explicitly specified by means of an argument term. For
example, sentences are usually considered as predicating about a time argument, the time
of reference (cf. article 57 (Ogihara) Tense), but very often, the time of reference is not
specified by an explicit expression such as yesterday.The terms OPERATOR, OPERAND
and QUANTIFIER will all be used for certain types of expressions.
If the traditional definition of duality given in (7) is inadequate, it is basically because
it attempts to define duality at the level of expressions. Rather it has to be defined at
the level of meanings because it is a logical relation and logical relations between
expressions originate from their meanings.
484
IV. Lexical semantics
1.3.2. The operands of dual operators
The first prerequisite for an adequate definition of duality is a precise semantic
characterization of the operands of dual operators ("d-operators", for short). Since the
operands must be eligible for negation, their meanings have to be predicates, i.e. something
that assigns a truth-value to arguments. Negation ultimately operates on truth-values;
its effect on a predicate is the conversion of the truth value assigned. Predicate
expressions range from lexical expressions such as verbs, nouns and adjectives to complex
expressions like VPs, NPs, APs or whole sentences. In (4), the dual operators are the
subject NPs some tomatoes and every tomato; their operands are the VPs is/are green and
their respective negations is/are not green; they express predications about the tomatoes
referred to. In (5), the operands of the dual modal verbs are the infinitives have lied and
have told the truth; they express predications about the referent of the subject NP he; in
(6), the operands of already and still are the remainders of the sentence: the light is on/off;
in this case, these sentences are taken as predicates about the reference time implicitly
referred to.
Predicates are never universally applicable, but only in a specific DOMAIN of cases.
For a predicate P, its domain D(P) is the set of those tuples of arguments the predicate
assigns a truth value to. The notion of domain carries over to predicate expressions: their
domain is the domain of the predicate that constitutes their meaning.
In the following, PRESUPPOSITIONS (cf. article 91 (Beaver & Geurts)
Presupposition) of a sentence or other predicate expression are understood as conditions that
simply restrict the domain of the predicate that is its meaning. For example, the sentence
Is the light already on in (6) presupposes (among other conditions) (pi) that there is
a uniquely determined referent of the NP the light (cf. article 41 (Hcim) Definiteness
and indefiniteness) and (p2) that this light was not on before. The predication expressed
by the sentence about the time of reference is thus restricted to those times when (pi)
and (p2) are fulfilled, i.e. those times where there is a unique light which was not on
before. In general, a predicate expression p will yield a truth-value for a given tuple
of arguments if and only if the presuppositions of p are fulfilled. This classical
Frcgcan view of presuppositions is adequate here, as wc arc dealing with the logical
level exclusively. It follows from this notion of presupposition that predicate
expressions which are defined for the same domain of arguments necessarily carry identical
presuppositions. In particular that is the case if two predicate expressions are logically
equivalent:
Definition 1: logical equivalence
Let p and p' be predicate expressions with identical domains, p and p' are LOGICALLY
EQUIVALENT - p ■ p' - iff for every argument tuple in their common domain, p and p'
yield identical truth values.
1.3.3. The negation relation
The crucial relation of logical contradiction can be defined analogously. It will be called
'neg\ the 'neg(ation) relation'. Expressions in this relationship, too, have identical
presuppositions.
22. Dual oppositions in lexical meaning
485
Definition 2: negation relation
Let p and p' be predicate expressions with identical domains, p and p' are NEG-
OPPOSITES, or NEGATIVES, of each other - p neg p' - iff for every argument tuple out
of their common domain, p and p' yield opposite truth values.
Note that this is a semantic definition of negation, as it is defined in terms of predicates,
i.e. at the level of meaning. The tests of duality require the construction of pairs of neg-
opposites, or NEG-PAIRS for short. This means to come up with means of negation
at the level of expressions, i.e. with lexical or grammatical means of converting the
meanings of predicate expressions.
For sentences, an obvious way of constructing a negative is the application (or de-
application) of grammatical negation ('g-negation', in the following). In English, g-
negation takes up different forms depending on the structure of the sentence (cf. Lobner
2000: §1 for more detail). The normal form is g-negation by VP negation, with do aux-
iliarization in the case of non-auxiliary verbs. If the VP is within the scope of a higher-
order operator such as a focus particle or a quantifying expression, either the higher-order
operator is subject to g-negation or it is substituted by its neg-opposite. Such higher-order
operators include many instances of duality. English all, every, always, everywhere, can and
others can be directly negated, while some,sometimes,somewhere, must, always,still etc. are
replaced for g-negation by no, never, nowhere, need not, not yet and no more, respectively.
For the construction of neg-pairs of predicate expressions other than sentences,
sometimes g-negation can be used, e.g. for VPs. In other cases, lexical inversion may be
available, i.e. the replacement of a predicate expression by a lexical neg-opposite such as on/
off, to leave/to stay, member/non-member. Lexical inversion is not a systematic means of
constructing neg-pairs because it is contingent on what the lexicon provides. But it is a
valuable instrument for duality tests.
1.4. Second-order predicates and subnegation
1.4.1. D-operators
D-operators must be eligible to negation and therefore predicate expressions
themselves; as wc saw, at least one of their arguments, their opcrand(s) must, again, be a
predicate expression. For example, the auxiliary must in (5) is a predicate expression that
takes another predicate expression have lied as its operand. In this sense, the d-operators
are second-order predicate expressions, i.e. predicate expressions that predicate about
predicates. D-operators may have additional predicate or non-predicate arguments. For
an operator q and its operand p let 'q(p)' denote the morpho-syntactic combination
of q and p, whatever its form. In terms of the types of Formal Semantics, the simplest
types of d-operators would be (t,t) (sentential operators) and ((a,t),t) (quantifiers); a
frequent type, represented by focus particles, is (((a,t),a),t) (cf. article 33 (Zimmermann)
Model-theoretic semantics for logical types).
1.4.2. Negation and subnegation
When the definition of the neg-relation is applied to d-operators, it captures external
negation. The case of internal negation is taken care of by the 'subneg(ation) relation'.
486
IV. Lexical semantics
Two operators are subneg-opposites if, loosely speaking, they yield the same truth values
for neg-opposite operands.
Definition 3: subnegation opposites
Let q and q' be operators with a predicate type argument. Let the predicate domains of q
and q' be such that q yields a truth value for a predicate expression p iff q' yields a truth
value for the neg-opposites of p.
q and q' are SUBNEG(ATION) OPPOSITES, or SUBNEGATIVES, of each other - q
SUBNEG q' - iff
> for any predicate expressions p and p' eligible as operands of q and q', respectively:
ifp neg p'then q(p)sq'(p')-
(For operators with more than one predicate argument, such as conjunction and
disjunction, the definition would have to be modified in an obvious way.)
If two d-operators are subnegatives, their domains need not be identical. The definition
only requires that if q is defined for p, any subnegative q' is defined for the negatives of p.
If the predicate domain of q contains negatives for every predicate it contains, then q and
q' have the same domain. Such arc the domains of the logical quantifiers, but that docs not
hold for pairs of operators with different presuppositions, e.g. already and still (cf. §5.1).
An example of subnegatives is the pair always/never:
(9) Max always is late = Max never is on time
To be late and to be on time are neg-opposites, here in the scope of always and never,
respectively. The two quantificational adverbials have the same domain, whence the
domain condition in Definition 3 is fulfilled.
2. Duality
2.1. General definition
Definition 3 above paths the way for the proper definition of duality:
Definition 4: dual opposites
Let q and q' be operators with a predicate type argument. Let the predicate domains of q
and q' be such that q yields a truth value for a predicate expression p iff q' yields a truth
value for the neg-opposites of p.
q and q' are DUAL (OPPOSITE)S of each other - q DUAL q' - iff:
> for any predicate expressions p and p' eligible as operands of q and q', respectively:
ifp neg p'then q(p) neg q'(p')-
The problem with condition (a) in the traditional definition in (7) is taken care of by the
domain condition here; and any mention of grammatical negation is replaced by relating
488 IV. Lexical semantics
Due to the equivalences in (10) and (11), with any d-operator q, the operations N, S and
D yield a group of exactly four cases: q, Nq, Sq, Dq - provided Nq, Sq and Dq each differ
from q (see §2.3 for reduced groups of two members). Each of the four cases may be
expressible in different, logically equivalent ways. Thus what is called a "case" here, is
basically a set of logically equivalent expressions. According to (10) and (11), any further
application of the operations N, S and D just yields one of these four cases: the group is
closed under these operations. Within such a group, no element is logically privileged:
instead of defining the group in terms of q and its correlates Nq, Sq and Dq, we might,
for example, just as well start from Sq and take its correlates NSq = Dq, SSq = q, and
DSq = Nq.
Definition 5: duality group
A DUALITY GROUP is a
group of
subneg and dual that contains at least a
up to
pair c
four
f dual
aperators in
operators.
the mutual relations
neg.
Duality groups can be graphically represented as a square of the structure depicted in
Fig. 22.1.
dual
q < ► Dq
Nq < ► Sq
dual
Fig. 22.1: Duality square
Although the underlying groups of four operators related by neg, subneg and dual are
perfectly symmetrical, duality groups in natural, and even in formal, languages are almost
always deficient, in that not all operators are lexicalized. This issue will be taken up in
§§3,4 and 5 below.
2.3. Reduced duality groups and self-duality
For the sake of completeness, we will briefly mention cases of reduced (not deficient)
duality groups. The duality square may collapse into a constellation of two, or even one
case, if the operations N, D, or S are of no effect on the truth conditions, i.e. if Nq, Dq or
Sq arc equivalent to q itself. Neutralization of N contradicts the Law of Contradiction: if
q = Nq, and q were true for any predicate operand p, it would at the same time be false.
Therefore, the domain of q must be empty. This might occur if q is an expression with
contradictory presuppositions, whence the operator is never put to work. For such an
"idle" operator, negatives, subnegatives and duals are necessarily idle, too. Hence the
whole duality group collapses into one case.
490
IV. Lexical semantics
(17) / don't want you to leave = I want you not to leave
The question as to which verbs are candidates for the neg-raising phenomenon is, as far
as I know, not finally settled (see Horn (1978) for a comprehensive discussion, also Horn
(1989: §5.2)).The fact that NR verbs are self-dual allows, however, a certain general
characterization. The condition of self-duality has the consequence that if v is true for p, it is
false for Np, and if v is false for p, it is true for Np.Thus NR verbs express propositional
attitudes that in their domain "make up their mind" for any proposition as to whether
the attitude holds for the proposition or its negation. For example, NR want applies only
to such propositions the subject has either a positive or a negative preference for. The
claim of self-duality is tantamount to the presupposition that this holds for any possible
operand. Thereby the domains of NR verbs are restricted to such pairs of p and Np to
which the attitude either positively or negatively obtains.
Among the modal verbs, those meaning "want to" like German wollen do not
partake in the duality groups listed below. I propose that these modal verbs are NR verbs,
whence their duality groups collapse to a group of two [q/Dq, Nq/Sq], e.g. [wollen,
wollen neg].
THE GENERIC OPERATOR. In mainstream accounts of characterizing (i-generic)
sentences (cf. Krifka et al. 1995, also article 47 (Carlson) Genericity) it is commonly
assumed that their meanings involve a covert genericity operator GEN. For example,
men are dumb would be analyzed as (18a):
(18) a. GEN[x](x is a man;x is dumb]
According to this analysis, the meaning of the negative sentence men are not dumb would
yield the GEN analysis (18b), with the negation within the scope of GEN:
(18) b. GEN[x](x is a man;-i x isdumb]
The sentence men are not dumb is the regular grammatical negation of men are dumb,
whence it should also be analysed as (18c), i.e. as external negation w.r.t. to GEN:
(18) c. -iGEN[x](x is a man;x isdumb]
It follows immediately that GEN is self-dual.This fact has not been recognized in the
literature on generics. In fact, it invalidates all available accounts of the semantics of GEN
which all agree in analyzing GEN as some variant of universal quantification. Universal
quantification, however, is not self-dual, as internal and external negation clearly yield
different results (see Lobner 2000: §4.2) for elaborate discussion.
HOMOGENEOUS QUANTIFICATION. Ordinary restricted quantification may
happen to be self-dual if the predicate quantified yields the same truth-value for all
elements of the domain of quantification, i.e. if it is either true of all cases or false of all
cases. (As a special case, this condition obtains if the domain of quantification contains
just one clement u;both 3 and V arc then equivalent to self-dual Iu.) The "homogeneous
quantifier" 3V (cf. Lobner 1989:179, and Lobner 1990:27ff for more discussion) is a two-
place operator which takes one formula for the restriction and a second formula for the
predication quantified. It will be used in §5.2.
494
IV. Lexical semantics
Tab. 22.2: Duality Groups of Deontic Expressions
type of expression
modal verbs
(deontic modality)
modal verbs (2)
adjectives
German causative
deontic verbs
imperative
causative
verbs of deontic and
causal modality
q
may/can
may
possible
ermoglichen
"render
possible"
imperative of
permission
causative of
permission
accept
allow
let/admit
dual
must
shall
necessary
erzwingen
"force"
imperative of
request
causative of
causation
demand
request
let/make/force
negative
(must neg)
(may neg)
(shall neg)
impossible
verhindern
"prevent"
(neg imperative)
(neg causative)
refuse
forbid
prevent
subnegative
(need neg)
(need neg)
unnecessary
eriibrigen
"render
unnecessary"
-
-
(demand neg)
(request neg)
(force neg)
The imperative form has two uses, the prototypical one of request, or command, and a
permissive use, corresponding to the first cell of the respective duality group.The negation
of the imperative, however, only has the neg-reading of prohibition. The case of
permitting not to do something cannot be expressed by a simple imperative and negation.
Similarly, grammatical causative forms such as in Japanese tend to have a weak (permissive)
reading and a strong reading (of causation), while their negation inevitably expresses
prevention, i.e. causing not to. The same holds for English let and German lassen.
3.2.3. Epistemic modality
In the groups of modal verbs in epistemic use, g-negation of may yields a subnegative,
unlike the neg-reading of the same form in the deontic group. Can, however, yields a neg-
reading with negation.Thus, the duality groups exhibit remarkable inconsistencies such
as the near-equivalence of may and can along with a clear difference of their respective
g-negations, or the equivalence of the g-negations of non-equivalent may and must.
Tab. 22.3: Duality Groups of Epistemic Expressions
type of expression
modal verbs (1)
modal verbs (2)
epistemic adjectives
adverbs
verbs of altitude
adjectives for logical
properties
verbs for logical
relations
q
can
may
possible
possibly
hold possible
satisfiable
be compatible
with
dual
must
must
certain
certainly
believe
tautological
entail
negative
can neg
can neg
impossible
in no case
exclude
contradictory
unsatisfiable
exclude
subnegative
need neg
may neg
questionable
-
doubt
(neg
tautological)
(neg entail)
22. Dual oppositions in lexical meaning 495
Logical necessity and possibility can be considered a variant of epistemic modality. Here
the correspondence to quantification is obvious, as these properties and relations are
defined in terms of existential and universal quantification over models (or worlds, or
contexts).
3.2.4. Aspectual operators
Aspectual operators deal with transitions in time between opposite phases, or equiv-
alently, with beginning, ending and continuation. Duality groups are defined by verbs
such as begin, become and by focus particles such as already and still. The German
particle schon will be analyzed in §5.1 in more detail, as a paradigm case of'phase
quantification'. The particles of the nurnoch group have no immediate equivalents in English.
See Lobner (1999: §5.3) for semantic explanations.
3.2.5. More focus particles, conjunctions
More focus particles such as only, even, also are candidates for duality relations. A duality
account of German nur ("only", "just") is proposed in Lobner (1990: §9). In some of
the uses of only analyzed there, it functions as the dual of auch ("also"). An analysis of
auch, however, is not offered there. Konig (1991a, 1991b) proposed to consider causal
because and concessive although duals, due to the intuition that "although p, not q1 means
something like 'not {because p, q)\ However, Iten (2005) argues convincingly against
that view.
3.2.6. Scalar adjectives
Lobner (1990: §8) offers a detailed account of scalar adjectives which analyzes them
as dual operators on an implicit predication of markedness. The analysis will be briefly
sketched in §5.1. According to this view, pairs of antonyms, together with their negations,
form duality groups such as [long, short, (neg long), (neg short)].
Independently of this analysis, certain functions of scalar adjectives exhibit logical
relations that are similar to the duality relations. Consider the logical relationships
between positive with enough and too with positive, as well as between equative and
comparative:
(24) a. x is not too short = x is long enough
x is too long = x is not short enough
b. x is not as long as y = x is shorter than y
x is as long as y = x is not shorter than y
The negation of too with positive is equivalent to the positive of the antonym with
enough, and the negation of the equative is equivalent to the comparative of the
antonym. Antonymy essentially means reversal of the underlying common scale. Thus the
equivalences in (24) represent instances of a "duality" relation that is based on negation
and scale reversal, another self-inverse operation, instead of being based on negation
and subnegation. In view of such data, the notion of duality might be generalized to
analogous logical relations based on two self-inverse operations.
496
IV. Lexical semantics
3.3. Asymmetries within duality groups
The duality groups considered here exhibit remarkable asymmetries. Let us refer to the
first element of a duality group as type 1, its dual as type 2, its negative as type 3 and its
subnegative as type 4. The first thing to observe is the fact that there are always lexical
or grammatical means of directly expressing type 1 and type 2, sometimes type 3 and
almost never type 4.This tendency has long been observed for the quantifier groups (sec
Horn 1972 for an early account, Dohmann 1974a,b for cross-linguistic data, and Horn
to appear for a comprehensive recent survey). In addition to the lexical gaps, there is a
considerable bias of negation towards type 3, even if the negated operand is type 2.Types
3 and 4 are not only less frequently lexicalized; if they are, the respective expressions are
often derived from type 1 and 2, if historically, by negative affixes, cf. n-either,n-ever,n-or,
im-possible.The converse never occurs.Thus, type 4 appears to be heavily marked: on a
scale of markedness we obtain 1,2 < 3 < 4.
A closer comparison of type 1 and type 2 shows that type 2 is marked vs. type 1, too.
As for nominal existential quantification, languages are somehow at pain when it comes
to an expression of neutral existential quantification. This might at a first glance appear
to indicate markedness of existential vs. universal quantification. However, nominal
existential quantification can be considered practically "built into" mere predication under
the most frequent mode of particular (non-generic) predication: particular predication
entails reference, which in turn entails existence. Thus, existential quantification is so
unmarked that it is even the default case not in need of overt expression. A second
point of preference compared to universal quantification is the degree of elaboration of
existential quantification by more specific operators such as numerals and other quantity
specifications.
For the modal types, this argumentation does not apply. What distinguishes type 1 from
type 2 in these cases, is the fact that irregular negation, i.e. subneg readings of g-negation
only occurs with type 2 operators; in this respect, epistemic may constitutes an exception.
Tab. 22.4: Duality Groups of Aspectual Expressions
type of expression
verbs of beginning etc.
German aspectual
particles with stative
operands (1)
German aspectual
particles with focus on
a specification of time
German aspectual
particles with stative
operands(2)
German aspectual
particles with scalar stative
operands(3)
q
begin
become
schon
"already"
schon
"already"
endlich
"finally"
nurnoch
dual
continue
stay
noch
"still"
erst
"only"
noch immer
"STILL"
noch
negative
end
(become neg)
(noch neg)
"neg yet"
(noch neg)
"neg yet"
noch immer neg
"STILL NEG"
(neg nur noch)
subnegative
(neg begin)
(neg stay)
(neg mehr)
"neg more"
(neg erst)
"neg still"
(endlich neg mehr)
"finally neg more"
(neg mehr)
22. Dual oppositions in lexical meaning
The aspectual groups will be discussed later. So much may, however, be stated. In
all languages there are very many verbs that incorporate type 1 'become' or 'begin' as
opposed to very few verbs that would incorporate type 2 'continue' or 'stay' or type 3
'stop', and apparently no verbs incorporating 'not begin'.
These observations that result in a scale of markedness of type 1 < type 2 < type 3 <
type 4 are tendencies. In individual groups, type 2 may be unmarked vs. type 1.
Conjunction is unmarked vs. disjunction, and so is the command use of the imperative opposed
to the permission use.
Of course, the tendencies are contingent on which element is chosen as the first of the
group.They emerge only if the members of the duality groups are assigned the types they
arc. Given the perfect internal symmetry of duality groups, the type assignment might
seem arbitrary - unless it can be motivated independently. What is needed, therefore, is
independent criteria for the assignment of the four types. These will be introduced in
§4.2 and §5.3.
4. Semantic aspects of duality groups
4.1. Duality and the Square of Opposition
The quantificational and modal duality groups (Tab. 22.1,22.2,22.3) can be arranged in a
second type of logical constellation, the ancient Square of Opposition (SqO), established
by the logical relations of entailment, contradiction (i.e. neg-opposition), contrariety and
subcontrariety. The four relations are essentially entailment relations (cf. Lobner 2002,
2003: §4 for an introduction).They are defined for arbitrary, not necessarily second-order,
predicates. The relation of contradictoriness is just neg.
Definition 7: entailment relations for predicate expressions
Let p and p' be arbitrary predicate expressions with the same domain D.
(i) p ENTAILS p' iff for every a in D, if p is true for a, then p' is true for a.
(ii) p and p' are CONTRARIES iff p entails Np'.
(iii) p and p' are SUBCONTRARIES iff Np entails p'.
The traditional SqO has four vertices, A, I, E, O corresponding to V, 3, -i 3 and -iV, or
type 1,2,3,4, respectively, although in a different arrangement than in Fig. 22.2:
subcontraries
Fig. 22.2: Square of Oppositions
O
498
IV. Lexical semantics
The duality square and the SqO depict different, in fact independent logical relations.
Unlike the duality square, the SqO is asymmetric in the vertical dimension: the
entailment relations are unilateral, and the relation between A and E is different from the one
between I and O.The relations in the SqO are basically second-order relations (between
first-order predicates), while the duality relations dual and subneg are third-order
relations (between second-order predicates).
The entailment relations can be established between arbitrary first-order
predicate expressions; the duality relations would then simply be unavailable due to the
lack of predicate-type arguments. For example, any pair of contrary first-order predicate
expressions such as frog and dog together with their negations span a SqO with, say,
A = dog, E = frog, I = N frog, O = N dog. There arc also SqO's of second-order
operators which are not duality groups. For example, let A be more than one and E no with
their respective negations O not more than one/at most one and I some/at least one.
The SqO relations obtain, but A and I, i.e. more than one and some are not duals: the
dual of more than one is not more than one not, i.e. at most one not. This is clearly not
equivalent with some.
On the other hand, there are duality groups which do not constitute SqO's, for
example the schon group. An SqO arrangement of this group would have schon and
noch nicht, and noch and nicht mehr, as diagonally opposite vertices. However, in this
group duals have different presuppositions, in fact this holds for all horizontal or
vertical pairs of vertices. Therefore, since the respective relations of entailment, contrariety
and subcontrariety require identical presuppositions, none of these obtains. In Lobner
(1990:210) an artificial example is constructed which shows that the duality relations do
not entail the SqO relations even if all four operators share the same domain.There may
be universal constraints for natural language which exclude such cases.
4.2. Criteria for the distinction among the four types
4.2.1. Monotonicity: types i and 2 vs. types 3 and 4
The most salient difference among the four types is that between types 1 and 2 and types
3 and 4: types 1 and 2 are positive, types 3 and 4 negative.The difference can be captured
by the criterion of monotonicity. Barwisc & Cooper (1981) first introduced this property
for quantifiers; see also article 43 (Keenan) Quantifiers.
Definition 8: monotonicity
a. An operator q is UPWARD MONOTONE - MONT- if for any operands p and
p' if p entails p' then q(p) entails q(p')-
b. An operator q is DOWNWARD MONOTONE - MONi - if for any operands p
and p' if p entails p' then q(p') entails q(p).
All type 1 and type 2 operators of the groups listed are mont while the type 3 and
type 4 operators are moni. Negation inverses entailment: if p entails p' then Np'
entails Np. Hence, both N and S inverse the direction of monotonicity. To see this,
consider (25):
22. Dual oppositions in lexical meaning 499
(25) have a coke entails have a drink, therefore
a. every is mont: every student had a coke entails every student had a drink
b. /joismoni: no student had a drink entails no student had a coke
Downward monotonicity is a general trait of semantically negative expressions
(cf. Lobner 2000: §1.3). It is generally marked vs. upward monotonicity. Moni
operators, including most prominently g-negation itself, license negative polarity items (cf.
article 64 (Giannakidou) Polarity items) and are thus specially restricted. Negative
utterances in general are heavily marked pragmatically since they require special
context conditions (Givon 1975).
4.2.2. Tolerance: types i and 4 vs. types 2 and 3
Intuitively, types 1 and 4 are weak as opposed to the strong types 2 and 3. The weak
types make weaker claims. For example, one positive or negative case is enough to verify
existential or negated universal quantification, respectively, whereas for the verification
of universal and negated existential quantification the whole domain of quantification
has to be checked.This distinction can be captured by the property of (in)consistency, or
(in)tolerance:
Definition 9: tolerance and intolerance
a.
b.
An operator q is TOLERANT iff
for some neg-pair p and Np of operands, q is true for both p
An operator q is INTOLERANT iff it is not tolerant.
and
Np.
Intolerant operators arc "strong", tolerant ones "weak".
(26) a. intolerant every: every light is off excludes every light is on
b. tolerant some: some lights are on is compatible with some lights are off
An operator q is intolerant iff for all operands p, if q(p) is true then q(Np) is false, i.e.
Nq(Np) is true. Hence an operator is intolerant iff it entails its dual. Unless q is self-
dual, it is different from its dual, hence only one of two different duals can entail its
dual.Therefore, of two different dual operators one is intolerant and the other tolerant,
or both are tolerant. In the quantificational and modal groups, type 2 generally entails
type 1, whence type 2 is intolerant and type 1 tolerant. Since negation inverses
entailment, type 3 entails type 4 if type 2 entails type 1. Thus in these groups, type 4 is tolerant,
and type 3 intolerant. The criterion of (in)tolerance cannot be applied to the aspectual
groups in Tab. 22.4. This gap will be taken care of in §5.3.
As a result, for those groups that enter the SqO (the quantificational and modal
groups), the four types can be distinguished as follows.
Tab. 22.5: Type Distinctions in Duality Squares and the Square of Opposition
type 1 /1 type 2 / A type 3 / E type 4 / O
mont mont monl intolerant
tolerant intolerant inoni tolerant
500
IV. Lexical semantics
Horn (to appear: 14) directly connects (in)tolerance to the asymmetries of g-negation
w.r.t. types 3 and 4. He states that 'intolerant q may lexically incorporate S, but tends not
to lexicalize N; conversely, tolerant q may lexically incorporate N, but bars lexicalization
of S' (the quotations are modified as to fit the terminology and notation used here). Since
intolerant operators are of type 2 and tolerant ones of type 1, both incorporations of
S or N lead to type 3.
The logical relations in the SqO can all be derived from the fact that V entails 3.They
are logically equivalent to V being intolerant which, in turn, is equivalent to each of the
(in)tolerance values of the other three quantifiers. The monotonicity properties cannot
be derived from the SqO relations. They are inherent to the semantics of universal and
existential quantification.
4.3. Explanations of the asymmetries
There is considerable discussion as to the reasons for the gaps observed in the SqO
groups. Horn (1972) and Horn (to appear) suggest that type 4 is a conversational implica-
ture of type 1 and hence in no need of extra lexicalization: some implicates not all, since if
it were in fact all, one would have chosen to say so.This being so, type 4 is not altogether
superfluous; some contexts genuinely require the expression of type 4; for example, only
not all, not some, can be used with normal intonation as a refusal of all.
Lobner (1990: §5.7) proposes a speculative explanation in terms of possible differences
in cognitive cost. The argument is as follows. Assume that the relative unmarkedness of
type 1 indicates that type 1 is cognitively basic and that the other types are cognitively
implemented as type 1 plus some cognitive equivalents of N, S, or D. If the duality group
is built up as [q, Dq, Nq, DNq], this would explain the scale of markedness, if one assumes
that application of D is less expensive than application of N, and simultaneous
application of both is naturally even more expensive than either. Since we are accustomed to
think of D as composed of S and N, this might appear implausible; however, an analysis
is offered (see §5.2), where, indeed, D is simple (essentially presupposition negation) and
S the combined effect of N and D.
Jaspers (2005) discusses the asymmetries within the SqO, in particular the missing
lexicalization of type 4 / O, in more breadth and depth than any account before. He, too,
takes type 1 as basic,"pivotal" in his terminology, and types 2 and types 3 as derived from
type 1 by two different elementary relations, one of them N.The fourth type, he argues,
does not exist at all (although it can be expressed compositionally). His explanation is
therefore basically congruent with the one in Lobner (1990), although the argument is
based not on speculations about cognitive effort, but on a reflection on the character of
human logic. For details of a comparison between the two approaches see Jaspers (2005:
§2.2.5.2 and §4).
5. Phase quantification
In Lobner (1987,1989,1990) a theory of "phase quantification" was developed which was
first designed to provide uniform analyses of the various schon groups in a way that
captures the duality relationships. The theory later turned out to be also applicable to "only"
(German nur), scalar adjectives and, in fact, universal and existential quantification in
502
IV. Lexical semantics
Tab. 22.6: Presuppositions and Assertions of the schon Group
operator
schon (tt,p)
noch(tf,p)
noch nicht(tf, p)
nicht mehr(te
,P)
relation to schon
dual
neg
subneg
presupposition:
previous
not-p
P
not-p
P
state
assertion:
state at te
P
P
not-p
not-p
Types 2,3, and 4 can be directly analyzed as generated from type 1 by application of N
and D, where D is just negation of the presupposition.
Mittwoch (1993) and van der Auwera (1993) have questioned the presuppositions
of the schon group. Their criticism, however, is essentially due to a confusion of types
of uses (each use potentially comes with different presuppositions), or of
presuppositions and conversational implicatures. Lobner (1999) offers an elaborate discussion, and
refutation, of these arguments.
Scalar adjectives. Scalar adjectives frequently come in pairs of logically contrary
antonyms such as long/short, old/young, expensive/cheap. They relate to a scale that is
based on some ordering.They encode a dimension for their argument such as size, length,
age, price, and rank its degree on the respective scale. Pairs of antonyms consist of a
positive element +A and a negative element -A (see Bierwisch 1989 for an elaborate
general discussion). +A states for its argument that it occupies a high degree on the
scale, a degree that is marked against lower degrees. -A states that the degree is low,
i.e. marked against higher degrees. The respective negations predicate an unmarked
degree on the scale as opposed to marked higher degrees (neg +A), or marked lower
degrees (neg -A).The criteria of marked degrees are context dependent and need not be
discussed here.
Similar to the meanings of the schon group, + A can be seen as predicating of an
argument t that on the given scale it is placed above a critical point where unmarkedness
changes into markedness, and analogously for the other three cases. In the diagrams
in Fig. 22.4, unmarkedness is denoted by 0 and markedness by 1, as (un)markedness
coincides with the truth value that +A and -A assign to their argument.
+A(t) -Aft) not+A(t) not-Aft)
0 1 10
Fig. 22.4: Phase diagrams for scalar adjectives
i.
Other uses of scalar adjectives such as comparative, equative, positive with degree
specification, enough or too can be analyzed analogously (Lobner 1990: §8).
Existential and universal quantification. In order to determine the truth value of
quantification restricted to some domain b and about a predicate expressed by p, the
elements of b have to be checked in some arbitrary order as to whether p is true or
506
IV. Lexical semantics
Huddleston. Rodney & Geoffrey K. Pullum 2002. The Cambridge Grammar of the English
Language. Cambridge: Cambridge University Press.
Iten, Corinne 2005. Linguistic Meaning, Truth Conditions and Relevance. The Case of Concessives.
Houndmills: Palgrave Macmillan.
Jaspers, Dany 2005. Operators in the Lexicon: On the Negative Logic of Natural Language. Ph.D.
dissertation. University of Leiden. Utrecht: Landelijke Onderzoekschool Taalwetenschap (=LOT).
Konig, Ekkehard 1991a. Concessive relations as the dual of causal relations. In: D. Zaefferer (ed.).
Semantic Universals and Universal Semantics. Dordrecht: Foris, 190-209.
Konig, Ekkehard 1991b. The Meaning of Focus Particles in German. London: Routledge.
Krifka, Manfred, Francis J. Pelletier. Gregory N. Carlson. Alice ter Meulen, Godehard Link &
Gennaro Chicrchia 1995. Gencricity: An introduction. In: G. N. Carlson & F. J. Pelletier (eds.).
The Generic Book. Chicago, IL:The University of Chicago Press, 1-124.
Lobner, Sebastian 1987. Quantification as a major module of natural language semantics. In: J. A. G.
Grocncndijk, M. J. B. Stokhof & D. de Jongh (eds.). Studies in Discourse Representation Theory
and the Theory of Generalized Quantifiers. Dordrecht: Foris, 53-85.
Lobner, Sebastian 1989. German schon - erst - noch: An integrated analysis. Linguistics &
Philosophy 12,167-212.
Lobner, Sebastian 1990. Wahr neben Falsch. Duale Operatoren als die Quantoren natilrlicher
Sprache. Tubingen: Niemeyer.
Lobner, Sebastian 1999. Why German schon and noch are still duals: A reply to van der Auwera.
Linguistics & Philosophy 22.45-107.
Lobner, Sebastian 2000. Polarity in natural language: Predication, quantification and negation in
particular and characterizing sentences. Linguistics & Philosophy 23,213-308.
Lobner, Sebastian 2002. Understanding Semantics. London: Hodder-Arnold.
Lobner, Sebastian 2003. Semantik. Eine Einfiihrung. (German version of Lobner 2002). Berlin:
dc Gruytcr.
Mittwoch, Anita 1993. The relationship between schon/already and noch/still: A reply to Lobner.
Natural Language Semantics 2,71-82.
Palmer, Frank R. 2001. Mood and Modality. Cambridge: Cambridge University Press.
Schliickcr. Barbara 2008. Warum nicht bleiben nicht werden ist: Ein Pladoycr gegcn die Dualitat
von werden und bleiben. Linguistische Berichte 215,345-371.
Sebastian Lobner, Diisseldorf (Germany)
V. Ambiguity and vagueness
23. Ambiguity and vagueness: An overview
1. Interpretive uncertainty
2. Ambiguity
3. Vagueness
4. Conclusion
5. References
Abstract
Ambiguity and vagueness are two varieties of interpretive uncertainty which are often
discussed together, but are distinct both in their essential features and in their significance for
semantic theory and the philosophy of language. Ambiguity involves uncertainty about
mappings between levels of representation with different structural characteristics, while
vagueness involves uncertainty about the actual meanings of particular terms. This article
examines ambiguity and vagueness in turn, providing a detailed picture of their empirical
characteristics and the diagnostics for identifying them, and explaining their significance
for theories of meaning. Although this article continues the tradition of discussing
ambiguity and vagueness together, one of its goals is to emphasize the ways that these
phenomena are distinct in their empirical properties, in the factors that give rise to them, and
in the analytical tools that can be brought to bear on them.
1. Interpretive uncertainty
Most linguistic utterances display "interpretive uncertainty", in the sense that the
mapping from an utterance to a meaning (Grice's 1957 "meaning^") appears (at least from
the external perspective of a hearer or addressee) to be one-to-many rather than one-to-
one. Whether the relation between utterances and meanings really is one-to-many is an
open question, which has both semantic and philosophical significance, as I will outline
below. What is clear, though, is that particular strings of phonemes, letters or manual
signs used to make utterances are, more often than not, capable of conveying distinct
meanings.
As a first step, it is important to identify which kinds of interpretive uncertainty
(viewed as empirical phenomena) arc of theoretical interest. Consider for example
(la-b), which manifest several different kinds of uncertainty.
(1) a. Sterling's cousin is funny,
b. Julian's brother is heavy.
One kind concerns the kinds of individuals that the English noun phrases Sterling's
cousin and Julian's brother can be used to pick out: the former is compatible with
Sterling's cousin being male or female and the latter is compatible with Julian's brother
Maienbom, von Heusingerand Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 507-535
508 V. Ambiguity and vagueness
being older or younger than him. However, this sort of uncertainty merely reflects the
fact that cousin and brother are indeterminate with respect to sex and age, respectively,
both terms have conditions of application that specify certain kinds of familial
relationships, and brother, unlike cousin, also imposes conditions on the sex of the individuals it
applies to, but beyond these constraints these terms do not discriminate between objects
as a matter of meaning. (Indeterminacy is also sometimes referred to as "generality"; see
Zwicky & Sadock 1975,1984;Gillon 1990,2004.)
That this is so can be seen from the fact that distinctions of this sort do not affect
judgments of truth or falsity. For example, assuming that the antecedents of the
conditionals in (2a-b) specify the minimal difference between the actual world and the counter-
factual worlds under consideration, the fact that (2a) is false and (2b) is true shows that
a change in sex, unlike a change in familial relationships, does not affect the truth of the
application of cousin.
(2) Lily is Sterling's cousin....
a. ...but if she were a boy, she wouldn't be Sterling's cousin anymore.
b. ...but if her mother weren't Sterling's father's sister, she wouldn't be Sterling's
cousin anymore.
Likewise, the hypothetical change in age in (3a) doesn't affect the truth of the application
of brother, though the change in sex in (3b) now docs make a difference.
(3) Sterling is Julian's brother...
a. ...but if their ages were reversed, he wouldn't be Julian's brother anymore.
b. ...but if he were a girl, he wouldn't be Julian's brother anymore.
Indeterminacy reflects the fact that the meaning of a word or phrase typically docs not
involve an exhaustive specification of the features of whatever falls under its extension;
some features are left open, resulting in the sort of flexibility of application we see above.
This is not to say that such features couldn't be specified: many languages contain cousin
terms that do make distinctions based on sex (either through grammatical gender, as in
Italian cugino 'cousin/mKc' vs. cugina 'cousin ', or lexically, as in Norwegian fetter 'male
cousin' vs. kusine 'female cousin'), and some contain male sibling terms that specify age
(such as Mandarin gege 'older brother' vs. didi 'younger brother'). Whether a particular
distinction is indeterminate or not is thus somewhat arbitrary and language specific, and
while it might be interesting to determine if there are cultural or historical explanations
for the presence/absence of such distinctions in particular languages, the existence of
indeterminacy in any single language is typically not a fact of particular significance for
its semantic analysis.
A second type of uncertainty manifested by (la) and (lb) is of much greater
importance for semantic analysis, however, as it involves variability in the truth or satisfaction
conditions that a particular bit of an utterance introduces into the meaning calculation.
This kind of uncertainty is ambiguity, which manifests itself as variation in truth
conditions: one and the same utterance token can be judged true of one situation and false of
another, or the other way around, depending on how it is interpreted. In (la) and (lb),
we see ambiguity in the different ways of understanding the contributions of funny and
heavy to the truth conditions, (la) can be construed either as a claim that Sterling's cousin
23. Ambiguity and vagueness: An overview 509
has an ability to make people laugh ("funny ha-ha") or that she tends to display odd or
unusual behavior ("funny strange"). Similarly, (lb) can be used to convey the
information that Julian's brother has a relatively high degree of weight, or that he is somehow
serious, ponderous, or full of gravitas.That these pairs of interpretations involve distinct
truth conditions is shown by the fact that we can use the same term (or, more accurately,
the same bits of phonology) to say something that is true and something that is false of
the same state of affairs, as in (4) (Zwicky & Sadock 1975).
(4) Sterling's cousin used to make people laugh with everything she did, though she
was never in any way strange or unusual. She was funny without being funny. Lately,
however, she has started behaving oddly, and has lost much of her sense of humor.
Now she's funny but not very funny.
Both examples also manifest an ambiguity in the nature of the relation that holds
between the genitive-marked nominal in the possessive construction and the denotation
of the whole possessive DP;cf. article 45 (Barker) Possessives and relational nouns. While
the most salient relations are the familial ones expressed by the respective head nouns
(cousin o/and brother of), it is possible to understand these sentences as establishing
different relations. For example, if Julian is one of several tutors working with a family
of underachieving brothers, we could use (lb) as a way of saying something about the
brother who has been assigned to Julian, without in any way implying that Julian himself
stands in the brother of relation to anyone. (He could be an only child.)
Even after we settle on a particular set of conditions of application for the ambiguous
terms in (la) and (lb) (e.g. that we mean "funny ha-ha" by funny or are using heavy to
describe an object's weight), a third type of uncertainty remains about precisely what
properties these terms ascribe to the objects to which they are applied, and possibly
about whether these terms can even be applied in the first place.This is vagueness, and is
of still greater significance for semantic theory, as it raises fundamental questions about
the nature of meaning, about deduction and reasoning, and about knowledge of language.
Consider, an utterance of (lb) in a context in which we know that heavy is being
used to characterize Julian's brother's weight. If we take the person who utters this
sentence to be speaking truthfully, we may conclude that Julian's brother's weight is above
some threshold. However, any conclusions about how heavy he is will depend on a range
of other contextual factors, such as his age, his height, information about the
individuals under discussion, the goals of the discussion, the interests of the discourse
participants, and so forth, and even then will be rough at best. For example, if we know that
Julian's brother is a 4-year old, and that we're talking about the children in his preschool
class, we can conclude from an utterance of (lb) that his weight is somewhere above
some threshold, but it would be extremely odd to follow up such an utterance by saying
something like (5).
(5) Well, since that means he is at least 17.5 kg, we need to make sure that he is one of
the carriers in the piggy-back race, rather than one of the riders.
(5) is odd because it presumes a specific cut-off point separating the heavy things
from the non-heavy things (17.5 kg), but the kind of uncertainty involved in vagueness is
precisely uncertainty about where the cut off is.
510 V. Ambiguity and vagueness
This can be further illustrated by a new context. Imagine that we are in a situation
in which the relevant contextual factors are clear: Julian's brother is a 4-year old, we're
talking about the children in his class, and we want to decide who should be the anchor
on the tug-of-war team. In addition, we also know that Julian's brother weighs exactly
15.2 kg. Even with these details spelled out - in particular, even with our knowledge of
Julian's brother's actual weight - we might still be uncertain as to whether (lb) is true:
Julian's brother is a borderline case for truthful application of the predicate.
Borderline cases and uncertainty about the boundaries of a vague predicate's
extension raise significant challenges for semantic theory. If we don't (and possibly
can't) know exactly how much weight is required to put an object in the extension of
heavy (in a particular context of use), even when wc arc aware of all of the potentially
relevant facts, can we truly say that we know the meaning of the term? Do we have
only incomplete knowledge of its meaning? If our language contained only a few
predicates like heavy, we might be able to set them aside as interesting but ultimately
insignificant puzzles. Vagueness is pervasive in natural language, however, showing
up in all grammatical categories and across lexical fields, so understanding the principles
underlying this type of uncertainty is of fundamental importance for semantic theory.
In the rest of this article, I will take a closer look at ambiguity and vagueness in turn,
providing a more detailed picture of their empirical characteristics and the diagnostics
for identifying them, and explaining their significance for theories of meaning. Although
this article follows in a long tradition of discussing ambiguity and vagueness together, a
goal of the article is to make it clear that these phenomena are distinct in their empirical
properties, in the factors that give rise to them, and in the analytical tools that can be
brought to bear on them. However, both present important challenges for semantics and
philosophy of language, and in particular, for a compositional, truth conditional theory
of meaning.
2. Ambiguity
2.1. Varieties of ambiguity
Ambiguity is associated with utterance chunks corresponding to all levels of linguistic
analysis, from phonemes to discourses, and is characterized by the association of a single
orthographic or phonological string with more than one meaning. Ambiguity can have
significant consequences, for example if the wording of a legal document is such that it
allows for interpretations that support distinct judgments. But it can also be employed
for humorous effect, as in the following examples from the 1980s British comedy series
A Bit of Fry and Laurie (created by Stephen Fry and Hugh Laurie).
(6) FRY: You have a daughter, I believe.
LAURIE: Yeah, Henrietta.
FRY: Did he? I'm sorry to hear that.That must'vc hurt.
(6) illustrates a case of phonological ambiguity, playing on the British comedians'
pronunciations of the name Henrietta and the sentence Henry ate her. (7) makes use of the
lexical ambiguity between the name Nancy and the British slang term nancy, which
means weak or effeminate when used as an adjective.
512 V. Ambiguity and vagueness
test of contradiction, which involves determining whether the same string of words or
phonemes (modulo the addition/subtraction of negation) can be used to simultaneously
affirm and deny a particular state of affairs. We saw this test illustrated with funny in (4)
above; the fact that (11) is read as a true contradiction shows that a merely indeterminate
term like cousin does not allow for this.
(11) Lily's mother is Sterling's father's sister. Since Lily is a girl, she is Sterling's cousin
but not his cousin.
A second important test involves identity of sense anaphora, such as ellipsis,
anaphoric too, pronominalization and so forth. As pointed out by Lakoff (1970), such
relations impose a parallelism constraint that requires an anaphoric term and its
antecedent to have the same meaning (cf. article 70 (Reich) Ellipsis, for a discussion of
parallelism in ellipsis), which has the effect of reducing the interpretation space of an
ambiguous expression. For example, consider (12), which can have either of the truth-
conditionally distinct interpretations paraphrased in (12a) and (12b), depending on
whether the subject the fish is associated with the agent or theme argument of the verb
eat.
(12) The fish is ready to eat.
a. The fish is ready to eat a meal.
b. The fish is ready to be eaten.
When (12) is the antecedent in an identity of sense anaphora construction, as in (13a-b),
the resulting structure remains two-ways ambiguous; it does not become four-ways
ambiguous: if the fish is ready to do some eating, then the chicken is too; if the fish is
ready to be eaten, then the chicken is too.
(13) a. The fish is ready to cat, and the chicken is ready to cat too.
b. The fish is ready to eat, but the chicken isn't.
That is, these sentences do not have understandings in which the fish is an agent and the
chicken is a theme, or vice-versa.
This fact can be exploited to demonstrate ambiguity by constructing test examples in
such a way that one of the expressions in the identity of sense relation is compatible only
with one interpretation of the ambiguous term or structure. For example, since potatoes
are not plausible agents of an eating event, the second conjunct of (14a) disambiguates
the first conjunct towards the interpretation in (14b). In contrast, (14c) is somewhat odd,
because children are typically taken to be agents of eating events, rather than themes,
but the context of a meal promotes the theme-rather than agent-based interpretation of
the first conjunct, creating a conflict.This conflict is even stronger in (14c), which strongly
implies either agentive potatoes or partially-cooked children, giving the whole sentence
a certain dark humour.
(14) a. The fish is ready to eat, but the potatoes are not.
b. ?The fish is ready to eat, but the children are not.
c. ??The potatoes are ready to eat, but the children are not.
23. Ambiguity and vagueness: An overview 513
Part of the humorous effect of (14c) is based on the fact that a sensible interpretation
of the sentence necessitates a violation of the default parallelism of sense imposed by
ellipsis, a parallism that would not be required if there weren't two senses to begin with.
The use of such violations for rhetorical effect is referred to as syllepsis, or sometimes
by the more general term zeugma.
2.3. Ambiguity and semantic theory
Ambiguity has played a central role in the development of semantic theory by
providing crucial data for both building and evaluating theories of lexical representation and
semantic composition. Cases of ambiguity are often "analytical choice points" which can
lead to very different conclusions depending on how the initial ambiguity is evaluated. For
example, whether scope ambiguities arc taken to be structural (reflecting different Logical
Forms), lexical (reflecting optional senses, possibly derived via type-shifting), or
compositional (reflecting indeterminacy in the application of composition rules) has consequences
for the overall architecture of a theory of the syntax-semantics interface, as noted above.
Consider also the ambiguity of adjectival modification structures such as (15) (first
discussed in detail by Bolinger 1967; see also Siegel 1976; McConnell-Ginet 1982; Cinque
1993; Larson 1998 and article 52 (Demonte) Adjectives), which is ambiguous between the
"intersective" reading paraphrased in (15a) and the "nonintersective" one paraphrased in
(15b).
(15) Olga is a beautiful dancer.
a. Olga is a dancer who is beautiful.
b. Olga is a dancer who dances beautifully.
Siegel (1976) takes this to be a case of lexical ambiguity (in the adjective beautiful),
and builds a theory of adjective meaning on top of this assumption. In contrast, Larson
(1998) argues that the adjectives themselves are unambiguous, and shows how the
different interpretations can be accommodated by hypothesizing that nouns (like verbs)
introduce a Davidsonian event variable, and that the adjective can take either the noun's
individual variable or its event variable as an argument. (The first option derives the
interpretation in (15a); the second derives the one in (15b).)
In addition to playing an important methodological role in semantic theory,
ambiguity has also been presented as a challenge for foundational assumptions of semantic
theory. For example, Parsons (1973) develops an argument which aims to show that the
existence of ambiguity in natural language provides a challenge for the hypothesis that
sentence meaning involves truth conditions (see Saka 2007 for a more recent version
of this argument). The challenge goes like this. Assume that a sentence 5 contains an
ambiguous term whose different senses give rise to distinct truth conditions p and q (as
in the case of (15)),such that (I6a-b) hold.
(16) a. 5 is true if and only if p
b. 5 is true if and only if q
c. p if and only if q
But (16a-b) mutually entail (16c), which is obviously incorrect, so one of our
assumptions must be wrong; according to Parsons, the problematic assumption is the one that
518 V. Ambiguity and vagueness
of interpretive uncertainty completely by denying that fixing the reference of pronouns
is part of semantics.
In each of the cases discussed above, the key to responding to Parsons' challenge was
to demonstrate that standard (or at least reasonable) assumptions about lexico-syntactic
representation and composition support the view that observed variability in truth-
conditions has a representational basis: in each case, the mappings from representations
to meanings are one-to-one, but the mappings from representations (and meanings)
to pronunciations are sometimes many-to-one. But what if standard assumptions fail
to support such a view? In such a case, Parsons' challenge would reemerge, unless a
representational distinction can be demonstrated.
One of the strongest cases of this sort comes from the work of Charles Travis, who
discusses a particular form of truth conditional variability associated with color terms (see
e.g.,Travis 1985,1994,1997).The following passage illustrates the phenomenon:
A story. Pia's Japanese maple is full of russet leaves. Believing that green is the colour of
leaves, she paints them. Returning, she reports. That's better.The leaves are green now.' She
speaks truth. A botanist friend then phones, seeking green leaves for a study of green-leaf
chemistry. 'The leaves (on my tree) are green,' Pia says. 'You can have those.' But now Pia
speaks falsehood. (Travis 1997:89)
This scenario appears to show that distinct utterances of the words in (27), said in order
to describe the same scenario (the relation between the leaves and a particular color),
can be associated with distinct truth values.
(27) The leaves are green.
Following the line of reasoning initially advanced by Parsons, Travis concludes from
this example that sentence meaning is not truth conditional; that instead, the semantic
value of a sentence at most imposes some necessary conditions under which it may be
true (as well as conditions under which it may be used), but those conditions need not
be sufficient, and the content of the sentence does not define a function from contexts
to truth.
Travis' skeptical conclusion is challenged by Kennedy & McNally (2010), however,
who ask us to consider a modified version of the story of Pia and her leaves. Now she has
a pile of painted leaves of varying shades of green (pile A) as well as a pile of naturally
green leaves, also of varying shades (pile B). Pia's artist friend walks in and asks if she
can have some green leaves for a project. Pia invites her to sort through the piles and
take whichever leaves she wants. In sorting through the piles, the artist might utter any
of the sentences in (28) in reference either to leaves from pile A or to leaves from pile B,
as appropriate based on the way that they manifest green: the particular combination of
hue, saturation, and brightness, extent of color, and so forth.
(28) a. These leaves are green.
b. These leaves are greener than those.
c. These leaves aren't as green as those.
d. These leaves are less green than those.
520 V. Ambiguity and vagueness
3. Vagueness
3.1. The challenge of vagueness
It is generally accepted that the locus of vagueness in sentences like (29) is the predicate
headed by the gradable adjective expensive: this sentence is vague because, intuitively,
what it means to count as expensive is unclear.
(29) The coffee in Rome is expensive.
Sentences like (29) have three distinguishing characteristics, which have been the focus
of much work on vagueness in semantics and the philosophy of language. The first is
contextual variability in truth conditions: (29) could be judged true if asserted as part of
a conversation about the cost of living in Rome vs. Naples (In Rome, even the coffee is
expensive!), for example, but false in a discussion of the cost of living in Chicago vs. Rome
(The rents are high in Rome, but at least the coffee is not expensive!). This kind of
variability is of course not restricted to vague predicates - for example, the relational noun
citizen introduces variability because it has an implicit argument (citizen ofx) but it is not
vague - though all vague predicates appear to display it.
The second feature of vagueness is the existence of borderline cases. For any context,
in addition to the sets of objects that a predicate like is expensive is clearly true of and
clearly false of, there is typically a third set of objects for which it is difficult or impossible
to make these judgments. Just as it is easy to imagine contexts in which (29) is clearly true
and contexts in which it is clearly false, it is also easy to imagine a context in which such a
decision cannot be so easily made. Consider, for example, a visit to a coffee shop to buy a
pound of coffee. The Mud Blend at $1.50/pound is clearly not expensive, and the Organic
Kona at $20/pound is clearly expensive, but what about the Swell Start Blend at $9.25/
pound? A natural response is "I'm not sure"; this is the essence of being a borderline case.
Finally, vague predicates give rise to the Sorites Paradox, illustrated in (30).
(30) The Sorites Paradox
PI. A $5 cup of coffee is expensive (for a cup of coffee).
P2. Any cup of coffee that costs 1 cent less than an expensive one is expensive (for
a cup of coffee).
C. Therefore, any free cup of coffee is expensive.
The structure of the argument appears to be valid, and the premises appear to be true,
but the conclusion is without a doubt false. Evidently, the problem lies somewhere in the
inductive second premise; what is hard is figuring out exactly what goes wrong. And even
if we do, we also need to explain both why it is so hard to detect the flaw and why we arc
so willing to accept it as valid in the first place.
These points are made forcefully by Fara (2000), who succinctly characterizes the
challenges faced by any explanatorily adequate account of vagueness in the form of the
following three questions:
(31) a. The Semantic Question
If the inductive premise of a Sorites argument is false, then is its classical
negation - the sharp boundaries claim that there is an adjacent pair in a sorites
23. Ambiguity and vagueness: An overview
523
variability, borderline cases, and the Sorites Paradox as a guide, we can find vague
terms in all grammatical classes: nouns (like heap, which gives the Sorites Paradox its
name), verbs (such as like, or more significantly, know), determiners (such as many and
few), prepositions (such as near) and even locative adverbials, as in the Fry and Laurie
dialogue in (33).
(33) FRY: There are six million people out there....
LAURIE: Really? What do they want?
Here the humor of Laurie's response comes from the vagueness of out there: whether
it extends to cover a broad region beyond the location of utterance (Fry's intention)
or whether it picks out a more local region (Laurie's understanding). Vagueness is thus
pervasive, and its implications for the analysis of linguistic meaning extend to all parts
of the lexicon.
3.2. Approaches to vagueness
It is impossible to summarize all analysis of vagueness in the literature, so I will focus
here on an overview of four major approaches, based on supervaluations, epistemic
uncertainty, contextualism, and interest relativity. For a larger survey of approaches, see
Williamson (1994); Keefe & Smith (1997); and Fara & Williamson (2002).
3.2.1 Supervaluations
Let's return to one of the fundamental properties of vague predicates: the existence of
borderline cases. (34a) is clearly true; (34b) is clearly false; (34c) is (at least potentially)
borderline.
(34) a. Mercury is close to the sun.
b. Pluto is close to the sun.
c. The earth is close to the sun.
However, even if we are uncertain about the truth of (34c), we seem to have clear
intuitions about (35a-b): the first is a logical truth, and the second is a contradiction.
(35) a. The earth is or isn't close to the sun.
b. The earth is and isn't close to the sun.
This is not particularly surprising, as these sentences are instances of (36a-b):
(36) a. pw -7?
b. p/\ ->p
As noted by Fine (1975), these judgments show that logical relations (such as the Law
of the Excluded Middle in (36a)) can hold between sentences which do not themselves
have clear truth values in a particular context of utterance. Fine accepts the position
that sentences involving borderline cases, such as (34c), can fail to have truth values
524 V. Ambiguity and vagueness
in particular contexts (this is what it means to be a borderline case), and accounts for
judgments like (35a) by basing truth valuations for propositions built out of logical
connectives not on facts about specific contexts of evaluation, but rather on a space of
interpretations in which all "gaps" in truth values have been filled. In other words, the truth
of examples like (35a-b) is based on supervaluations (van Fraassen 1968,1969) rather
than simple valuations.
The crucial components of Fine's theory are stated in (37)-(39); a similar set of
proposals (and a more comprehensive linguistic analysis) can be found in Kamp (1975) and
especially Klein (1980).
(37) Specification space
A partially ordered set of points corresponding to different ways of specifying the
predicates in the language; at each point, every proposition is assigned true, false
or nothing according to an "intuitive" valuation. This valuation must obey certain
crucial constraints, such as Fine's Penumbral Connections which ensure e.g., that if x
is taller than y, it can never be the case that x is tall is false while y is tall is true (cf.
the Consistency Postulate of Klein 1980).
(38) Completability
Any point can be extended to a point at which every proposition is assigned a truth
value, subject to the following constraints:
a. fidelity:Truth values at complete points are 1 or 0.
b. stability: Definite truth values are preserved under extension.
(39) Supertruth
A proposition is supcrtruc (or supcrfalsc) at a partial specification iff it is true
(false) at all complete extensions.
According to this approach, the reason that we have clear intuitions about (35a) and
(35b) is because for any ways of making things more precise, we're always going to end
up with a situation where the former holds for any proposition and the latter fails to hold,
regardless of whether the proposition has a truth value at the beginning. In particular,
given (38), (35a) is supertrue and (35b) is superfalse.
This theory provides an answer to Fara's Semantic Question about vagueness.
According to this theory, any complete and admissible specification will entail a sharp
boundary between the things that a vague predicate is true and false of. This renders
inductive statements like the second premise of (40) superfalse, even when the argument
as a whole is evaluated relative to a valuation that does not assign truth values to some
propositions of the form x is heavy, i.e., one that allows for borderline cases.
(40) a. A 100 kilogram stone is heavy.
b. Any stone that weighs 1 gram less than a heavy one is heavy.
c. #A 1 gram stone is heavy.
The supervaluationist account thus gives up bivalence (the position that all propositions
are either true or false relative to particular assignments of semantic values), but still
manages to retain important generalizations from classical logic (such as the Law of the
526 V. Ambiguity and vagueness
eliminate borderline cases and identify where the second premise of the Sorites Paradox
fails?
Williamson's response comes in several parts. First, he assumes that meaning
supervenes on use; as such, a difference in meaning entails a difference in use, but not vice-
versa. Second, he points out that the meanings of some terms may be stabilized by natural
divisions (cf. Putnam's 1975 distinction between H20 and XYZ), while the meanings of
others (the vague ones) cannot be so stabilized: a slight shift in our disposition to say
that the earth is close to the sun would slightly shift the meaning of close to the sun.
The boundary is sharp, but not fixed. But this in turn means that an object around the
borderline of a vague predicate P could easily have been (or not been) P had the facts
(in particular, the linguistic facts) been slightly different - different in ways that arc too
complex for us to fully catalogue, let alone compute. Given this instability, we can never
really know about a borderline case whether it is or is not P.
This last point leads to the principle in (41), which is another way of saying that vague
knowledge requires a margin for error.
(41) The Margin for Error Principle
For a given way of measuring differences in measurements relevant to the application of
property P, there will be a small but non-zero constant c such that if jc and y differ in those
measurements by less than c and x is known to be P. then y is known to be P.
The upshot of this reasoning is that it is impossible to know whether x is P is true or false
when x and y differ by less than c. That's why we fail to reject the second premise of the
Sorites, and also why "big" changes make a difference. (If we replace 1 cent with 1 dollar
in (30), or 1 gram with 10 kilograms in (40), the paradox disappears.)
There are a number of challenges to this account, most of which focus on the central
hypothesis that we can be ignorant about core aspects of meaning and still somehow
manage to have knowledge of meaning at all. Williamson (1992) lists these challenges
and provides responses to them; here I focus on the question of whether this theory
is adequate as an account of vagueness. According to Fara (2000), it is not, because
although it addresses the Semantic and Epistemological Questions, it does not address
the Psychological one. In particular, there is no account of why we don't have the
following reaction to the inductive premise: "That's false! I don't know where the shift from
P to ->P is, so there are cases that I'm not willing to make a decision about, but I know it's
in there somewhere, so the premise must be false."
Williamson (1997) suggests the answer has to do with the relation between
imagination and experience.The argument runs as follows:
(42) i. It is impossible to gain information through imagination that cannot be gained
through experience,
ii. It is impossible to recognize the experience of the boundary transition in a
sorites sequence because the transition lacks a distinctive appearance,
iii. Therefore, it is impossible to imagine the transition.
This failure of imagination then makes it impossible to reject the inductive premise, since
doing so precisely requires imagination of the crucial boundary transition. However,
528 V. Ambiguity and vagueness
length. The epistemic account of vagueness provides no account of this fact (nor does
an unaugmented supervaluationist account, since it too needs to allow for contextual
shifting of standards of comparison). If the impossibility of crisp judgments in examples
like these involving definite descriptions and our judgments about the second premise
of the Sorites Paradox are instances of the same basic phenomenon, then the failure
of the epistemic account of vagueness to explain the former raises questions about its
applicability to the latter.
3.2.3. Contextualism and interest relativity
Raffman (1996) observes three facts about vague predicates and Sorites sequences.
First, as we have already discussed, vague predicates have context dependent extensions.
Second, when presented with a sorites sequence based on a vague predicate P, a
competent speaker will at some point stop (or start, depending on direction) judging P to
be true of objects in the sequence. Third, even if we fix the (external) context, the shift
can vary from speaker to speaker and from run to run. This is also a part of competence
with P.
These observations lead Raffman (1994, 1996) to a different perspective on the
problem of vagueness: reconciling tolerance (insensitivity to marginal changes) with
categorization (the difference between being red and orange, tall and not tall, etc.). She
frames the question in the following way: how can we simultaneously explain the fact
that a competent speaker seems to be able to apply incompatible predicates (e.g., red vs.
orange; tall vs. not tall) to marginally different (adjacent) items in the sequence and the
fact that people are unwilling to reject the inductive premise of the Paradox?
Her answer involves recognizing the fact that evaluation of a sorites sequence
triggers a context shift, which in turn triggers a shift in the extension of the predicate in
such a way as to ensure that incompatible predicates are never applied to adjacent pairs,
and to make (what looks like) a sequence of inductive premises all true. (For similar
approaches, sec Kamp 1981; Bosch 1983; Soamcs 1999.) This gives the illusion of validity,
but since there is an extension shift, the predicate at the end of the series is not the same
as the one at the beginning, so the argument is invalid.
There are three pieces to her account. The first comes from work in cognitive
psychology, which distinguishes between two kinds of judgments involved in a sorites
sequence. The first is categorization, which involves judgments of similarity to a
prototype/standard; the second is discrimination, which involves judgments of
sameness/difference between pairs. Singular judgments about items involve categorization, and it is
relative to such judgments that a cutoff point is established. Discrimination, on the other
hand, doesn't care where cutoff points fall, but it imposes a different kind of constraint:
adjacent pairs must be categorized in the same way (Tvcrsky & Kahncman 1974).
At first glance, it appears that the categorization/discrimination distinction just
restates the problem of vagueness in different terms: if for any run of a sorites sequence,
a competent speaker will at some point make a category shift, how do we reconcile such
shifts with the fact that we resist discrimination between adjacent pairs? Note that the
problem is not the fact that a speaker might eventually say of an object o in a sorites
sequence based on P that it is not P, even if she judged o/+1 to be P, because this is a
singular judgment about o^The problem is that given the pair (op o/+1 >, the speaker will
refuse to treat them differently. This is what underlies judgments about the inductive
23. Ambiguity and vagueness: An overview 529
premise of the Sorites Paradox and possibly the crisp judgment effects discussed above
as well, though this is less clear (see below).
The second part of Raffman's proposal is designed to address this problem, by
positing that a category shift necessarily involves a change in perspective such that the new
category instantaneously absorbs the preceding objects in the sequence. Such backwards
spread is the result of entering a new psychological state, a Gestalt shift that triggers a
move from one 'category anchor' or prototype to another, e.g., from the influence of the
red anchor to the influence of the orange one. This gives rise to the apparent
boundlessness of vague predicates: a shift in category triggers a shift in the border away from the
edge, giving the impression that it never was there in the first place.
And in fact, as far as the semantics is concerned, when it comes to making judgments
about pairs of objects, it never is. This is the third part of the analysis, which makes crucial
appeal to the context dependence of vague predicates. Raffman proposes that the meaning
of a vague predicate P is determined by two contextual factors. The external context
includes discourse factors that fix domain, comparison class, dimension, etc. of P. The
internal context includes the properties of an individual's psychological state that
determine dispositions to make judgments of P relative to some external context. Crucially, a
category shift causes a change in internal context e.g., (from a state in which the red anchor
dominates to one in which the orange anchor does), which in turn results in a change in
the extension of the predicate in the way described above, resulting in backwards spread.
Taken together, these assumptions provide answers to each of Fara's questions. The
answer to the Semantic Question is clearly positive, since the commitment to category
shifts involves a commitment to the position that a vague predicate can be true of one
member of an adjacent pair in a sorites sequence oj and false of o/+[. The reason we
cannot say which (ojf o/+1) has this property, however, is that the act of judging o; to be
not P (or P) causes a shift in contextual meaning of P to P\ which, given backwards
spread, treats oj and o.+/ the same. This answer to the Epistcmological Question also
underlies the answer to the Psychological Question: even though the inductive premise
of the Sorites Paradox is false for any fixed meaning of a vague predicate, we think that
it is true because it is possible to construct a sequence of true statements that look like
(instantiations of) the inductive premise, but which in fact do not represent valid
reasoning because they involve different contextual valuations of the vague predicate. For
example, if we are considering a sequence of 100 color patches \pv pv ... pm\ ranging
from 'pure' red to 'pure' orange, such that a category shift occurs upon encountering
patch p4V the successive conditional statements in (46a) and (46c) work out to be true
thanks to backwards spread (because their subconstituents are both true and both
false in their contexts of evaluation, respectively), even though red means something
different in each case.
(46) a. If p45 is red, then p^ is red.
PwP** \\red\\c
TRUE-> TRUE ^ TRUE
b. shift at p41: change from context c to context c'
c. If p^ is red, then p41 is red.
p*>p*7* Wred\Y
FALSE -> FALSE |=TRUE
23. Ambiguity and vagueness: An overview 531_
relativity, will cause the extension of the predicate to shift in a way that ensures that they
are evaluated in the same way. In Fara's words:"the boundary between the possessors and
the lackers in a sorites series is not sharp in the sense that we can never bring it into focus;
any attempt to bring it into focus causes it to shift somewhere else." (Fara 2000:75-76)
One of the reasons that the contextualist and interest relative analyses provide
compelling answers to the Psychological Question is that they are inherently psychological:
the former in the role that psychological state plays in fixing context sensitive
denotations; the latter in the role played by interest relativity. Moreover, in providing an
explanation for judgments about pairs of objects, these analyses can support an explanation of
the 'crisp judgment' effects discussed in the previous section, provided they can be linked
to a semantics that appropriately distinguishes the positive and comparative forms
of a scalar predicate. However, the very aspects of these analyses that are central to
their successes also raise fundamental problems that question their ultimate status as
comprehensive accounts of vagueness, according to Stanley (2003).
Focusing first on the contextualist analysis, Stanley claims that it makes incorrect
predictions about versions of the Sorites that involve sequential conditionals and ellipsis.
Stanley takes the contextualist to be committed to a view in which vague predicates are
a type of indexical expression. Indexicals, he observes, have the property of remaining
invariant under ellipsis: (48b), for example, cannot be used to convey the information
expressed by (48a).
(48) a. Kim voted for that^ candidate because Lee voted for thatfl candidate.
b. Kim voted for that^ candidate because Lee did vote for thatfl candidate.
Given this, Stanley argues that if the contextualist account of vagueness entails that
vague predicates are indexicals, then our judgments about sequences of conditionals
like (46) (keeping the context the same) should change when the predicates are elided.
Specifically, since ellipsis requires indexical identity, it must be the case that the elided
occurrences of red in (49) be assigned the same valuation as their antecedents, i.e. that
[[red]Y=[[red]y.
(49) a. If p45 is red, then p^ is red too.
PwP** \\red\\c
TRUE-> TRUE |= TRUE
b. shift at p47: change from context c to context c'
c. If p^ is red, then p41 is red too.
P«,^[[red]r\p47e[[red]r
TRUE -> FALSE |= FALSE
But this is either in conflict with backwards spread, in which case ellipsis should be
impossible, or it entails that (49c) should be judged false while (49a) is judged true. Neither of
these predictions arc borne out: the judgments about (49) arc in all relevant respects
identical to those about (46). Raffman (2005) responds to this criticism by rejecting the
view that the contextualist account necessarily treats vague predicates as indexicals, and
suggests that the kind of 'shiftability' of vague predicates under ellipsis that is
necessary to make the account work is analogous to what we see with comparison classes in
examples like (50a), which has the meaning paraphrased in (50b) (Klein 1980).
532 V. Ambiguity and vagueness
(50) a. That elephant is large and that flea is too.
b. That elephant is large for an elephant and that flea is large for a flea.
This is probably not the best analogy, however: accounts of comparison class shift
in examples like (50a) rely crucially on the presence of a binding relation between
the subject and a component of the meaning of the predicate (see e.g., Ludlow 1989;
Kennedy 2007), subsuming such cases under a general analysis of 'sloppy identity' in
ellipsis. If a binding relation of this sort were at work in (49), the prediction would
be that the predicate in the consequent of (49a) and the antecedent of (49b) should
be valued in exactly the same way (since the subjects are the same), which, all else
being equal, would result in exactly the problematic judgments about the truth and
falsity of the two conditionals that Stanley discusses. In the absence of an alternative
contextualist account of how vague predicates should behave in ellipsis, then, Stanley's
objection remains in force.
This objection does not present a problem for Fara's analysis, which assumes that
vague predicates have fixed denotations. However, the crucial hypothesis that these
denotations are interest relative comes with its own problems, according to Stanley. In
particular, he argues that this position leads to the implication that the meaning of a
vague predicate is always relativized to some agent, namely the entity relative to whom
significance is assessed. But this implication is inconsistent with the fact that we can have
beliefs about the truth or falsity of a sentence like (51) without having beliefs about any
agent relative to whom Mt. Everest's height is supposed to be significant.
(51) Mt. Everest is tall.
Moreover, the truth of the proposition conveyed by an utterance of (51) by a particular
individual can remain constant even in hypothetical worlds in which that individual
doesn't exist, something that would seem to be impossible if the truth of proposition
has something to do with the utterer's interests. Fara (2008) rejects Stanley's criticism on
the grounds that it presumes that an "agent of interest" is a constituent of the
proposition expressed by (51), something that is not the case given her particular assumptions
about the compositional semantics of the vague scalar predicates she focuses on, which
builds on the decompositional syntax and semantics of Kennedy (1999, 2007) (see also
Bartsch & Venneman 1972). However, to the extent that her analysis creates entailments
about the existence of individuals with the relevant interests, it is not clear that Stanley's
criticisms can be so easily set aside.
4. Conclusion
Ambiguity and vagueness are two forms of interpretive uncertainty, and as such, are
often discussed in tandem.They are fundamentally different in their essential features,
however, and in their significance for semantic theory and the philosophy of language.
Ambiguity is essentially a "mapping problem", and while there are significant analytical
questions about how (and at what level) to best capture different varieties of ambiguity,
the phenomenon per se does not represent a significant challenge to current conceptions
of semantics. Vagueness, on the other hand, raises deeper questions about knowledge
23. Ambiguity and vagueness: An overview 533
of meaning. The major approaches to vagueness that I have outlined here, all of which
come from work in philosophy of language, provide different answers to these
questions, but none is without its own set of challenges. Given this, as well as the fact that this
phenomenon has seen relatively little in the way of close analysis by linguists, vagueness
has the potential to be an important and rich domain of research for semantic theory.
5. References
Barker. Chris 2002.The dynamics of vagueness. Linguistics & Philosophy 25.1-36.
Bartsch, Renate & Theo Vennemann 1972. The grammar of relative adjectives and comparison.
Linguistische Berichte 20,19-32.
Bierwisch, Manfred 1989. The semantics of gradation. In: M. Bierwisch & E. Lang (eds.).
Dimensional Adjectives. Grammatical Structure and Conceptual Interpretation. Berlin: Springer, 71-261.
Bogustawski, Andrzej 1975. Measures are measures. In defence of the diversity of comparatives and
positives. Linguistische Berichte 36,1- 9.
Bolinger, Dwight 1967. Adjectives in English. Attribution and predication. Lingua 18,1-34.
Bosch, Peter 1983. 'Vagueness' is context-dependence. A solution to the Sorites Paradox. In:
T Ballmer & M. Pinkal (eds.). Approaching Vagueness. Amsterdam: North-Holland, 189-210.
Cinque, Guglielmo 1993. On the evidence for partial N movement in the Romance DP. University
of Venice Working Papers in Linguistics 3(2), 21-40.
Cresswell, Max J. 1976. The semantics of degree. In: B. Partee (ed.). Montague Grammar. New York:
Academic Press, 261-292.
Fara, Delia Graff 2000. Shifting sands. An interest-relative theory of vagueness. Philosophical
Topics 28,45-81. Originally published under the name "Delia Graff.
Fara. Delia Graff 2008. Profiling interest relativity. Analysis 68,326-335.
Fara. Delia Graff & Timothy Williamson (eds.) 2002. Vagueness. Burlington, VT: Ashgate.
Originally published under the name "Delia Graff.
Fine, Kit 1975. Vagueness, truth and logic. Synthese 30,265-300.
van Fraassen. Bas C. 1968. Presupposition, implication, and self-reference. Journal of Philosophy
65,136-152.
van Fraassen, Bas C. 1969. Presuppositions, supervaluations, and free logic. In: K. Lambert (ed.).
The Logical Way of Doing Things. New Haven, CT: Yale University Press, 69-91.
Gillon, Brendan S. 1990. Ambiguity, generality, and indeterminacy. Tests and definitions. Synthese
85,391^116.
Gillon, Brendan S. 2004. Ambiguity, indeterminacy, dcixis, and vagueness. Evidence and theory.
In: S. Davis & B. Gillon (eds.). Semantics. A Reader. Oxford: Oxford University Press, 157-187.
Gi ice, H. Paul 1957. Meaning. The Philosophical Review 66,377-388.
Heim, Irene 2000. Degree operators and scope. In: B. Jackson &T Matthews (eds.). Proceedings of
Semantics and Linguistic Theory (= SALT) X. Ithaca, NY: Cornell University, 40-64.
Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Blackwell.
Jacobson, Pauline 1999.Towards a variable-free semantics. Linguistics & Philosophy 22,117-184.
Jacobson, Pauline 2000. Paycheck pronouns, Bach-Peters sentences, and variable-free semantics.
Natural Language Semantics 8,77-155.
Kamp, Hans 1975. Two theories of adjectives. In: E. Keenan (ed.). Formal Semantics of Natural
Language. Cambridge: Cambridge University Press, 123-155.
Kamp, Hans 1981. The paradox of the heap. In: U. Monnich (ed.). Aspects of Philosophical Logic.
Dordrecht: Reidel, 225-277.
Kamp, Hans & Barbara H. Partee 1995. Prototype theory and compositionality. Cognition 57,
129-191.
Keefe, Rosanna & Peter Smith (eds.) 1997. Vagueness. A Reader. Cambridge, MA: The MIT
Press.
24. Semantic underspecification 535
Stanley, Jason 2003. Context, interest relativity, and the Sorites. Analysis 63,269-280.
von Stechow, A mini 1984. Comparing semantic theories of comparison. Journal of Semantics 3,
1-77.
Travis, Charles 1985. On what is strictly speaking true. Canadian Journal of Philosophy 15,
187-229.
Travis, Charles 1994. On constraints of generality. Proceedings of the Aristotelian Society 94,
165-188.
Travis, Charles 1997. Pragmatics. In: B. Hale & C. Wright (eds.). A Companion to the Philosophy of
Language. Oxford: Blackwell, 87-107.
Tversky, Amos & Daniel Kahneinan 1974. Judgment under uncertainty. Heuristics and biases.
Science 185(4157), 1124-1131.
Ward, Gregory 2004. Equatives and deferred reference. Language 80,262-289.
Ward, Gregory, Richard Sproat & Gail McKoon 1991. A pragmatic analysis of so-called anaphoric
islands. Language 67,439-474.
Wheeler, Samuel 1972. Attributives and their modifiers. Nous 6,310-334.
Williamson, Timothy 1992. Vagueness and ignorance. Proceedings of the Aristotelian Society.
Supplementary Volume 66,145-162.
Williamson,Timothy 1994. Vagueness. London: Routledge.
Williamson,Timothy 1997. Imagination, stipulation and vagueness. Philosophical Issues 8,215-228.
Zwicky, Arnold & Jerrold Sadock 1975. Ambiguity tests and how to fail them. In: J. P. Kimball (ed.).
Syntax and Semantics. 4. New York: Academic Press, 1-36.
Zwicky, Arnold & Jerrold Sadock 1984. A reply to Martin on ambiguity. Journal of Semantics 3,
249-256.
Christopher Kennedy, Chicago, IL (USA)
24. Semantic underspecification
1. Introduction
2. The domain of semantic underspecification
3. Approaches to semantic underspecification
4. Motivation
5. Semantic underspecification and the syntax-semantics interface
6. Further processing of underspecified representations
7. References
Abstract
This article reviews semantic underspecification, which has emerged over the last three
decades as a technique to capture several readings of an ambiguous expression in one
single representation by deliberately omitting the differences between the readings in
the semantic descriptions. After classifying the kinds of ambiguity to which
underspecification can be applied, important properties of underspecification formalisms
will be discussed that can be used to distinguish subgroups of these formalisms. The
remainder of the article then presents various motivations for the use of
underspecification, and expounds the derivation and further processing of underspecified semantic
representations.
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 535-574
536
V. Ambiguity and vagueness
1. Introduction
Underspecification is defined as the deliberate omission of information from linguistic
descriptions to capture several alternative realisations of a linguistic phenomenon in one
single representation.
Underspecification emerged in phonology (see Steriade 1995 or Harris 2007 for
an overview), where it was used e.g. for values of features that need not be specified
because they can be predicted independently, e.g., by redundancy rules or by
phonological processes. The price for this simplification, however, are additional layers or stages
in phonological processes/representations, which resurfaces in most approaches that use
underspecification in semantics.
In the 1980's, underspecification was adopted in semantics. Here the relevant linguistic
phenomenon is meaning, thus, underspecified representations are intended to capture
whole sets of different meanings in one representation. Since this applies only to sets
of meanings that correspond to the readings of a linguistic expression, semantic
underspecification emerges as a technique for the treatment of ambiguity. (Strictly speaking,
underspecification could be applied to semantic indefiniteness in general, which also
encompasses vagueness, see Pinkal 1995 and article 23 (Kennedy) Ambiguity and
vagueness. But since underspecification focusses almost exclusively on ambiguity, vagueness
will be neglected.)
While underspecification is not restricted to expressions with systematically
related sets of readings (as opposed to homonyms), it is in practice applied to such
expressions only. The bulk of the work in semantic underspecification focusses on
scope ambiguity.
In natural language processing, underspecification is endorsed to keep semantic
representations of ambiguous expressions tractable and to avoid unnecessary
disambiguation steps; a completely new use of underspecification emerged in hybrid processing,
where it provides a common format for the results of deep and shallow processing.
Underspecification is used also in syntax and discourse analysis to obtain compact
representations whenever several structures can be assigned to a specific sentence
(Marcus, Hindle & Fleck 1983; Rambow, Weir & Shanker 2001; Muskens 2001) or
discourse, respectively (Asher & Fernando 1999; Duchier & Gardent 2001; Schilder 2002;
Egg & Redeker 2008; Regneri, Egg & Roller 2008).
This article gives an overview over underspecification techniques in semantics. First
the range of phenomena in semantics to which underspecification (formalisms) can be
applied is sketched in section 2. Section 3 outlines important properties of
underspecification formalisms which distinguish different subgroups of these formalisms. Various
motivations for using underspecification in semantics are next outlined in section 4.
The remaining two sections focus on the derivation of underspecified semantic
representations by a suitable syntax-semantics interface (section 5) and on the further
processing of these representations (section 6).
2. The domains of semantic underspecification
Before introducing semantic underspecification in greater detail, ambiguous expressions
that are in principle amenable to a treatment in terms of semantic underspecification
will be classified according to two criteria. These criteria compare the readings of these
538 V. Ambiguity and vagueness
(4) Every researcher of a company saw most samples.
(5) [Every man]( read a book he(. liked.
(6) Every linguist attended a conference, and every computer scientist did, too.
The subject in (4) exhibits nested quantification, where one quantifier-introducing DP
comprises another one. (4) is challenging because the number of its readings is less than
the number of the possible permutations of its quantifiers (3! = 6). The scope ordering
that is ruled out in any case is V > most7 > 3 (Hobbs & Shieber 1987). (See section 3.1. for
further discussion of this example.)
In (5), the anaphoric dependency of a book he liked on every man restricts the
quantifier scope ambiguity in that the DP with the anaphor must be in the scope of its
antecedent (Reyle 1993).
In (6), quantifier scope is ambiguous, but must be the same in both sentences (i.e., if
every linguist outscopes a conference, every computer scientist does, too), which yields two
readings. In a third reading, a conference receives scope over everything else, i.e., both
linguists and computer scientists attending the same conference (Hirschbiihler 1982;
Crouch 1995; Dalrymple, Shieber & Pereira 1991; Shieber, Pereira & Dalrymple 1996;
Egg, Roller & Nichrcn 2001).
Other scope-bearing items can also enter into scope ambiguity, e.g., negation and
modal expressions, as in (7) and (8):
(7) Everyone didn't come. (V > -i or -i > V)
(8) A unicorn seems to be in the garden. (3 > seem or seem > 3)
Such cases can also be captured by underspecification. For instance, Minimal Recursion
Semantics (Copestake et al. 2005) describes them by underspecifying the scope of the
quantifiers and fixing the other scope-bearing items scopally.
But cases of scope ambiguity without quantifiers show that underspecifying quantifier
scope only is not general enough. For instance, cases of 'neg raising' (Sailer 2006) like in
(9) have a reading denying that John believes that Peter will come, and one attributing to
John the belief that Peter will not come:
(9) John doesn't think Peter will come.
Sailer analyses these cases as a scope ambiguity between the matrix verb and the
negation (whose syntactic position is invariably in the matrix clause).
Other such examples involve coordinated DPsJike in (10), (11),or (12) (Hurum 1988;
Babko-Malaya 2004; Chaves 2005b):
(10) I want to marry Peggy or Sue.
(11) Every man and every woman solved a puzzle.
(12) Every lawyer and his secretary met.
540 V. Ambiguity and vagueness
However, the definition of semantically and syntactically homogeneous ambiguity
includes also cases where the readings consist of the same building blocks, but differ in
that some of the readings exhibit more than one instance of specific building blocks.
This kind of semantically and syntactically homogeneous ambiguity shows up in the
ellipsis in (17). Its two readings 'John wanted to greet everyone that Bill greeted' and
'John wanted to greet everyone that Bill wanted to greet' differ in that there is one
instance of the semantic contribution of the matrix verb want in the first reading, but two
instances in the second reading (Sag 1976):
(17) John wanted to greet everyone that Bill did.
This is due to the fact that the pro-form did is interpreted in terms of a suitable preceding
VP (its antecedent), and that there are two such suitable VPs in (17), viz., wanted to
greet everyone that Bill did and greet everyone that Bill did. ((17) is a case of antecedent-
contained deletion, see Shieber, Pereira & Dalrymple 1996 and Egg, Roller & Niehren
2001 for underspecified accounts of this phenomenon;see also article 70 (Reich) Ellipsis.)
Another example of this kind of semantically and syntactically homogeneous
ambiguity is the Afrikaans example (18) (Sailer 2004). Both the inflected form of the matrix
verb wou 'wanted' and the auxiliary het in the subordinate clause introduce a past tense
operator. But these examples have three readings:
(18) Jan wou gebel het.
Jan want. PAST called have
'Jan wanted to call/Jan wants to have called/Jan wanted to have called.'
The readings can be analysed (in the order given in (18)) as (19a-c):That is, the readings
comprise one or two instances of the past tense operator:
(19) a. PAST(wanf (j; (call'(j))))
b. want'(j; PAST(call'(j)))
c. PASTtwairfCj; PAST(call'(j))))
Finally, the criterion 'syntactically and semantically homogeneous' as defined in this
subsection will be compared to similar classes of ambiguity from the literature. Syntactic and
semantic homogeneity is sometimes referred to as structural ambiguity. But this term is
itself ambiguous in that it is sometimes used in the broader sense of 'semantically
homogeneous' (i.e., syntactically homogeneous or not). But then it would also encompass the
group of semantically but not syntactically homogeneous ambiguities discussed in the
next subsection.
The group of semantically and syntactically homogeneous ambiguities coincides by
and large with Bunt's (2007) 'structural semantic ambiguity' class, excepting ambiguous
compounds like math problem and the collective/distributive ambiguity of quantifiers,
both of which are syntactically but not semantically homogeneous: Different readings of
a compound each instantiate an unspecific semantic relation between the components
in a unique way. Similarly, distributive and quantitative readings of a quantifier arc
distinguished in the semantics by the presence or absence of a distributive or collective
operator, e.g., Link's (1983) distributive D-operator (see article 46 (Lasersohn) Mass
nouns and plurals).
546 V. Ambiguity and vagueness
The description (28) can then be read as follows: The fragment at the top consists
of a hole only, i.e., we do not yet know what the described representations look like.
However, since the relation R relates this hole and the right and the left fragment, they
are both part of these representations - only the order is open. Finally, the holes in both
the right and the left fragment are related to the bottom fragment in terms of /?, i.e., the
bottom fragment is in the scope of either quantifier. The only semantic representations
compatible with this description are (27a-b).
To derive the described readings from such a constraint (its solutions), the relation
R between holes and fragments is monotonically strengthened until all the holes are
related to a fragment, and all the fragments except the one at the top are identified with
a hole (this is called 'plugging' in Bos 2004).
In (28), one can strengthen R by adding the pair consisting of the hole in the left
fragment and the right fragment. Here the relation between the hole in the universal
fragment and the bottom fragment in (28) is omitted.
(29) p
\/x. woman'(jc) -» □
3y. man'(_y)A □
love'(oc, y)
Identifying the hole-fragment pairs in R in (29) then yields (27a), one of the solutions of
(28). The other solution (27b) can be derived by first adding to R the pair consisting of
the hole in the right fragment and the left fragment.
Underspecification formalisms that implement scope in this way comprise Under-
specified Discourse Representation Theory (UDRT; Reyle 1993; Reyle 1996; Frank &
Reyle 1995), Minimal Recursion Semantics (MRS; Copestake et al. 2005), the Constraint
Language for Lambda Structures (CLLS; Egg, Koller & Niehren 2001), the language of
Dominance Constraints (DQsubsumed by CLLS;Althauset al.2001),Hole Semantics (HS;
Bos 1996; Bos 2004; Kallmcycr & Romero 2008), and Logical Description Grammar
(Muskens2001).
Koller, Niehren & Thater (2003) show that expressions of HS can be translated
into expressions of DC and vice versa; Fuchss et al. (2004) describe how to translate
MRS expressions into DC expressions. Player (2004) claims that this is due to the
fact that UDRT, MRS, CLLS, and HS are the same 'modulo cosmetic differences',
however, his comparison does not pertain to CLLS but to the language of dominance
constraints.
Scope relations like the one between a quantifying DP and the verb it is an argument
of can also be expressed in terms of suitable variables. This is implemented e.g. in the
Undcrspccicd Semantic Description Language (USDL; Pinkal 1996, Niclircn, Pinkal &
Ruhrberg 1997; Egg & Kohlhase 1997 present a dynamic version of this language). In
USDL, the constraints for (26) are expressed by the equations in (30):
(30) a. X0 = C,(every_woman@L^ (C2(love@*2@*,)))
b. X0 = C3(a_man@Lr (C4(love@jc2@jc,)))
24. Semantic underspecification 547
Here 'every_woman' and 'a_man' stand for the the two quantifiers in the
semantics of (26), '@' denotes explicit functional application in the metalanguage, and 'Lr',
X-abstraction over x.
These equations can now be solved by an algorithm like the one in Huet (1975). For
instance, for the V3-reading of (26), the variables would be resolved as in (31a-c).This
yields (31d), whose right hand side corresponds to (27a):
(31) a. C, = C4 = X£/>
b. C2 = XP.a_man@Lr (P)
c. C3 = XP.every_woman@Lr (/>)
d. X0 = every_woman@Lj (a_man@L^ (love@ac2@ac,))
Another way to express such scope relations is used in the version of the Quasi-Logical
Form (QLF) in Alshawi & Crouch (1992), viz., list-valued metavariables in semantic
representations whose specification indicates quantifier scope. Consider e.g. the (simplified)
representation for (26) in (32a), which comprises an underspecified scoping list (the
variable _s before the colon). Here the meanings of every woman and a man are represented
as complex terms; such terms comprise term indices (+m and +w) and the restrictions of
the quantifiers (man and woman, respectively). Specifying this underspecified reading to
the reading with wide scope for the universal quantifier then consists in instantiating the
variable _s to the list [+w,+m] in (32b), which corresponds to (27a):
(32) a. _s:love(term(+w,...,woman,...),term(+m,...,man,...))
b. [+w,+m]: love (term (+w,...,woman,...), term (+m,...,man,...))
Even though QLF representations seem to differ radically from the ones that use
dominance constraints, Lev (2005) shows how to translate them into expressions of Hole
Semantics, which is based on dominance relations.
Finally, I will show how Glue Language Semantics (GLS; Dalrymple et al. 1997;
Crouch & van Genabith 1999; Dalrymple 2001) handles scope ambiguity. Each
lexical item introduces so-called meaning constructors that relate syntactic constituents
(I abstract away from details of the interface here) and semantic representations.
For instance, for the proper name John, the constructor is 'DP ~ John", which states
that the DP John has the meaning John' ('-' relates syntactic constituents and their
meanings).
Such statements can be arguments of linear logic connectives like the conjunction ®
and the implication — , e.g., the meaning constructor for love:
(33) V* Y.DPsiibj ~X® DPobj ~Y~S~ \ovc\X, Y)
In prose: Whenever the subject interpretation in a sentence S is X and the object
interpretation is V, then the 5 meaning is love'(A', y).That is these constructors specify how
the meanings of smaller constituents determine the meaning of a larger constituent.
The implication — is resource-sensitive: 'A — B" can be paraphrased as'use a resource
A to derive (or produce) fl'.The resource is 'consumed' in this process, i.e., no longer
available for further derivations.Thus, from A and A — B one can deduce B, but no longer
548 V. Ambiguity and vagueness
A. For (33), this means that after deriving the 5 meaning the two DP interpretations
are no longer available for further processes of semantic construction (consumed).
The syntax-semantics interface collects these meaning constructors during the
construction of the syntactic structure of an expression. For ambiguous expressions such
as (26), the resulting collection is an underspecified representation of its different
readings. Representations for the readings of the expression can then be derived from this
collection by linear-logic deduction.
In the following, the presentation is simplified in that DP-internal semantic
construction is omitted and only the DP constructors are given:
(34) a. V//, P(Vx.DP ~x^H~, (jc)) - H ~ every/(woman', P)
b. VG, R.{Vy.DP ~y^G -,(>>)) - G ~ a'(man',/?)
The semantics of every woman in (34a) can be paraphrased as follows: Look for a
resource of the kind 'use a resource that a DP semantics is jc, to build the truth-valued
(subscript f of ~) meaning P(x) of another constituent //'.Then consume this resource
and assume that the semantics of H is every'(woman',P); here every' abbreviates the
usual interpretation of every.The representation for a man works analogously.
With these constructors for the verb and its arguments, the semantic representation
of (26) in GLS is (35d), the conjunction of the constructors of the verb and its arguments.
Note that semantic construction has identified the DPs that are mentioned in the three
constructors:
(35) a. V//, P(Vx.DPsiibj ~x^H~, P(x)) - H ~ every'( woman', P)
b. VG, R.(Vy.DPobj ~y^G~t R(y)) - G ~ *'(mm',R)
c. VX, Y DPslibj ~X® DPobj ~Y~S~ love'(*, Y)
d. (35a)®(35b)®(35c)
From such conjunctions of constructors, fully specified readings can be derived. For (26),
the scope ambiguity is modelled in GLS in that two different semantic representations
for the sentence can be derived from (35d).
Either derivation starts with choosing one of the two possible specifications of the
verb meaning in (35c), which determine the order in which the argument interpretations
are consumed:
(36) a. V*.DPsubj ~ X- (VKDPobj ~ Y - 5 ~ love'(*, Y))
b. VY. DPobj ~Y~ (V*. DPsubj ~ X - S ~ love'(*, Y))
I will now illustrate the derivation of the reading of V3-reading of (26). The next step
uses the general derivation rule (37) and the instantiations in (38):
(37) from A - B and B - C one can deduct A - C
(38) G - 5, Y - y, and R - Xy.love'(X,y))
From specification (36a) and the object semantics (35b) we then obtain (39a), this goes
then together with the subject semantics (35a) to yield (39b), a notational variant of (27a):
24. Semantic underspecification 549
(39) a. V*.DPsubj ~ X - 5 ~ a'(man', X>>.love'(*,y))
b. every'(woman', k.a'(man', Xy.love'(.*, _y))
The derivation for the other reading of (26) chooses the other specification (36b) of the
verb meaning and works analogously.
3.1.2. A more involved example
After this expository account of the way that the simple ambiguity of (26) is captured
in various underspecification formalisms, reconsider the more involved nested
quantification in (40) [= (4)], whose constraint is given in (41).
(40) Every researcher of a company saw most samples
(41) a
3y. company' (y) a □ Vac.(researcher' (x)a a)->a most' (sample',Xz □)
of (x, y) see' (x, z)
As expounded in section 2.1, not all scope relations of the quantifiers are possible in (40).
I assume that (40) has exactly five readings, there is no reading with the scope ordering
V>most' > 3.(41) is a suitable underspecified representation of (40) in that it has exactly
its five readings as solutions.
As a first step of disambiguation, we can order the existential and the universal
fragment. Giving the former narrow scope yields (42):
(42)
Vx. (researcher' (r) a □) ->□ most' (sample', Xz □)
By.company' (y )a □
of (x, y) see' (x, z)
But once the existential fragment is outscopcd by the universal fragment, it can no longer
interact scopally with the most - and the see-fragment, because it is part of the
restriction of the universal quantifier. That is, there are two readings to be derived from (42),
with the mosf-fragment scoping below or above the universal fragment. This rules out a
reading in which most scopes below the universal, but above the existential quantifier:
(43) a. Vac.(researcher'(ac) a By.compan/^) a of (x,y)) -> most'(sample', Xz.see'(x, z))
b. most/(sample/, \zVx.(researcher'(x) a 3y.company'(>') a of '(x,y)) -> see'(x,z))
The second way of fixing the scope of the existential w.r.t. the universal quantifier in (41)
gives us (44):
24. Semantic underspecification 551_
(47) love'«forall x woman'(oc)>, (exists y man'^)))
(47) renders the semantics of DPs as terms, i.e., scope-bearing expressions whose scope
has not been determined yet. Terms are triples of a quantifier, a bound variable, and a
restriction.
The set of fully specified representations encompassed by such a representation is
then determined by a resolution algorithm. The algorithm 'discharges' terms, i.e.,
integrates them into the rest of the representation, which determines their scope. (Formally,
a term is replaced by its variable, then quantifier, variable, and restriction arc prefixed
to the resulting expression.) For instance, to obtain the representation (27a) for the
'V3'-reading of (26) the existential term is discharged first, which yields (48):
(48) 3y.mm'(y) a love'((forall x woman'(*)), y)
Discharging the remaining term then yields (27a); to derive (27b) from (47), one would
discharge the universal term first. Such an approach is adopted e.g. in the Core Language
Engine version of Moran (1988) and Alshawi (1992).
Hobbs & Shieber (1987) show that a rather involved resolution algorithm is needed to
prevent overgeneration in more complicated cases of scope ambiguity, in particular, for
nested quantification. Initial representations for nested quantification comprise nested
terms, e.g., the representation (49) for (40):
(49) see'((forall x researcher'^) a ot'(x, (exists y company'^)))), (most z sample'(z)))
Here the restriction on the resolution is that the inner term may never be discharged
before the outer one, which in the case of (40) rules out the unwanted 6th possible
permutation of the quantifiers. Otherwise, this permutation could be generated by
discharging the terms in the order '3, most', V. Such resolution algorithms lend
themselves to a straightforward integration of preference rules such as 'each outscopes other
determiners', see section 6.4.
Other ways of handling nested quantification by restricting resolution
algorithms for underspecified representations have been discussed in the literature. First,
one could block vacuous binding (even though vacuous binding does not make
formulae ill-formed), i.e., requesting an appropriate bound variable in the scope of every
quantifier. In Hobbs & Shieber's (1987) terms, the step from (51), the initial
representation for (50), to (52) is blocked, because the discharged quantifier fails to bind an
occurence of a variable y in its scope (the only occurrence of y in its scope is inside a
term, hence not accessible for binding). Thus, the unwanted solution (53) cannot be
generated:
(50) Every researcher of a company came
(51) come'((forall x researcher'(or) a ot'(x, (exists y company/(_y)»»
(52) 3y.company'(y) a come'((forall x researcher'(or) a ot'(x,y)))
(53) \/x.(researcher'(x) a ot'(x,y)) -» 3y. company'^) a come'(jc)
554 V. Ambiguity and vagueness
In prose: The whole structure (represented by the label /T) consists of a set of labelled
DRS fragments (for the semantic contributions of DP, negation, and VP, respectively)
that arc ordered in a way indicated by a relation ORD.
For an undcrspecificd representation of the two readings of (61), the scope relations
between /, and l2 are left open in ORD:
(63) ORD = </T > /,, /T > lv scope^) > ly scope(l2) > I J.
Here '>' means 'has scope over', and scope maps a DRS fragment onto the empty DRS
box it contains. Fully specified representations for the readings of (61) can then also be
expressed in terms of (62). In these cases, ORD comprises in addition to the items in
(63) a relation to determine the scope between universal quantifier and negation, e.g.,
scope{lx) > l2 for the reading with wide scope of the universal quantifier.
Another instance of such a 'monostratal' underspecification formalism is the (revised)
Quasi-Logical Form (QLF) of Alshawi & Crouch (1992), which uses list-valued
metavariables in semantic representations whose specification indicates quantifier scope. See
the representation for (26) in (32a).
Kempson & Cormack (1981) also assume a single level of semantic representation
(higher-order predicate logic) for quantifier scope ambiguities.
3.4. Compositionality
Another distinction between underspecification formalisms centres upon the notion of
resource: In most underspecification formalisms, the elements of a constraint show up at
least once in all its solutions, in fact, exactly once, except in special cases like ellipses. For
instance, in UDRT constraints and their solutions share the same set of DRS fragments,
in CLLS (Egg, Kollcr & Nichrcn 2001) the relation between constraints and their
solutions is defined as an assignment function from node variables (in constraints) to nodes
(in the solutions), and in Glue Language Semantics this resource-sensitivity is explicitly
encoded in the representations (expressions of linear logic).
Due to this resource-sensitivity, every solution of an underspecified semantic
representation of a linguistic expression preserves the semantic contributions of the parts of
the expression. If different parts happen to introduce identical semantic material, then
each instance must show up in each solution. For instance, any solution to a constraint
for (64a) must comprise two universal quantifiers.The contributions of the two DPs may
not be conflated in solutions, which rules out that (64a) and (64b) could share a reading
'for every person x:x likes x\
(64) a. Everyone likes everyone
b. Everyone likes himself
While this strategy seems natural in that the difference between (64a) and (64b) need
not be stipulated by additional mechanisms, there are cases where different instances of
the same semantic material seem to merge in the solutions.
Reconsider e.g. the case of Afrikaans past tense marking (65) [= (18)] in Sailer
(2004). This example has two tense markers and three readings. Sailer points out that
24. Semantic underspecification 555
the two instances of the past tense marker seem to merge in the first and the second
reading of (65):
(65) Jan wou gebel het
Jan want.PAST called have
'Jan wanted to call/Jan wants to have called/Jan wanted to have called'
A direct formalisation of this intuition is possible if one relates fragments in terms of
subexpressionhood, as in the underspecified analyses in the LRS framework (Richter &
Sailer 2006; Kallmeyer & Richter 2006). If constraints introduce identical fragments as
subexpressions of a larger fragment, these fragments can but need not coincide in the
solutions of the constraints.
For the readings of (18), the constraint (simplified) is (66a):
(66) a. <[PAST(y)]^, [PAST(0]c, [wanfC^ [call'(j)],, P<a, e<S, 0<S,i<y,i<£
b. PAST(wanf(j,A (call'(j))))
In prose: The two PAST- and the wa/if-fragments are subexpressions of (relation ' < ')
the whole expression (as represented by the variables a or <5), while the ca//-fragment is
a subexpression of the arguments of the PAST operators and the intcnsionaliscd second
argument of want. This constraint describes all three semantic representations in (19);
e.g., to derive (66b) [= (19a)], the following equations are needed: a=S=f}=£,y=£ =
0, and J] = /.The crucial equation here is fi = e, which equates two fragments. (Additional
machinery is needed to block unwanted readings, see Sailer 2004.)
This approach is more powerful than resource-sensitive formalisms.The price one has
to pay for this additional power is the need to control explicitly whether identical
material may coincide or not, e.g., for the analyses of negative concord in Richter & Sailer
(2006). (See also article 6 (Pagin & Westerstahl) Compositionality.)
3.5. Expressivity and compactness
The standard approach to evaluate an underspecification formalisms is to apply it to
challenging ambiguous examples, e.g., (67) [= (40)], and to check whether there is an
expression of the formalism that expresses all and only the attested readings of the
example:
(67) Every researcher of a company saw most samples
However, what if these readings are contextually restricted, or, if the sentence has only
four readings, as claimed by Kallmeyer & Romero (2008) and others, lacking the reading
(45b) with the scope ordering 3 > most7 > V?
Underspecification approaches that model scope in terms of partial order between
fragments of semantic representations run into problems already with the second of
these possibilities: Any constraint set that encompasses the four readings in which most7
has highest or lowest scope also covers the fifth reading (45b) (Ebert 2005). This means
V. Ambiguity and vagueness
that such underspecification formalisms are not expressive in the sense of Konig &
Reyle (1999) or Ebert (2005), since they cannot represent any subset of readings of an
ambiguous expression.
The formalisms are of different expressivity, e.g., approaches that model quantifier
scope by lists (such as Alshawi 1992) are less expressive than those that use dominance
relations, or scope lists together with an explicit ordering of list elements as in Fox &
Lappin's (2005) Property Theory with Curry Typing.
Fully expressive is the approach of Koller, Regneri & Thater (2008), which uses
Regular Tree Grammars for scope underspecification. Rules of these grammars expand
nonterminals into tree fragments. For instance, the rule 5 -» f (A, B) expands 5 into a
tree whose mother is labelled by /, and whose children arc the subtrees to be derived by
expanding the nonterminals A and B.
Koller, Regneri & Thater (2008) show that dominance constraints can be
translated into RTGs, e.g., the constraint (68) [= (41)] for the semantics of (40) is translated
into (69).
(68) 3>>.company' (y) a □ Vi (researcher' (x) a □) -> □ most' (sample', Xz □)
of (x,y)
see' (x,z)
(69)
{1-5}
{1-5}
{1-5}
{2-5}
{2-5}
Ml
->
->
->
->
->
->
3comp([2-5})
Vres({l-2},{4-5})
most([\^\)
Vres({2},{4-5})
most ([2-4])
V({l-2},{4})
Ml
{1-2}
12-4}
{4-5}
{2}
{4}
->
->
->
->
->
->
comp{{\\, [2-
comp([2})
Vres({2},{3})
most({4})
of
see
In (69), the fragments of (68) are addressed by numbers, 1,3, and 5 are the fragments for
the existential, universal, and most-DP, respectively, and 2 and 4 are the fragments for
o/and see. All nonterminals correspond to parts of constraints; they are abbreviated as
sequences of fragments. For instance, {2-5} corresponds to the whole constraint except
the existential fragment.
Rules of the RTG specify on the right hand side the root of the partial constraint
introduced on the left hand side, for instance, the first rule expresses wide scope of a
company over the whole sentence.The RTG (69) yields the same five solutions as (68).
In (69), the reading 3 > most7 > V can be excluded easily, by removing the
production rule {2-5} -» mo.sf({2-4}):This leaves only one expansion rule for {2-5}. Since {2-5}
emerges only as child of 3comp with widest scope, only "ires can be the root of the tree
below widest-scope Bcomp.This shows that RTGs are more expressive than dominance
constraints. (In more involved cases, restricting the set of solutions can be less simple,
however, which means that RTGs can get larger if specific solutions are to be excluded.)
This last observation points to another property of underspecification formalisms that
is interdependent with expressivity, viz., compactness : A (sometimes tacit) assumption is
24. Semantic underspecification 559
of the sentence constituent. This gives the desired flexibility to derive scopally different
semantic representations like in (27) from uniform syntactic structures like (2).
The approach of Woods (1967,1978) works similarly: Semantic contributions of DPs
are collected separately from the main semantic representation; they can be integrated
into this main representation immediately or later.
Another approach of this kind is Steedman (2007). Here non-universal quantifiers
and their scope with respect to universal quantifiers are modelled in terms of Skolem
functions. (See Kallmeyer & Romero 2008 for further discussion of this strategy.) These
functions can have arguments for variables bound by universal quantifiers to express the
fact that they are outscoped by these quantifiers. Consider e.g. the two readings of (26)
in Skolem notation:
(70) a. \/x.v/oman'(x) -» man'^A:,) a love'(oc, sk{) ('one man for all women')
b. Vac.woman'(ac) -» mm'(sk2(x)) a lo\e'(x,sk2(x)) ('a possibly different man per
woman')
For the derivation of the different readings of a scopally underspecified expression,
Steedman uses underspecified Skolem functions, which can be specified at any point
in the derivation w.r.t. their environment, viz., the /i-tuple of variables bound by
universal quantifiers so far. For (26), the semantics of a man would be represented by
XQ.Q(skolem''(man')), where skolem' is a function from properties P and environments
E to generalised skolem terms like /(£), where P holds of /(£).
The term A(2£?(skolem'(man')) can be specified at different steps in the derivation,
with different results: Immediately after the DP has been formed specification returns a
Skolem constant like skx in (70a), since the environment is still empty. After combining
the semantics of the DPs and the verb, the environment is the 1-tuple with the variable x
bound by the universal quantifier of the subject DP, and specification at that point yields
a skolem term like sk2(x).
This sketch of competing approaches to the syntax-semantics interface shows that
the functionality of this interface (or, an attempt to uphold it in spite of semantically
and syntactically homogeneous ambiguous expressions) can be a motivation for
underspecification: Functionality is preserved for such an expression directly in that there is
a function from its syntactic structure to its underspecified semantic representation that
encompasses all its readings.
4.2. Ambiguity and negation
Semantic underspecification also helps avoiding problems with disjunctive representations
of the meaning of ambiguous expressions that show up under negation: Negating an
ambiguous expression is intuitively interpreted as the disjunction of the negated
expressions, i.e., one of the readings of the expressions is denied. However, if the meaning of
the expression itself is modelled as the disjunction of its readings, the negated expression
emerges as the negation of the disjunctions, which is equivalent to the conjunction of the
negated readings, i.e., every reading of the expression is denied, which runs counter to
intuitions.
For instance, for (26) such a semantic representation can be abbreviated as (71), which
turns into (72) after negation:
24. Semantic underspecification 561
What is more, a complete disambiguation may be not even necessary. In these cases,
postponing ambiguity resolution, and resolving ambiguity only on demand makes NLP
systems more efficient.For instance,scope ambiguities arc often irrelevant for translation,
therefore it would be useless to identify the intended reading of such an expression: Its
translation into the target language would be ambiguous in the same way again.
Therefore e.g. the Verbmobil project (machine translation of spontaneous spoken dialogue;
Wahlster 2000) used a scopally underspecified semantic representation (Schiehlen 2000).
The analyses of concrete NLP systems show clearly that combinatorial explosion
is a problem for NLP that suggests the use of underspecification (pace Player 2004).
The large number of readings that are attributed to linguistic expressions are due to the
fact that, first, the number of scope-bearing constituents per expression is
underestimated (there are many more such constituents besides DPs, e.g., negation, modal verbs,
quantifying adverbials like three times or again), and, second and much worse, spurious
ambiguities come in during syntactic and semantic processing of the expressions.
For instance, Koller, Rcgneri & Thater (2008) found that 5% of the representations
in the Rondane Treebank (underspecified MRS representations of sentences from the
domain of Norwegian tourist information, distributed as part of the English Resource
Grammar, Copestake & Flickinger 2000) have more than 650 000 solutions, record
holder is the rather innocuous looking (74) with about 4.5 x 1012 scope readings:
(74) Myrdal is the mountain terminus of the Flam rail line (or Flamsbana) which makes
its way down the lovely Flam Valley (Flamsdalen) to its sea-level terminus at
Flam.
The median number of scope readings per sentence is 56 (Koller, Regneri & Thater
2008), so, short of applying specific measures to eliminate spurious ambiguities (see
section 6.2), combinatorial explosion definitely is a problem for semantic analysis
in NLP
In recent years, underspecification has turned out very useful for NLP in another
way, viz., in that underspecified semantics provides an interface bridging the gap
between deep and shallow processing. To combine the advantages of both kinds of
processing (accuracy vs. robustness and speed), both can be combined in NLP
applications {hybrid processing). The results of deep and shallow syntactic processing can
straightforwardly be integrated on the semantic level (instead of combining the results
of deep and shallow syntactic analyses). An example for an architecture for hybrid
processing is the 'Heart of Gold' developed in the project 'DeepThought' (Callmeier
etal.2004).
Since shallow syntactic analyses provide only a part of the information to be gained
from deep analysis, the semantic information derivable from the results of a shallow
parse (e.g., by a part-of-speech tagger or an NP chunker) can only be a part of the one
derived from the results of a deep parse. Underspecification formalisms can be used to
model this kind of partial information as well.
For instance, deep and shallow processing may yield different results with respect to
argument linking: NP chunkers (as opposed to systems of deep processing) do not relate
verbs and their syntactic arguments, e.g., experiencer and patient in (75). Any semantic
analysis based on such a chunker will thus fail to identify individuals in NP and verb
semantics as in (76):
562
V. Ambiguity and vagueness
(75) Max saw Mary
(76) named(ac,, Max), see(ac2,*3), named(ac4, Mary)
Semantic representations of different depths must be compatible in order to combine
results from parallel deep and shallow processing or to transform shallow into deep
semantic analyses by adding further pieces of information. Thus, the semantic
representation formalism must be capable of separating the semantic information from different
sources appropriately. For instance, information on argument linking should be listed
separately, thus, a full semantic analysis of (75) should look like (77) rather than (78).
Robust MRS (Copcs-takc 2003) is an underspecification formalism that was designed to
fulfill this demand:
(77) named(ac1,Max),see(ac2,ac3),named(ac4,Mary),*, = x2,x3 = x4
(78) named(acI,Max),see(acI,ac4), named(ac4, Mary)
4.4. Semantic construction
Finally, underspecification formalisms turn out to be interesting from the perspective of
semantic construction in general, independently of the issue of ambiguity. This interest
is based on two properties of these formalisms, viz., their portability and their flexibility.
First, underspecification formalisms do not presuppose a specific syntactic
analysis (which would do a certain amount of preprocessing for the mapping from syntax
to semantics, like the mapping from surface structure to Logical Form in Generative
Grammar). Therefore the syntax-semantics interface can be defined in a very
transparent fashion, which makes the formalisms very portable in that they can be coupled
with different syntactic formalisms. Tab. 24.1 lists some of the realised combinations of
syntactic and semantic formalisms.
Second, the flexibility of the interfaces that are needed to derive underspecified
representations of ambiguous expressions is also available for unambiguous cases that pose a
challenge for any syntax-semantics interface. For instance, semantic construction for the
modification of modifiers and indefinite pronouns like everyone is a problem, because
the types of functor (semantics of the modifier) and argument (semantics of the modified
expression) do not fit: For instance, in (79) the PP semantics is a function from properties
to properties, the semantics of the pronoun as well as the one of the whole modification
structure are sets of properties.
Tab. 24.1: Realised couplings of underspecification formalisms and syntax formalisms
HPSG LFG (L)TAG
MRS Copestakc ct al. (2005) Oepen et al. (2004) Kallineyer & Joshi (1999)
GLS Asudch& Crouch Dalrymple (2001) Frank & van Genabith
(2001) (2001)
UDRT Frank & Reyle (1995) van Genabith & Crouch Cimiano & Rcyle (2005)
(1999)
HS Chaves (2002) Kallineyer & Joshi (2003)
564 V. Ambiguity and vagueness
the main DP fragment. The top fragments (holes that determine the scope of the DP,
because the top fragment of a constituent always dominates all its other fragments)
of DP, determiner, and NP are identical. ('SSI' indicates that it is a rule of the
syntax-semantics interface.)
The main fragment of a VP (of a sentence) emerges by applying the main verb (VP)
fragment to the main fragment of the object (subject) DP.The top fragments of the verb
(VP) and its DP argument are identical to the one of the emerging VP (S):
(SSI)
(81) [vp V DP] => |VPl: |V](|DP]); IVPj = |V J = |DPJ
(SSI)
(82) [s DP VP] => |S]: |VP](|DP]); |S J = |DP J = |VP J
We assume that for all lexical entries, the main fragments are identical to the standard
semantic representation (e.g., for every, we get [[Det]]: \Q\PVx.Q{x) -» P(x)), and the
top fragments arc holes dominating the main fragments. If in unary projections like
the one of man from N toN and NP main and top fragments are merely inherited, the
semantics of a man emerges as (83):
By. man' (y) a p
[DP]: y
The crucial point is the decision to let the bound variable be the main fragment in the
DP semantics.The intermediate DP fragment between top and main fragment is ignored
in further processes of semantic construction. Combining (83) with the semantics of the
verb yields (84):
(84)[VPJ: D
3y. man'(_y) a n
flVP]: love'OO
Finally, the semantics of every woman, which is derived in analogy to (83), is combined
with (84) through rule (82). According to this rule, the two top fragments are identified
and the two main fragments are combined by functional application into the main S
fragment, but the two intermediate fragments, which comprise the two quantifiers, are
not addressed at all, and hence remain dangling in between. The result is the desired
dominance diamond:
566 V. Ambiguity and vagueness
Hurum's (1988) algorithm, the CLE resolution algorithm (Moran 1988;Alshawi 1992),
and Chaves' (2003) extension of Hole Semantics detect specific cases of equivalent
solutions (e.g., when one existential quantifier immediately dominates another one) and
block all but one of them.The blocking is only effective once the solutions are
enumerated. In contrast, Roller & Thater (2006) and Roller, Regneri & Thater (2008) present
algorithms to reduce spurious ambiguities that map underspecified representations on
(more restricted) underspecified representations.
6.3. Reasoning with underspecified representations
Sometimes fully specified information can be deduced from an underspecified semantic
representation. For instance, if Amdlic is a woman, then (26) allows us to conclude
that she loves a man, because this conclusion is valid no matter which reading of (26)
is chosen. For UDRT (Konig & Reyle 1999; Reyle 1992; Reyle 1993; Reyle 1996) and
APL (Jaspars & van Eijck 1996), there are calculi for such reasoning with underspecified
representations. Van Deemter (1996) discusses different kinds of consequence relations
for this reasoning.
6.4. Integration of preferences
In many cases of scope ambiguity, the readings are not on a par in that some are more
preferred than others. Consider e.g. a slight variation of (26), here the 3V-reading is
preferred over the V3-reading:
(86) A woman loves every man
One could integrate these preferences into underspecified representations of scopally
ambiguous expressions to narrow down the number of its readings or to order the
generation of solutions (Alshawi 1992).
6.4.1. Kinds of preferences
The preferences discussed in the literature can roughly be divided into three groups.
The first group are based on syntactic structure, starting with Johnson-Laird (1969) and
Lakoff's (1971) claim that surface linear order or precedence introduces a preference
for wide scope of the preceding scope-bearing item (but see e.g.Villalta 2003 for
experimental counterevidence). Precedence can be interpreted in terms of a syntactic
configuration such as c-command (e.g., VanLehn 1978), since in a right-branching binary
phrase-marker preceding constituents c-command the following ones.
However, these preferences are not universally valid: Kurtzman & MacDonald (1993)
report a clear preference for wide scope of the embedded DP in the case of nested
quantification as in (87). Here the indefinite article precedes (and c-commands) the embedded
DP, but the V3-reading is nevertheless preferred:
(87) I met a researcher from every university
Hurum (1988) and VanLehn (1978) make the preference of scope-bearing items to
take scope outside the constituent they are directly embedded in also dependent on the
24. Semantic underspecification 567
category of that constituent (e.g., much stronger for items inside PPs than items inside
infinite clauses).
The scope preference algorithm of Gamback & Bos (1998) gives scope-bearing
nonheads (complements and adjuncts) in binary-branching syntactic structures immediate
scope over the respective head.
The second group of preferences focus on grammatical functions and thematic roles.
Functional hierarchies have been proposed that indicate preference to take wide scope
in loup (1975) (88a) and VanLehn (1978) (88b):
(88) a. topic > deep and surface subject > deep subject or surface subject > indirect
object > prepositional object > direct object
b. preposed PP, topic NP > subject > (complement in) sentential or adverbial PP >
(complement in) verb phrase PP > object
While loup combines thematic and functional properties in her hierarchy (by including the
notion of'deep subject'), Pafel (2005) distinguishes grammatical functions (only subject
and sentenctial adverb) and thematic roles (strong and weak patienthood) explicitly.
There is a certain amount of overlap between structural preferences and the
functional hierarchies, at least in a language like English: Here DPs higher on the functional
hierarchy also tend to c-command DPs lower on the hierarchy, because they are more
likely to surface as subjects.
The third group of preferences addresses the quantifiers (or, the determiners
expressing them) themselves. loup (1975) and VanLehn (1978) introduce a hierarchy of
determiners:
(89) each > every > all > most > many > several > some (plural) > a few
CLE incorporates such preference rules, too (Moran 1988; Alshawi 1992), e.g., the rule
that each outscopes other determiners, and that negation is outscoped by some and
outscopes every.
Some of these preferences can be put down to a more general preference for logically
weaker interpretations, in particular, the tendency of universal quantifiers to outscope
existential ones (recall that the VB-reading of sentences like (26) is weaker than the
3V-rcading; VanLehn 1978; Moran 1988; Alshawi 1992). Similarly, scope of the negation
above every and below some returns existential statements, which are weaker than the
(unpreferred) alternative scopings (universal statements) in that they do not make a
claim about the whole domain.
Pafel (2005) lists further properties, among them focus and discourse binding (whether
a DP refers to an already established set of entities, as e.g. in few of the books as opposed
to few books).
6.4.2. Interaction of preferences
It has been argued that the whole range of quantifier scope effects can only be accounted
for in terms of an interaction of different principles.
Fodor (1982) and Hurum (1988) assume an interaction between linear precedence
and the determiner hierarchy, which is corroborated by experimental results of Filik,
24. Semantic underspecification 569
Asher, Nicholas & Alex Lascarides 2003. Logics of Conversation. Cambridge: Cambridge
University Press.
Asudeh, Ash & Richard Crouch 2002. Glue semantics for HPSG. In: F. van Eynde, L. Hellan & D.
Beermann (eds.). Proceedings of the 8th International Conference on Head-Driven Phrase
Structure Grammar. Stanford, CA: CSLI Publications, 1-19.
Babko-Malaya, Olga 2004. LTAG semantics of NP-coordination. In: Proceedings of the 7th
International Workshop on Tree Adjoining Grammars and Related Formalisms. Vancouver, 111-117.
Bierwisch. Manfred 1983. Semantische und konzeptionelle Representation lexikalischer Einheiten.
In: R. Ruzi£ka & W. Motsch (eds.). Untersuchungen zur Semantik. Berlin: Akademie Verlag,
61-99.
Bierwisch, Manfred 1988. On the grammar of local prepositions. In: M. Bierwisch, W. Motsch & I.
Zimmermann (eds.). Syntax, Semantik und das Lexikon. Berlin: Akademie Verlag, 1-65.
Bierwisch, Manfred & Ewald Lang (eds.) 1987. Grammatische und konzeptuelle Aspekte von
Dimensionsadjektiven. Berlin: Akademie Verlag.
Blackburn, Patrick & Johan Bos 2005. Representation and Inference for Natural Language. A First
Course in Computational Semantics. Stanford, CA: CSLI Publications.
Bos, Johan 1996. Predicate logic unplugged. In: P. Dekker & M. Stokhof (eds.). Proceedings of the
10th Amsterdam Colloquium. Amsterdam: ILLC. 133-142.
Bos, Johan 2004. Computational semantics in discourse. Underspecification, resolution, and
inference. Journal of Logic, Language and Information 13,139-157.
Brants,Thorsten. Wojciech Skut & Hans Uszkoreit 2003. Syntactic annotation of a German
newspaper corpus. In: A. Abeille' (ed.). Treebanks. Building and Using Parsed Corpora. Dordrecht:
Kluwer, 73-88.
Bronnenberg, Wim, Harry Bunt, Jan Landsbergen, Remko Scha, Wijnand Schoeninakers & Eric
van Utteren 1979. The question answering system PHLIQA1. In: L. Bole (ed.). Natural
Language Question Answering Systems. London: Macmillan, 217-305.
Bunt, Harry 2007. Underspecification in semantic representations. Which technique for what
purpose? In: H. Bunt & R. Muskens (eds.). Computing Meaning, vol. 3. Amsterdam: Springer,
55-85.
Callmeier, Ulrich, Andreas Eisele, Ulrich Schafer & Melanie Siegel 2004. The DeepThought core
architecture framework. In: M.T. Lino ct al. (eds.). Proceedings of the 4th International
Conference on Language Resources and Evaluation (= LREC). Lisbon: European Language Resources
Association, 1205-1208.
Chaves, Rui 2002. Principle-based DRTU for HSPG. A case study. In: Proceedings of the 1st
International Workshop on Scalable Natural Language Understanding (= ScaNaLu). Heidelberg: EML.
Chaves, Rui 2003. Non-Redundant scope disambiguation in undeispecified semantics. In: B. ten
Cate (ed.). Proceedings of the 8th EESLLI Student Session. Vienna, 47-58.
Chaves, Rui 2005a. DRT and underspecification of plural ambiguities. In: Proceedings of the 6th
International Workshop in Computational Semantics. Tilburg University, 78-89.
Chaves, Rui 2005b. Underspecification and NP coordination in constraint-based grammar. In: F.
Richter & M. Sailer (eds.). Proceedings of the ESSLLI'05 Workshop on Empirical Challenges
and Analytical Alternatives to Strict Compositionality. Edinburgh: Heriot-Watt University, 38-58.
Cimiano, Philipp & Uwe Reyle 2005. Talking about trees, scope and concepts. In: H. Bunt,
J. Geertzen & E.Thijse (eds.). Proceedings of the 6th International Workshop in Computational
Semantics. Tilburg:Tilburg University, 90-102.
Cooper, Robin 1983. Quantification and Syntactic Theory. Dordrecht: Rcidcl.
Copcstakc, Ann & Dan Flickingcr 2000. An open-source grammar development environment
and broad-coverage English grammar using HPSG. In: Proceedings of the 2nd International
Conference on Language, Resources and Evaluation (= LREC). Athens: European Language
Resources Association, 591-600.
Copestake, Ann, Dan Flickinger, Carl Pollard & Ivan Sag 2005. Minimal recursion semantics. An
introduction. Research on Language and Computation 3,281-332.
570 V. Ambiguity and vagueness
Crouch, Richard 1995. Ellipsis and quantification. A substitutional approach. In: Proceedings of
the 7th Conference of the European Chapter of the Association for Computational Linguistics
(= EACL). Dublin: Association for Computational Linguistics, 229-236.
Crouch, Richard & Josef van Genabith 1999. Context change, underspecification and the structure
of Glue Language derivations. In: M. Dairymple (ed.). Semantics and Syntax in Lexical
Functional Grammar. The Resource Logic Approach. Cambridge, MA:The MIT Press, 117-189.
Dairy mple, Mary 2001. Lexical Functional Grammar. New York: Academic Press.
Dalrymple, Mary, John Lamping, Fernando Pereira & Vijay Saraswat 1997. Quantifiers, anaphora,
and intensionality. Journal of Logic, Language and Information 6,219-273.
Dalrymple, Mary, Stuart Shieber & Fernando Pereira 1991. Ellipsis and higherorder unification.
Linguistics & Philosophy 14,399-452.
van Deemter, Kees 1996.Towards a logic of ambiguous expressions. In: K. van Deemter & S. Peters
(eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications, 203-237.
Dolling, Johannes 1995. Ontological domains, semantic sorts and systematic ambiguity.
International Journal of Human-Computer Studies 43,785-807.
Duchicr, Dcnys & Claire Gardcnt 2001. Tree descriptions, constraints and incrcinentality. In: H.
Bunt, R. Muskens & E.Thijsse (eds.). Computing Meaning, vol. 2. Dordrecht: Kluwer, 205-227.
Ebert, Christian 2005. Formal Investigations of Underspecified Representations. Ph.D. dissertation.
King's College, London.
Egg, Markus 2004. Mismatches at the syntax-semantics interface. In: S. Miiller (ed.). Proceedings
of the 11th International Conference on Head-Driven Phrase Structure Grammar. Stanford, CA:
CSLI Publications, 119-139.
Egg, Markus 2005. Flexible Semantics for Reinterpretation Phenomena. Stanford, CA: CSLI
Publications.
Egg, Markus 2006. Anti-Ikonizitat an der Syntax-Semantik-Schnittstelle. Zeitschrift fur Sprachwis-
senschaft 25,1-38.
Egg, Markus 2007. Against opacity. Research on Language and Computation 5,435-455.
Egg, Markus & Michael Kohlhase 1997. Dynamic control of quantifier scope. In: P. Dekker, M.
Stokhof & Y. Venema (eds.). Proceedings of the 11th Amsterdam Colloquium. Amsterdam:
ILLC, 109-114.
Egg, Markus, Alexander Roller & Joachim Nichrcn 2001. The constraint language for lambda-
structures. Journal of Logic, Language and Information 10,457-485.
Egg, Markus & Gisela Redeker 2008. Underspecified discourse representation. In: A. Benz &
P. Kuhnlein (eds.). Constraints in Discourse. Amsterdam: Benjamins, 117-138.
van Eijck, Jan & Manfred Pinkal 1996. What do underspecified expressions mean? In: R. Cooper
et al. (eds.). Building the Framework. FraCaS Deliverable D 15, section 10.1. Edinburgh:
University of Edinburgh. 264-266.
Filik, Ruth, Kevin Patei son & Simon Liversedge 2004. Processing doubly quantified sentences.
Evidence from eye movements. Psychonomic Bulletin and Review 11,953-959.
Fodor, Janet 1982. The mental representation of quantifiers. In: S. Peters & E. Saarinen (eds.).
Processes, Beliefs, and Questions. Dordrecht: Reidel, 129-164.
Fox, Chris & Shalom Lappin 2005. Underspecified interpretations in a Curry-typed representation
language. Journal of Logic and Computation 15,131-143.
Frank, Annette & Josef van Genabith 2001. GlueTag. Linear logic based semantics for LTAG - and
what it teaches us about LFG and LTAG. In: M. Butt &T Holloway King (eds.). Proceedings of
the LFG '01 Conference. Stanford, CA: CSLI Publications, 104-126.
Frank, Annette & Uwe Reyle 1995. Principle-based semantics for HPSG. In: Proceedings of the
7th Conference of the European Chapter of the Association for Computational Linguistics
(= EACL). Dublin: Association for Computational Linguistics, 9-16.
Fuchss, Ruth, Alexander Roller, Joachim Niehren & Stefan Thater 2004. Minimal recursion
semantics as dominance constraints. Translation, evaluation, and analysis. In: Proceedings of the 42nd
Annual Meeting of the Association for Computational Linguistics (= ACL). Barcelona:
Association for Computational Linguistics, 247-254.
24. Semantic underspecification 571
Gamback.Bjorn & Johan Bos 1998. Semantic-head based resolutions of scopal ambiguities. In:
Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual
Meeting of the Association for Computational Linguistics (= ACL). Montreal: Association for
Computational Linguistics, 433-437.
Gawron, Jean Mark & Stanley Peters 1990. Anaphora and Quantification in Situation Semantics.
Menlo Park, CA: CSLI Publications.
van Genabith, Josef & Richard Crouch 1999. Dynamic and underspecified semantics for LFG. In:
M. Dalrymple (ed.). Semantics and Syntax in Lexical Functional Grammar. The Resource Logic
Approach. Cambridge, MA:The MIT Press, 209-260.
Harris, John 2007. Representation. In: P. de Lacy (ed.). The Cambridge Handbook of Phonology.
Cambridge: Cambridge University Press, 119-137.
Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Blackwell.
Hendriks, Herman 1993. Studied Flexibility. Categories and Types in Syntax and Semantics.
Amsterdam: ILLC.
Hirschbiihler, Paul 1982. VP deletion and across the board quantifier scope. In: J. Pustejovsky &
P. Sells (eds.). Proceedings of the North Eastern Linguistics Society (= NELS) 12. Amherst, MA:
GLSA. 132-139.
Hobbs, Jerry & Stuart Shieber 1987. An algorithm for generating quantifier scoping. Computational
Linguistics 13,47-63.
Hobbs, Jerry, Mark Stickel, Douglas Appelt & Paul Martin 1993. Interpretation as abduction.
Artificial Intelligence 63,69-142.
Hodges, Wilfrid 2001. Formal features of compositionality. Journal of Logic, Language and
Information 10,7-28.
Hoeksema.Jack 1985. Categorial Morphology. New York: Garland.
Huet, Gerard P. 1975. A unification algorithm for typed X-calculus. Theoretical Computer Science
1,27-57.
Hurum, Sven 1988. Handling scope ambiguities in English. In: Proceedings of the 2nd Conference
on Applied Natural Language Processing (= AN LP). Morristown, NJ: Association for
Computational Linguistics, 58-65.
Ioup, Georgette 1975. Some universals for quantifier scope. In: J. Kimball (ed.). Syntax and
Semantics 4. New York: Academic Press, 37-58.
Jaspars, Jan & Jan van Eijck 1996. Underspecification and reasoning. In: R. Cooper et al. (eds.).
Building the Framework. FraCaS Deliverable D 15, section 10.2, 266-287.
Johnson-Laird, Philip 1969. On understanding logically complex sentences. Quarterly Journal of
Experimental Psychology 21,1-13.
Kallmeyer, Laura & Aravind K. Joshi 1999. Factoring predicate argument and scope semantics.
Underspecified semantics with LTAG. In: P. Dekker (ed.). Proceedings of the 12th Amsterdam
Colloquium. Amsterdam: ILLC, 169-174.
Kallmeyer, Laura & Aravind K. Joshi 2003. Factoring predicate argument and scope semantics.
Underspecified semantics with LTAG. Research on Language and Computation 1,3-58.
Kallmeyer, Laura & Frank Richter 2006. Constraint-based computational semantics. A comparison
between LTAG and LRS. In: Proceedings of the 8th International Workshop on Tree-Adjoining
Grammar and Related Formalisms. Sydney: Association for Computational Linguistics,
109-114.
Kallmeyer, Laura & Maribel Romero 2008. Scope and situation binding in LTAG using semantic
unification. Research on Language and Computation 6,3-52.
Keller, William 1988. Nested Cooper storage. The proper treatment of quantification in ordinary
noun phrases. In: U. Reyle & Ch. Rohrer (eds.). Natural Language Parsing and Linguistic
Theories. Dordrecht: Reidel, 432^47.
Kempson, Ruth & Annabel Cormack 1981. Ambiguity and quantification. Linguistics &
Philosophy 4,259-309.
Roller, Alexander, Joachim Niehren & Stefan Thater 2003. Bridging the gap between
underspecification formalisms. Hole semantics as dominance constraints. In: Proceedings of the
572 V. Ambiguity and vagueness
11th Conference of the European Chapter of the Association for Computational Linguistics
(= EACL). Budapest: Association for Computational Linguistics, 195-202.
Roller, Alexander, Michaela Regneri & StefanThater 2008. Regular tree grammars as a formalism for
scope underspecification. In: Proceedings of the 46th Annual Meeting of the Association for
Computational Linguistics (= ACL). Columbus, OH: Association for Computational Linguistics, 218-226.
Roller, Alexander & Stefan Thater 2005. The evolution of dominance constraint solvers. In:
Proceedings of the ACL Workshop on Software. Ann Arbor, MI: Association for Computational
Linguistics, 65-76.
Roller, Alexander & Stefan Tliater 2006. An improved redundancy elimination algorithm for
underspecified descriptions. In: Proceedings of the 21st International Conference on
Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics
(=ACL). Sydney: Association for Computational Linguistics, 409-416.
Ronig, Esther & Uwe Reyle 1999. A general reasoning scheme for underspecified representations.
In: H.J. Ohlbach & U. Reyle (eds.). Logic, Language and Reasoning. Essays in Honour of Dov
Gabbay. Dordrecht: Rluwer. 1-28.
Rratzcr, Angclika 1998. Scope or pscudoscope? Arc there wide-scope indefinites? In: S. Rothstein
(ed.). Events and Grammar. Dordrecht: Rluwer. 163-196.
Kurtzman. Howard & Maryellen MacDonald 1993. Resolution of quantifier scope ambiguities.
Cognition 48,243-279.
Lakoff, George 1971. On generative semantics. In: D. Steinberg & L. Jakobovits (eds.). Semantics.
An Interdisciplinary Reader in Philosophy, Linguistics and Psychology. Cambridge: Cambridge
University Press, 232-296.
Larson, Richard 1998. Events and modification in nominals. In: D. Strolovitch & A. Lawson (eds.).
Proceedings of Semantics and Linguistic Theory (= SALT) VIII. Ithaca, NY: CLC Publications,
145-168.
Larson, Richard & Sungeon Cho 2003. Temporal adjectives and the structure of possessive DPs.
Natural Language Semantics 11,217-247.
Lev, Iddo 2005. Decoupling scope resolution from semantic composition. In: H. Bunt, J. Geertzen
& E.Thijse (eds.). Proceedings of the 6th International Workshop on Computational Semantics.
Tilburg:Tilburg University, 139-150.
Link, Godchard 1983. The logical analysis of plurals and mass terms. A lattice-theoretic approach.
In: R. Bauerle, Ch. Schwarze & A. von Stechow (eds.). Meaning, Use, and the Interpretation of
Language. Berlin: Mouton de Gruyter, 303-323.
Marcus, Mitchell, Donald Hindle & Margaret Fleck 1983. D-theory. Talking about talking about
trees. In: Proceedings of the 21st Annual Meeting of the Association for Computational
Linguistics (= ACL). Cambridge, MA: Association for Computational Linguistics, 129-136.
Montague, Richard 1974. Formal Philosophy. Selected Papers of Richard Montague. Edited and
with an introduction by Richmond H.Thomason. New Haven, CT: Yale University.
Moran, Douglas 1988. Quantifier scoping in the SRI core language engine. In: Proceedings of the
26th Annual Meeting of the Association for Computational Linguistics (= ACL). Buffalo, NY:
Association for Computational Linguistics, 33-40.
Muskens, Reinhard 2001. Talking about trees and truth-conditions. Journal of Logic, Language and
Information 10,417-455.
Nerbonne, John 1993. A feature-based syntax/semantics interface. Annals of Mathematics and
Artificial Intelligence 8.107-132.
Niehrcn, Joachim, Manfred Pinkal & Peter Ruhrbcrg 1997. A uniform approach to
underspecification and parallelism. In: Proceedings of the 35th Annual Meeting of the Association for
Computational Linguistics (= ACL). Madrid: Association for Computational Linguistics, 410-417.
Oepen, Stephan, Helge Dyvik, Jan Tore L0nning, Erik Velldal, Dorothee Beermann, John
Carroll, Dan Flickinger, Lars Hellan, Janni Bonde Johannessen, Paul Meurer, Torbj0rn Nor-
dgard & Victoria Rosin 2004. Som a kapp-ete med trollet? Towards MRS-based Norwegian-
English machine translation. In: Proceedings of the 10th International Conference on Theoretical
and Methodological Issues in Machine Translation (= TMI). Baltimore, MD, 11-20.
24. Semantic underspecification 573
Pafel, Jiirgen 2005. Quantifier Scope in German. Amsterdam: Benjamins.
Park, Jong 1995. Quantifier scope and constituency. In: Proceedings of the 33rd Annual Meeting of
the Association for Computational Linguistics (= ACL). Cambridge, MA: Association for
Computational Linguistics, 205-212.
Pinkal, Manfred 1995. Logic and Lexicon: The Semantics of the Indefinite. Dordrecht: Kluwer.
Pinkal, Manfred 1996. Radical underspecification. In: P. Dekker & M. Stokhof (eds.). Proceedings
of the 10th Amsterdam Colloquium. Amsterdam: ILLC, 587-606.
Pinkal, Manfred 1999. On semantic underspecification. In: H. Bunt & R. Muskens (eds.).
Computing Meaning, vol. 1. Dordrecht: Kluwer, 33-55.
Player, Nickie 2004. Logics of Ambiguity. Ph.D. dissertation. University of Manchester.
Poesio, Massimo 1996. Semantic ambiguity and perceived ambiguity. In: K. van Dee inter & S. Peters
(eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI Publications. 159-201.
Poesio, Massimo, Patrick Sturt, Ron Artslein & Ruth Filik 2006. Underspecification and anaphora.
Theoretical issues and preliminary evidence. Discourse Processes 42,157-175.
Pulman, Stephen G. 1997. Aspectual shift as type coercion. Transactions of the Philological Society
95,279-317.
Rambow, Owen, David Weir & Vijay K. Shanker 2001. D-tree substitution grammars.
Computational Linguistics 27,89-121.
Rapp, Irene & Arniin von Stechow 1999. Fast 'almost' and the visibility parameter for functional
adverbs. Journal of Semantics 16,149-204.
Regneri, Michaela, Markus Egg & Alexander Roller 2008. Efficient processing of underspecified
discourse representations. In: Proceedings of the 46th Annual Meeting of the Association for
Computational Linguistics (= ACL), Short Papers. Columbus, OH: Association for
Computational Linguistics, 245-248.
Reyle, Uwe 1992. Dealing with ambiguities by underspecification. A first order calculus for
unscoped representations. In: M. Stokhof & P. Dekker (eds.). Proceedings of the 8th Amsterdam
Colloquium. Amsterdam: ILLC, 493-512.
Reyle, Uwe 1996. Co-indexing labeled DRSs to represent and reason with ambiguities. In: K. van
Deemter & S. Peters (eds.). Semantic Ambiguity and Underspecification. Stanford, CA: CSLI
Publications, 239-268.
Richtcr, Frank & Manfred Sailer 1996. Syntax fur cine untcrspczifizicrtc Scmantik. PP-Anbindung
in einem deutschen HPSG-Fragment. In: S. Mehl, A. Mertens & M. Schulz (eds.). Pr'dpositional-
semantik und PP-Anbindung. Duisburg: Institut fur Informatik, 39-47.
Richter, Frank & Manfred Sailer 2006. Modeling typological markedness in semantics.The case of
negative concord. In: S. Miiller (ed.). Proceedings of the HPSG 06 Conference. Stanford, CA:
CSLI Publications, 305-325.
Robaldo, Livio 2007. Dependency Tree Semantics. Ph.D. disseration. University of Torino.
Sag, Ivan 1976. Deletion and Logical Form. Ph.D. dissertation. MIT, Cambridge, MA.
Sailer. Manfred 2000. Combinatorial Semantics and Idiomatic Expressions in Head-Driven Phrase
Structure Grammar. Doctoral dissertation. University of Tubingen.
Sailer, Manfred 2004. Past tense marking in Afrikaans. In: C. Meier & M. Weisgerber (eds.).
Proceedings of Sinn und Bedeutung 8. Konstanz: University of Konstanz, 233-248.
Sailer, Manfred 2006. Don't believe in underspecified semantics. In: O. Bonami & P. Cabredo
Hoflierr (eds.). Empirical Issues in Formal Syntax and Semantics, vol. 6. Paris: CSSP, 375-403.
Schiehlen, Michael 2000. Semantic construction. In: W. Wahlster (ed.). Verbmobil. Foundations of
Speech-to-Speech Translation. Berlin: Springer, 200-216.
Schildcr, Frank 2002. Robust discourse parsing via discourse markers, topicality and position.
Natural Language Engineering 8,235-255.
Schubert, Lenhart & Francis Pelletier 1982. From English to logic. Context-free computation of
'conventional' logical translation. American Journal of Computational Linguistics 8,26-44.
Shieber, Stuart, Fernando Pereira & Mary Dalrymple 1996. Interaction of scope and ellipsis.
Linguistics & Philosophy 19,527-552.
Steedman, Mark 2000. The Syntactic Process. Cambridge, MA:The MIT Press.
574 V. Ambiguity and vagueness
Steedman, Mark 2007. Surface-Compositional Scope-Alternation without Existential Quantifiers.
Draft 5.2., September 2007. Ms. Edinburgh, ilcc/University of Edinburgh, http://www.iccs.
informatics.ed.ac.uk/~steedman/papers.html, August 25,2009.
Steriade.Donka 1995. Underspecification and markedness. In: J. Goldsmith (ed.). The Handbook of
Phonological Theory. Oxford: Blackwell, 114-174.
de Swart, Henriette 1998. Aspect shift and coercion. Natural Language and Linguistic Theory 16,
347-385.
VanLehn, Kurt 1978. Determining the Scope of English Quantifiers. MA thesis. MIT, Cambridge,
MA. MITTechnical Report AI-TR-483.
Villalta, Elisabeth 2003. The role of context in the resolution of quantifier scope ambiguities. Journal
of Semantics 20,115-162.
Wahlster, Wolfgang (ed.) 2000. Verbmobil. Foundations of Speech-to-Speech Translation. Berlin:
Springer.
Westerstahl, Dag 1998. On mathematical proofs of the vacuity of compositionality. Linguistics &
Philosophy 21,635-643.
Woods, William 1967. Semantics for a Question-Answering System. Ph.D. dissertation. Harvard
University, Cambridge, MA.
Woods, William 1978. Semantics and quantification in natural language question answering. In: M.
Yovits (ed.). Advances in Computers, vol. 17. New York: Academic Press, 2-87.
Markus Egg, Berlin (Germany)
25. Mismatches and coercion
1. Enriched type theories
2. Type coercion
3. Aspect shift and coercion
4. Cross-linguistic implications
5. Conclusion
6. References
Abstract
The principle of compositionality of meaning is the foundation of semantic theory. With
function application as the main rule of combination, compositionality requires complex
expressions to be interpreted in terms of function-argument structure. Type theory lends
itself to an insightful representation of many combinations of a functor and its argument.
But not all well-formed linguistic combinations are accounted for within classical type
theory as adopted by Montague Grammar. Conflicts between the requirements of a functor
and the properties of its arguments are described as type mismatches. Three solutions to
type mismatches have been proposed within enriched type theories, namely type raising,
type shifting, and type coercion. This article focusses on instances of type coercion. We
provide examples, propose lexical and contextual restrictions on type coercion, and discuss
the status of coercion as a semantic enrichment operation. The paper includes a special
application of the notion of coercion in the domain of tense and aspect. Aspectual coercion
affects the relationship between predicative and grammatical aspect. We treat examples
from English and Romance languages in a cross-linguistic perspective.
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 574-597
25. Mismatches and coercion
575
1. Enriched type theories
1.1. Type mismatches
According to the principle of compositionality of meaning, the meaning of a complex
whole is a function of the meaning of its composing parts and the way these parts are put
together. With function application as the main rule for combining linguistic expressions,
compositionality requires that complex expressions be interpreted in terms of function-
argument structure. Functors do not combine with just any argument. They look for
arguments with particular syntactic and semantic properties, A type mismatch arises
when the properties of the argument do not match the requirements of the functor. Such
a type mismatch can lead to ungrammaticalities. Consider the ill-formed combination eat
laugh, where the transitive verb eat wants to take an object of category NP, and does not
combine with the intransitive verb laugh of category VP, In type-theoretical terms, eat is
an expression of type (e, (e,t)), which requires an expression of type e as its argument.
Given that laugh is of type (e,t), there is a mismatch between the two expressions, so they
do not combine by function application, and the combination eat laugh is ungrammatical.
In this article wc do not focus on syntactic incompatibilities, but on semantic type
mismatches and ways to resolve conflicts between the requirements of a functor and
the properties of its argument. Sections 1,2 and 1.3 briefly refer to type raising and type
shifting as operations defined within an enriched type theoretical framework. Section 2
addresses type coercion, as introduced by Pustejovsky (1995), Section 3 treats aspectual
coercion within a semantic theory of tense and aspect.
1.2. Type raising
As we saw with the ill-formed combination eat laugh, a type mismatch can lead to
ungrammaticalities. Not all type mismatches have such drastic consequences. Within
extended versions of type theory, mechanisms of type-raising have been formulated, that
deal with certain type mismatches (cf, Hendriks 1993), Consider the VPs kiss Bill and
kiss every boy, which are both well-formed expressions of type (e,t).The transitive verb
kiss is an expression of type (e,(e,t)). In the VP kiss Bill, the transitive verb directly
combines with the object of type e. However, the object every boy is not of type e, but of type
((e,t),t),The mismatch is resolved by raising the type of the argument of the transitive
verb. If wc assume that a transitive verb like kiss can take cither proper names or
generalized quantifiers as its argument, we interpret kiss as an expression of type (e, (e,t)), or
type (((e,t),t), (e,t)). Argument raising allows the transitive verb kiss to combine with the
object every boy by function application.
1.3. Type shifting
Type raising affects the domain in which the expression is interpreted, but maintains
function application as the rule of combination, so it preserves compositionality. For
other type mismatches, rules of type-shifting have been proposed, in particular in the
interpretation of indefinites (Partee 1987), In standard Montague Grammar, NPs have
a denotation in the domain of type e (e.g. proper names) and/or type ((e,t),t)
(generalized quantifiers). In certain environments, the indefinite seems to behave more like a
576 V, Ambiguity and vagueness
predicate (type e,t) than like an argument. We find the predicative use of indefinites in
predicative constructions {Sue is a lawyer, She considers him a fool), in measurement
constructions {This baby weighs seven pounds), with certain 'light' verbs {This house has
many windows), and in existential constructions {There is a book on the table, (see article
69 (McNally) Existential sentences). An interpretation of indefinites as denoting
properties makes it possible to treat predicative and existential constructions as functors taking
arguments of type (e,t) (see article 85 (de Hoop) Type shifting, and references therein).
The type-shifting approach raises problems for the role of indefinites in the object
position of regular transitive verbs, though. Assuming that we interpret the verb eat as an
expression of type (c,(c,t)), and the indefinite an apple as an expression of type (e,t),
wc would not know how to combine the two in eat an apple. Argument raising docs not
help in this case, for a functor of type (((e,t),t), (e,t)) does not combine with an argument
of type (e,t) either. Two solutions present themselves. If we assign transitive verbs two
types, the type (e,(e,t)) and the type ((e,t),(e,t)), we accept a lexical ambiguity of all verbs
(Van Geenhoven 1998). If widespread lexical ambiguities are not viewed as an attractive
solution, the alternative is to develop combinatory rules other than function application.
File Change Semantics and Discourse Representation theory (Heim 1982, Kamp 1981,
Kamp & Reyle 1993) follow this route, but not in a strictly type-theoretical setting. The
indefinite introduces a variable, that is existentially closed at the discourse level if it is
not embedded under a quantificational binder. De Swart (2001) shows that we can
reinterpret the DRT rule of existential closure in typc-thcorctical terms. The price wc pay
is a fairly complex system of closure rules in order to account for monotone increasing,
decreasing and non-monotone quantifiers. These complexities favor a restriction of the
property type denotation of indefinites to special contexts where specific construction
rules can be motivated by the particularities of the construction at hand.
2. Type coercion
In Section 1, we saw that enriched type theories have formulated rules of type raising
and type shifting to account for well-formed natural language examples that present
mismatches in stricter type theories. A third way of resolving type mismatches in natural
language has been defined in terms of type coercion, and this route constitutes the focus
of this article,
2.1. Type coercion as a strategy to resolve type mismatches
The use of coercion to resolve type mismatches goes back to the work on polysemy by
James Pustejovsky (Pustejovsky 1995), Pustejosvky points out that the regular rules of
type theoretic combinatorics are unable to describe sentences such as (1):
(1) a. Mary finished the novel in January 2007, A week later, she started a new book,
b. John wants a car until next week.
Verbs like start and finish in (la) do not directly take expressions of type c or ((c,t),t),
such as the novel or a new book, for they operate on processes or actions rather than on
objects, as we see in start reading,finish writing. As Dowty (1979) points out, the temporal
adverbial until next week modifies a hidden or understood predicate in (lb), as in John
25. Mismatches and coercion
577
wants to have a car until next week. The type mismatch in (1) does not lead to ungram-
maticalities, but triggers a reinterpretation of the combination of the object of the verb.
The hearer fills in the missing process denoting expression, and intcrprcts^/iw/i the novel
as 'finish reading the novel' or 'finish writing the novel', and wants a car as 'wants to have
a car'. This process of reinterpretation of the argument, triggered by the type mismatch
between a functor (here: the verb finish or start) and its argument (here the object the
novel or a new book) is dubbed type coercion by Pustejovsky, Type coercion is a semantic
operation that converts an argument to the type expected by the function, where it would
otherwise result in a type error. Formally, function application with coercion is defined as
in (2) (Pustejovsky 1995: 111), with a the argument and p the functor:
(2) Function application with coercion. If a is of type c, p of type (a,b), then:
(i) if type c = a, then p(a) is of type b,
(ii) if there is a o e Za such that o(a) results in an expression of type a, then p(o(a))
is of type b.
(iii)otherwise a type error is produced.
If the argument is of the right type to be the input of the functor, function application
applies as usual (i). If the argument can be reinterpreted in such a way that it satisfies
the input requirements of the function, function application with coercion applies (ii). La
represents the set of well-defined reinterpretations of a, leading to the type a for o(a)
that is required as the input of the functor p. If such a reinterpretation is not available,
the sentence is ungrammatical (iii).
Type coercion is an extremely powerful process. If we assume that any type mismatch
between an operator and its argument can be repaired by the hearer simply filling in the
understood meaning, we would run the risk of inserting hidden meaningful expressions
anywhere, thus admitting random combinations of expressions, and loosing any ground
for explaining ungrammaticalities. So the process of type coercion must be severely
restricted,
2.2. Restrictions on type coercion
There are three important restrictions Pustejovsky (1995) imposes upon type coercion in
order to avoid ovcrgcncralization. First, coercion docs not occur freely, but must always
be triggered as part of the process of function application (clause (ii) of definition 2),
In (1), the operators triggering type coercion are the verbs finish, start and want.
Outside of the context of such a trigger, the nominal expressions the novel, a new book and
a car do not get reinterpreted as processes involving the objects described by these noun
phrases (clause (i) of definition 2), Second, type coercion always affects the argument,
never the functor. In example (1), this means that the interpretation of finish, start and
want is as usual, and only the interpretation of the nominal expressions the novel, a new
book and a car is enriched by type coercion. Third, the interpretation process involved
in type coercion requires a set of well-defined reinterpretations for any expression a
(Za in clause ii of definition 2), Pustejovsky (1995) emphasizes the role of the lexicon
in the reinterpretation process. He develops a generative lexicon, in which not only the
referential interpretation of lexical items is spelled out, but an extended
interpretation is developed in terms of argument structure, extended event structure and qualia
578 V. Ambiguity and vagueness
structure (involving roles). In (1), the relevant qualia of the lexical items novel and book
include the fact that these nouns denote artifacts which are produced in particular ways
(agcntivc role: process of writing), and which arc used for particular purposes (tclic role:
process of reading). The hearer uses these extended lexical functions to fill in the
understood meaning in examples like (1). The qualia structure is claimed to be part of our
linguistic knowledge, so Pustejovsky maintains a strict distinction between lexical and
world knowledge. See article 32 (Hobbs) Word meaning and world knowledge for more
on the relation between word meaning and world knowledge.
Type coercion with verbs like begin, start, finish, is quite productive in everyday
language use. Google provided many examples involving home construction as in (3a,b),
building or producing things more in general (3c-f), and, by extension, organizing (3g, h):
(3) a. When I became convalescent, father urged Mr, X to begin the house.
b. We will begin the roof on the pavilion on December 1.
c. Today was another day full of hard work as our team stays late hours to finish
the car,
d. (about a bird) She uses moss, bits of vegetation, spider webs, bark flakes, and
pine needles to finish the cup-shaped nest,
e. Susan's dress is not at any picture, but the lady in the shop gave very good
instructions how to finish the dress,
f. Once I finish an album, I'll hit the road and tour, I got a good band together and
1 really feel pleased.
g. The second step was to begin the House Church, (context: a ministry page)
h. This is not the most helpful way to begin a student-centered classroom.
Examples with food preparation (4a) and consumption (4b) arc also easy to find, even
though we sometimes need some context to arrive at the desired interpretation (4c),
(4) a. An insomniac, she used to get up at four am and start the soup for dinner, while
rising dough for breakfast, and steaming rice for lunch,
b. The portions were absolutely rcdiculusly huge, even my boyfriend the human
garbage disposal couldn't finish his plate.
c. A pirate Jeremy is able to drink up a rum barrel in 10 days, A pirate Amelie needs
2 weeks for that. How much time do they need together to finish the barrel?
Finally, we find a wide range of other constructions, involving exercices, tables, exams,
etc, to complete (5a), technical equipment or computer programs to run (5b), lectures to
teach (by lecturers) (5c), school programs to attend (by students) (5d),etc.
(5) a. Read the instructions carefully before you begin the exam.
b. Press play to begin the video,
c. Begin the lesson by asking what students have seen in the sky.
d. These students should begin the ancient Greek or Latin sequence now if they
have not already done so.
These reinterpretations are clearly driven by Pustejovsky's qualia structure, and are thus
firmly embedded in a generative lexicon. But type coercion can also apply in creative
25. Mismatches and coercion
579
language use, where the context rather than the lexicon drives the reinterpretation
process. If the context indicates that Mary is a goat that eats anything she is fed, we
could end up with an interpretation of (la) in which Mary finished eating the novel, and
started eating a new book a week later (cf, Lascarides & Copestake 1998). One might be
skeptical about the force of contextual pressure. Indeed, a quick google search revealed
no naturally occurring examples of 'begin/finish the book' (as in: a goat beginning/
finishing to eat the book). However, there are other examples that suggest a strongly
context-dependent application of type coercion:
(6) a. Is it too late to start geranium plants from cuttings?
(context: a Q&A page on geraniums, meaning: begin growing geranium plants)
b, (After briefly looking at spin as an angular momentum property, we will begin
discussing rotational spectroscopy.) We finish the particle on a sphere, and begin
to apply those ideas to rotational spectography,
(context: chemistry class schedule; meaning: finish discussing).
The numbers of such examples of creative language use are fairly low, but in the right
context, their meaning is easy to grasp. All in all, type coercion is a quite productive
process, and the limits on this process reside in a combination of lexical and contextual
information.
2.3. Putting type coercion to work
The contexts in (7) offer another prominent example of type coercion discussed by
Pustejovsky (1995):
(7) a. We will need a fast boat to get back in time,
b, John put on a long album during dinner.
Although fast is an adjective modifying the noun boat in (7a), it semantically operates
on the action involving the boat, rather than the boat itself. Similarly, the adjective long
in (7b) has an interpretation as an event modifier, so it selects the telic role of playing
the record,
Pustejovsky's notion of type coercion has been put to good use in other contexts
as well. One application involves the interpretation of associative ('bridging') definite
descriptions (Clark 1975):
(8) Sara brought an interesting book home from the library. A photo of the author was
on the cover.
The author and the cover are new definites, for their referents have not been introduced
in previous discourse. The process of accommodation is facilitated by an extended
argument structure, which is based on the qualia structure of the noun (Bos, Buitelaar
& Mineur 1995), and the rhetorical structure of the discourse (Asher & Lascarides
1998). Accordingly, the most coherent interpretation of (8) is to interpret the author and
the cover as referring to the author and the cover of the book introduced in the first
sentence.
580
V, Ambiguity and vagueness
Kluck (2007) applies type coercion to metonymic reinterpretations as in (9), merging
Pustejovsky's generative lexicon with insights concerning conceptual blending:
(9) a toy truck, a stone lion
The metonymic interpretation is triggered by a conflict between the semantics of the
noun and that of the modifying adjective. In all these examples, we find a shift to an
'image' of the object, rather than an instance of the object itself.
The metonymic type coercion in (9) comes close to instances of polysemy like those
in (10) that Nunberg (1995) has labelled 'predicate transfer',
(10) The ham sandwich from table three wants to pay.
This sentence is naturally used in a restaurant setting, where one waiter informs another
one that the client seated at table three, who had a ham sandwich for lunch is ready to
pay. Nunberg analyzes this as a mapping from one property onto a new one that is
functionally related to it; in this case the mapping from the ham sandwich to the person who
had the ham sandwich for lunch. Note that the process of predicate transfer in (10) is
not driven by a type mismatch, but by other conflicts in meaning (often driven by
selection restrictions on the predicate). This supports the view that reinterpretation must be
contextually triggered in general.
Asher & Pustejovsky (2005) use Rhetorical Discourse Representation Theory to
describe the influence of context more precisely (see article 37 (Kamp & Reyle)
Discourse Representation Theory for more on discourse theories).They point out that in the
context of a fairy tale, we have no difficulty ascribing human-like properties to goats that
allow us to interpret examples like (11):
(11) The goat hated the film but enjoyed the book.
In a more realistic setting, we might interpret (11) as the goat eating the book, and
rejecting the film (cf. section 2.2 above), but in a fictional interpretation, goats
can become talking, thinking and reading agents, thereby assuming characteristics
that their sortal typing would not normally allow. Thus discourse may override lexical
preferences.
Type coercion, metonymic type coercion and predicate transfer can be included in the
set of mechanisms that Nunberg (1995) subsumes under the label 'transfer of meaning',
and that come into play in creative language use as well as systematic polysemy. Function
application with coercion is a versatile notion, which is applied in a wide range of cases
where mismatches between a functor and its argument are repaired by
reinterpretation of the argument. Coercion restores compositionality in the sense that
reinterpretation involves inserting hidden meaningful material that enriches the semantics of the
argument, and enables function application. Lexical theories such as those developed
by Pustejovsky (1995), Nunberg (1995), Asher & Pustejovsky (2005), highlighting not
only denotational and selective properties,but also functional connections and discourse
characteristics arc best suited to capture the systematic relations of polysemy wc find in
these cases.
25. Mismatches and coercion
583
The aspectual operator applies (recursively if necessary) on the eventuality description
provided by the predicate-argument structure, so in (13b), the perfect maps the ongoing
process of writing a book onto its consequent state. Aspectual restrictions and
implications for a cross-linguistic semantics of the perfect are discussed in Schmitt (2001)
and de Swart (2003, 2007). Article 49 (Portner) Perfect and progressive offers a general
treatment of the semantics and pragmatics of the perfect.
Tense operates on the output of the aspectual operator (if present), so in (13a), the
state of the ongoing process of eating is located in the past, and no assertion about
the event reaching its inherent endpoint in the real world is made. The structure in
(13a) gives rise to the semantic representation in DRT format in Fig, 25,1.
s, t, n
t<n
SOt
s: PROG
e: Julie eat a fish |
Fig. 25.1: Julie was eating a fish
According to Fig. 25.1, there is a state of Julie eating a fish in progress, that is located
at some time t preceding the speech time n ('now'). The intensional semantics of the
Progressive (Dowty 1979, Landman 1992, and others) is hidden in the truth conditions of
the Prog operator, and is not our concern in this article (see article 49 (Portner) Perfect
and progressive for more discussion).
3.2. Aspectual mismatches and coercion
It is well known that the English Progressive does not take just any eventuality description
as its argument.The Progressive describes processes and events in their development, and
the combination with a state verb frequently leads to ungrammaticalities (14a,b).
(14) a. *?Bill is being sick/in the garden.
b. *?Julie is having blue eyes.
c. I'm feeding him a line, and he is believing every word.
d. Bill is being obnoxious.
However, the Progressive felicitously combines with a state verb in (14c) (from Michaelis
2003) and (14d). A process of aspectual reinterpretation takes place in such examples,
and the verb is conceived as dynamic, and describing an active involvement of the agent.
The dynamic reinterpretation arises out of the conflict between the aspectual
requirements of the Progressive, and the aspectual features of the state description. De Swart
(1998) and Zucchi (1998) use the notion of aspectual coercion to explain the meaning
effects arising from such mismatches. In an extended use of the structure in (12),
aspectual reinterpretation triggers the insertion of a hidden coercion operator C^, mapping a
stativc description onto a dynamic one:
(15) a. He is believing every word.
[Pres [Prog [ Csd [he believe every word]]]
584 V. Ambiguity and vagueness
b. Bill is being obnoxious.
[ Pres [ Prog [ C^, [ Bill be obnoxious ]]]]
The hidden coercion operator is inserted in the slot for grammatical aspect. In (15), Csd
adapts the properties of the state description to the needs of the Progressive. More
precisely, the operator Csd reinterprets the state description as a dynamic description, which
has the aspectual features that allow it to be an argument of the Progressive operator.
In this way, the insertion of a hidden coercion operator resolves the aspectual mismatch
between the eventuality description and the aspectual operator in contexts like (15).
(15b) gives rise to the semantic representation in DRT in Fig. 25.2:
s.t, n
t = n
tOs
s: PROG
d:CSd
s': Bill be obnoxious
Fig. 25.2: Bill is being obnoxious
The dynamic variable d, obtained by coercion of the state s' is the input for the
Progressive operator. The output of the progressive is again a state, but the state of an event or
process being in progress is more dynamic than the underlying lexical state. The DRT
representation contains the coercion operator C^,, with the subscripts indicating the
aspectual characterization of the input and output description.The actual interpretation
of the coercion operator is pushed into the truth conditions of C, because it is partly
dependent on lexical and contextual information (compare also Section 2).The relevant
interpretation of Csd in Fig. 25.2 is dynamic:
(16) Dynamic is a function from sets of state eventualities onto sets of dynamic
eventualities in such a way that the state is presented as a process or event that the agent
is actively involved in.
Note that (14a) and (14b) cannot easily be reinterpreted along the lines of (16), because
of lack of agentivity in the predicate.
Following Moens & Steedman (1988), we assume that all mappings between sets of
eventualities that are not labelled by overt aspectual operators are free.This implies that all
semantically possible mappings that the language allows between sets of eventualities, but
that arc not the denotation of an overt grammatical aspect operator like the Progressive,
the Perfect, etc. are possible values of the hidden coercion operator. Of course, the value of
the coercion operator is constrained by the aspectual characteristics of the input eventuality
description and the aspectual requirements of the aspectual operator. Thus, the coercion
operator in Fig. 25.1 is restricted to being a mapping from sets of states to sets of events.
3.3. Extension to aspectual adverbials
Grammatical operators like the Progressive or the Perfect are not the only
expressions overtly residing in the Aspect slot in the structure proposed in (12). Consider the
25. Mismatches and coercion
585
sentences in (17)—(19), which illustrate that in adverbials combine with event descriptions,
and for and until adverbials with processes or states:
(17) in adverbials
a. Jennifer drew a spider in five minutes. (event)
b. #Jennifer ran in five minutes. (process)
c. #Jennifer was sick in five minutes. (state)
(18) for adverbials
a. Jennifer was sick for two weeks. (state)
b. Jennifer ran for five minutes. (process)
c. #Jennifer drew a spider for five minutes. (event)
(19) until adverbials
a. Jennifer was sick until today. (state)
b. Jennifer slept until midnight. (process)
c. #Jcnnifcr drew a spider until midnight. (event)
The aspectual restrictions on in and for adverbials have been noted since Vendler (1957),
see article 48 (Filip) Aspectual class and Aktionsart and references therein. For until
adverbials, see de Swart (1996), and references therein. A for adverbial measures the
temporal duration of a state or a process, whereas an in adverbial measures the time it
takes for an event to culminate. Until imposes a right boundary upon a state or a
process. Their different semantics is responsible for the aspectual requirements in, for and
until adverbials impose upon the eventuality description with which they combine, as
illustrated in (17)—(19). We focus here on for and in adverbials. Both have been treated
as measurement expressions that yield a quantized eventuality (Krifka 1989). Thus
to run denotes a set of processes, but to run for five minutes denotes a set of quantized
eventualities. Aspectual adverbials can then be treated as operators mapping one set
of eventualities onto another (cf. Moens & Steedman 1988). For adverbials map a set
of states or processes onto a set of events; in adverbials a set of events onto a set of
events.
The mapping approach provides a natural semantics of sentences (17a) and (18a, b).
However, the mismatches in (17b, c) and (18c) are marked as pragmatically infelicitous
(#), rather than ungrammatical (*).This raises the question whether the aspectual
restrictions on in and for adverbials are strictly semantic, or rather loosely pragmatic, in which
case an underspecification account might be equally successful (cf. Dolling 2003 for a
proposal, and article 24 (Egg) Semantic underspecification for a general discussion on the
topic). We can maintain the semantic treatment of aspectual adverbials as two different
kinds of measurement expressions, if we treat the well-formed examples that combine in
with states/processes or for with event descriptions as instances of aspectual coercion, as
illustrated in (20) and (21):
(20) a. Sally read a book for two hours. (process reading of event)
[Past [ for two hours [ Ceh [ Sally read a book ]]]]
b. Jim hit a golf ball into the lake for an hour. (frequentative reading of event)
25. Mismatches and coercion
587
over eventualities such that the frequentative/habitual state describes a stable
property over time of an agent.
Note that habitual readings are constrained to iterable events, which imposes constraints
upon the predicate-argument structure, as analyzed in de Swart (2006).
The advantage of a coercion analysis over an underspecification approach in the
treatment of the aspectual sensitivity of in and for adverbials is that we get a
systematic explanation of the meaning effects that arise in the unusual combinations
exemplified in (17b, c), (18c), (20) and (21), but not in (17a) and (18a, b). Effects of aspectual
coercion have been observed in semantic processing (Pinango et al.2006 and references
therein). Brcnnan & Pylkkancn (2008) hypothesize that the online effects of aspectual
coercion are milder than in type coercion, because it is easier to shift into another kind
of event than to shift an object into an event. Nevertheless, they find an effect with mag-
netoencephalography, which they take to support the view that aspectual coercion is
part of compositional semantics, rather than a context-driven pragmatic enrichment of
an underspecified semantic representation, as proposed by Dolling (2003) (cf. also the
discussion in section 2.4 above).
3.4. Aspectually sensitive tenses
Aspectual mismatches can arise at all levels in the structure proposed in (12). Dc Swart
(1998) takes the English simple past tense to be aspectually transparent, in the sense that
it lets the aspectual characteristics of the eventuality description shine through.Thus, both
he was sick and he ate an apple get an episodic reading, and the sentences locate a state
and an event in the past respectively. Not all English tenses are aspectually transparent.
The habitual interpretation of he drinks or she washes the car (Schmitt 2001, Michaelis
2003) arises out of the interaction of the process/event description and the present tense,
but cannot be taken to be the meaning of either, for he drank and she washed the car have
an episodic besides a habitual interpretation, and he is sick has just an episodic
interpretation. We can explain the special meaning effects if we treat the English simple present
as an aspectually restricted tense. It is a present tense in the sense that it describes the
eventuality as overlapping in time with the speech time. It is aspectually restricted in the
sense that it exclusively operates on states (Schmitt 2001, Michaelis 2003). This
restriction is the result of two features of the simple present, namely its characterization as an
imperfective tense, which requires a homogenous eventuality description as its input (cf.
also the French Imparfait in section 4.4 below), and the fact that it posits an overlap of the
eventuality with the speech time, which requires an evaluation in terms of instants, rather
than intervals. States are the only eventualities that combine these two properties. As a
result, the interpretation of (23a) is standard, and the sentence gets an episodic
interpretation, but the aspectual mismatch in (23b) and (23c) triggers the insertion of a coercion
operator Cds, reinterpreting the dynamic description as a stative one (see Fig. 25.4).
(23) a. Bill is sick.
[ Present [ Bill be sick ]]
b. Bill drinks.
[ Present [ Cds [ Bill drink ]]]
V. Ambiguity and vagueness
Julie washes the car.
[ Present [ Cds [ Julie wash the car ]]]
s,t, n
sOt
tOn
s:Cds
e: Julie wash the car
Fig. 25.4: Julie washes the car
In (23b) and (23c), the value of Cds is a habitual interpretation. A habitual reading is stative,
because it describes a regularly recurring action as a stable property over time (Smith
2005). The contrast between (23a) and (23b, c) confirms the general feature of coercion,
namely that reinterpretation only arises if coercion is forced by an aspectual mismatch. If
there is no conflict between the aspectual requirements of the functor, and the input, no
reinterpretation takes place, and we get a straightforward episodic interpretation (23a).
Examples (23b) and (23c) illustrate that the ^interpretations that are available through
coercion are limited to those that are not grammaticalized by the language (Moens 1987).
The overt Progressive blocks the progressive interpretation of sentences like (23b, c) as
the result of a diachronic development in English (Bybee, Perkins & Pagliucca 1994:144).
3.5. Aspectual coercion and rhetorical structure
At first sight, the fact that an episodic as well as a habitual interpretation is available for
he drank and she washed the car might be a counterexample to the general claim that
aspectual coercion is not freely available:
(24) Bill drank.
a. [Past [ Bill drink ]]
b. [Past [Cds [ Bill drink]]]
(episodic interpretation)
(habitual intepretation)
(25) Julie washed the car.
a. [Past [ Julie washed the car ]]
b. [Past [ Cds [ Julie washed the car ]
(episodic interpretation)
(habitual interpretation)
Given that the episodic interpretation spelled out in (24a) and (25a) does not show an
aspectual mismatch, we might not expect to find habitual interpretations involving a
stative reinterpretation of the dynamic description, as represented in (24b) and (25b),
but we do. Such aspectual reinterpretations are not triggered by a sentence internal
aspectual conflict, but by rhetorical requirements of the discourse. Habitual
interpretations can be triggered if the sentence is presented as part of a list of stable properties of
Bill over time, as in (26):
(26) a. In high school, Julie made a little pocket money by helping out the neighbours.
She washed the car, and mowed the lawn.
25. Mismatches and coercion
589
b. In college, Bill led a wild life. He drank, smoked, and played in a rock band,
rather than going to class.
Aspectual coercion may be governed by discourse requirements as in (26), but even then,
the shifted interpretation is not free, but must be triggered by the larger context, and
more specifically the rhetorical structure of the discourse. An intermediate case is
provided by examples in which an adverb in the sentence indicates the rhetorical function of
the eventuality description, and triggers aspectual coercion, as in (27):
(27) Suddenly, Jennifer knew the answer. (inchoative reading of state)
[Past [ C^ [ Jennifer know the answer ]]]
Suddenly only applies to happenings, but know the answer is a state description, so
(27) illustrates an aspectual mismatch.The insertion of a hidden coercion operator solves
the conflict,so we read the sentence in such a way that the adverb suddenly brings out the
inception of the state as the relevant happening in the discourse.
3.6. Key features of aspectual coercion
In this section, we defined aspectual mapping operations with coercion in DRT. Formally,
the proposal can be summed up as follows:
(28) Aspectual mappings with coercion in DRT
(i) When a predicate-argument structure S, is modified by an aspectual operator
Asp denoting a mapping from a set of eventualities of aspectual class a to a set
of eventualities of aspectual class a\ introduce in the universe of discourse of
the DRS a new dr a' and introduce in the set of conditions a new condition a':
Asp Kr
(ii) If S, denotes an eventuality description of aspectual class a, introduce in
the universe of discourse of K, a new discourse referent a\ and introduce
in the set of conditions of K, a new condition a': y, where y is the denotation
ofS,.
(iii)If S( denotes an eventuality description of aspectual class a" (a" *a), and there
is a coercion operator Ca„a such that Ca„a(y) denotes a set of eventualities of
aspectual class a, introduce in the universe of K, a new discourse referent a\ and
introduce in the set of conditions of K, a new condition o':Ca..a(y),whereyis the
denotation of Sr
(iv) If S, denotes an eventuality description of aspectual class a" (a'Va), and there
is no coercion operator Ca„a such that Ca..a(y) denotes a set of eventualities of
aspectual class a, modification of S, by the aspectal operator Asp issemantically
ill-formed.
According to this definition, aspectual coercion only arises when aspectual features
of an argument do not meet the requirements of the functor (typically an adverbial
expression, an aspectual operator, or a tense operator) (compare clauses ii and iii).
As a result, we never find coercion operators that map a set of states onto a set of
states, or a set of events onto a set of events, but we always find mappings from one
25. Mismatches and coercion
593
s, n, t
t<n
hOt
h:Ceh
e: he do his homework
Fig. 25.6: II faisait ses devoirs
out of the conflict between the aspectual requirements of the tense operator, and the
input eventuality description.The outcome of the coercion process is that state/process
descriptions reported in the Passe Simple arc event denoting, and fit into a sequence of
ordered events, whereas event descriptions reported in the Imparfait are state denoting,
and do not move the reference time forward.
(38) a. lis s' aperqurent de 1'ennemi. Jacques eut grand peur.
They refl noticed.ps of the ennemi. Jacques had.ps great fear.'
'They noticed the ennemi. Jacques became very fearful.'
b. II se noyait quand 1'agent le sauva
He refl drowned.imp when the officer him saved.ps
en le retirant de 1'eau.
by him withdraw, part from the water.
'He was drowning when the officer saved him by dragging him out of the water.'
The examples in (38) are to be compared to those in (30) and (31), where no coercion
takes place, but the discourse structures induced by the Passe Simple and the Imparfait
arc similar. We conclude that the coercion approach maintains Kamp & Rohrcr's (1983)
insights about the temporal structure of narrative discourse in French, while offering a
compositional analysis of the internal structure of the sentence.
Versions of a treatment of the perfective/imperfective contrast in terms of aspec-
tually sensitive tense operators have been proposed for other Romance languages, cf.
Schmitt (2001). If we embed the analysis of the perfective and imperfective past tenses
in the hierarchical structure proposed in (12), we predict that they always take wide
scope over overt aspectual operators, including aspectual adverbials, negation, and quan-
tificational adverbs (iterative adverbs like twice and frequentative adverbs like often).
If these expressions are interpreted as mappings between sets of eventualities, we
predict that the distribution of perfective/ imperfective tenses is sensitive to the output of
the aspectual operator. De Swart (1998) argues that eventuality descriptions modified
by in and for adverbials are quantized (cf. section 3.3 above), and are thus of the right
aspectual class to serve as the input to the Passe Simple in French. She shows that
coercion effects arise in the combination of these adverbials with the Imparfait. De Swart &
Molendijk (1999) extend the analysis to negation, iteration and frequency in French.
Lcnci & Bcrtinctto (2000) treat the pcrfcctivc/impcrfcctivc past tense distribution in
sentences expressing iteration and frequency in Italian, and Perez-Leroux et al. (2007)
show that similar meaning effects arise in Spanish. Coercion effects also play a role in
first and second language acquisition, cf. Montrul & Slabakova (2002), Perez-Leroux
et al. (2007), and references therein for aspectuality.
594
V. Ambiguity and vagueness
5. Conclusion
This article outlined the role that type coercion and aspectual rcintcrpretation play in
the discussion on type mismatches in the semantic literature.This docs not mean that the
coercion approach has not been under attack, though.Three issues stand out.
The question whether coercion is a semantic enrichment mechanism or a pragmatic
inference based on an underspecified semantics has led to an interesting debate in the
semantic processing literature. The question is not entirely resolved at this point, but
there are strong indications that type coercion and aspectual coercion lead to delays in
online processing, supporting a semantic treatment.
As far as aspectual coercion is concerned, the question whether predicative aspect
and grammatical aspect are to be analyzed with the same ontological notions (as in the
mapping approach discussed here), or require two different sets of tools (as in Smith
1991/1997) is part of an ongoing discussion in the literature. It has been argued that
the lexical component of the mapping approach is not fine-grained enough to account
for more subtle differences in predicative aspect (Caudal 2005). In English, there is a
class of verbs (including read, wash, combe, ...) that easily allow the process reading,
whereas others (eat, build,...) do not (Kratzer 2004). Event decomposition approaches
(as in Kempchinsky & Slabakova 2005) support the view that some aspectual operators
apply at a lower level than the full predicate-argument structure, which affects the view
on predicative aspect developed by Verkuyl (1972/1993).The contrast between English-
type languages, where aspectual operators apply at the VP level, and Slavic languages,
where aspectual prefixes are interpreted at the V-level is highly relevant here (Filip &
Rothstein 2006).
The sensitivity of coercion operators to lexical and contextual operators implies that
the semantic representations developed in this article are not complete without a
mechanism that resolves the interpretation of the coercion operator in the context of the
sentence and the discourse. So far, we only have a sketch of how this would have to run
(Pulman 1997, de Swart 1998, Hamm & van Lambalgen 2005, Asher & Pustejovsky 2005).
There is also a more fundamental problem. Most operators we discussed in the aspectual
section are sensitive to the stative/dynamic distinction, or the homogeneous/quantized
distinction.This implies that the relation established by coercion is not strictly functional,
for it leaves us with several possible outputs (Verkuyl 1999).The same issue arises with
type coercion. However, spelling out the different readings semantically would bring us
back to a treatment in terms of ambiguities or underspecification, and we would lose our
grip on the systematicity of the operator-argument relation.
Notwithstanding the differences in framework, and various problems with the
implementation of the notion of coercion, the idea that reinterpretation effects are real, and
require a compositional analysis seems to have found its way into the literature.
6. References
dc Almeida, Roberto 2004. The effect of context on the processing of type-shifting verbs. Brain &
Language 90,249-261.
Asher, Nicholas & Alex Lascai ides 1998. Briding. Journal of Semantics 15,83-113.
Asher, Nicholas & James Pustejovsky 2005. Word Meaning and Coinnionsense Metaphysics. Ms.
Austin,TX, University ofTexas/Waltham/Boston, MA, Brandeis University.
598 V. Ambiguity and vagueness
non-literal language and a number of major innovations in metaphor theory. Metaphor
and metonymy are now understood to be ubiquitous aspects of language, not simply fringe
elements. The study of metaphor and metonymy have provided a major source of evidence
for Cognitive Linguists to argue against the position that a sharp divide exists between
semantics and pragmatics. Mounting empirical data from psychology, especially work by
Gibbs, has led many to question the sharp boundary between literal and metaphorical
meaning. While distinct perspectives remain among contemporary metaphor theorists,
very recent developments in relevance theory and cognitive semantics also show intriguing
areas of con vergen ce.
1. Introduction
It is an exaggeration to say that metaphor and metonymy are the tropes that launched
1,000 theories. Still, they are long recognized uses of language that have stimulated
considerable debate among language scholars and most recently served as a major
impetus for Cognitive Linguistics, a theory of language which challenges the modularity
hypothesis and postulates such as a strict divide between semantics and pragmatics
(cf. article 88 (Jaszczolt) Semantics and pragmatics). From Aristotle to the 1930's, the
study of metaphor and metonymy was primarily seen as part of the tradition of rhetoric.
Metaphor in particular was seen as a highly specialized, non-literal, 'deviant' use of
language, used primarily for literary or poetic effect. Although scholars developed a number
of variations on the theme, the basic analysis that metaphor involved using one term to
stand for another, remained largely unchanged. Beginning with Richards (1936) and his
insight that the two elements in a metaphor (the tenor and vehicle) are both active
contributors to the interpretation, the debate has shifted away from understanding metaphor
as simply substituting one term for another or setting up a simple comparison (although
this definition is still found in reference books such as the Encyclopedia Britannica).
In the main, contemporary metaphor scholars now believe that the interpretations and
inferences generated by metaphors are too complex to be explained in terms of simple
comparisons. (However, see discussion of resemblance metaphors, below.)
Richards' insights stirred debate among philosophers such as Black and Grice.
Subsequently, linguists (for example, Grady, Johnson, Lakoff, Reddy, Sperber & Wilson,
and Giora) and psychologists (for example, Coulson, Gibbs, and Keysar & Bly) have
joined in the discussion.The result has been a sharp increase in interest in metaphor and
metonymy. Debate has revolved around issues such as the challenges such non-literal
language poses for truth conditional semantics and pragmatics, for determining the
division between literal and non-literal language, and the insights metaphor and metonymy
potentially offer to our understanding of the connections between language and
cognition.The upshot has been a number of major innovations which include, but are not
limited to, a deeper understanding of the relations between the terms in linguistic metaphor,
that is, tenor (or target) and vehicle (or source); exploration of the type of knowledge,
'dictionary' versus 'encyclopedic', necessary to account for the interpreting metaphor;
and development of a typology of metaphors, that is, primary, complex, and resemblance
metaphor.
Given the recent diversity of scholarly activity, universally accepted definitions of
either metaphor or metonymy are difficult to come by. All metaphor scholars recognize
certain examples of metaphor, such as 'Ted is the lion of the Senate', and agree that such
26. Metaphors and metonymies 599
examples of metaphor represent, in some sense, non-literal use of language. Scholars
holding a more traditional, truth-conditional perspective might offer a definition along
the lines of 'use of a word to create a novel class inclusion relationship'. In contrast,
conceptual metaphor theorists hold that metaphor is "a pattern of conceptual association"
(Grady 2007) resulting in "understanding and experiencing one kind of thing in terms
of another" (Lakoff & Johnson 1980: 5). A key difference between perspectives is that
the truth-conditional semanticists represent metaphor as a largely word-level
phenomenon, while cognitive semanticists view metaphor as reflective of cognitive patterns
of organization and processing more generally. Importantly, even though exemplars of
metaphor such as 'Ted is a lion' are readily agreed upon, the prototypicality of such
a metaphor is controversial. More traditional approaches tend to treat 'Ted is a lion'
as a central example of metaphor; in contrast, conceptual metaphor theory sees it as
one of three types of metaphor (a resemblance metaphor) and not necessarily the most
prototypical.
According to Evans & Green (2006) and Grady (2008), resemblance metaphors
represent the less prototypical type of metaphors and reflect a comparison involving cultural
norms and stereotypes. A subset of resemblance metaphors, termed image metaphors,
are discussed by Lakoff & Turner (1989) as more clearly involving physical comparisons,
as in 'My wife's waist is an hourglass'. For cognitive semanticists, uses such as 'heavy' in
Stewart's brother is heavy, meaning Stewart's brother is a serious, deep thinker, represent
the most prevalent, basic type of metaphor, termed primary metaphor. In contrast, a
truth conditional scmanticist (cf. articles 23 (Kennedy) Ambiguity and vagueness and 24
(Egg) Semantic underspecification) would likely classify this use of 'heavy' as a case of
literal language use (sometimes referred to as a dead metaphor) involving polysemy and
ambiguity.
Not surprisingly, analogous divisions are found in the analyses of metonymy. All
semanticists recognize a statement such as 'Brad is just another pretty face' or 'The ham
sandwich is ready for the check' as central examples of metonymy. Moreover, there
is general agreement that a metonymic relationship involves referring to one entity
in terms of a closely associated or conceptually contiguous entity. Nevertheless, as
with analyses of metaphor, differences exists in terms of whether metonymy is
considered solely a linguistic phenomenon in which one linguistic entity refers to another
or whether it is reflective of more general cognition. Cognitive semanticists argue
"metonymy is a cognitive process in which one conceptual entity [...] provides mental
access to another conceptual entity within the same [cognitive] domain [...]" (Kovesces
&Radden 1998:39).
The chapter is organized as follows: Section 2 will present an overview of the
traditional approaches to metaphor which include Aristotle's classic representation as well as
Richards', Black's and Davidson's contributions. Section 3 addresses pragmatic accounts
of metaphor with particular attention to recent discussions within Relevance theory.
Section 4 presents psycholinguistic approaches, particularly Gentner's 'metaphor-as-career'
account and Glucksberg's metaphor -as- category account and the more recent "dual
reference" hypothesis. Section 5 focuses on cognitive and conceptual accounts. It
provides an overview of Lakoff & Johnson's (1980) original work on conceptual metaphor,
later refinements of that work, especially development of the theory of experiential
correlation and Grady's typology of metaphors, and conceptual blending theory. Section 6
addresses metonymy.
26. Metaphors and metonymies 601^
on metaphor, and hence no systematic patterns of metaphor. Like Black, and in keeping
with the long standing tradition, Davidson believed that metaphor is a lexical
phenomenon. Gibbs (1994: 220-221) points out that Davidson's "metaphors-without-meaning"
view reflects the centuries-old belief in the importance of an ideal scientific-type language
and that Davidson's denial of meaning or cognitive content in metaphor is motivated by
a theory-internal need to restrict the study of meaning to truth conditional sentences.
3. Pragmatic accounts of metaphor
Since their inception in the 1960's most pragmatic theories have preserved or reinforced
this sharp line of demarcation between semantic meanings and pragmatic interpretations
in their models of metaphor understanding.
In his theory of conversational implicature, Grice (1969/1989, 1975) argued that the
non-literal meaning of utterances is derived inferentially from a set of conversational
maxims (e.g. be truthful and be relevant). Given the underlying assumption that talk
participants are rational and cooperative, an apparent violation of any of those maxims
triggers a search for an appropriate conversational implicature that the speaker intended
to convey in context (cf. articles 92 (Simons) Implicature and 94 (Potts) Conventional
implicature). In this conception of non-literal language understanding, known as the
"standard pragmatic model", metaphor comprehension involves a scries of inferential
steps - analyze the literal meaning of an utterance first, then reject it due to its literal
falsity, and then derive an alternative interpretation that fits the context. In Robert is a
bulldozer, for instance, the literal interpretation is patently false (i.e. violates the maxim
of quality) and this recognition urges the listener to infer that the speaker must have
meant something non-literal (e.g., "persistent and does not let obstacles stand in his way
in getting things done"). In his speech act theory, Searle (1979) also espoused this
standard model and posited an even more elaborate set of principles and steps to be followed
in deriving non-literal meanings.
One significant implication of the standard pragmatic view is that metaphors, whose
interpretation is indirect and requires conversational implicaturcs, should take
additional time to comprehend over that needed to interpret equivalent literal speech, which
is interpreted directly (Gibbs 1994, 2006b). The results of numerous psycholinguistic
experiments, however, have shown this result to be highly questionable. In the majority
of experiments, listeners/readers were found to take no longer to understand the
figurative interpretations of metaphor, metonymy, sarcasm, idioms, proverbs, and indirect
speech acts than to understand equivalent literal expressions, particularly if these are
seen in realistic linguistic and social contexts (see Gibbs 1994,2002 for reviews; see also
the discussion below of psycholinguistic analyses of metaphor understanding for findings
on processing time for conventional versus novel figurative expressions).
Within their increasingly influential framework of relevance theory, Sperber & Wilson
(1995; Wilson & Sperber 2002a, 2002b) present an alternative model of figurative
language understanding that does not presuppose an initial literal interpretation and its
rejection. Instead, metaphors are processed similarly to literal speech in that
interpretive hypotheses are considered in their order of accessibility with the process stopping
once the expectation of "optimal relevance" has been fulfilled. A communicative input
is optimally relevant when it connects with available contextual assumptions to yield
maximal cognitive effects (such as strengthening, revising, and negating such contextual
26. Metaphors and metonymies 603
In the case of Caroline is a princess, the contextually derived ad hoc concept princess*
is narrowed, in a particular context, from the full extension of the encoded concept
princess to cover only a certain type of princess ("spoiled" and "indulged"), excluding other
kinds (e.g. "well-bred", "well-educated", "altruistic", etc.). It is also at the same time a
case of broadening in that the ad hoc concept covers a wider category of people who
behave in self-centered, spoiled manners, but are non-royalty, and whose prototypical
members are coddled girls and young women.
The relevance theoretic account of figurative language understanding thus pivots
on the "literal-loose-metaphorical continuum", whereby no criterion can be found for
distinguishing literal, loose, and metaphorical utterances, at least in terms of inferential
steps taken to arrive at an interpretation that satisfies the expectation of optimal
relevance. What is crucial here is the claim that the fully inferential process of language
understanding unfolds equally for all types of utterances and that the linguistically
encoded content of an utterance (hence, the "decoded" content of the utterance for the
addressee) merely serves as a starting point for inferring the communicator's intended
meaning, even when it is a "literal" interpretation. In other words, strictly literal
interpretations that involve neither broadening nor narrowing of lexical concepts are derived
through the same process of mutually adjusting explicit content (i.e. explicature, as
distinct from the decoded content) with implicit content (i.e. implicature), as in loose and
metaphorical interpretations of utterances (Sperber & Wilson 2008:93). As seen above,
this constitutes a radical departure from the standard pragmatic model of language
understanding, which gives precedence to the literal over the non-literal in the online
construction of the speaker's intended meaning. Relevance theory assumes, instead, that
literal interpretations are neither default interpretations to be considered first, nor are
they equivalent to the linguistically encoded (and decoded) content of given utterances.
The linguistically denoted meaning recovered by decoding is therefore "just one of the
inputs to an inferential process which yields an interpretation of the speaker's meaning"
(Wilson & Sperber 2002b: 600) and has "no useful theoretical role" in the study of verbal
communication (Wilson & Sperber 2002b: 583). Under this genuinely inferential model,
"interpretive hypotheses about explicit content and implicatures are developed partly
in parallel rather than in sequence and stabilize when they arc mutually adjusted so as
to jointly confirm the hearer's expectations of optimal relevance" (Sperber & Wilson
2008: 96; see also Wilson & Sperber 2002b). As an illustration of how the inferential
process may proceed in literal and figurative use of language, let us look at two examples
from Sperber & Wilson (2008).
(1) Peter: For Billy's birthday, it would be nice to have some kind of show.
Mary: Archie is a magician. Let's ask him. (Sperber & Wilson 2008:90)
In this verbal exchange, the word magician can be assumed to be ambiguous in its
linguistic denotation, between (a) someone with supernatural powers who performs magic
(magician,), and (b) someone who does magic tricks to amuse an audience (Magician,).
In the minimal context given, where two people are talking about a show for a child's
birthday party, the second encoded sense is likely to be activated first, under the
assumption that Mary's utterance will achieve optimal relevance by addressing Peter's
suggestion that they have a show for Billy's birthday party. Since magicians in the sense
of Magician2 put on magic shows that children enjoy, this assumption, combined with
608 V. Ambiguity and vagueness
Frame Semantics, and 86 (Kay & Michaelis) Constructional meaning for broader
perspectives on language and cognition within the framework of Cognitive Linguistics).
Moving away from the deviance tradition, Lakoff & Johnson focused on the ubiquity of,
often highly conventionalized, metaphor in everyday speech and what this reveals about
general human cognition. Rather than placing metaphor on the periphery of language,
Lakoff & Johnson argue that it is central to human thought processes and hence central
to language. They defined metaphor as thinking about a concept (the target) from one
knowledge domain in terms of another domain (the source). In particular, they
articulated the notion of embodied meaning, i.e., that human conceptualization is crucially
structured by bodily experience with the physical-spatial world; much of metaphor is a
reflection of this structuring by which concepts from abstract, internal or less familiar
domains are structured by concepts from the physical/spatial domains.This structuring is
held to be asymmetrical.The argument is that many of these abstract or internal domains
have not developed language in their own terms, for instance, English has few domain
specific temporal terms and generally represents time in terms of the spatial domain.
Thus language from the familiar, accessible, intersubjective domains of the social-
spatial-physical is recruited to communicate about the abstract and internal.
This overarching vision is illustrated in Lakoff & Johnson's well known analysis of the
many non-spatial, metaphoric meanings of up and down. They note that up and down
demonstrate a wide range of meanings in everyday English, a few of which are illustrated
below:
(3) a. The price of gas is up/down amount
b. Jane is further up/down in the corporation than most men. power, status
c. Scooter is feeling up/down these days. mood
Their argument is that all these uses of up and down reflect a coherent pattern of human
bodily experience with verticality.The importance of up and down to human thinking is
centrally tied to the particularities of the human body; humans have a unique physical
asymmetry stemming from the fact that we walk on our hind legs and that our most
important organs of perception arc located in our heads, as opposed to our feet.
Moreover, the up/down pattern cannot be attributed simply to these two words undergoing
semantic extension, as the conceptual metaphor is found with most words expressing a
vertical orientation:
(4) a. The price of gas is rising/declining.
b. She's in high/low spirits today.
c. She's over/under most men in the corporation.
This analysis explains a consistent asymmetry found in such metaphors; language does
not express concepts from social/physical/spatial domains in terms of concepts from
abstract or internal domains. For example, concepts such as physical verticality are not
expressed in terms of emotions or amount; a sentence such as He seems particularly
confident cannot be interpreted to mean something like He seems particularly tall, whereas
He's standing tall now does have the interpretation of a feeling of confidence and well
being For Conceptual Metaphor Theorists, the asymmetry of mappings represents a key,
universal constraint on the vast majority metaphors.
26. Metaphors and metonymies 609
In contrast to other approaches, conceptual metaphor theory tends not to consider
metaphoric use of words and phrases on a case-by-case, word-by-word basis, but rather
points out broader patterns of use. Thus, a distinction is made between conceptual
metaphors, which represent underlying conceptual structure, and metaphoric expressions,
which are understood as linguistic reflections of the underlying conceptual structure
organized by systematic cross-domain mappings. The correspondences are understood
to be mappings across domains of conceptual structure. Cognitive semanticists hold that
words do not refer directly to entities or events in the world, but rather they act as access
points to conceptual structure.They also advocate an encyclopedic view of word meaning
and argue that a lexical item serves as an access point to richly patterned memory
(Langackcr 1987,1991,1999,2008; also sec the discussion of Conceptual Blending below).
Lakoff & Johnson argue that such coherent patterns of bodily experience are
represented in memory in what they term 'image schemas', which are multi-modal, multi-
faceted, systematically patterned conceptualizations. Some of the best explored image
schemas include BALANCE, RESISTANCE, BEGINNING-PATH-END POINT, and
MOMENTUM (Gibbs 2006a). Conceptual metaphors draw on these image-schemas and
can form entrenched, multiple mappings from a source domain to a target domain. Her
problems are weighing her down is a metaphorical expression of the underlying
conceptual metaphor difficulties are physical burdens, which draws on the image schematic
structure representing physical resistance to gravity and momentum.The metaphor maps
concepts from the external domain of carrying physical burdens onto the internal domain
of mental states. Such underlying conceptual metaphors are believed to be directly
accessible without necessarily involving any separate processing of a literal meaning
and subsequent inferential processes to determine the metaphorical meaning (Gibbs
1994, 2002). Hence, Conceptual Metaphor Theory questions a sharp division between
semantics and pragmatics, rejecting the traditional stance that metaphoric expressions
arc first processed as literal language and only when no appropriate interpretation can
be found, are then processed using pragmatic principles.The same underlying conceptual
metaphor is found in numerous metaphorical expressions, such as He was crushed by his
wife's illness and Her heart felt heavy with sorrow. Recognition of the centrality of
mappings is a hallmark of Conceptual Metaphor Theory; Grady (2007: 190), in fact, argues
that the construct of cross-domain mapping is "the most fundamental notion of CMT".
Additionally, since what is being mapped across domains is cognitive models, the theory
holds that the mappings allow systematic access to inferential structure of the source
domain, as well as images and lexicon (Coulson 2006). So, the mapping from the domain
of physical weight to the domain of mental states includes the information that the effect
of the carrying the burden increases and decreases with its size or weight, as expressed
in the domain of emotional state in phrases such as Facing sure financial ruin, George felt
the weight of the world settling on him, and Teaching one class is a light work load, and
Everyone's spirits lightened as more volunteers joined the organization and the number of
all night work sessions were reduced. Conceptual metaphor theorists argue that while the
conceptual metaphor is entrenched, the linguistic expression is often novel.
Coulson (2006) further notes that viewing metaphorical language as a
manifestation of the underlying conceptual system offers an explanation for why wc consistently
find limited but systematic mappings between the source domain and target domain
(often termed 'partial projection'). The systematicity stems from the access to higher
order inferential structures within the conceptual domains which allows mappings across
26. Metaphors and metonymies 61J_
physical structure. Moreover, the analysis suggests that while primary metaphors, such as
ORGANIZATION IS COMPLEX PHYSICAL STRUCTURE and FUNCTIONING/PERSISTING IS REMAINING
erect/whole will be found in most languages, the precise cross-domain mappings or
complex metaphors which occur will differ from language to language.
Grady points out that humans regularly observe or experience the co-occurrence of
two separable phenomena, such that the two phenomena become closely associated in
memory. Grady identifies scores of experiential correlations, of which more is up is
probably the most frequently cited. The argument is that humans frequently observe liquids
being poured into containers or objects added to piles of one sort or another.These
seemingly simple acts actually involve two discrete aspects which tend to go unrecognized-
thc addition of some substance and an increase in vertical elevation. Notice that it is
possible to differentiate these two phenomena; for instance, one could pour the liquid on
the floor, thus adding more liquid to a puddle, without causing an increase in elevation.
However, humans are likely to use containers and piles because they offer important
affordances involving control and organization which humans find particularly useful.
The correlation between increases in amount and increases in elevation are ubiquitous
and important for humans and thus form a well entrenched experiential correlation or
mapping between two domains. Another important experiential correlation Grady
identifies is affection is warmth. He argues that many of the most fundamental experiences
during the first few months of life involve the infant experiencing the warmth of a
caretaker's body when being held, nursed and comforted. Thus, from the first moments of
life, the infant begins to form an experiential correlation between the domain of
temperature and the domain of affection. This experiential correlation (or primary
metaphor) is reflected in expressions such as warm smile and warm welcome. Conversely,
lack of warmth is associated with discomfort and physical distance from the caretaker, as
reflected in the expressions cold stare or frigid reception.
The theory of experiential correlation has begun to gain support from a number
of related fields. Psycholinguistic evidence comes from work by Zhong & Leonardelli
(2008) who investigated the affection is warmth metaphor and found that subjects
report experiencing a sense of physical coldness when interpreting phrases such as 'icy
glare'. From the area of child language learning, Chris Johnson (1999) has argued that
caretaker speech often reflects experiential correlations such as knowing is seeing and
that children may have no basis for distinguishing between literal and metaphorical
uses of a word like see. If caretakers regularly use see in contexts where it can mean
either the literal 'perceive visually', as in "Let's see what is in the bag" or the
metaphorical iearn; find out', as in "Let's see what this tastes like", children may develop
a sense of the word which conflates literal and metaphorical meanings. Only at some
later stage of development, they come to understand there are two separate uses and
go through a process Johnson terms deconflation. Thus, Johnson argues that the
conceptual mappings between experiential correlates are reinforced in the child's earliest
exposure to language.
Experiential correlation and the theory of primary metaphors was also a major impetus
for the "Neural Theory of Language" (Lakoff & Johnson 1999; Narayanan 1999).This is
a computational theory which has successfully modeled inferences arising from mcta-
phoric interpretation of language. Experiential correlations are represented in terms of
computational 'neural nets'; correlation mappings are treated as neural circuits linking
representations of source and target concepts. Consistent with a usage-based, frequency
612 V. Ambiguity and vagueness
driven model of language, the neural net model assumes that neural circuits are
automatically established when a perceptual and a nonperceptual concept are repeatedly
co-activated.
The theory of experiential correlation moves our understanding of metaphor past
explanations based only on comparison or similarity or ad hoc categories. It addresses
the conundrum of how it is that much metaphoric expression, such as 'warm smile',
involves two domains that have no immediately recognizable similarities.
Grady does not claim that all metaphor is based on experiential correlation. In
addition to primary metaphors and their related complex metaphors, he posits another type,
resemblance metaphor, to account for expressions such as Achilles is a lion. These are
considered one-shot metaphors that tend to have limited extensions.They involve
mappings based on conventionalized stereotypes, here the western European stereotype that
lions are fierce, physically powerful and fearless. Key to this account is Grady's
adherence to the tenets of lexical meaning being encyclopedic in nature (thus lion acts as an
access point to the conventional stereotypes) and mappings across conceptual domains.
Clearly, resemblance metaphors with their direct link to conventional stereotypes are
expected to be culturally-specific, in contrast to primary metaphors which are predicted
to be more universal. Grady (1999b) has analyzed over 15 languages, representing a wide
range of historically unrelated languages, for the occurrence of several primary
metaphors such as big is important. He found near universal expression of these metaphors
in the languages examined.
The work by Grady and his colleagues addresses many of the criticisms of Lakoff &
Johnson (1980). By setting up a typology of metaphors, conceptual metaphor theory can
account for the occurrence of both universal and language specific metaphors. Within a
specific language, the notion of complex metaphors, based on multiple primary
metaphors, offers an explanation for systematic, recurring metaphorical patterns, including
a compelling account of partial projection from source to target. The construct of
experiential correlation also provides a methodology for distinguishing between the three
proposed types of metaphor. Primary metaphors and their related complex metaphors
are strictly asymmetrical in their mappings and ultimately grounded in ubiquitous,
fundamental human experiences with the world. In contrast, resemblance metaphors are
not strictly asymmetrical. So we find not only Achilles is a lion but also The lion is the
king of beasts. Moreover, metaphor theorists have not been able to find a basic bodily
experience that would link lions (or any fierce animals) with people. Finally, identifying
the three categories allows an coherent account of embodied metaphor without having
to entirely abandon the insight that certain metaphors, such as 'My wife's waist is an
hourglass' or 'Achilles is a lion', are based on some sort of comparison.
Lakoff & Johnson (1980) also hypothesized that certain idioms, such as 'spill the
beans' were transparent (or made sense) to native speakers because they are linked to
underlying conceptual structures, such as the CONTAINER image schema. Keysar &
Bly (1995,1999) question the claim that such transparent idioms can offer insights into
cognition.They argue that the native speakers' perception of the relative transparency of
an idiom's meaning is primarily a function of what native speakers believe the meaning
of the idiom to be.This is essentially a claim for a language based, rather than
conceptually based meaning of idioms and metaphors. Keysar & Bly carried out a set of
experiments in which they presented subjects with attested English idioms which could be
analyzed as transparent, but which had fallen out of use. Examples included 'the goose
614 V. Ambiguity and vagueness
This framework holds that words do not refer directly to entities in the world, but rather
they act as prompts to access conceptual structure and to set up short term, local
conceptual structures called mental spaces. Mental spaces can be understood as'temporary
containers for relevant, partial information about a particular domain' "(Coulson 2006: 35).
The key architecture of the theory is the conceptual integration network which is 'an array
of mental spaces in which the processes of conceptual blending unfold' " (Coulson 2006:
35). A conceptual integration network is made up of two or more input spaces structured
by information from distinct domains, a generic space that contains 'skeletal' conceptual
structure that is common to both input spaces, and a blended space. Mappings occur across
the input spaces as well as selectively projecting to the blended space.
One of the most often cited examples of a conceptual blend is That surgeon is a
butcher, which is interpreted to mean that the surgeon is incompetent and careless.
Neither the surgeon space nor the butcher space contains the notion of incompetence. The
key issue is how to account for this emergent meaning. Both spaces contain common
structure such as an agent, an entity acted upon (human versus animal), goals (healing
the patient versus severing flesh), means for achieving the goals (precise incisions
followed by repair of the wound versus slashing flesh, hacking bones), etc. Analogy
mappings project across the two input spaces linking the shared structure. The notion of
incompetence arises in the blended space from the conflict between the goal of healing
the patient, which projects from the surgeon space, and the means for achieving the goal,
slashing flesh and hacking, which projects from the butcher space. Swcctscr (1999) has
also analyzed noun-noun compounds such as land yacht (large, showy, luxury car) and
couch potato (person who spends a great deal of time sitting and watching TV) as blends.
Such noun-noun compounds are particularly interesting as the entities being referred to
are not yachts or potatoes of any kind. Conceptual blending theory subsumes conceptual
metaphor, with its commitment to embodied meaning. Its adherents also argue that the
processes occurring in non-literal language and a variety of other phenomenon, such as
conditionals and diachronic meaning extension, are essentially the same as those needed
to explain the interpretation of literal language.
Over the past decade, there has been a growing convergence between cognitive
semanticists and relevance theorists in the area of metaphor analysis (see Gibbs &
Tendahl 2006). For instance, as we saw above, the latest versions of relevance theory (e.g.,
Wilson & Sperber 2002a, 2002b; Wilson & Carston 2006; Sperber & Wilson 2008) now
posit entries for lexical items which include direct access to 'encyclopedic assumptions'
which are exploited in the formation and processing of metaphors and metonymies, a
position which is quite similar to Langacker's (1987) analysis of words being prompts to
encyclopedic knowledge.
Wilson & Carston (2006:425) also accept the analysis that physical descriptions such
as 'hard, rigid, inflexible, cold', etc apply to human psychological properties through
inferential routes such as 'broadening' to create superordinate concepts (hard*, rigid*,
cold*, etc) which have both physical and psychological instances. They further argue
that ad hoc, on the fly categories can be constructed through the same process of
broadening. For instance, they posit that the process of broadening provides an inferential
route to the interpretation of Robert is a bulldozer, where the encoded concept
bulldozer has the logical feature, i.e. encoded meaning, machine of a certain kind and the
following encyclopedic assumptions:
26. Metaphors and metonymies 61_5
(5) a. large, powerful, crushing, dangerous to bystanders, etc.
b. looks like this (xxx); moves like this (yyy), etc.
c. goes straight ahead regardless of obstacles
d. pushes aside obstructions; destroys everything in its path
e. hard to stop or resist from outside; drowns out human voices, etc.
They continue, "Some of these encyclopedic features also apply straightforwardly to
humans. Others, (powerful, goes straight ahead ...) have both a basic, physical sense
and a further psychological sense, which is frequently encountered and therefore often
lexicalized" (Wilson & Carston 2006:425).
However, the "inferential" theory of metaphor skirts the issue of how the very process
of creating "ad hoc" concepts (such as 'hard to stop') and extended lexicalized concepts
(such as cold* or rigid*) may be motivated in the first place. The qualities and
consequences of physical coldness manifested by ice are not the same as the affectations of
lack of caring or standoffishness which are evoked in the interpretation of an expression
such as Sally is a block of ice. The physical actions and entities involved in a scenario of a
bulldozer moving straight ahead crushing physical obstacles such as mounds of dirt and
buildings is manifestly different than the actions of a person and the entities involved in
a situation in which a person ignores the wishes of others and acts in an aggressive and
obstinate manner. As noted above, it remains unclear how the notions of narrowing and
broadening are 'straight forward inferences' in cases like Robert is a bulldozer, where the
key psychological properties such as "aggressive" and "obstinate" cannot be drawn from
the lexical entry of "bulldozer" or its encyclopedic entry because such properties simply
are not part of our understanding of earth moving machines.
Another gap in the relevance theory approach, and indeed all approaches that treat
metaphor on a word-by-word basis, is the failure to recognize that most metaphors are
part of very productive, systematic patterns of conceptualization based on embodied
experience. For instance, Wilson & Carston argue that block of ice has a particular set
of encyclopedic assumptions, such as solid, hard, cold, rigid, inflexible, and unpleasant to
touch or interact with, associated with it. This treatment fails to systematically account
for the fact that the words icy, frigid, frosty, chilly, even cool, are all used to express the
same emotional state. In contrast, conceptual metaphor theory accounts for these
pervasive patterns through the construct of experiential correlation and embodied meaning.
Similarly, the account of Robert is a bulldozer outlined above fails to recognize the
underlying image schemas relating to force dynamics, such as momentum and movement along
a path, that give rise to any number of conceptually related metaphoric expressions to
describe a person as obstinate, powerful and aggressive. Robert could have also been
described as behaving like a run away freight train or as bowling over or mowing down
his opponents to the same effect, even though the dictionary entries (and their related
encyclopedic entries) for freight trains, bowling balls, and scythes or lawn movers would
have very little overlap with that of a bulldozer.
6. Metonymy
Over the centuries, metonymy has received less attention than metaphor. Indeed,
Aristotle made no clear distinction between the two. Traditionally, metonymy was defined
VI. Cognitively oriented approaches
to semantics
27. Cognitive Semantics: An overview
1. Introduction
2. The semantics of grammar
3. Schematic structure
4. Conceptual organization
5. Interactions among semantic structures
6. Conclusion
7. References
Abstract
The linguistic representation of conceptual structure is the central concern of the two-to-
three decades old field that has come to be known as "cognitive linguistics". Its approach
is concerned with the patterns in which and processes by which conceptual content is
organized in language. It addresses the linguistic structuring of such basic conceptual categories
as space and time, scenes and events, entities and processes, motion and location, and force
and causation. To these it adds the basic ideational and affective categories attributed to
cognitive agents, such as attention and perspective, volition and intention, and expectation
and affect. It addresses the semantic structure of morphological and lexical forms, as well
as of syntactic patterns. And it addresses the interrelationships of conceptual structures,
such as those in metaphoric mapping, those within a semantic frame, those between text
and context, and those in the grouping of conceptual categories into large structuring
systems. Overall, its aim is to ascertain the global integrated system of conceptual structuring
in language.
1. Introduction
The linguistic representation of conceptual structure is the central concern of the two-
to-three decades old field that has come to be known generally as "cognitive linguistics"
through such defining works as Fauconnier (1985), Fauconnier & Turner (2002), Fillmore
(1975,1976), Lakoff (1987,1992), Langacker (1987,1991), and Talmy (2000a, 2000b), as
well as through edited collections like Geeraerts & Cuyckens (2007). This field can first
be characterized by contrasting its "conceptual" approach with two other approaches,
the "formal" and the "psychological". Particular research traditions have largely based
themselves within one of these approaches, while aiming - with greater or lesser success
- to address the concerns of the other two approaches.
The formal approach focuses on the overt structural patterns exhibited by linguistic
forms, largely abstracted away from or regarded as autonomous from any associated
meaning. This approach thus includes the study of syntactic, morphological, and
morphemic structure. The tradition of generative grammar has been centered in the formal
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 622-642
27. Cognitive Semantics: An overview 623
approach. But its relations to the other two approaches have remained limited. It has all
along referred to the importance of relating its grammatical component to a semantic
component, and there has indeed been much good work on aspects of meaning, but this
enterprise has generally not addressed the overall conceptual organization of language. The
formal semantics that has been adopted within the generative tradition (e.g., Lappin 1997)
has largely included only enough about meaning to correlate with the formal categories
and operations that the main body of the tradition has focused on. And the reach of
generative linguistics to psychology has largely considered only the kinds of cognitive structure
and processing that might be needed to account for its formal categories and operations.
The psychological approach regards language from the perspective of general cognitive
systems such as perception, memory, attention, and reasoning. Centered in this approach,
the field of psychology has also addressed the other two approaches. Its conceptual
concerns (see e.g., Neely 1991) have in particular included semantic memory, the associativity of
concepts, the structure of categories, inference generation, and contextual knowledge. But
it has insufficiently considered systematic conceptual structuring - the global integrated
system of schematic structures with which language organizes conceptual content.
By contrast, the conceptual approach of cognitive linguistics is concerned with the
patterns in which and processes by which conceptual content is organized in language.
It has thus addressed the linguistic structuring of such basic conceptual categories as
space and time, scenes and events, entities and processes, motion and location, and force
and causation.To these it adds the basic ideational and affective categories attributed to
cognitive agents, such as attention and perspective, volition and intention, and
expectation and affect. It addresses the semantic structure of morphological and lexical forms,
as well as of syntactic patterns. And it addresses the interrelationships of conceptual
structures, such as those in metaphoric mapping, those within a semantic frame, those
between text and context, and those in the grouping of conceptual categories into large
structuring systems. Overall, the aim of cognitive linguistics is to ascertain the global
integrated system of conceptual structuring in language.
Cognitive linguistics, further, addresses the concerns of the other two approaches to
language. First, it examines the formal properties of language from its conceptual
perspective. Thus, it aims to account for grammatical structure in terms of the functions
this serves in the representation of conceptual structure. Second, as one of its most
distinguishing characteristics, cognitive linguistics aims to relate its findings to the
cognitive structures that concern the psychological approach. It aims both to help account
for the behavior of conceptual phenomena within language in terms of those
psychological structures, and at the same time, to help work out some of the properties of those
structures themselves on the basis of its detailed understanding of how language realizes
them. It is this trajectory toward unification with the psychological that motivates the
term "cognitive" within the name of this linguistic tradition. In the long term, its aim is to
integrate the linguistic and the psychological perspectives on cognitive organization in a
unified understanding of human conceptual structure.
With its focus on the conceptual, cognitive linguistics regards "meaning" or
"semantics" simply as conceptual content as it is organized by language. Thus, general
conception as experienced by individuals - i.e., thought - includes linguistic meaning within
its greater compass. And while linguistic meaning - whether that expressible by an
individual language or by language in general - apparently involves a selection from or
constraints upon general conception, it is nevertheless qualitatively of a piece with it.
624 VI. Cognitively oriented approaches to semantics
Cognitive linguistics is as ready as other linguistic approaches to represent an aspect
of language abstractively with a symbolic formula or schematic diagram, provided that
that aspect is judged both to consist of discrete components in crisp relationships and to
be clearly understood. But most cognitive linguists share the sensibility that such formal
representations poorly accord with the gradients, partial overlaps, interactions that lead
to mutual modification, processes of fleshing out, inbuilt forms of vagueness, and the like
that they observe in semantics. They instead aim to set forth such phenomena through
descriptive means that provide precision and rigor without formalisms.They further find
that formal accounts present their representations of language organization with
premature exhaustiveness and mistakenly uniform certainty. We might propose developing
a field of "thcoryology" that taxonomizes types of theories, according to their defining
properties. In such a field, a formal theory of language, at any given phase of its grasp of
language phenomena, would be of the type that requires encompassive and perfected
mechanisms to account for those phenomena. But cognitive linguistics rests on a type of
theory that, at its foundation, builds in gradients for the stage of development to which
any given aspect of language under analysis has been brought, and for the certainty with
which the analysis is held.
While cognitive linguists largely share the approach to language outlined here, they
may differ in more limited respects. For example, Ronald Langacker generally stresses
the contribution of every morpheme and construction in a sentence to the unified
meaning of the sentence, while George Lakoff and I sec certain interactions among the
elements as overriding such contributions. And Lakoff stresses a theory of embodiment
that Langacker and I break into subtheories and in part challenge (see section 5.1).
Terminologically, "cognitive linguistics" refers to the field as a whole. Within that
field, "cognitive grammar" is largely associated with Langacker's work, while "cognitive
semantics" is largely associated with my own work (the main focus below), though it is
sometimes used more cxtcndcdly.
Externally, cognitive linguistics is perhaps closest to functional linguistics (e.g., Givon
1989). Discourse is central to the latter while more peripheral to the former, but both
approach language with similar sensibilities. Jackendoffs approach is comparable in
spirit to cognitive linguistics, as seen in article 30 (Jackendoff) Conceptual Semantics.
Thus, he assumes a mentalist, rather than a cognition-avoiding logical, basis for meaning.
And he critiques the approaches of Fodor, Wierzbicka, and Levin and Rappaport (see
article 19 (Levin & Rappaport Hovav) Lexical Conceptual Structure) much as docs
cognitive linguistics. But his reliance on an algebraic features-based formalism to
represent meaning differs from the cognitive-linguistic view of its inadequacy in handling the
semantic gradience and modulation cited above. And his privileging of spatial structure
in semantics is at variance with the significance that cognitive linguistics sees in such
further domains as temporal structure, force-dynamic/causal structure, cognitive state
(including purpose, expectation, affect, familiarity Hovav), and reality status (including
factual, counterfactual, conditional, potential), as well as domains that he himself cites,
like social relations.
2. The semantics of grammar
To turn to the specific contents of Cognitive Semantics, then, this outline opens with the
semantics of grammar because it is the key to conceptual structuring in language. A
27. Cognitive Semantics: An overview 625
universal design feature of languages is that their meaning-bearing forms are divided
into two different subsystems, the open-class, or lexical, and the closed-class, or
grammatical (see Talmy 2000a: ch. 1). Open classes have many members and can readily add many
more.They commonly include (the roots of) nouns, verbs, and adjectives. Closed classes
have relatively few members and are difficult to augment. They include bound forms -
inflections, derivations, and clitics and such free forms as prepositions, conjunctions, and
determiners. In addition to such overt closed classes, a language can have certain implicit
closed classes such as word order patterns, a set of lexical categories (e.g., nounhood,
verbhood, etc. per se), a set of grammatical relations (e.g., subject status, direct object
status, etc.), and grammatical constructions.
2.1. Semantic constraint on grammar
Within this formal distinction, the crucial semantic finding is that the meanings that open-
class forms can express are virtually unrestricted, whereas those of closed-class forms are
highly constrained. This constraint applies both to the conceptual categories they can
refer to and to the particular member notions within any such category. For example,
many languages around the world have closed-class forms in construction with a noun
that indicate the number of the noun's referent, but no languages have closed-class forms
indicating its color. And even closed-class forms referring to number can indicate such
notions as singular, dual, plural, paucal, and the like, but never such notions as even, odd,
a dozen, or countable. By contrast, open-class forms can refer to all such notions, as the
very words just used demonstrate.
The total set of conceptual categories with their member notions that closed-class
forms can ever refer to thus constitutes an approximately closed inventory. Individual
languages draw in different patterns from this universally available inventory for their
particular set of grammatically expressed meanings.The inventory is graduated, progressing
from categories and notions that may well appear universally in all languages (a
candidate is "polarity" with its members 'positive' and 'negative'), through ones appearing in
many but not all languages (a candidate is "number"), down to ones appearing in just a
few languages (an example is "rate" with its members 'fast' and 'slow').
2.2. Topological principle for grammar
The next issue is what determines the conceptual categories and member notions
included in the inventory as against those excluded from it. No single global principle
is evident, but several semantic constraints with broad scope have been found. One
of these, the "topology principle", applies to the meanings - or "schemas" - of closed-
class forms referring to space, time or certain other domains. This principle largely
excludes Euclidean properties such as absolutes of distance, size, shape, or angle from
such schemas. Instead, these schemas exhibit such topological properties as "magnitude
neutrality" and "shape neutrality".
To illustrate magnitude neutrality, the spatial schema of the English preposition
across prototypically represents motion along a path from one edge of a bounded plane
perpendicularly to its opposite. But this schema is abstracted away from magnitude so
the preposition can be used equally well in The ant crawled across my palm, and in The
bus drove across the country. Likewise in time, the temporal schema of the past tense
27. Cognitive Semantics: An overview 627
example), each with its own distinctive schematic structure. Such categories also largely
share certain structural properties, such as the capacity for converting from one category
member to another, the multiple nesting of such conversions, and a structural parallelism
between objects in space and events in time. At a second level, these categories join in
extensive "schematic systems" that structure major sectors of conception. Four of these
schematic systems are outlined next.
3.1. Configurational structure
One schematic system,"configurational structure",comprehends all the respects in which
closed-class schemas represent structure for space or time or other conceptual domains
often in virtually geometric patterns (seeTalmy 2000a: ch. 3,2003,2006; Herskovits 1986;
article 98 (Pederson) The expression of space; article 107 (Landau) Space in semantics
and cognition). It thus includes much that is within the schemas represented by spatial
prepositions, by temporal conjunctions, and by tense and aspect markers, as well as by
markers that otherwise interact with open-class forms with respect to object and event
structure. This last type of schema is seen in the categories of "plexity" and "state of
boundedness", treated next.
3.1.1. Plexity
The conceptual category of plexity pertains to a quantity's state of articulation into
equivalent elements. Its two main member notions arc "uniplcxity" and "multiplcxity".
The novel term "plexity" was chosen to capture an underappreciated generalization
present across the traditional categories of "number" for objects in space and "aspect"
for events in time. In turn, uniplexity thus covers both the singular and the semelfactive,
while multiplexity covers both plural and iterative. If an open-class form is intrinsically
lexicalized for a uniplex referent, a closed-class form in construction with it can trigger a
cognitive operation of "multiplexing" that copies its original solo referent onto various
points of space or time.Thus, in English, the noun bird and the verb (to) sigh intrinsically
have a uniplex referent. But this can be multiplexed by adding -s to the noun, as in birds,
or by adding keep -ing to the verb, as in keep sighing. (True, English keep is open-class,
but parallel forms in other languages arc closed-class iterative forms.)
An operation of "unit excerpting" can perform the reverse conversion from
multiplexity to uniplexity on an intrinsically multiplex open-class form. In English, this
operation is performed only by grammatical complexes, as in going from furniture to (a) piece
of furniture or from breathe to take a breath. But other languages have simplex forms.
Thus, Yiddish goes from groz 'grass' to (a) grezl '(a) blade of grass'. And Russian goes
from dixat' 'sneeze a multiplex number of times' to cixnut' 'sneeze once'.
3.1.2. State of boundedness
A second conceptual category is "state of boundedness", with two main member notions,
"unboundedness" and "boundedness". An unbounded quantity is conceptualized as able
to continue on indefinitely without intrinsic finiteness. A bounded quantity is
conceptualized as an individuated unit entity with a boundary around it. As with plexity, these new
terms are intended to capture the commonality across the space and time domains, and
630 VI. Cognitively oriented approaches to semantics
viewing, or family of viewings, points ahead to a subsequent wife or wives following
John's marriage with the woman at the punchbowl. Finally, a perspective point of the
speaker at the present moment of speech is established by the past tense of the main
verb was. It is this perspective point at which the speaker's cumulative knowledge of
the reported sequence of events is stored as memory and, in turn, which functions as
the origin of a retrospective direction of viewing over the earlier sequence. The earlier
perspective points are here nested within the scope of the viewing from the current
perspective point.
3.3. Distribution of attention
A third schematic system, "distribution of attention", directs a listener's attention
differentially over the structured scene from the established perspective point (see Talmy
2000a: ch. 4,2007). Grammatical and other devices set up regions with different degrees
of salience, arrange these regions into different patterns, and map these patterns in
one or another way over the components of the structured scene. Several patterns are
outlined here.
3.3.1. Focal attention
One attentional arrangement is a center-surround pattern with the center foregrounded
as the focus and with the surround backgrounded. The grammatical relation of subject
status can direct focal attention to the referent of the subject nominal, and alternative
selections of subject can place the center of the pattern over different referents, even
ones within the same event.Thus, focal attention can be mapped either onto the seller in
a commercial transaction, with lesser attention on the remainder, as in The clerk sold the
vase to the customer, or onto the buyer, with lesser attention on the new remainder, as in
The customer bought the vase from the clerk (see Talmy 2000a: ch. 1).
For another realization of this pattern, Fillmore's (1976) term "frame" and Langacker's
(1987) term "base" refer to a structured set of coentailed concepts in the attentional
background. Their respective terms "highlighting" and "profiling" then refer to the
foregrounding of the portion of the set that a morpheme refers to directly. A Husserl (1970)
example can illustrate. The nouns husband and wife both presuppose the conception of
a married couple in the background of attention, while each focuses attention on one or
the other member of such a pair in the foreground.
3.3.2. Level of synthesis
In expressions referring to the same scene, different grammatical forms can direct greater
attention to either of two main "levels of synthesis" or of granularity, the Gestalt level or
the componential level. Thus, the head status of pyramid in the sentence The pyramid of
bricks came crashing down, raises its salience over that of bricks with its dependent status.
More attention is at the Gestalt level of the whole pyramid, conceptually tracking its
overall movement. But the dependency relations are reversed in the sentence The bricks
in the pyramid came crashing down. Here, more attention is at the componential level of
the constituent bricks, tracking their multiple movements (see Talmy 2000a: ch. 1).
27. Cognitive Semantics: An overview 631_
3.3.3. Window of attention
A third pattern is the "window of attention". Here, one or more (discontinuous) portions
of a referent scene are foregrounded in attention (or "windowed") by the basic device
of their explicit mention, while the remainder of the scene is backgrounded in attention
(or "gapped") by their omission from mention. To illustrate, the sentence The pen kept
rolling off the uneven table conveys the conception of an iterating cycle in which a pen
progresses through the phases of lying on a table, falling down, lying on the ground, and
being placed back on the table. But the overt linguistic material refers only to the
departure phase of the pen's cyclic path. Accordingly, only this portion of the total referent is
foregrounded in attention, while the remainder of the cycle is relatively backgrounded.
With enough context, the alternative sentence / kept placing the pen back on the uneven
table could refer to the same cycle. But here, the presence of overt material referring
to the return phase of that cycle foregrounds that phase in attention, while now the
departure phase is backgrounded (see Talmy 2000a: ch. 4).
3.3.4. Attentional nesting
Nesting was shown for configuration and for perspective, and it can also be seen in the
schematic system of attention. It appears in the second of the following two sentences:
The customer bought a vase. /The customer was sold a vase. In the second sentence, focal
attention is first directed to the seller by the lexical choice of sell but is then redirected to
the buyer by the passive voice. If this redirection of attention were total, then the second
sentence would be scmantically indistinguishable from the first sentence, but in fact it is
not. Rather, the redirection of attention is only partial: it leaves intact the foregrounding
of the seller's active intentional role, but it shifts the main focus onto the buyer as target.
Altogether, then, it can be said that attention on the seller is hierarchically embedded
within a more dominant attention on the buyer.
4. Conceptual organization
In addition to schematic systems, language has many other forms of extensive and
integrated conceptual organization, such as the three presented next. Although Figure/
Ground organization and factive/fictive organization could be respectively
comprehended under the attentional and the configurational schematic systems, and force
dynamic organization has elsewhere been treated as a fourth schematic system, these are
all extensive enough and cut across enough distinctions to be presented here as separate
bodies of conceptual organization.
4.1. Figure/ground organization
In representing many spatial, temporal, equational, and other situations, language is so
organized as to single out two portions of the situation, the "Figure" and the "Ground",
and to relate the former to the latter (see Talmy 2000a: ch. 5). In particular, the Figure is
a conceptually movable entity; its location or path is conceived as a variable whose
particular value is at issue.The Ground is a reference entity with a stationary setting relative
to a reference frame; the Figure's variable is characterized with respect to it.
27. Cognitive Semantics: An overview 633
4.2.2. Emanation paths
Another category of Active motion, "emanation paths", involves the Active
conceptualization of an intangible line emerging from a source object, passing in a straight
line through space, and terminating on a target object, where factively nothing is in
motion. In one subtype,"demonstrative paths", a directed line emerges from the pointed
front of a source object. This is seen in The arrow points toward / past / away from the
town.
In the "radiation paths" subtype, a beam of radiation emanates from a radiant object
and terminates on an irradiated object. This is seen in Light shone from the sun into the
cave. It might be claimed that photons do factively emanate from a radiant object, so that
Active motion need not be invoked. However, we do not see photons, so any
representation of motion is cognitively imputed. In any case, in a related subtype, "shadow paths",
none will claim the existence of "shadowons", and yet once again Active motion is seen
in a sentence like The pole threw its shadow on the wall.
Finally, a "sensory path" is represented as moving from the experiencer to the
experienced object in a sentence like / looked into /past / away from the tunnel. Such an
emanating "line of sight" can also be represented as moving laterally. Both these forms
of Activity - Arst lateral, then axial - are represented in / slowly looked down into the
well.
One question for this Active category, though, is what determines the direction of
the intangible emanation. Logically, since motion is imagined, it should be possible to
conceptualize a reversed path. Attempts at representing such reversed paths appear in
sentences like *Light shone from my hand onto the sun, or *The shadow jumped from
the wall onto the pole, or */ looked from that distant mountain into my eyes. But such
formulations do not exist in any language that represents such events Actively. Rather,
an "active-determinative" principle appears to govern the direction of emanation. Of the
two objects, the more active or determinative one is conceptualized as the source. Thus,
relative to my hand, the sun is brighter, hence, more active, and must be treated as the
source of radiative emanation. My agency in looking is more active than the inanimate
perceived object, so I am treated as the source of sensory emanation. And the pole is
more determinative - I can move the pole and the shadow will also move, but I cannot
perform the opposite operation of moving the shadow and getting the pole to move - so
the pole is treated as the source of shadow emanation.
4.3. Force dynamics
Language has an extensive conceptual system of "force dynamics" for representing
the patterns in which one entity, the "Agonist", has force exerted on it by another
entity, the "Antagonist" (see Talmy 2000a:ch. 7). It covers such concepts as an Agonist's
natural tendency toward action or rest, an Antagonist's opposition to such a tendency,
the Agonist's resistance to this opposition, and the Antagonist's overcoming of such
resistance. It includes the concepts of causing and letting, helping and hindering, and
blockage and the removal of blockage. It generalizes over the causative concepts of
traditional linguistics, placing them naturally within a matrix of Aner distinctions. It
also cuts across conceptual domains, from the physical, to the psychological, to the
social, as illustrated next.
634 VI. Cognitively oriented approaches to semantics
4.3.1. The physical domain
A contrast between two sentences can illustrate the physical domain. The sentence The
ball rolled along the green represents motion in a force-dynamically neutral way. But The
ball kept rolling along the green adds force dynamics to the otherwise same spatial
movement. In fact, it has readings for two different force dynamic patterns. Interpreted under
the "extended causing of motion" pattern, the ball as Agonist has a natural tendency
toward rest but is being overcome by a stronger Antagonist such as the wind.
Alternatively, interpreted under one of the "despite" patterns, the ball as Agonist has a natural
tendency toward motion and is overcoming a weaker Antagonist such as stiff grass - that
is, it moves along despite opposition from the grass.
4.3.2. The psychological domain
An individual's psyche can be conceptualized and linguistically represented as a "divided
self" in which two different components are in force dynamic opposition.To illustrate, the
sentence I didn't respond is force dynamically neutral. But the sentence I refrained from
responding, though it still represents a lack of response, now adds in the force dynamic
pattern "extended causing of rest". Specifically, a more central part of mc, the Agonist,
has a tendency toward responding, while a more peripheral part of me, the Antagonist,
opposes this tendency, is stronger, and so blocks a response.The two opposing parts are
explicitly represented in the corresponding sentence I held myself back from responding.
4.3.3. The social domain
Much as the closed-class category of prepositions is largely associated with a specific
semantic category, that of paths or sites in relation to a Ground object, so the closed-class
category of modals is largely associated with the semantic category of force dynamics -
in particular, with its social application. Here, certain interpersonal interactions,
mediated solely through communication, can be metaphorically represented in terms of force
or pressure exerted by one individual or group on another.
For example, must, as in You must go to school, represents one of the "causing" force
dynamic patterns between individuals. It sets the subject up as an Agonist whose desire -
taken as a kind of tendency - is to do the opposite of the predicate's referent. And it sets
up an implicit Antagonist - for example, I, your mother, people at large - that exerts
psychological pressure on the Agonist toward performance of the undesired action.
The modal may, as in You may go to the playground, instead represents a "letting"
force dynamic pattern. Here, the subject as Agonist has a desire or tendency toward the
stated action that could have been blocked by an implicit stronger Antagonist, but this
potential blockage is withheld.
5. Interactions among semantic structures
The preceding discussion has mostly dealt with conceptual structures each in its own
terms. But a major aspect of language organization is that conceptual structures, from
small to large, can also interact with each other in accordance with certain principles.
Such interactions are grouped together below under four extensive categories.
27. Cognitive Semantics: An overview 635
5.1. Conceptual imposition
A widespread view about the contents and structures of cognition is that they ultimately
derive from real properties of external phenomena, through processes of perception and
abstraction, in what John Searle has called the "world-to-mind direction of fit". While
acknowledging such processes, cognitive linguistics calls attention instead to intrinsic
content and structure in cognition - presumably mainly of innate origin - and to how
extensive they are. Such native cognitive properties certainly apply to the general
functioning of cognition. But they also apply in many forms of "conceptual imposition" -
the imputation of certain contents and structures to our conceptions and perceptions of
the world in a "mind-to-world direction of fit". Several realizations of such conceptual
imposition are outlined next.
5.1.1. The imputation of content or structure
An initial non-linguistic example of autochthonous cognition is "affect". Emotions such
as anger or affection are experienced either as such or as applied to outside entities.
But it is difficult to see how such feelings could arise from a process of abstraction from
the external world. Linguistic examples of course abound. Fictive motion offers some
immediately striking ones, such as the "shadow path" in The pole threw its shadow onto
the wall. As described in 4.2.2, the literal wording here depicts a movement from pole to
wall that is not overtly perceived as occurring "out there". That is, at least the language-
related portion of our cognition imposes the conceptualization of motion onto what
would be perceived as static.
Actually, though, virtually all the semantic structures described so far are forms of
conceptual imposition. Thus, in the scene represented by the sentence The post office
is near the bank, based on the discussion in 4.1, it could hardly be claimed that the post
office is inherently Figure-like and the bank Ground-like, beyond our cognitive
imputation of those roles to those objects. And houses dispersed over a valley, as described in
3.2.2, could scarcely possess an associated moving or stationary perspective point from
which they arc viewed, apart from the linguistic forms that ascribe such perspective to
the represented scene.
5.1.2. Alternatives of conceptualization
A consequence of the fact that a particular structural or contentful conception can
be imputed to a phenomenon is that a range of alternative conceptions could also be
imputed to it. These are alternatives of what my work has termed the
"conceptualization" and Langackcr's has termed the "construal" of a phenomenon. For example, as
seen in 4.2.1, the fictive motion that could be imputed to a fence along one coextension
path, as in The fence goes from the plateau down into the valley could also be imputed to
it along the reverse path, as in The fence goes from the valley up onto the plateau.
Or consider the deictics this and that, which establish a conceptual boundary in space
and depict an indicated object as being respectively either on the speaker's side of the
boundary or on the side opposite the speaker. Then, referring to the exact same bicycle
standing, say, some 8 feet away, a speaker could opt to say either This bike is in my way,
or That bike is in my way.The speaker can thus impose alternatives of conceptualization
636 VI. Cognitively oriented approaches to semantics
on the scene, imputing a conceptual boundary either between himself and the bike or on
the other side of the bike.
5.1.3. Embodiment
The notion of "embodiment" extends the idea of conceptual imposition. It assumes that
such imposed concepts are largely based on experiences humans have of their bodies
interacting with environments or on psychological or neural structure. It proposes that
such experiences are imputed to, or form the basis of, our understanding of most
phenomena (Lakoff & Johnson 1999). In my view, though, the linguistic literature has largely
applied the blanket term "embodiment" to a range of insufficiently distinguished ideas
that differ in their validity. Four such distinct ideas of embodiment in current use are
outlined here.
First, in what might be called the "bulk encounter" idea of embodiment, phenomena
are grouped and categorized in terms of the way in which our bodies - with their
particular shape and mesoscopic size - can interact with them. But this idea is either incorrect
or limited. For example, many languages have closed-class representation for a linear
configuration, as English does with the preposition along. This preposition applies to a
Ground object schematizable as linear. But due to magnitude neutrality (see 1.2), this
schema can be applied to objects of quite different sizes, as in The ant climbed up along
the matchstick, and The squirrel climbed up along the tree trunk. Yet, although the along
schema can group a matchstick and a tree trunk together, we bodily interact with those
objects in quite different ways. Accordingly, the bulk encounter idea of embodiment does
not account for this and perhaps much else in the structure of linguistically represented
conception.
Second, in what could be called the "neural infrastructure" idea of embodiment,
it is the organization and operation of our neural structure that determines how we
conceptualize phenomena. Thus, the linear schematization just cited might arise from
ncurally based processes of visual perception that function to abstract out just such one-
dimensional contours. Comparably, the concept evoked on hearing a word such as bicycle
or coffee might arise from the reactivation of the visual, motor, and olfactory areas that
were previously active during interaction with those objects, in the manner of Damasio's
"convergence zones".The problem with this idea of embodiment is that, although
generally correct, it is simply subsumed by psychology and needs no separate statement of its
own.
Third, in what could be called the "concreteness as basic" idea of embodiment, the
view is that experience with the tangible world is developmentally earlier and provides
the basis for later conceptions of intangible phenomena, much as in Piagetian theory.
A commonly cited example is concepts of time based on those of space. Another is the
conception of purpose based on that of destination, that is, one's destination in a physical
journey in what Lakoff (1992) terms the "event structure metaphor". While something
of this directional bias is evident in metaphoric mapping (see below), it is not clear that
it correctly characterizes cognitive organization. On the contrary, we may well have an
innate cognitive system dedicated to temporal processing - perhaps already evident very
early - that includes perception of and control over duration; starting, continuing, and
stopping; interrupting and resuming; repeating; waiting; and speeding up and slowing
down. We may likewise have an innate cognitive system for intention or purpose. In any
27. Cognitive Semantics: An overview 637
case,"purpose" cannot be derived from "destination". After all, the concept that a person
moving from point X to point Y has Y as a "destination" already includes a component of
purpose. When such a component is lacking, we do not say that a person has point Y as
her destination but rather that her motion simply "stops" at that point. Accordingly, the
notion of purpose present in the concept of "destination" could not derive from
perceptions of concrete motion patterns, but might originate in an innate cognitive system for
the enactment and conception of intention or purpose.
In a fourth and final type here, what can be called the "anti-objectivism" idea of
embodiment faults the view that there exists an autonomous truth, uniform and
pervasive, in such realms as logic and mathematics that the human mind taps into for its
understandings and activities in those realms. Rather, wc deal with such realms by
imputing or mapping onto them various of our conceptual schemas, motor programs,
or other cognitive structures. On this view, we do much of our thinking and reasoning in
terms of such experientially derived structures. For example, our sense of the meaning
of the word angle is not derived from some independent ideal mathematical realm, but
is rather built up from our experience, e.g., from perceptions of a static forking branch,
from moving two sticks axially until their ends touch, or from rotating one stick while
its end touches that of another.This view of how we think may be largely correct. But if
applied too broadly, it might obscure the possible existence of an actual cognitive system
for objectivity and reason. Such a system might have the capacity to check for coherence
across concepts, for global consistency across conceptual and cognitive domains, and for
consistency across inferences and reasoning, - whether or not the assessed components
themselves arose through otherwise embodied processes - and it might be the source of
the very conception of an objective domain.
5.2. Cognitive recruitment
I propose the term " recruitment" for a pervasive cognitive process in which a cognitive
configuration with a certain original function or conceptual content gets used to
perform another function or to represent some other concept.That is, the basic function or
concept is appropriated or co-opted in the service of manifesting another one.
Such recruitment would certainly cover all tropes, including Activity and metaphor.
Thus, in a coextension path example of Active motion like The fence goes from the plateau
to the valley (see 4.2.1), the morphemes go, from, and to originally and basically refer
to motion, but this reference is conscripted in the service of representing a stationary
configuration.
And metaphor can also be understood in terms of recruitment. In cognitive linguistics,
metaphor has been mainly studied not for its salient poetic form familiar from
literature but - under the term "conceptual metaphor" - for its largely unconscious pervasive
structuring of everyday expression (see e.g., Lakoff 1992; article 26 (Tyler & Takahashi)
Metaphors and metonymies). In the basic analysis, certain structural elements of a
conceptual "source domain" are mapped onto the content of a conceptual "target domain".
But in our present terms, it can also be said that the conceptual structures and morphemic
meanings original to the source domain are recruited for use as structures and meanings
within the target domain. The directionality of the mapping - based on the "concrete
as basic" view of embodiment (see 5.1.3) - is typically from a more concrete domain
grounded in bodily experience to a more abstract domain. Thus, the more palpable
638 VI. Cognitively oriented approaches to semantics
domain of space is systematically mapped onto the more abstract domain of time in such
everyday expressions as Christmas is ahead /near /almost here / upon us /past.
Recruitment can be seen as well in the appropriation of one type of construction
to serve as another type. For example, the English question construction with certain
modals can serve as a request, as in Could you pass me the salt?. Fictivity terminology
could be extended to label the host construction here as a Active question, and the
parasitic construction as a factive request. Or it could be said that a request construction has
recruited a question construction.
Finally, to illustrate functional recruitment, repair mechanisms in discourse, in their
basic function, comprise a variety of devices that a speaker uses to remedy hitches that
arise in the production of an utterance.Talmy (2000b: ch.6) cites a recorded example of a
young woman rejecting a suitor in which she uses an inordinate density of repair
mechanisms, including false starts, interruptions, corrections, and repetitions. But it is evident
that these originally corrective devices have been co-opted to perform a different
function: to manifest embarrassed concern for the addressee's sensitive feelings. And, built in
turn upon that function is the further function of the speaker's signaling to the addressee
that she did have his feelings in mind.
5.3. Semantic conflict resolution
A conflict or incompatibility often exists between the references of two constituents
in a sentence, or between the reference of a constituent and the context or one's
general knowledge (see Talmy 2000b: ch. 5). The treatment of such semantic conflict thus
complements treatments of semantic "unification" in which the referents of constituents
integrate unproblematically. A hearer of a conflict generally applies one out of a set of
resolutions to it. These include shifts, blends, juxtapositions, and juggling. Of these, the
first two are characterized next.
5.3.1. Shifts
In the type of resolution Talmy (1977) termed a "shift" - now largely called "coercion"
after Pustejovsky (1993) - the reference of one of the two conflicting forms changes so
as to accord with the reference of the other form. A shift can involve the cancellation,
stretching, or replacement of a semantic feature. Each of these three types of shifts is
illustrated next.
The across schema cited in 2.2 can illustrate component cancellation. This schema
prototypically involves a horizontal path on a bounded plane from one edge
perpendicularly to its opposite. But the path's termination on the distal edge can be canceled,
as in a sentence like The shopping cart rolled across the boulevard and was hit by an
oncoming car. Here, the English preposition is not blocked from usage, or replaced by
some preposition referring to partial planar traversal, but continues on with one of its
semantic components missing. In fact, the preposition can continue in usage even with
both of the path's edge contacts canceled, as seen in The tumbleweed rolled across the
desert for an hour.
The across schema can also illustrate component stretching. A prototypical constraint
on this schema, not mentioned earlier, is that the main axis of the plane, which is
perpendicular to the path, may be longer than the path or of the same length, but cannot be
27. Cognitive Semantics: An overview 639
shorter. Accordingly, I can swim "across" a square swimming pool from one edge to the
other, or "across" a canal from one bank to the other, but if my path parallels a canal's
banks, I am not swimming "across" the canal but "along" it. But what if I am at an oblong
pool and swim from one of the narrow edges to its opposite? In referring to this
situation, the acceptability of the sentence / swam across the pool is great where the pool is
only slightly longer than a square shape, and decreases as its relative length increases.
The across schema thus permits the path length within the relative-axis constraint to be
stretched moderately but not too far.
Finally, component replacement can be seen in a sentence like She is somewhat
pregnant. Here, the gradient specification of somewhat conflicts with the basic all-or-none
specification of pregnant. A hearer might resolve this conflict through the mechanism of
juxtaposition, to yield the "incongruity effect" of humor. If not, though, the hearer can
shift pregnant into accord with somewhat by replacing its 'all-or-none' component with
that of 'gradience'.Then the overall meaning of pregnant shifts as well from involving the
presence or absence of a fetus to involving the length of gestation.
5.3.2. Blends
An incompatibility between two sets of specifications in a sentence can also be resolved
as a "blend", in which a hearer generates an often imaginative conceptual hybrid that
accommodates both of the original conceptual inputs in some novel relation to each
other. Talmy (1977) distinguished two types of blends, superimposition and introjection,
and illustrated the former with the sentence My sister wafted through the party.The
conflict here is between waft suggesting something like a leaf moving gently in an
irregular pattern through the air, and the rest of the sentence suggesting a person (moving)
through a group of other people. In myself, this sentence evokes the blended
conceptualization of my sister wandering aimlessly through the party, somewhat unconscious of the
events around her, and of the party somehow suffused with a slight rushing sound of air.
Fauconnier & Turner (2002) have greatly elaborated on this process, also terming it
a "blend" or a "conceptual integration". In their terms, two separate mental spaces (see
below) can map elements of their content and structure into a third mental space that
constitutes a blend of the two inputs, with potentially novel structure.Thus, in referring to a
modern catamaran reenacting a century-old voyage by an early clipper, a speaker can say
At this point, the catamaran is barely maintaining a 4 day lead over the clipper.The speaker
here conceptually superimposes the two treks and generates the apparency of a race.
5.4. Semantic interrelations
In the preceding three subsections, semantic structures have in effect "acted on" each
other to yield a novel conceptual derivative. But semantic elements and structures can
also simply relate to each other in particular patterns. Four such patterns are outlined next.
5.4.1. Within one sense of a morpheme
Several bodies of research within cognitive linguistics address the structured relations
among the semantic components of the meaning of a morpheme in one of its polysemous
senses. Two of these are "frame semantics" and "prototype theory", outlined next.
640 VI. Cognitively oriented approaches to semantics
Fillmore's (e.g., 1976) Frame Semantics (see article 29 (Gawron) Frame Semantics)
shows that the meaning of a morpheme does not simply consist of a central concept
- the main concern of a speaker in using the morpheme - but extends out indefinitely
with ever further conceptual associations that bear particular relations to each other
and to the central concept. In fact, several different morphemes can share roughly the
same extended frame while foregrounding different portions in the center. Thus, such
"commercial frame" verbs as sell, buy, spend, charge, and cost all share in their frames a
seller, a buyer, money, and goods, as well as the transfer of money from the buyer to the
seller and, in return for that, the transfer of the goods from the seller to the buyer. Each
of these concepts in turn rests on a further conceptual infrastructure. For example, the
'money' concept rests on notions of governmental minting and socially agreed value,
while the 'in return for' concept rests on notions of reciprocity and equity.
In Lakoff's (1987) prototype theory (see article 28 (Taylor) Prototype theory), a
morpheme's meaning can generally be viewed as a category whose members differ in
privilege, whose properties can vary in number and strength, and whose boundary can vary in
scope. In its most prototypical usage, then, the morpheme refers to the most privileged
category member, assigns the fullest set of its properties at their greatest strength to that
member, and tightens its boundary to enclose the smallest scope. For an example from
Fillmore (1976), the meaning of breakfast - in its most prototypical usage - consists of
eating certain foods, namely, eggs, bacon, toast, coffee, orange juice, and the like, at a
certain time of day, namely, in the morning. But the meaning can be extended to less
prototypical values, for example, either to different foods - The Joneses eat liver and onions
for breakfast - or to different times of the day - Breakfast is served all day.
5.4.2. Across different senses of a morpheme
Brugman (1981) was the first to show that for a polysemous morpheme, one sense can
function as the prototype to which the other senses are progressively linked by
conceptual increments within a "radial category". Thus, for the preposition over, the prototype
sense may be 'horizontal motion above an object' as in The bird flew over the hill. But
linked to this by "endpoint focus" is the sense in Sam lives over the hill.
5.4.3. Relations from within a morpheme to across a sentence
The "Motion typology" of Talmy (2000b: ch. 1) proposes a universal semantic
framework for an event of motion or location. This consists of four components in the main
event proper - the moving or stationary "Figure", its state of "Motion" (moving or being
located), its "Path" (path or site), and the "Ground" that serves as its reference point -
plus an outside "Co-event" typically of Manner or of Cause. Languages differ typologi-
cally as to which of these components they characteristically include within the verb of
a sentence, and which they locate elsewhere in the sentence. And these two sets of
allocations are correlated.
Thus, a "verb-framed" language like Spanish characteristically places the components
of Motion and Path together in the verb, and so has an extensive scries of "path verbs"
with meanings like 'enter', 'exit', 'ascend', 'descend', 'cross', 'pass', and 'return'. In
correlation with this lexicalization pattern for the verb, the language has a ready colloquial
construction for representing the Co-event - typically a gerund form that can appear
27. Cognitive Semantics: An overview 641_
right after the path verb. For example,'I ran into the cave' might be expressed as Entre
corriendo a la cueva - literally,"I entered running to the cave".
By contrast, a "satellite-framed" language like English characteristically places the
components of Motion and Co-event together in the verb, and so has a series of "Manner
verbs" like run, limp, scuttle and speed. In correlation with this lexicalization pattern for
the verb, the language also has an extensive series of "path satellites" - e.g., in, out, up,
down, past, across and back - as well as a partially overlapping set of path prepositions,
together with the syntactic construction for their inclusion after the verb. The English
sentence corresponding to the preceding Spanish one is thus: I ran into the cave.
This correlation within a language between a verb's lexicalization pattern and the
rest of the syntax in a motion sentence can be put into relief by noting minimally
occurring patterns (see Slobin 1996).Thus, Spanish does not have a path satellite category or
an extensive set of path prepositions, and in fact can largely not use the prepositions it
does have to represent a path. For instance, it could not do so for the cave example. For
its part, English does not have a colloquial gerund construction for use with its few path
verbs (which in any case are mostly borrowed from Romance languages, where they are
native). Thus, a sentence like / entered the cave running is fully awkward.
5.4.4. Across a sentence
Fauconnier (1985) shows how different portions of a sentence can set up distinct "mental
spaces" with particular relations to each other. Each such space is a relatively self-
contained conceptual domain with its component elements in a particular arrangement;
two spaces can share many of the same elements; and a mapping can be established
between corresponding elements.The mapping is directional, going from a "base" space -
a conceptual domain generally factual for the speaker - to a "subordinate" space that
can be counterfactual, representational, at a different time, etc. Thus, in Max thinks
Harry's name is Joe, the speaker's base space includes 'Max' and 'Harry' as elements; the
word thinks sets up a subordinate space for a portion of Max's belief system; and this
contains an element 'Joe' that corresponds to 'Harry'.
6. Conclusion
In this survey, the field of cognitive linguistics in general and of Cognitive Semantics in
particular is seen to have as its central concern the representation of conceptual structure
in language.The field addresses properties of conceptual structure both local and global,
both autonomous and interactive, and both typological and universal. And it relates these
linguistic properties to more general properties of cognition. While much has already
been done in this relatively young linguistic tradition, it remains quite dynamic and is
extending its explorations in a number of new directions.
7. References
Brugman. Claudia 1981. The Story of'Over'. MA thesis University of California, Berkeley, CA.
Fauconnier, Gilles 1985. Mental Spaces. Aspects of Meaning Construction in Natural Language.
Cambridge, MA:The MIT Press.
28. Prototype theory 643
28. Prototype theory
1. Introduction
2. Prototype effects
3. Prototypes and the basic level
4. The cultural context of categories
5. Prototypes and categories
6. Objections to prototypes
7. Words and the world
8. Prototypes and polysemy
9. References
Abstract
According to a long-established theory, categories are defined in terms of a set of features.
Entities belong in the category if, and only if, they exhibit each of the defining features. The
theory is problematic for a number of reasons. Many of the categories which are lexical-
ized in language are incompatible with this kind of definition, in that category members
do not necessarily share the set of defining features. Moreover, the theory is unable to
account for prototype effects, that is, speakers' judgements that some entities are 'better'
examples of a category than others. These findings led to the development of prototype
theory, whereby a category is structured around its good examples. This article reviews the
relevant empirical findings and discusses a number of different ways in which prototype
categories can be theorized, with particular reference to the functional basis of categories
and their role in broader conceptual structures. The article concludes with a discussion
of how the notion of prototype category has been extended to handle polysemy, where
the various senses of a word can be structured around, and can be derived from, a more
central, prototypical sense.
1. Introduction
In everyday discourse, the term 'prototype' refers to an engineer's model which, after
testing and possible improvement, may then go into mass production.
In linguistics and in cognitive science more generally, the term has acquired a
specialized sense,although the idea of a basic unit,from which other examples can be derived,may
still be discerned.The term, namely,refers to the best,most typical,or most central member
of category. Things belong in the category in virtue of their sharing of commonalities
with the prototype. Prototype theory refers to this view on the nature of categories.
This article examines the role of prototypes in semantics, especially in lexical
semantics. To the extent that words can be said to be names of categories, prototype theory
becomes a theory of word meaning.
Prototype theory contrasts with the so-called classical, or Aristotelian theory of
categorization (Lakoff 1982, 1987; Taylor 2003a). According to the classical theory, a
category is defined in terms of a set of properties, or features, and an entity is a member of
the category if it exhibits each of the features. Each of the features is necessary, jointly
they are sufficient.The classical theory captures the 'essence' of a category in contrast to
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 643-664
644 VI. Cognitively oriented approaches to semantics
the 'accidental' properties of category members.The theory entails that categories have
clear-cut boundaries and that all members, in their status as category members, have
equal status within the category.
An example which is often cited to illustrate the classical theory is the category
'bachelor', defined in terms of the features [+human], [+adult], [+male], and [-married]. Any
entity which exhibits each of the four defining features is, by definition, a member of
the category and thus can bear the designation bachelor. Something which exhibits only
three or fewer of the features is not a member. In their status as bachelors, all members of
the category are equal (though they may, of course, differ with respect to non-essential,
accidental features, such as their height, wealth, and such like).
The classical theory is attractive for a number of reasons. First, the theory neatly
accounts for the relation of entailment. If X is a bachelor, then necessarily X is
unmarried, because 'unmarried' is a property already contained in the definition of bachelor.
Second, the theory explains why some expressions are contradictions. In married
bachelor, or This bachelor is married, the property designated by married conflicts with a
definitional feature of bachelor. Third, the theory accounts for the distinction between
analytic and synthetic statements. This bachelor is a man is synthetic; it is necessarily
true in virtue of the definitions of the words. This man is a bachelor is analytic; its truth is
contingent on the facts of the matter.The theory also goes some way towards explaining
concept combination. A rich bachelor refers to an entity which exhibits the features of
bachelor plus the features of rich.
In spite of these obvious attractions, there are many problems associated with the
classical theory. First, the theory is unable to account for the well-documented prototype
effects, to be discussed in section 2. Another problem is that the theory says nothing
about the function of categories. Organisms categorize in order to manage the myriad
impressions of the environment. If something looks like an X, then it may well be an X,
and wc should behave towards it as we would towards other Xs. The classical theory is
unable to accommodate this everyday kind of inferencing. According to the classical
theory, the only way to ascertain whether something is an X is to check it out for each
of the defining features of X. Having done that, there is nothing further to say about the
matter. Categorization of the entity serves no useful purpose to the organism. A related
matter is that the theory makes no predictions as to which categories are likely to be
lexicalized in human languages. In principle, any random set of features can define a
category. But many of these possible categories - for example, a category defined by the
features [is red], [was manufactured before 1980], [weighs 8 kg] - although perfectly well-
formed in terms of the theory, are not likely to be named in the lexicon of any human
language.
A further set of problem arises in regard to the features. In terms of the theory, the
features are more basic than the categories which they define. In many cases, however,
the priority of the features might be questioned. [Having feathers] may well be a
necessary feature of'bird'. But do we comprehend the category 'bird' on the basis of a prior
understanding of what it means for a creature to have feathers, in contrast, say, to having
fur, hair, scales, or spines? Probably not. Rather, we understand 'feathers' in consequence
of our prior acquaintance with birds. Another point to bear in mind is that each feature
will itself define a category, namely, the category of entities exhibiting that feature. The
feature [+adult] - one component of the definition of bachelor - singles out the category
of adults. We will need to provide a classical definition of the category 'adult', whose
28. Prototype theory 649
level thus turns out to be the most informative and efficient of the taxonomic levels. It is
the level which packs the most information, both in terms of what something is, also with
respect to what it is not. It is not surprising, therefore, that basic level terms tend to be of
high frequency, they are short, and are learned early in first language acquisition.
The findings also make possible a more sophisticated understanding of prototypes. As
noted, basic level categories tend to be contrastive. It is possible, then, to view the
prototype as a category member which exhibits the maximum number of features which are
typical of the category and which are not shared by members of neighbouring categories.
Rosch et al. (1976) in this connection speak of the cue validity of features. A feature
has high cue validity to the extent that presence of the feature is a (fairly) reliable
predictor of category membership. For example, [having a liver] has almost zero cue validity
with respect to the category'bird'. Although all birds do have a liver, so also do countless
other kinds of creature. On the other hand, [having feathers] has very high cue validity,
since being feathered is a distinctive characteristic of most, if not all birds, and birds
alone. [Being able to fly] has a somewhat lower, but still quite high cue validity, since
there are some other creatures (butterflies, wasps, bats) which also fly. [Not being able to
fly] would have very low cue validity, not only because it applies to only a few kinds of
birds, but because there are countless other kinds of things which do not fly. It is for these
reasons that being able to fly, and having feathers, feature in the bird prototype.
Other researchers (e.g. Murphy 2002: 215) have drawn attention to the converse of
cue validity, namely, category validity. If cue validity can be defined as the probability of
membership in category C, given feature f, i.e. P(C|f), category validity can be defined
as the probability that an entity will exhibit feature f, given its membership in C, i.e.
P(f | C).The interaction of cue and category validity offers an interesting perspective on
inferencing strategies, mentioned in section 1 in connection with the classical theory. (See
also article 106 (Kelter & Kaup) Conceptual knowledge, categorization and meaning.) A
person observes that entity c exhibits feature f. If the feature has high cue validity with
respect to category C, the person may infer, with some degree of confidence, that e is a
member of C. One may then make inferences about e, based on category validity. For
example, an entity with feathers is highly likely to be a bird. If it is a bird, it is likely to be
able to fly, and much more besides.
The hypothetical category mentioned in section 1 - the category defined by the
features [is red], [was manufactured before 1980], and [weighs 8 kg] - exhibits extremely low
cue and category validity, and this could be one reason why such a category would never
be lexicalized in a human language.The fact that something is red scarcely predicts
membership in the category,since there are countless other red things in the universe; likewise
for the two other features. The only way to assign membership in the category would
be to check off each of the three features. Having done this, one can make no
predictions about further properties of the entity.To all intents and purposes, the hypothetical
category would be quite useless.
A functional account of prototypes, outlined above, may be contrasted with an account
in terms of frequency of occurrence. In response to the question 'Where does prototypi-
cality come from?' (Geeraerts 1988), many people are inclined to say that prototypes (or
prototypical instances) arc encountered more frequently than more marginal examples
and that that is what makes them prototypical. Although frequency of occurrence
certainly may be a factor (our prototypical vehicles are now somewhat different from those
of 100 years ago, in consequence of changing methods of transportation) it cannot be the
652 VI. Cognitively oriented approaches to semantics
confidential and not at all public property.The status of information as public or private
property can be expected to vary according to circumstances and cultural conventions.
5. Prototypes and categories
There arc several ways of understanding the notion of prototype and of the relation
between a prototype and a category (Taylor 2008). Some of these have been hinted at in
the preceding discussion. In this section we examine them in more detail.
5.1. Categories are defined with respect to a 'best example'
On this approach, a category is understood and mentally represented simply in terms of
a good example. One understands 'red' in terms of a mental image of a good red, other
hues being assimilated to the category in virtue of their similarity to the prototype.
Some of Rosch's statements may be taken in support of this view. For example, Heider
(1971: 455) surmises that "much actual learning of semantic reference, particularly in
perceptual domains, may occur through generalization from focal exemplars".
Elsewhere she writes of "conceiving of each category in terms of its clear cases rather than its
boundaries" (Rosch 1978:35-36).
An immediate problem arises with this approach. Any colour can be said to be similar
to red in some respect (if only in virtue of its being a colour) and is therefore eligible to
be described as red 'to some degree'. In order to avoid this manifestly false prediction we
might suppose that the outer limits of category membership will be set by the existence
of neighbouring, contrasting categories. As a colour becomes more distant from focal red
and approaches focal orange, there comes a point at which it will no longer be possible
to categorize it as red, not even to a small degree.The colour is, quite simply, not red.
Observe that this account presupposes a structuralist view of lexical semantics,
whereby word meanings divide up conceptual space in a mosaic-like manner, such that
the denotational range of one term is restricted by the presence of neighbouring terms
(Lyons 1977: 260). It predicts (correctly in the case of colours, or at least, basic level
colours) that membership will be graded, in that an entity may be judged to be a member
of a category only to a certain degree depending on its distance from the prototype.The
category, as a consequence, will have fuzzy boundaries, and degree of membership in one
category will inversely correlate with degree of membership in a neighbouring category.
The 'redder' a shade of orange, the less it is orange and the more it is red.
There are a small number of categories for which the above account may well be valid,
including the household receptacles studied by Labov: as a vessel morphs from a
prototypical cup into a prototypical bowl, categorization as cup gradually decreases, offset by
increased categorization as bowl.The account may also be valid for scalar concepts such
as hot, warm, cool, and cold, where the four terms exhaustively divide up the temperature
dimension.
But for a good many categories it will not be possible to maintain that they arc
understood simply in terms of a prototype, with their boundaries set by neighbouring terms.
In the first place, the mosaic metaphor of word meanings may not apply.This is the case
with near synonyms, that is, words which arguably have distinct prototypes, but whose
usage ranges overlap and which are not obviously contrastive.Take the pair high and tall
(Taylor 2003b). Tall applies prototypically to humans (tall man),high to inanimates (high
28. Prototype theory 653
mountain). Yet the words do not mutually circumscribe each other at their boundaries.
Many entities can be described equally well as tall or high; use of one term does not
exclude use of the other. It would be bizarre to say of a mountain that it is high but not
tall, or vice versa.
The approach is also problematic in the case of categories which cannot be reduced to
values on a continuously varying dimension or on set of such dimensions. Consider
natural kind terms such as bird, mammal, and reptile, or gold, silver, and platinum. Natural
kinds are presumed to have a characteristic 'essence', be it genetic, molecular, or
whatever. (This said, the category of natural kind terms may not be clear-cut; see Keil 1989.
As a rule of thumb, we can say that natural kinds are the kinds of things which scientists
study. We can imagine scientists studying the nature of platinum, but not the nature of
furniture.) While natural kind categories may well show goodness-of-example effects, they
tend to have very precise boundaries. Birds, as we know them, do not morph gradually
into mammals (egg-laying monotremes like the platypus notwithstanding), neither can
we conceive of a metal which is half-way between gold and silver. And, indeed, it would
be absurd to claim that knowledge of the bird prototype (e.g. a small songbird, such as a
robin) is all there is to the bird concept, to claim, in other words, that the meaning of bird
is 'robin', and that creatures are called birds simply on the basis of their similarity to the
prototype. While a duck may be similar to a robin in many respects, we cannot appeal to
the similarity as evidence that ducks should be called robins. In the case of categories like
'bird', the prototype is clearly insufficient as a category representation. We need to know
what kinds of things are likely to be members of the category, how far we can generalize
from the prototype, and where (if only approximately) the boundaries lie. We need an
understanding of the category which somehow encompasses all its members.
5.2. The prototype as a set of weighted attributes
In subsequent work Rosch came to a more sophisticated understanding of prototype,
proposing that
categories tend to become defined in terms of prototypes or prototypical instances that
contain the attributes most representative of items inside and least representative of items
outside the category.
(Rosch 1978:30; italics added)
A category now comes to be understood as a set of attributes which are differentially
weighted according to their cue validity, that is, their importance in diagnosing category
membership, and an entity belongs in the category if the cumulative weightings of its
attributes achieve a certain threshold level. On this approach, category members need
not share the same attributes, nor is an attribute necessarily shared by all category
members. Rather, the category hangs together in virtue of a 'family resemblance' (Rosch &
Mcrvis 1975), in which attributes 'criss-cross', like the threads of a rope (Wittgenstein
1978: 32). The more similar an instance to all other category members (this being a
measure of its family resemblance), the more prototypical it is of the category.
A major advantage of the weighted attribute view is that it makes possible a "summary
representation" of a category, which, like the classical theory, "somehow encompass[es]
an entire concept" (Murphy 2002: 49). As a matter of fact, a classical category would
654 VI. Cognitively oriented approaches to semantics
turn out to be a limiting case, where each of the features has an equal and maximal
weighting, and without the presence of each of the features the threshold value would
not be attained.
The weighted attribute view raises the interesting possibility that the prototype may
not correspond to any actual category member; it is more in the nature of an idealized
abstraction. Confirmation comes from work with artificial categories (patterns of dots
which deviate to varying degrees from a pre-established prototype), where subjects have
been able to identify the prototype of a category they have learned, even though they
had not been previously exposed to it (Posner & Keele 1968).
5.3. Categories as exemplars
A radical alternative to feature-based approaches construes a category simply as a
collection of instances. Knowledge of a category consists in a memory store of encountered
exemplars (Smith & Medin 1981). Categorization of a new instance occurs in virtue of
similarities to one or more of the stored exemplars, a prototypical example being one
which exhibits the highest degree of similarity with the greatest number of instances.
There are several variants of the exemplar view of categories. The exemplars might be
individual instances encountered on specific occasions; especially for superordinate
categories, on the other hand, the exemplars might be the basic categories which instantiate
them (Storms, de Boeck & Ruts 2000). In its purest form, the exemplar theory denies
that people make generalizations over category exemplars. Mixed representations might
also be envisaged, however, whereby instances which closely resemble each other might
coalesce into a generic image which preserves what is common to the instances and
filters out the idiosyncratic details (Ross & Makin 1999).
On the face of it, the exemplar view,even in its mixed form, looks rather implausible.The
idea that we retain specific memories of previously encountered instances would surely
make intolerable demands on human memory. Several factors, however, suggest that
wc should not dismiss the exemplar theory out of hand, and indeed Storms, dc Boeck &
Ruts (2000) report that the exemplar theory outperforms the summary representation
theory, at least with respect to membership in superordinate categories. First, computer
simulations have shown that exemplar models are able to account for a surprising range
of experimental findings on human categorization, including, importantly, prototype
effects (Hintzman 1986). Second, there is evidence that human memory is indeed rich in
episodic detail (Schacter 1987). Even such apparently irrelevant aspects of encountered
language, such as the position on a page of a piece of text (Rothkopf 1971), or the voice
with which a word is spoken (Goldinger 1996), may be retained over substantial periods
of time. Moreover, humans are exquisitely sensitive to the frequency with which events,
including linguistic events, have occurred (Ellis 2002). Bybcc (2001) has argued that
frequency should be recognized as a major determinant of linguistic performance,
acceptability judgements, and language change.
A focus on exemplars would tie in with the trend towards usage-based models of
grammar (Langacker 2000,Tomasello 2003). It is axiomatic, in a usage-based model, that
linguistic knowledge is acquired on the basis of encounters with actual usage events. While
generalizations may be made over encountered events, the particularities of the events
need not thereby be erased from memory (Langacker 1987:29). Indeed, it is now widely
recognized that a great deal of linguistic knowledge must reside in rather particular facts
28. Prototype theory 657
of natural kind terms. As mentioned earlier, Labov queried the idea that the set of things
called cups might possess a defining essence, distinct from the recognition features which
allow a person to categorize something as a cup. In this case, the stereotype turns out to
be nothing other than the prototype.
Not to be forgotten also is the fact that speakers may operate with more than one
understanding of a category.The case of'adult' was already mentioned, where an 'expert'
bureaucratic definition might co-exist with a looser, multi-dimensional, and inherently
fuzzy understanding of what constitutes an adult.
6.3. Prototypes save: An excuse for lazy lexicographers?
Wierzbicka (1990) maintained that appeal to prototypes is simply an excuse for lazy
semanticists to avoid having to formulate rigorous word definitions. Underlying
Wicrzbicka's position is the view that words arc indeed amenable to definitions which
are able to predict their full usage range. She offers sample definitions of the loci classici
of the prototype literature, including 'game','lie', and 'bird'.
However, as Geeraerts (1997: 13-16) has aptly remarked, Wierzbicka's definitions
often sneak in prototype effects by the back door, as it were. For example, Wierzbicka
(1990: 361-362) claims that ability to fly is part of the bird-concept, in spite of the fact
that some birds are flightless. The discrepancy is expressed in terms of how a person
would imagine a bird, namely, as a creature able to move in the air, with the proviso that
'some creatures of this kind cannot move in the air'. Far from discrediting prototype
structure, Wierzbicka's definition simply incorporates them.
7. Words and the world
Rosch's work addressed the relation between words and the things in the world to which
the words can refer.
The relation can be studied from two perspectives (Taylor 2007). We can ask, for this
word, what are the things which it can be used to refer to? This is the semasiological, or
referring perspective. Alternatively we can ask, for this thing, what are the words that
we can use to refer to it? This is the onomasiological, or naming perspective (Blank
2003).The two perspectives roughly correspond to the way in which dictionaries and
thesauri are organized. A dictionary lists words and gives their meanings. A thesaurus lists
concepts and gives words which can refer to them.
The two perspectives underlie much research in colour terminology. Consider, for
example, the data elicitation techniques employed by MacLaury (1995). Three
procedures were involved. The first requires subjects to name a series of colour chips
presented in random sequence.This procedure elicits the basic colour terms of the language.
Next, for each of the colour terms proffered on the naming task, subjects are asked to
identify its focal reference on a colour chart.This procedure elicits the prototypes of the
colour terms.Third, for each colour term, subjects map the term on the colour chart,
indicating which colours could be named by the word.This procedure shows the referential
range of the term.
MacLaury's research, therefore, combined an onomasiological perspective (from
world to word, i.e. "What do you call this?") with a semasiological perspective (from
28. Prototype theory 659
over the hill, the trajector traces a random path which 'covers' the hill. In The board is
over the hole, the board completely obscures the hole. In this usage, the verticality of the
trajector vis-a-vis the landmark is no longer obligatory: the board could be positioned
vertically against the hole.
The examples give the flavour of what came to be known, for obvious reasons, as a
radial category. The various senses radiate out from the central, prototypical sense, like
spokes in a wheel.
This approach to polysemy has proved extremely attractive to many researchers, not
least because it lends itself to the visual display of central and derived senses. For a
particularly well worked-out example, see Fillmore & Atkins' (2000) account of English
crawl in comparison to French ramper. The approach has been seen as a convenient way
to handle the fact that the various senses of a word may not share a common definitional
core. Just as the various things we call 'furniture' may not exhibit a set of necessary and
sufficient features, so also the various senses of a word may resist a definition in terms of
a invariant semantic core.
The approach has also been taken up with respect to the semantics of constructions
(Goldberg 1995, 2006). Take, for example, the ditransitive [V NP, NPJ construction in
English. Its presumed prototype, illustrated by give the dog a bone, involves the transfer
of one entity, NP2, to another, NP,, such that NP, ends up having NPr But in throw the
dog a bone there is only the intention that NP, should have NP2, there is no entailment
that NP, does end up having NP2. More distant from the prototypical sense are examples
such as deny someone access, where the intention is that NP2 should be withheld from
NP,.
In applying the notion of a prototype category to cases of polysemy (whether lexical
or constructional), we must be aware of the differences between the two phenomena. On
the one hand,wc can use the word fruit to refer, firstly, to apples and oranges, but also to
olives. Although the word can refer to different kinds of things, the word presumably has
a single sense and designates a single category of objects (albeit, a prototypically
structured category). But when we use the word to refer to the outcome of a person's efforts,
as in the fruit of my labours or The project bore fruit, we are using the word in a different
sense. The outcome of a person's efforts cannot be regarded as just another marginal
example of fruit, akin to an olive or a coconut. Rather, the metaphorical sense has to be
regarded as an extension from the botanical sense. Even so, to speak of the two senses
as forming a category, and to claim that one of the senses is the prototype, is to use the
terms 'category' and 'prototype' also in an extended sense.
In the case of fruit, it is reasonably clear which of the senses is to be taken as basic
and which are extensions therefrom. But in other cases a decision may not be so easy. As
noted above, for Lakoff and Brugman the central sense of over was movement 'above
and across' (The plane flew over the city). For Tyler & Evans (2001), on the other hand,
the 'protoscene' of the preposition is exemplified by The bee is hovering over the flower,
which lacks the notion of movement 'across'.
The question now arises, on what basis is the central sense identified as such? Whereas
Rosch substantiated the prototype notion by a variety of experimental techniques,
linguists applying the prototype model to polysemous items appeal (implicitly or explicitly)
to a variety of principles, which may sometimes be in conflict. One is descriptive elegance,
whereby the prototype is identified as that sense to which the others can most
reasonably, or most economically, be related. However, as the example of over demonstrates,
660 VI. Cognitively oriented approaches to semantics
different linguists are liable to come up with different proposals as to what is the central
sense. Another principle appeals to the organic growth of the polysemous category, with
a historically older sense being taken as more central than senses which have developed
later. Relevant here are certain assumptions concerning metaphorical extension (see
article 26 (Tyler & Takahashi) Metaphors and metonymies).Thus, Lakoff (1987:416-417)
claims that the spatial sense of long (as in a long stick) is 'more central' than the temporal
sense (a long time), on the basis of what is supposed to be a very general conceptual
metaphor which maps spatial notions onto non-spatial domains.
A controversial question concerns the psychological reality of radial categories.
Experimental evidence, such as it is, would suggest that radial categories might actually
have very little psychological reality for speakers of the language (Sandra & Rice: 1995).
One might, for example, suppose that the radial structure represents the outcome of the
acquisition process. Data from acquisition studies, however, do not always corroborate
the radial analysis. Amongst the earliest uses of over which are acquired by children
are uses such as fall over, over here, and all over (i.e. 'finished') (Hallan 2001). These
would probably be regarded as marginal senses on just about any radial analysis. It is
also legitimate to ask, what it would mean, in terms of a speaker's linguistic knowledge,
for a particular sense of a word to be 'marginal' or 'non-prototypical'. Both the temporal
and the spatial uses of long are frequent and both have to be mastered by any competent
speaker of the language.
In the case of a prototype category (as studied by Rosch) wc arc dealing with a single
sense (with its prototype structure) of a word. In the case of a radial category (as
proposed by Lakoff) we are dealing with several senses (each of which will also no doubt
have a prototype structure). The distinction is based on whether we are dealing with a
single sense of a word or multiple (related) senses. However, the allocation of the various
uses of a word to a single sense or to two different senses can be fraught with difficulty,
and various tests for diagnosing the matter can give conflicting results (Gccracrts 1993,
Tuggy 1993). For example, do paint a portrait,paint the kitchen, and paint white stripes on
the road exemplify a single sense of paint, or two (or perhaps even three) closely related
senses? There are arguments for each of these positions. Recently, a number of scholars
have queried whether it is legitimate in principle to try to identify the senses of a word
(Allwood 2003, Zlatev 2003).
Perhaps the most reasonable conclusion to draw from the above is that knowing a
word involves learning a set (possibly, a very large and open-ended set) of established
uses and usage patterns (Taylor 2006). Such an account would be reminiscent of the
exemplar theory of categorization, in that a speaker retains memories, not of category
members, but of word uses. Whether, or how, a speaker of the language perceives these
uses to be related may not have all that much bearing on the speaker's proficiency in
the language. The notion of prototype in the Roschean sense might not therefore be all
that relevant. The notion of prototype, and extensions therefrom, might, however, be
important in the case of novel, or creative uses. In this connection, Langacker (1987:381)
speaks of'local' prototypes. Langacker (1987: 57) characterizes a language as an
inventory of conventionalized symbolic resources. Often, the conceptualization that a speaker
wishes to symbolize on a particular occasion will not correspond exactly with any of the
available resources. Inevitably, some extension of an existing resource will be indicated.
The existing resource constitutes the local prototype and the actual usage is an extension
from it. If the extension is used on future occasions, it may become entrenched and will
28. Prototype theory 663
Langacker, Ronald 1987. Foundations of Cognitive Grammar, vol. 1. Theoretical Prerequisites.
Stanford, CA: Stanford University Press.
Langacker, Ronald 2000. A dynamic usage-based model. In: M. Barlow & S. Kemmer (eds). Usage-
Based Models of Language. Stanford. CA: CSLI Publications, 1-63.
Lewandowska-Tomaszczyk, Barbara 2007. Polysemy, prototypes, and radial categories In: D.
Geeraerts & H. Cuyckens (eds.). The Oxford Handbook of Cognitive Linguistics. Oxford:
Oxford University Press, 139-169.
Lyons, John 1977. Semantics. Cambridge: Cambridge University Press.
MacLaury, Robert 1991. Prototypes revisited. Annual Review of Anthropology 20,55-74.
MacLaury, Robert 1995. Vantage theory. In: J.Taylor & R. MacLaury (eds.). Language and the
Cognitive Construal of the World. Berlin: Mouton de Gruyter, 231-276.
Majid. Asifa, Melissa Bowei man, Miriam van Staden & James S. Bostei 2007. The semantic
categories of cutting and breaking events A crosslinguistic perspective. Cognitive Linguistics 18.
special issue, 133-152.
Moon, Rosamund 1998. Fixed Expressions and Idioms in English. A Corpus-Based Approach.
Oxford: Clarendon Press.
Murphy. Gregory 2002. The Big Book of Concepts. Cambridge. MA: The MIT Press.
Murphy. Gregory & Douglas Medin 1985. The role of theories in conceptual coherence.
Psychological Review 92.289-316.
Osherson, Daniel & Edward E. Smith 1981. On the adequacy of prototype theory as a theory of
concepts Cognition 9,35-58.
Posner. Michael & Steven Keele 1968. On the genesis of abstract ideas Journal of Experimental
Psychology 11,353-363.
Pulman, Stephen G. 1983. Word Meaning and Belief. London: Croom Helm.
Putnam, Hilary 1975. Philosophical Papers, vol. 2. Mind, Language and Reality. Cambridge:
Cambridge University Press
Rosch, Eleanor 1975. Cognitive representations of semantic categories Journal of Experimental
Psychology. General 104,192-233.
Rosch, Eleanor 1977. Human categorization. In: N. Warren (ed.). Studies in Cross-Cultural
Psychology, vol. I. London: Academic Press, 3-49.
Rosch, Eleanor 1978. Principles of categorization. In: E. Rosch & B. Lloyd (eds). Cognition and
Categorization. Hillsdale, NJ: Erlbaum, 27-48. Reprinted in: B. Aarts et al. (eds). Fuzzy Grammar.
A Reader. Oxford: Oxford University Press 2004,91-108.
Rosch, Eleanor & Carolyn B. Mervis 1975. Family resemblances Studies in the internal structure of
categories Cognitive Psychology 1,573-605.
Rosch, Eleanor, Carolyn Mervis Wayne Grey, David Johnson & Penny Boyes-Braem 1976. Basic
objects in natural categories Cognitive Psychology 8,382-439.
Ross Brian H. & Valerie S. Makin 1999. Prototype versus exemplar models In: R. J. Sternberg (ed.).
The Nature of Cognition. Cambridge, MA: The MIT Press, 205-241.
Rothkopf. Ernst Z. 1971. Incidental memory for location of information in text. Journal of Verbal
Learning and Verbal Behavior 10,608-613.
Sandra. Dominiek & Sally Rice 1995. Network analysis of prepositional meaning. Mirroring whose
mind - the linguist's or the language user's? Cognitive Linguistics 6,89-130.
Schacter, Daniel 1987. Implicit memory. History and current status Journal of Experimental
Psychology. Learning, Memory, and Cognition 13,501-518.
Smith. Edward E. & Douglas L. Mcdin 1981. Categories and Concepts. Cambridge, MA: Harvard
University Press
Storms, Gert, Paul de Boeck & Wim Ruts 2000. Prototype and exemplar-based information in
natural language categories Journal of Memory and Language 42,51-73.
Sweetser, Eve 1987. The definition of lie. An examination of the folk models underlying a semantic
prototype. In: D. Holland & N. Quinn (eds). Cultural Models in Language and Thought.
Cambridge: Cambridge University Press, 43-66.
29. Frame Semantics
665
frames in providing a principled account of the openness and richness of word-meanings,
distinguishing a frame-based account from classical approaches, such as accounts based
on conceptual primitives, lexical fields, and connotation, and showing how they can play a
role in the account of how word meaning interacts with syntactic valence.
For there exists a great chasm between those, on the one side, who relate everything to a
single central vision, one system more or less coherent or articulate, in terms of which they
understand, think and feel - a single, universal, organizing principle in terms of which alone
all that they are and say has significance - and, on the other side, those who pursue many
ends, often unrelated and even contradictory, connected, if at all, only in some de facto way,
for some psychological or physiological cause, related by no moral or aesthetic principle.
Berlin (1957: 1).cited by Minsky (1975)
1. Introduction
Two properties of word meanings contribute mightily to the difficulty of providing a
systematic account.
One is the openness of word meanings.The variety of word meanings is the variety of
human experience. Consider defining words such as Tuesday, barber, alimony, seminal,
amputate, and brittle. One needs to make reference to diverse practices, processes, and
objects in the social and physical world: repeatable calendar events, grooming and hair,
marriage and divorce, discourse about concepts and theories, and events of breaking.
Before this seemingly endless diversity, semanticists have in the past stopped short,
excluding it from the semantic enterprise, and attempting to draw a line between a small
linguistically significant set of primitive concepts and the openness of the lexicon.
The other problem is the closely related problem of the richness of word meanings.
Words are hard to define, not so much because they invoke fine content specific
distinctions, but because they invoke vast amounts of background information.The concept of
buying presupposes the complex social fact of a commercial transaction.The concept of
alimony presupposes the complex social fact of divorce, which in turn presupposes the
complex social fact of marriage. Richness, too, has inspired semanticists simply to stop, to
draw a line, saying exact definitions of concepts do not matter for theoretical purposes.
This boundary-drawing strategy, providing a response if not an answer to the
problems of richness and openness, deserves some comment. As linguistic semanticists, the
story goes, our job is to account for systematic, structurally significant properties of
meaning.This includes:
(1) a. the kinds of syntactic constructions lexical meanings are compatible with,
i. the kinds of participants that become subjects and objects
ii. regular semantic patterns of oblique markings and valence alternations
b. Regular patterns of inference licensed by category, syntactic construction or
closed class lexical item.
The idea is to carve off that part of semantics necessary for knowing and using the
syntactic patterns of the language. To do this sort of work, we do not need to pay
attention to every conceptually possible distinction. Instead we need a small set of semantic
668 VI. Cognitively oriented approaches to semantics
depends on the concept of marriage. The dependency is definitional. Unless you define
what a marriage is, you can't define what a divorce is. Unless you define what a divorce
is, you can't define what alimony is. Thus there is a very real sense in which the
dependencies we are describing move us toward simpler concepts. Notice, however, that the
dependency is leading in a different direction than an analysis that decomposes
meanings into a small set of primitives like cause and become. Instead of leading to concepts
of increasing generality and abstractness, we are being led to define the situations or
circumstances which provide the necessary background for the concepts we are describing.
The concepts of marriage and divorce are equally specific, but the institution of marriage
provides the necessary background for the institution of divorce.
Or consider the complex subject of Tuesdays (Fillmore 1985). Wc live in a world of
cyclic events. Seasons come and go and then return.This leads to a cyclic calendar which
divides time up into repeating intervals, which are divided up further. Years are divided
into months, which are divided into weeks, which are divided into days, which have cyclic
names. Each week has a Sunday, a Monday, a Tuesday, and so on. Defining Tuesday
entails defining the notion of a cyclic calendar. Knowing the word Tuesday may not entail
knowing the word Sunday, but it does entail understanding at least the concept of a week
and a day and their relation, and that each week has exactly one Tuesday.
We thus have words and background concepts. We will call the background concept
the frame. Now the idea of a frame begins to have some lexical semantic bite with the
observation that a single concept may provide the background for a set of words. Thus
the concept of marriage provides the background for words/suffixes/phrases such as
bride, groom, marriage, wedding, divorce, -in-law, elope,fiancee, best man, maid-of-honor,
honeymoon, husband, and wife, as well as a variety of basic kinship terms omitted here
for reasons of space. The concept of calendar cycle provides the frame for lexical
items such as week, month, year, season, Sunday, ..., Saturday, January, ..., December,
day, night, morning, and afternoon. Notice that a concept once defined may provide
the background frame for further concepts. Thus, divorce itself provides the
background frame for lexical items such as alimony, divorce, divorce court, divorce attorney,
ex-husband, and ex-wife.
In sum, a frame may organize a vocabulary domain:
Borrowing from the language of gestalt psychology wc could say that the assumed
background of knowledge and practices - the complex frame behind this vocabulary domain
- stands as a common ground to the figure representable by any of the individual words
[Words belonging to a frame] are lexical representatives of some single coherent schemati-
zation of experience or knowledge.
Fillmore (1985: 223)
Now a premise of Frame Semantics is that the relation between lexical items and frames
is open ended. Thus one way in which the openness of the lexicon manifests itself is in
building concepts in unpredictable ways against the backdrop of other concepts. The
concept of marriage seems to be universal or near-universal in human culture.The
concept of alimony is not. No doubt concepts sometimes pop into the lexicon along with
their defining frames (perhaps satellite is an example), but the usual case is to try to build
them up out of some existing frame (Thus horseless carriage leading to car is the more
usual model).
670 VI. Cognitively oriented approaches to semantics
2.2. Basic tools
We have thus far focused on the role of frames in a theory of word meanings. Note that
nothing in particular hangs on the notion word. Frames may also have a conventional
connection to a simple syntactic constructions or idiom; give someone the slip probably
belongs to the same frame as elude. Or they may be tied to more complex constructions
such as the Comparative Correlative (cf. article 86 (Kay & Michaclis) Constructional
meaning).
(7) The more I drink the better you look.
This construction has two "slots" requiring properties of quantity or degree. The same
issues of background situation and profiled participants arise whether the linguistic
exponent is a word or construction.The term sign, used in exactly the same sense as it is
used by construction grammarians, will serve here as well.
As a theory of the conventional association of schematized situations and linguistic
exponents, then, Frame Semantics makes the assumption that there is always some
background knowledge relative to which linguistic elements do some profiling, and relative to
which they are defined.Two ideas are central:
1. a background concept
2. a set of signs including all the words and constructions that utilize this conceptual
background.
Two other important frame theoretic concepts are frame elements and profiling.
Thus far in introducing frames I have emphasized what might be called the
modularity of knowledge. Our knowledge of the the world can usefully be divided up into
concrete chunks. Equally important to the Fillmorian conception of frames is the integrating
function of frames. That is, frames provide us with the means to integrate with other
frames in context to produce coherent wholes. For this function, the crucial concept is the
notion of a frame element (Fillmore & Baker 2000). A frame element is simply a regular
participant, feature, or attribute of the kind of situation described by a frame.Thus, frame
elements of the wedding frame will include the husband, wife, wedding ceremony,
wedding date, best man and maid of honor, for example. Frame elements need not be
obligatory; one may have a wedding without a best man; but they need to be regular recurring
features.
Thus, frames have slots, replaceable elements. This means that frames can be linked
to to other frames by sharing participants or even by being participants in other frames.
They can be components of an interpretation.
In Frame Semantics, all word meanings are relativized to frames. But different
words select different aspects of the background to profile (we use the terminology in
Langacker 1984). Sometimes aspects profiled by different words are mutually exclusive
parts of the circumstances, such as the husband and wife in the marriage frame, but
sometimes word meanings differ not in what they profile, but in how they profile it. In such
cases, I will say words differ in perspective (Fillmore 1977a). I will use Fillmore's much-
discussed commercial event example (Fillmore 1976) to illustrate:
29. Frame Semantics
671
(8) a. John sold the book to Mary for $100.
b. Mary bought the book from John for $100.
c. Mary paid John $100 for the book.
Verbs like buy, sell, pay have as background the concept of a commercial transaction, an
event in which a buyer gives money to a seller in exchange for some goods. Now because
the transaction is an exchange it can be thought of as containing what Fillmore calls two
subscenes: a goods Jransfer, in which the goods is transferred from the seller to the buyer,
and a money Jransfer, in which the money is transferred from the buyer to the seller.
Here it is natural to say that English has as a valence realization option for transfers of
possession one in which the object being transferred from one possessor to another is
realized as direct object. Thus verbs profiling the money transfer will make the money
the direct object (pay and collect) and verbs profiling the goods transfer will make the
goods the direct object (buy and sell). Then the difference between these verb pairs can
be chalked up to what is profiled.
But what about the difference between buy and selll By hypothesis, both verbs profile
a goods transfer, but in one case the buyer is subject and in another the seller is. Perhaps
this is just an arbitrary choice. This is in some sense what the thematic role theory of
Dowty (1991) says: Since (8a) and (8b) are mutually entailing, there can be no semantic
account of the choice of subject.
In Frame Semantics, however, we may attempt to describe the facts as follows: in the
case of buy the buyer is viewed as (perspectivalized as) agent, in the case of sell, the seller
is. There are two advantages to this description. First, it allows us to preserve a
principle assumed by a number of linguists, that cross-linguistically agents must be subjects.
Second, it allows us to interpret certain adjuncts that enter into special relations with
agents: instrumentals, benefactives, and purpose clauses.
(9) a. John bought the book from Mary with/for his last pay check. [Both with and for
allow the reading on which the pay check provides the funds for the purchase.]
b. Mary sold the book to John ?with/for his last paycheck. [Only for allows the
reading on which the pay check provides the funds.]
c. John bought the house from Sue for Mary, [allows reading on which Mary is
ultimate owner, disallows the reading on which Mary is seller and Sue is seller's
agent.]
d. Sue sold the house to John for Mary, [allows reading on which Mary is seller and
Sue is seller's agent; disallows reading on which Mary is ultimate owner.]
e. John bought the house from Sue to evade taxes/as a tax dodge, [tax benefit is
John's]
f. Sue sold the house to John to evade taxes/as a tax dodge, [tax benefit is Sue's]
But what does it mean to say that a verb takes a perspective which "views" a particular
participant as an agent? The facts are, after all, that both the buyer and the seller are
agents; they have all the entailment properties that characterize what we typically call
agents; and this, Dowty's theory of thematic roles tells us, is why verbs like buy and sell can
co-exist. I will have more to say on this point in section 4; for the moment I will confine
myself to the following general observation on what Frame Semantics allows: What is
profiled and what is left out is not determined by the entailment facts of its frame. Complex
672 VI. Cognitively oriented approaches to semantics
variations are possible. For example, as Fillmore observes, the commercial transaction
frame is associated with verbs that have no natural way of realizing the seller:
(10) John spent $100 on that book.
Nothing in the valence marking of the verb spend suggests that what is being profiled
here is a possession transfer; neither the double object construction, nor from nor to is
possible for marking a core commercial transaction participant. Rather the pattern
seems to be the one available for what one might call resource consumption verbs like
waste, lose, use (up), and blow. In this profiling, there is no room for a seller. Given that
such variation in what is profiled is allowed, the idea that the agenthood of a participant
might be part of what's included or left out does not seem so far-fetched. As I will argue
in section 4, the inclusion of events into the semantics can help us make semantic sense
of what abstractions like this might mean.
These considerations argue that there can be more than one frame backgrounding
a single word meaning; for example, concepts of commercial event, possession transfer,
and agentivity simultaneously define buy. A somewhat different but related issue is the
issue of event structure.There is strong evidence cross-linguistically at least in the form
of productive word-formation processes that some verbs - for example, causatives -
represent complex events that can only be expressed through a combination of two frames
with a very specific semantics. So it appears that a word meaning can simultaneously
invoke a configuration of frames, with particulars of the configuration sometimes spelled
out morphologically.
The idea that any word meaning exploits a background is of use in the account of
polysemy. Different senses will in general involve relativization to different frames. As a
very simple example, consider the use of spend in the following sentence:
(11) John spent 10 minutes fixing his watch.
How are we to describe the relationship of the use of spend in this example, which
basically describes a watch fixing event, with that in (10), which describes a commercial
transaction? One way is to say that one sense involves the commercial transaction,
and another involves a frame we might call action duration which relates actions to
their duration, a frame that would also be invoked by durativc uses of for. A
counterproposal is that there is one sense here, which involves an actor using up a resource. But
such a proposal runs up against the problem that spend really has rather odd disjunctive
selection restrictions:
(12) John spent 30 packs of cigarettes that afternoon.
Sentence (12) is odd except perhaps in a context (such as a prison or boarding school)
where cigarette packs have become a fungible medium of exchange; what it cannot mean
is that John simply used up the cigarettes (by smoking them, for example). The point
is that a single general resource consumption meaning ought to freely allow resources
other than time and money, so a single resource consumption sense does not correctly
describe the readings available for (12); however, a sense invoking a commercial
transaction frame constrained to very specific circumstances docs. Note also, that the fact
29. Frame Semantics
673
that 30 packs of cigarettes can be the money participant in the right context is
naturally accommodated.The right constraint on the money participant is not that it be cash
(for which Visa and Mastercard can be thankful), but that it be a fungible medium of
exchange.
Summarizing:
1. Frames are motivated primarily by issues of understanding and converge with
various schema-like conceptions advanced by cognitive psychologists, AI researchers,
and cognitive linguists. They are experientially coherent backgrounds with variable
components that allow us to organize families of concepts.
2. The concept of frames has far reaching consequences when applied to lexical
semantics, because a single frame can provide the organizing background for a set of words.
Thus frames can provide an organizing principle for a rich open lexicon. FrameNet is
an embodiment of these ideas.
3. In proposing an account of lexical semantics rich enough for a theory of
understanding, Frame Semantics converges with other lexical semantic research which has
been bringing to bear a richer set of concepts on problems of the syntax semantics
interface.
Having sketched the basic idea, I want in the next two sections to briefly contrast the
notion frame with two other ideas that have played a major role in semantics, the idea of
a relation, as incorporated via set theory and predicate logic into semantics, and the idea
of a lexical field.
3. Related conceptions
In this section I compare the idea of frames with two other concepts of major importance
in theories of lexical semantics, relations and lexical fields. The comparison offers the
opportunity to develop some other key ideas of Frame Semantics, including profiling
and saliency.
3.1. Frames versus relations: profiling and saliency
Words (most verbs, some nouns, arguably all degreeable adjectives) describe relations
in the world. Love and hate are relations between animate experiencers and objects.
The verb believe describes a relation between an animate experiencer and a
proposition. These are commonplace views among philsophers of language, semanticists,
and syntacticians, and they have provided the basis for much fruitful work. Where do
frames fit in?
For Fillmore, frames describe the factual basis for relations. In this sense they are
"pre-"relational.To illustrate, Fillmore (1985) cites Mill's (1847) discussion of the words
father and son. Although there is a single history of events which establishes both the
father- and the son- relation, the words father and son pick out different entities in the
world. In Mill's terminology, the words denote different things, but connote a single thing,
the shared history.This history, which Mill calls the fundamentum relationis (the
foundation of the relation), determines that the two relations bear a fixed structural relation to
VI. Cognitively oriented approaches to semantics
each other. It is the idea of a determinate structure for a set of relations that Fillmore
likens to the idea of a frame.
Thus, a frame defines not a single relation but, minimally, a structured set of relations.
This conception allows for a natural description not just of pairs of words like father
and son, but also of single words which do not in fact settle on a particular relation.
Consider the verb risk, discussed in Fillmore & Atkins (1998), which seems to allow a range
of participants into a single grammatical "slot". For example,
(13) Joan risked
a. censure.
b. her car.
c. a trip down the advanced ski slope.
The risk frame has at least 3 distinct participants, (a) the bad thing that may happen,
(b) the valued thing that may be lost, and (c) the activity that may cause the bad thing
to happen. All can be realized in the direct object position, as (13) shows. Since there
are three distinct relations here, a theory that identifies lexical meanings with relations
needs to say there are 3 meanings as well. Frame Semantics would describe this as one
frame allowing 3 distinct profilings. It is the structure of the frame together with the
profiling options the language makes available which makes the 3 alternatives possible.
Other verbs with a similar indeterminacy of participant are copy, collide, and mix:
(14) a. Sue copied her costume (from a film poster).
b. Sue copied the film poster.
c. The truck and the car collided.
d. The truck collided with the car.
e. John mixed the soup.
f. John mixed the paste into the soup.
g. John mixed the paste and the flour.
In each of these cases the natural Frame Semantics account would be to say the frame
remains constant while the profilings or perspective changes. Thus, under a Frame
Semantics approach, verbal valence alternations are to be expected, and the possibility
of such alternations provides motivation for the idea of a background frame with a range
of participants and a range of profiling options.
Now on a theory in which senses are relations, all the verbs in (14) must have
different senses.This is, for example, because the arguments in (14a) and (14b) fill different
roles. Frame Semantics allows another option. We can say the same verb sense is used in
both cases.The differences in interpretation arise because of differences in profiling and
perspectivalization.
3.2. Frames versus lexical fields
Because frames define lexical sets, it is useful to contrast the concept of frames with an
earlier body of lexical semantic work which takes as central the identification of
lexical sets. This work develops the idea of lexical fields (Weisgerber 1962; Coseriu 1967;
Trier 1971; Geckeler 1971; Lehrer & Kittay 1992). Lexical fields define sets of lexical
items in mutually defining relations, in other words, lexical semantic paradigms. The
676 VI. Cognitively oriented approaches to semantics
that preoccupied lexical field theorists, a specialization of height that includes just the
polar adjectives, a specialization of temperature that includes just the set cold, cool,
warm, hot, and so on. This level of specificity in fact roughly describes the granularity
of FramcNct.
3.3. Minskian frames
As described in Fillmore (1982), the term frame is borrowed from Marvin Minsky. It will
be useful before tackling the question of how profiling and perspectivalization work to
take a closer look at this precursor.
In Minsky's original frames paper (Minsky 1975), frames were put forth as a solution
to the problem of scene interpretation in vision. Minsky's proposal was in reaction to
those who, like the Gcstalt theorists (Koffka 1963), viewed scene perception as a single
holistic process governed by principles similar to those at work in electric fields. Minsky
thought scenes were assembled in independent chunks, constituent by constituent, in a
series of steps involving interpretation and integration.To describe this process, a model
factoring the visual field into a number of discrete chunks, each with its own model of
change with its own discrete phases, was needed.
A frame was thus a dynamic model of some specific kind of object with specific
participants and parameters. The model had built-in expectations about ways in which the
object could change, cither in time or as a viewer's perspective on it changed, formalized
as operations mapping old frame states to new frame states. A frame also included a
set of participants whose status changed under these operations; those moving into
certain distinguished slots are foregrounded. Thus, for example, in the simplified version of
Minsky's cube frame, shown before and after a rotation in Figs. 29.1 and 29.2, a frame
state encodes a particular view of a cube and the participants are cube faces. One
possible operation is a rotation of the cube, defined to place new faces in certain view-slots,
and move old faces out and possibly out of view. The faces that end up in view arc the
foregrounded participants of the resulting frame state. Thus the cube frame offers the
tools for representing particular views or perspectives on a cube, together with the
operations that may connect them in time.
Fig. 29.1: View of cube together with simplified cube frame representing that view. Links marked
"fg" lead to foregrounded slots: slots marked "invis" are backgrounded. Faces D and C are out
of view.
29. Frame Semantics
679
The connection between an argument frame like agent patient and simple
circumstance frames can be illustrated through the example of the possession transfer
frame (related to verbs like give, get, take, receive, acquire, bequeath, loan, and so on).
Represented as an AVM, this is:
(19)
POSSESSION TRANSFER
donor animate
possession entity
recipient animate
Now both give and acquire will be defined in terms of the possession transfer frame, but
give and acquire differ in that with give the donor becomes subject and with acquire the
recipient does. (Compare the difference between buy and sell discussed in section 2.2.)
We will account for this difference by saying that give and acquire have different
mappings from the agent patient frame to their shared circumstance frame (possession
transfer).This works as follows.
We define the relation between a circumstance and argument frame via a perspectiv-
alizing function. Here are the axioms for what we will call the acquisition function, on
which the recipient is agent:
(20) a. acquisition : possession_transfer -» agent_patient
b. agent o acquisition = recipient
c. patient o acquisition = possession
d. source o acquisition = donor
The first line defines acquisition as a mapping from the sort possession_transfer to the
sort agent_patient, that is as a mapping from possession transfer eventualities to agent
patient eventualities. The mapping is total; that is, each possession transfer is
guaranteed to have an agent patient eventuality associated with it. In the second line, the
symbol o stands for function composition; the composition of the agent function with the
acquisition function (written agent o acquisition) is the same function (extensionally) as
the recipient relation.Thus the filler of the recipient role in a possession transfer must be
the same as the filler of the agent role in the associated agent patient eventuality. And so
on, for the other axioms. Summing up AVM style:
(21)
POSSESSION TRANSFER
donor \\j
recipient [2]
possession [3]
AGENT
PATIENT
agent [2]
source [y
patient [3]
I will call the mapping that makes the donor agent donation.
VI. Cognitively oriented approaches to semantics
(22)
POSSESSION TRANSFER
donor
recipient
possession
AGENT
PATIENT
agent [T]
goal ||
patient \3]
With the acquisition and donation mappings defined, the predicates give and acquire
can be defined as compositions with donation and acquisition:
give = POSSESSION transfer o donation'*
acquire = POSSESSION transfer o acquisition'*
donation'* is an inverse of donation, a function from agent patient eventualities to
possession transfers defined only for those agent patient events related to possession
transfers. Composing this with the possession transfer predicate makes give a
predicate true of those agent patient events related to possession transfers, whose agents are
donors and whose patients are possessions. The treatment of acquire is parallel but uses
the acquisition mappings. For more extensive discussion, sec Gawron (2008).
Summarizing:
a. an argument frame agent patient, with direct consequences for syntactic valence
(agents become subject, patients direct object, and so on).
b. a circumstance frame possession transfer, which captures the circumstances of
possession transfer.
c. perspectivalizing functions acquisition and donation which map participants in the
circumstances to argument structure.
This is the basic picture of perspectivalization. The picture becomes more interesting
with a richer example.
In the discussion that follows, I assume a commercial transaction frame with at least
the following frame elements:
(23)
COMMERCIAL
TRANSACTION
buyer animate
seller animate
money fungible
goods entity
This is a declaration that various functions from event sorts to truth values and entity
sorts exist, a rather austere model for the sort of rich backgrounding function we have
assumed for frames. Wc will sec how this model is enriched below.
Our picture of profiling and perspectivalization can be extended to the more complex
cases of commercial transaction predicates with one more composition. For example, we
may define buy' as follows:
29. Frame Semantics
683
consumption
Buy
resource
consumption
Fig. 29.3: Left: Lexical network for commercial transaction. Right: Same network with the perspec-
tivalization chosen by buy in the boxed area.
do ... cause become ... in Dowty's system attributes a fixed structure to causatives of
inchoatives. As these examples show, a frame may have subscene roles which carve up
its constituents in incompatible ways. Now this may seem peculiar. Shouldn't the roles of
a frame define a fixed relation between disjoint entities? I submit that the answer is no.
The roles associated with each sort of event are regularities that help us classify an event
as of that sort. But such functions are not guaranteed to carve up each event into non-
overlapping, hierarchically structured parts. Sometimes distinct roles may select
overlapping constituents of events, particularly when independent individuation criteria are not
decisive, as when the constituents are collectives, or shapeless globs of stuff, or abstract
things such as events or event types. Thus we get the cases discussed above like collide,
mix, and risk, where different ways of profiling the frames give us distinct, incompatible
sets of roles. We may choose to view the colliders as a single collective entity (X and
Y collided), or as two (X collided with Y). We may choose to separate a figure from a
ground in the mixing event (14f),or lump them together (mix X and Y), or just view the
mixed substance as one (14f). Finally, risks involve an action (13c) and a potential bad
consequence (13a), and for a restricted set of cases in which that bad consequence is a
loss, a lost thing (13b).
What of relations? Formally, in this frame-based picture, we have replaced relations
with event predicates, each of which is defined through some composed set of mappings
to a set of events that will be defined only for some fixed set of roles. Clearly, for every
lexical predicate, there is a corresponding relation, namely one defined for exactly the
same set of roles as the predicate. Thus in the end the description of the kind of lexical
semantic entity which interfaces with the combinatorial semantics is not very different.
29. Frame Semantics
685
(27) a. The glass was hurled against the wall and broke.
b. The cockroach was hurled against the wall and died.
Thus the defaults at play in determining matters of "valence" differ from those in
discourse. We can at least describe the contrasts in (26) - not explain it - by saying
movement is an optional component of the breaking frame through which the denotation of
the verb break is defined, and not a component of the killing frame; or in terms of the
formal picture of section 4: Within the conventional lexical network linking frames in
English there is a partial function from breaking events to movement subscenes; there is
no such function for killing events.
In contrast Fillmore's (1985:232) discussed in section 2.1:
(28) We never open our presents until morning.
The point of this example was that it evoked Christmas without containing a single word
specific to Christmas. How might an automatic interpretation system simulate what is going
on for human understanders? Presumably by a kind of application of Occam's razor.There
is one and only one frame that explains both the presence of presents and the custom of
waiting until morning, and that is the Christmas frame.Thus the assumption that gets us the
most narrative bang for the buck is Christmas. In this case the frame has to be evoked by
dynamically assembling pieces of information activated in this piece of discourse.
These two examples show that frames will function differently in a theory of discourse
understanding than they will in a theory of sign-meanings in at least two ways. They will
require a different notion of default, and they will need to resort to different inferencing
strategies, such as inference to the most economical explanation.
7. Conclusion
The logical notion of a relation, which preserves certain aspects of the linearization
syntax forces on us, has at times appeared to offer an attractive account of what we grasp
when we grasp sign meanings. But the data we have been looking at in this brief
excursion into Frame Semantics has pointed another way. Lexical senses seem to be tied to
the same kind schemata that organize our perceptions and interpretations of the social
and physical world. In these schemata participants are neither linearized nor uniquely
individuated, and the mapping into the linearized regime of syntax is constrained but
underdetermined. We see words with options in what their exact participants are and
how they are realized. Frames offer a model that is both specific enough and flexible
enough to accommodate these facts, while offering the promise of a firm grounding for
lexicographic description and an account of text understanding.
8. References
Ashcr, Nicholas & Alex Lascaridcs 1995. Lexical disambiguation in a discourse context. Journal of
Semantics 12,69-108.
Baker, Collin & Charles J. Fillmore 2001. Frame Semantics for text understanding. In: Proceedings
of WordNet and Other Lexical Resources Workshop. Pittsburgh, PA: North American
Association of Computational Linguistics, 3-4.
29. Frame Semantics
687
Fillmore, Charles J. 1985. Frames and the semantics of understanding. Quaderni di Semantica 6,
222-254.
Fillmore, Charles J. & B.T.S Atkins 1994. Starting where the dictionaries stop. The challenge for
computational lexicography. In: B.T.S Atkins & A. Zampolli (eds.). Computational Approaches
to the Lexicon. Oxford: Clarendon Press.
Fillmore. Charles J. & B.T.S Atkins 1998. FraineNet and lexicographic relevance. In: Proceedings
of the First International Conference on Language Resources and Evaluation. Granada:
International Committee on Computational Linguistics, 28-30.
Fillmore. Charles J. & Collin Baker 2000. FrameNet, http://www.icsi.berkeley.edu/~framenet,
October 13,2008.
Gawron, Jean Mark 1983. Lexical Representations and the Semantics of Complementation. New
York: Garland.
Gawron, Jean Mark 2008. Circumstances and Perspective. The Logic of Argument Structure, http://
iepositoiiescdlib.oig/ucsdling/sdlp3/l. October 13,2008.
Geckeler, Horst 1971. Strukturelle Semantik und Wortfeldtheorie. Miinclieii: Fink.
Hobbs, Jerry R., Mark Stickcl, Douglas Appelt & Paul Martin 1993. Interpretation as abduction.
Artificial Intelligence 63,69-142.
Jackendoff, Ray S. 1990. Semantic Structures. Cambridge, MA:The MIT Press.
Koffka, Kurt 1963. Principles ofGestalt Psychology. New York: Harcourt, Brace, and World.
Lakoff, George 1972. Linguistics and natural logic. In: D. Davidson & G. Harm an (eds.). Semantics
of Natural Language. Dordrecht: Reide 1,545-665.
Lakoff, George 1983. Category. An essay in cognitive linguistics In: Linguistic Society of Korea
(ed.). Linguistics in the Morning Calm. Seoul: Hanshin, 139-194.
Lakoff, George & Mark Johnson 1980. Metaphors We Live By. Chicago, IL: The University of
Chicago Press.
Langacker, Ronald 1984. Active zones In: C. Brugman et al. (eds). Proceedings of the Tenth Annual
Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society, 172-188.
Lehrer, Adrienne & Eva Kittay (eds.) 1992. Frames, Fields, and Contrasts. New Essays in Semantic
and Lexical Organization. Hillsdale. NJ: Lawrence Erlbaum.
Levin,Beth 1993. English Verb Classes and Alternations. Chicago, I L:The University of Chicago Press.
Mill, John Stuart 1847. A System of Logic. New York: Harper and Brothers
Minsky, Marvin 1975.A framework for representing knowledge. In: P.Winston (ed.). The Psychology
of Computer Vision. New York: McGraw-Hill, 211-277. http://web.media.init.edu/~minsky/
papers/Frames/franie&html, October 13,2008.
Parsons, Terence 1990. Events in the Semantics of English. Cambridge, MA:The MIT Press.
Pustejovsky, James 1995. The Generative Lexicon. Cambridge, MA:The MIT Press.
Rounds, William C. 1997. Feature logics In: J. van Benthem & A. ter Meulen (eds.). Handbook of
Logic and Language. Amsterdam: Elsevier, 475-533.
Rumelhart, Donald 1980. Schemata. The building blocks of cognition. In: R.J. Spiro, B.C. Bruce
& W.F. Brewer (eds.). Theoretical Issues in Reading Comprehension. Hillsdale, NJ: Lawrence
Erlbaum, 33-58.
Schank, Roger C. & R.P. Abelson 1977. Scripts, Plans, Goals, and Understanding. Hillsdale, NJ:
Lawrence Erlbaum.
Smolka, Gert 1992. Feature constraint logics for unification grammars Journal of Logic
Programming 12,51-87.
Trier, Jost 1971. Aufs'dlze und Vortrage zur Wortfeldtheorie.T\\c Hague: Mouton.
Wcisgcrbcr, J. Leo 1962. Gnindzilge der inhaltsbezogenen Grammatik. Diisscldorf: Schwann.
Jean-Mark Gawron, San Diego, CA (USA)
688 VI. Cognitively oriented approaches to semantics
30. Conceptual Semantics
1. Overall framework
2. Major features of Conceptual Structure
3. Compositionality
4. References
Abstract
Conceptual Semantics takes the meanings of words and sentences to be structures in the
minds of language users, and it takes phrases to refer not to the world per se, but rather
to the world as conceptualized by language users. It therefore takes seriously constraints
on a theory of meaning coming from the cognitive structure of human concepts, from the
need to learn words, and from the connection between meaning, perception, action, and
nonlinguistic thought. The theory treats meanings, like phonological structures, as
articulated into substructures or tiers: a division into an algebraic Conceptual Structure and
a geometric/topological Spatial Structure; a division of the former into Propositional
Structure and Information Structure; and possibly a division of Propositional Structure
into a descriptive tier and a referential tier. All of these structures contribute to word,
phrase, and sentence meanings. The ontology of Conceptual Semantics is richer than in
most approaches, including not only individuals and events but also locations,
trajectories, manners, distances, and other basic categories. Word meanings are decomposed
into functions and features, but some of the features and connectives among them do not
lend themselves to standard definitions in terms of necessary and sufficient conditions.
Phrase and sentence meanings are compositional, but not in the strict Fregean sense:
many aspects of meaning are conveyed through coercion, ellipsis, and constructional
meaning.
1. Overall framework
Conceptual Semantics is a formal approach to natural language meaning developed by
Jackendoff (1983,1987,1990,2002,2007) and Pinker (1989,2007); Pustejovsky (1995) has
also been influential in its development.
The approach can be characterized at two somewhat independent levels. The first
is the overall framework for the theory of meaning, and how this framework is
integrated into linguistics, philosophy of language, and cognitive science (section l).Thc
second is the formal machinery that has been developed to achieve the goals of this
framework (sections 2 and 3). These two are somewhat independent: the general
framework might be realized in terms of other formal approaches, and many aspects
of the formal machinery can be deployed within other frameworks for studying
meaning.
The fundamental goal of Conceptual Semantics is to describe how humans express
their understanding of the world by means of linguistic utterances. From this goal flow
two theoretical commitments. First, linguistic meaning is to be described in mentalistic/
psychological terms - and eventually in neuroscientific terms. The theory of meaning,
like the theories of generative syntax and phonology, is taken to be about what is going
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 688-709
30. Conceptual Semantics 693
sentence meanings. Besides differences in the phenomena that it focuses on, Cognitive
Grammar differs from Conceptual Semantics in four respects. First, the style of its
formalization is different and arguably less rigorous. Second, it tends to connect to nonlin-
guistic phenomena through theories of embodied cognition rather than through more
standard cognitive neuroscience. Third (and related), many practitioners of Cognitive
Grammar take a rather empiricist (or Lockean) view of learning, whereas Conceptual
Semantics admits a considerable structured innate (or Kantian) basis to concept
formation. Fourth, Conceptual Semantics is committed to syntax having a certain degree of
independence from semantics, whereas Cognitive Grammar seeks to explain all aspects
of syntactic form in terms of the meaning(s) expressed (see article 86 (Kay & Michaelis)
Constructional meaning).
1.3. Conceptual Structure and Spatial Structure; interfaces
with syntax and phonology
The central hypothesis of Conceptual Semantics is that there is a level of mental
representation, Conceptual Structure, which instantiates sentence meanings and serves as the
formal basis for inference and for connection with world knowledge and perception.The
overall architecture of the mind in which this embedded is shown in Fig. 30.1.
Vision
u ition conceptual „ „ Spatial *C^*Haptic PercePtion
Phonology «-► Syntax «-► J «-► strHuctureJ^
Motor ar *^ °jiu UB °"l"'lure^*> Proprioception
instructions *«-. ..--' ^ Action repesentations
to vocal tract
LANGUAGE COGNITION PERCEPTION/ACTION
Fig. 30.1: Architecture of the mind
Each of the levels in Fig. 30.1 is a generative system with its own primitives and
principles of combination. The arrows indicate interfaces among representations: sets of
principles that provide systematic mappings from one level to the other. On the left-hand
side are the familiar linguistic levels and their interfaces to hearing and speaking. On
the right-hand side are nonlinguistic connections to the world through visual, haptic,
and proprioceptive perception, and through the formulation of action. (One could add
general-purpose audition, smell, and taste as well.)
In the middle lies cognition, here instantiated as the levels of Conceptual
Structure and Spatial Structure. Spatial Structure is hypothesized (Jackendoff 1987, 1996a;
Landau & Jackendoff 1993) as a geometric/topological encoding of 3-dimensional object
shape, spatial layout, motion, and possibly force. It is not a strictly visual
representation, because these features of the conceptualized physical world can also be derived
by touch (the haptic sense), by proprioception (the spatial position of one's own body),
and to some degree by auditory localization. Spatial Structure is the medium in which
these disparate perceptual modalities are integrated. It also serves as input for
formulating one's own physical actions. It is moreover the form in which memory for the
shape of familiar objects and object categories is stored. This level must be generative,
since one encounters, remembers, and acts in relation to an indefinitely large number of
694 VI. Cognitively oriented approaches to semantics
objects and spatial configurations in the course of life (see article 107 (Landau) Space in
semantics and cognition).
However, it is impossible to encode all aspects of cognition in gcomctric/topological
terms. A memory for a shape must also encode the standard type/token distinction, i.e.
whether this is the shape of a particular object or of an object category. Moreover, the
taxonomy of categories is not necessarily a taxonomy of shapes. For instance, forks and
chairs are not at all similar in shape, but both are artifacts; there is no general
characterization of artifacts in terms of shape or the spatial character of the actions one
performs with them. Likewise, the distinction between familiar and unfamiliar objects
and actions cannot be characterized in terms of shape. Finally, social relations such as
kinship, alliance, enmity, dominance, possession, and reciprocation cannot be formulated
in geometric terms. All of these aspects of cognition instead lend themselves to an
algebraic encoding in terms of features (binary or multi-valued, e.g. TYPE vs.TOKEN) and
functions of one or more arguments (e.g. x INSTANCE-OF y, x SUBCATEGORY-OF
y, x KIN-OF y).This system of algebraic features and functions constitutes Conceptual
Structure. Note that at least some of the distinctions just listed, in particular the social
relations, are also made by nonhuman primates. Thus a description of primate nonlin-
guistic cognition requires some form of Conceptual Structure, though doubtless far less
rich than in the human case.
Conceptual Structure too has its limitations. Attempts to formalize meaning and
reasoning have always had to confront the impossibility of coding perceptual characteristics
such as color, shape, texture, and manner of motion in purely algebraic terms.The
architecture in Fig. 30.1 proposes to overcome this difficulty by sharing the work of encoding
meaning between the geometric format of Spatial Structure and the algebraic format
of Conceptual Structure: spatial concepts have complementary representations in both
domains. (In a sense, this is a more sophisticated version of Paivio's 1971 dual-coding
hypothesis and the view of mental imagery espoused by Kosslyn 1980.)
Turning now to the interfaces between meaning and language, a standard
assumption in both standard logic and mainstream generative grammar is that semantics
interfaces exclusively with syntax, and in fact that the function of the syntax-semantics
interface is to determine meaning in one-to-one fashion from syntactic structure (this is
made explicit in article 81 (Harley) Semantics in Distributed Morphology).The
assumption, rarely made explicit (but going back at least to Descartes), is that combinatorial
thought is possible only through the use of combinatorial language.This comports with
the view, common well into the 20th century, that animals are incapable of thought.
Modern cognitive ethology (Hauser 2000; Cheney & Seyfarth 2007) decisively refutes
this view, and with it the assumption that syntax is the source of combinatorial thought.
Conceptual Semantics is rooted instead in the intuition that language is combinatorial
because it evolved to express a pre-existing combinatorial faculty of thought (condition
C6).
Under this approach, it is quite natural to expect Conceptual Structure to be far richer
than syntactic structure - as indeed it is. Culicover & Jackendoff (2005) argue that the
increasing complexity and abstraction of the structures posited by mainstream
generative syntax up to and including the Minimalist Program (Chomsky 1995) have been
motivated above all by the desire to encode all semantic relations overtly or covertly in
syntactic structure. Ultimately the attempt fails because semantic relations are too rich
and multidimensional to be encoded in terms of purely syntactic mechanisms.
30. Conceptual Semantics 695
In Conceptual Semantics, a word is regarded as a part of the language/thought
interface: it is a longterm memory association of a piece of phonological structure (e.g./kaet/),
some syntactic features (singular count noun), and a piece of Conceptual Structure
(FELINE ANIMAL, PET, etc.). If it is a word for a concept involving physical space, it
also includes a piece of Spatial Structure (what cats look like). Other interface principles
establish correspondences between semantic argument structure (e.g. what characters
an action involves) and syntactic argument structure (e.g. transitivity), between scope of
quantification in semantics and position of quantifiers in syntax, and between topic and
focus in semantics (information structure, see article 71 (Hinterwimmer) Information
structure) and affixation and/or position in syntax. In order to deal with situations where
topic and focus arc coded only in terms of stress and intonation (e.g. The dog CHASED
the mailman), the theory offers the possibility of a further interface that establishes a
correspondence directly between semantics and phonology, bypassing syntax altogether.
If it proves necessary to posit an additional level of "linguistic semantic structure"
that is devoted specifically to features relevant for grammatical expression (as posited
in article 31 (Lang & Maienborn) Two-level Semantics, article 19 (Levin & Rappaport
Hovav) Lexical Conceptual Structure and article 81 (Harley) Semantics in Distributed
Morphology), such a level would be inserted between Conceptual Structure and syntax,
with interfaces to both. Of course, in order for language to be understood and to play a
role in inference, it is still necessary for words to bridge all the way from phonology to
Conceptual Structure and Spatial Structure - they cannot stop at the putative level of
linguistic semantic structure. Hence the addition of such an extra component would not
at all change the content of Conceptual Structure, which is necessary to drive inference
and the connection to perception. (Jackendoff 2002, section 9.7, argues that in fact such
an extra component is unnecessary.)
2. Major features of Conceptual Structure
This section sketches some of the important features of Conceptual Structure
(henceforth CS). There is no space here to spell out formal details; the reader is referred to
Jackendoff (1983; 2002, chapters 11-12).
2.1. Tiers in CS
A major advance in phonological theory was the realization that phonological structure
is not a single formal object, but rather a collection of tiers, each with its own formal
organization, which divide up the work of phonology into a number of independent but
correlated domains. These include at least segmental and syllabic structure; the
amalgamation of syllables into larger domains such as feet, phonological words, and into-
national phrases; the metrical grid that assigns stress; and the structure of intonation
contours correlated with prosodic domains.
A parallel innovation is proposed within Conceptual Semantics. The first division,
already described, is into Spatial Structure and Conceptual Structure. Within CS, the
clearest division is into Propositional Structure, a function-argument encoding of who
did what to whom, how, where, and when (arguments and modifiers), versus
Information Structure, the encoding of Topic, Focus, and Common Ground. These two aspects of
meaning arc orthogonal, in that virtually any constituent of a clause, with any thematic
696 VI. Cognitively oriented approaches to semantics
or modifying role, can function as Topic or Focus or part of Common Ground.
Languages typically use different grammatical machinery for expressing these two aspects of
meaning. For instance, roles in Propositional Structure are typically expressed (morpho-)
syntactically in terms of position and/or case with respect to a head. Roles in Information
Structure are typically expressed by special focusing constructions, by special focusing
affixes, by special topic and focus positions that override propositional roles, and above
all by stress and intonation - which is never used to mark propositional roles. Both the
syntactic and the semantic phenomena suggest that Propositional and Information
Structure are orthogonal but linked organizations of the semantic material in a sentence.
More controversially, Jackendoff (2002) proposes segregating Propositional Structure
into two tiers, a descriptive tier and a referential tier. The former expresses the
hierarchical arrangement of functions, arguments, and modifiers. The latter expresses the
sentence's referential commitments to each of the characters and events, and the binding
relations among them; it is a dependency graph along the lines of Discourse
Representation Theory (see article 37 (Kamp & Reyle) Discourse Representation Theory). The
idea behind this extra tier is that such issues as anaphora, quantification, specificity, and
referential opacity are in many respects orthogonal to who is performing the action and
who the action is being performed on.The canonical grammatical structures of language
typically mirror the latter rather closely: the relative embedding of syntactic constituents
reflects the relative embedding of arguments and modifiers. On the other hand, scope of
quantification, specificity, and opacity arc not at all canonically expressed in the surface
of natural languages; this is why theories of quantification typically invoke something
like "quantifier raising" to relate surface position to scope. The result is a semantic
structure in which the referential commitments are on the outside of the expression, and the
thematic structure remains deeply embedded inside, its arguments bound to quantifiers
outside. Dividing the expressive work into descriptive and referential tiers helps clarify
the resulting notational logjam.
The division into descriptive and referential tiers also permits an insightful account of
two kinds of anaphora. Standard definite anaphora, as in (la), is anaphoric on the
referential tier and indicates coreference. Oie-anaphora, as in (lb), however, is anaphoric on
the descriptive tier and indicates a different individual with the same description.
(1) a. Bill saw a balloon and I saw it too.
b. Bill saw a balloon and I saw one too.
2.2. Ontological categories and aspectual features
Reference is typically discussed in terms of NPs that refer to objects. Conceptual Semantics
takes the position that there is a far wider range of ontological types to which reference can
be made (m-referencc, of course). The deictic that is used in (2a) to (m-)refer to an object
that the hearer is invited to locate in (his or her conceptualization of) the visual environment.
Similarly, the underlined deictics in (2b-g) are used to refer to other sorts of entities.
(2) a. Would you pick that [pointing] up, please? [reference to object]
b. Would you put your hat there [pointing], please? [reference to location]
c. They went that away [pointing]*. [reference to direction]
d. Can you do this [demonstrating]'! [reference to action]
30. Conceptual Semantics
e. That [pointing] had better never happen in MY house! [reference to event]
f. The fish that got away was this [demonstrating] long. [reference to distance]
g. You may start ... right ... now [clapping]\ [reference to time]
This enriched ontology leads to a proliferation of referential expressions in the semantic
structure of sentences. For instance, John went to Boston refers not only to John and
Boston, but also to the event of John going to Boston and to the trajectory 'to Boston',
which terminates at Boston. The event corresponds to the Davidsonian event variable
(see article 34 (Maienborn) Event semantics) and the events of Situation Semantics (see
articles 35 (Ginzburg) Situation Semantics and NL ontology and 36 (Ginzburg) Situation
Semantics. Event and Situation Scmanticists have taken this innovation to be a major
advance in the ontology over classical formal semantics. Yet the expressions in (2) clearly
show a far more differentiated ontology; events are just a small part of the full story. At
the same time, it should be recalled that the 'existence' of trajectories, distances, and so
forth is a matter not of how the world is, but of how speakers conceptualize the world.
Incidentally, trajectories such as to Boston are often thought to intrinsically involve
motion. A more careful analysis suggests that this is not the case. Motion along the
trajectory is a product of composing the motion verb went with to Boston. But the very same
trajectory is referred to in the road leads to Boston - a stative sentence expressing the
extent of the road - and in the sign points to Boston - a stative sentence expressing the
orientation of the sign. The difference among these examples comes from the semantics
of the verb and the subject, not from the prepositional phrase.
The semantics of expressions of location and trajectory - including their crossling-
uistic differences and relationships to Spatial Structure - has become a major
preoccupation in areas of semantics related to Conceptual Semantics (e.g. Bloom et al. 1996,Talmy
2000, Levinson 2003, van der Zee & Slack 2003; see article 98 (Pederson) The expression
of space; article 107 (Landau) Space in semantics and cognition).
Orthogonal to the ontological category features are aspectual features. It has long
been known (Declerck 1979, Hinrichs 1985, Bach 1986, and many others) that the
distinction between objects and substances (expressed by count and mass NPs respectively)
parallels the distinction between events and processes (telic and atelic sentences) (see
article 46 (Lasersohn) Mass nouns and plurals; article 48 (Filip) Aspectual class and
Aktionsart). Conceptual Semantics expresses this parallelism (Jackendoff 1991) through
a feature [±boundcd]. Another feature, [±intcrnal structure], deals with aggregation,
including plurality. (3) shows how these features apply to materials and situations. (The
situations in (3) are expressed as NPs but could just as easily be sentences.)
(3) Material Situation
[+boundcd,-internal object (a dog) single telic event (a sneeze)
structure]
[-bounded,-internal substance (dirt) atelic process (sleeping)
structure]
[+boundcd, +intcrnal group (a herd) multiple telic event (some
structure] sneezes)
[-bounded,+internal aggregate (dogs) iterated events (sneezing
structure] repeatedly)
698 VI. Cognitively oriented approaches to semantics
Trajectories or paths also partake of this feature system: a path such as into the forest,
with an inherent endpoint, is [+bounded]; along the road, with no inherent endpoint, is
[-bounded].
Because these features cut across ontological categories, they can be used to
calculate the telicity and iterativity of a sentence based on the contributions of all its parts
(Jackendoff 1996b). For instance, (4a) is telic because its subject and path are bounded,
and its verb is a motion verb. (4b) is atelic because its path is unbounded and its verb is
a motion verb. (4c) is atelic and iterative because its subject is an unbounded aggregate
of individuals and its verb is a motion verb. (4d) is stative, hence atelic, because the verb
is stative - even though its path is bounded. (Jackendoff 1996b shows formally how these
results follow.)
(4) a. John walked into the forest.
b. John walked along the road.
c. People walked into the forest.
d. The road leads into the forest.
2.3. Feature analysis in word meanings
Within Conceptual Semantics, word meanings are regarded as composite, but not
necessarily built up in a fashion that lends itself to definitions in terms of other words. This
subsection and the next three lay out five sorts of evidence for this view, and five different
innovations that therefore must be introduced into lexical decomposition.
The first case is when a particular semantic feature spans a number of semantic fields.
Conceptual Semantics grew out of the fundamental observations of Gruber (1965), who
showed that the notions of location, change, and causation extend over the semantic
fields of space, possession, and predication. For example, the sentences in (5) express
change in three different semantic fields, in each case using the verb go and expressing
the endpoint of change as the object of to.
(5) a. John went to New York. [space]
b. The inheritance went to John. [possession]
c. The light went from green to red. [predication]
Depending on the language, sometimes these fields share vocabulary and sometimes
they don't. Nevertheless, the semantic generalizations ring true crosslinguistically. The
best way to capture this crosscutting is by analyzing motion, change of possession, and
change of predication in terms of a common primitive function GO (alternating with BE
and STAY) plus a "field feature" that localizes it to a particular semantic field (space vs.
possession vs. predication). Neither the function nor the field feature is lexicalized by
itself: GO is not on its own the meaning of go. Rather, these two elements are like
features in phonology, where for example voiced is not on its own a phonological segment
but when combined with other features serves to distinguish one segment from another.
Thus these meaning components cannot be expressed as word-like primes.
An extension of this approach involves force-dynamic predicates (Talmy 1988,
Jackendoff 1990: chapter 7), where for instance force, entail, be obligated, and the various
senses of must share a feature, and permit, be consistent with, have a right, and the various
30. Conceptual Semantics 699
senses of may share another value of the same feature. At the same time, these predicates
differ in whether they are in the semantic field of physical force, social constraint, logical
relation, or prediction.
Another such case was mentioned in section 2.2: the strong semantic parallel between
the mass-count distinction in material substances and the process-event distinction in
situations. Despite the parallel, only a few words cut across these domains. One happens
to be the word end, which can be applied to speeches, to periods of time, and to tables of
certain shapes (e.g. long ones but not circular ones). On the Conceptual Semantics
analysis (Jackendoff 1991), end encodes a boundary of an entity that can be idealized as one-
dimensional, whatever its ontological type. And because only certain table shapes can
be construed as elaborations of a one-dimensional skeleton, only such tables have ends.
(Note that an approach to end in terms of metaphor only restates the problem. Why do
these metaphors exist? Answer: Because conceptualization has this feature structure.)
The upshot of cases like these is that word meanings cannot be expressed in terms of
word-like definitions, because the primitive features are not on their own expressible as
words.
2.4. Spatial structure in word meanings
One of the motivations for concluding that linguistic meaning must be segregated from
"world knowledge" (as inTwo-lcvcl Semantics and Lexical Conceptual Structure) is that
there are many words with parallel grammatical behavior but clearly different semantics.
For instance, verbs of manner of locomotion such as jog, sprint, amble, strut, and swagger
have identical grammatical behavior but clearly differ in meaning. Yet there is no evident
way to decompose them into believable algebraic features. These actions differ in how
they look and how they feel. Similarly, a definition of chair in terms of "[+has-a-seat]"
and "[+has-a-back]" is obviously artificial. Rather, our knowledge of the shape of chairs
seems to have to do with what they look like and what it is like to sit in them - where
sitting is ultimately understood in terms of performing the action. Likewise, our
knowledge of dog at some level involves knowing that dogs bark. But to encode this purely in
terms of a feature like "[+barks]" misses the point. It is what barking sounds like that is
important - and of course this sound also must be involved in the meaning of the verb
bark.
In each of these cases, what is needed to specify the word meaning is not an algebraic
feature structure, but whatever cognitive structures encode categories of shapes, actions,
and sounds. Among these structures are Spatial Structures of the sort discussed in section
1.3, which encode conceptualizations of shape, color, texture, decomposition into parts,
and physical motion. As suggested there, it is not that these structures alone constitute
the word meanings in question. Rather, it is the combination of Conceptual Structure
with these structures that fills out the meanings.
Jackendoff (1996a) hypothesizes that these more perceptual elements of meaning do
not interface directly with syntactic structure; that is, only Conceptual Structure makes
a difference in syntactic behavior. For example, the differences in manner of motion
among the verbs mentioned above arc coded in Spatial Structure and therefore make no
difference in their grammatical behavior. If correct, this would account for the fact that
such factors are not usually considered part of "linguistic semantics", even though they
play a crucial role in understanding. Furthermore, since these factors are not encoded in
VI. Cognitively oriented approaches to semantics
a format amenable to linguistic expression, they cannot be decomposed into definitions
composed of words. The best one can do by way of definition is ostension, relying on the
hearer to pick out the relevant factors of the environment.
2.5. Centrality conditions and preference rules
It is well known that many words do not have a sharply delimited denotation. An ancient
case is bald: the central case is total absence of hair, but there is no particular amount
of hair that serves as dividing point between bald and non-bald. Another case is color
terms: for instance, there are focal values of red and orange and a smooth transition of
hues between them; but there is no sharp dividing line, one side of which is definitely
red and the other side is definitely orange. To reinforce a point made in section 1.1, it
is not our ignorance of the true facts about baldness and redness that leads to this
conclusion. Rather, there simply is no fact of the matter. When judgments of such
categories are tested experimentally, the intermediate cases lead to slower, more variable, and
more context-dependent judgments. The character of these judgments has to do more
with the conceptualization of the categories in question than with the nature of the real
world (see article 28 (Taylor) Prototype theory, article 106 (Kelter & Kaup) Conceptual
knowledge, categorization and meaning).
In Conceptual Semantics, such words involve centrality conditions.They are coded in
terms of a focal or central case (such as completely bald or focally red), which serves as
prototype. Cases that deviate from the prototype (as in baldness) - or for which another
candidate prototype competes (as in color words) - result in the observed slowness and
variability of judgments. Such behavior is in fact what would be expected from a neural
implementation -sharp categorical behavior is actually much harder to explain in neural
terms.
A more complex case that results in noncategorical judgments involves so-called
cluster concepts. The satisfaction conditions for such concepts are combined by a non-
Boolean connective (let's call it "smor") for which there is no English word. If a concept
C is characterized by [condition A smor condition B], then stereotypical instances of
C satisfy both condition A and condition B, and more marginal cases satisfy either A
or B. For instance, the verb climb stcrcotypically involves (A) moving upward by (B)
clambering along a vertically aligned surface, as in (6a). However, (6b) violates condition
A while observing condition B, and (6c,d) are the opposite. (6e,f), which violate both
conditions, are unacceptable. This shows that neither condition is necessary, yet either is
sufficient.
(6) a. The bear climbed the tree. [upward clambering]
b. The bear climbed down the tree/ [clambering only]
across the cliff.
c. The airplane climbed to 30,000 feet, [upward only]
d. The snake climbed the tree. [upward only]
e. *The airplane climbed down to [neither upward nor clambering]
10,000 feet.
f. The snake climbed down the tree. [neither upward nor clambering]
30. Conceptual Semantics 701_
The connective between the conditions is not simple logical disjunction, because if we
hear simply The bear climbed, we assume it was going upward by clambering. That is,
both conditions are default conditions, and either is violable (Fillmore 1982). Jackendoff
(1983) calls conditions linked by this connective preference rules.
This connective is involved in the analysis of Wittgenstein's (1953) famous example
game, in the verb see (Jackendoff 1983: chapter 8), in the preposition in (Jackendoff
2002: chapter 11), and countless other cases. It is also pervasive elsewhere in
cognition, for example gestalt principles of perceptual grouping (Wertheimer 1923,
Jackendoff 1983: chapter 8) and even music and phonetic perception (Lerdahl & Jackendoff
1983). Because this connective is not lexicalized, word meanings involving it cannot be
expressed as standard definitions.
2.6. Dot-objects
An important aspect of Conceptual Semantics stemming from the work of Pustejovsky
(1995) is the notion of dot-objects - entities that subsist simultaneously in multiple
semantic domains. A clear example is a book, a physical object that has a size and weight,
but that also is a bearer of information. Like other cluster concepts, either aspect of this
concept can be absent: a blank notebook bears no information, and the book whose
plot I am currently developing in my head is not (yet) a physical object. But a
stereotypical book partakes of both domains. The information component can be linked to
other instantiations besides books, such as speech, thoughts in people's heads, computer
chips, and so on. Pustejovsky notatcs the semantic category of objects like books with a
dot between the two domains: [PHYSICAL OBJECT • INFORMATION], hence the
nomenclature "dot-object."
Note that this treatment of book is different from considering the word polysemous.
It accounts for the fact that properties from both domains can be applied to the same
object at once: The book that fell off the shelf [physical] discusses the war [information].
Corresponding to this sort of dot-object there arc dot-actions. Reading is at once a
physical activity - moving one's glance over a page - and an informational one - taking in
the information encoded on the page. Writing is creating physical marks that instantiate
information, as opposed to, say, scribbling, which need not instantiate information.
Implied in this analysis is that spoken language also is conceptualized as a dot-object:
sounds dotted with information (or meaning). The same information can be conveyed
by different sounds (e.g. by speaking in a different language), and the same sounds can
convey different information (e.g. different readings of an ambiguous sentence, or
different pragmatic construals of the same sentence in different contexts). Then speaking
involves emitting sounds dotted with information; by constrast, groaning is pure sound
emission.
Another clear case of a dot-object is a university, which consists at once of a
collection of buildings and an academic organization: Walden College covers 25 acres of hillside
and specializes in teaching children of the rich. Still another such domain (pointed out
by Searle 1995) is actions in a game. For example, hitting a ball to a certain location is a
physical action whose significance in terms of the game may be a home run, which adds
runs, or a foul ball, which adds strikes. For such a case, the physical domain is "dotted"
with a special "game domain," in terms of which one carries out the calculation of points
or the like to determine who wins in the end.
702 VI. Cognitively oriented approaches to semantics
Symbolic uses of objects, say in religious or patriotic contexts, can also be analyzed in
terms of dot-objects and dot-actions with significance in the symbolized domain.
Similarly with money, where coins, bills, checks, and so on - and the exchange thereof - are
both physical objects and monetary values (Searle calls the latter institutional facts, in
contrast with physical brute facts).
Perhaps the most far-reaching application of dot-objects is to the domain of persons
(Jackendoff 2007). On one hand, a person is a physical object that occupies a position
in space, has weight, can fall, has blue eyes, and so forth. On the other hand, a person
has a personal identity in terms of which social roles are understood: one's kinship or
clan relations, one's social and contractual obligations, one's moral responsibility, and
so forth. The distinction between these two domains is recognized crossculturally as the
difference between body on one hand and soul or spirit on the other. Cultures are full of
beliefs about spirits such as ghosts and gods, with personal identity and social significance
but no bodies. We quite readily conceptualize attaching personal identity to different
bodies, as in beliefs in life after death and reincarnation, films like Freaky Friday (in
which mother and daughter involuntarily exchange bodies), and Gregor Samsa's
metamorphosis into a giant cockroach. A different sort of such dissociation is Capgras
Syndrome (McKay, Langdon & Coltheart 2005), in which a stroke victim claims his wife has
been replaced by an impostor who looks exactly the same. Thus persons, like books and
universities, are dot-objects.
Social actions partake of this duality between physical and social/personal as well. For
example, shaking hands is a physical action whose social significance is to express mutual
respect between persons.The same social significance can be attached to other actions, say
to bowing, high-fiving, or a man kissing a lady's hand. And the same physical action can
have different social significance; for instance hissing is evidently considered an expression
of approval in some cultures, rather than an expression of disapproval as in ours.
Note that this social/personal domain is not the same as Theory of Mind, although
they overlap a great deal. On one hand, we attribute intentions and goals not just to
persons but also to animals, who do not have social roles (with the possible exception of
pets, who are treated as "honorary" persons). On the other hand, social characteristics
such as one's clan and one's rights and obligations are not a consequence of what one
believes or intends: they are just bare social facts. The consequence is that we likely
conceptualize people in three domains "dotted" together: the physical domain, the personal/
social domain, and the domain of sentient/animate entities.
A formal consequence of this approach is that the meaning of an expression
containing dot-objects and dot-actions is best treated in terms of two or more linked "planes"
of meaning operating in parallel. Some inferences are carried out on the physical plane,
others on the associated informational, symbolic, or social plane. Particularly through
the importance of social predicates to our thought and action, such a formal treatment is
fundamental to understanding human conceptualization and linguistic meaning.
3. Compositionality
A central idealization behind most theories of semantics, including those of mainstream
generative grammar and much of formal logic, is classical Fregean compositionality,
which can be stated roughly as (7) (see article 6 (Pagin & Westerstahl) Compositionality).
VI. Cognitively oriented approaches to semantics
(8a), taking place at a time and in a brain location consistent with semantic rather than
syntactic processing.
Another such case is reference transfer (Nunberg 1979), in which an NP is used to
refer to something related such as 'picture of NP','statue of NP','actor portraying NP'
and so on:
(10) a. There's Chomsky up on the top shelf, [statue of or book by Chomsky]
next to Plato.
b. [One waitress to another:]
The ham sandwich in the corner wants [person who ordered sandwich]
some more coffee.
c. I'm parked out back. I got smashed up [my car]
on the way here.
Jackendoff (1992) (also Culicover & Jackendoff 2005) shows that these shifts cannot be
disregarded as "merely pragmatic," for two reasons. First, a theory that is responsible for
how speakers understand sentences must account for these interpretations. Second,some
of these types of reference transfer have interactions with anaphoric binding, which is
taken to be a hallmark of grammar. Suppose Richard Nixon went to see the opera Nixon
in China. It might have happened that ...
(11) Nixon was horrified to watch himself sing a foolish aria to Chou En-lai.
Here Nixon stands for the real person and himself stands for the portrayed Nixon on
stage. However, such a connection is not always possible:
(12) *Aftcr singing his aria to Chou En-lai, Nixon was horrified to sec himself get up and
leave the opera house.
(11) and (12) are syntactically identical in the relevant respects. Yet the computation of
anaphora is sensitive to which NP's reference has been shifted.This shows that reference
transfer must be part of semantic composition. Jackendoff (1992) demonstrates that the
meaning of reference transfer cannot be built into syntactic structure in order to derive
it by Fregean composition.
Another sort of challenge to Fregean composition comes from constructional meaning,
where ordinary syntax is paired with nonstandard semantic composition (see article 86
(Kay & Michaclis) Constructional meaning). Examples appear in (13): the verb is
syntactically the head of the VP, but it does not select its complements. Rather, the verb
functions semantically as a means or manner expression.
(13) a. Bill belched his way out of the ['Bill went out of the restaurant
restaurant. belching']
b. Laura laughed the afternoon away. ['Laura spent the afternoon
laughing']
c. The car squealed around the corner. ['The car went around the corner
squealing']
30. Conceptual Semantics 707
the interfaces with syntax and phonology. Thus the empirical phenomena studied within
Conceptual Semantics provide arguments for the theory's overall worldview, one that is
consistent with the constraints of the mentalistic framework.
4. References
Bach, Eminon 1986. The algebra of events Linguistics & Philosophy 9,5-16.
Bloom. Paul 2000. How Children Learn the Meanings of Words. Cambridge, MA:The MIT Press.
Bloom, Paul et al. (eds.) 1996. Language and Space. Cambridge, MA:The MIT Press.
Carnap, Rudolf 1939. Foundations of Logic and Mathematics. Chicago, IL: The University of
Chicago Press.
Cheney, Dorothy L. & Robert M. Seyfarth 2007. Baboon Metaphysics. The Evolution of a Social
Mind. Chicago, IL:The University of Chicago Press.
Chomsky, Noam 1995. The Minimalist Program. Cambridge, MA: The MIT Press.
Culicover, Peter W. & Ray Jackeiidoff 2005. Simpler Syntax. Oxford: Oxford University Press.
Declerck. Renaat 1979. Aspect and the bounded/unbounded (telic/atelic) distinction. Linguistics
17,761-794.
Fauconnier, Gilles 1985. Mental Spaces. Aspects of Meaning Construction in Natural Language.
Cambridge, MA:The MIT Press.
Fellbaum, Christiane (ed.) 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: The
MIT Press.
Fillmore, Charles 1982.Towards a descriptive framework for dcixis. In: R. Jarvclla & W. Klein (eds.).
Speech, Place, and Action. New York: Wiley, 31-59.
Fodor, Jerry A. 1975. The Language of Thought. Cambridge, MA: Harvard University Press.
Fodor, Jerry A. 1998. Concepts. Where Cognitive Science Went Wrong. Oxford: Oxford University
Press.
Frege, Gottlob 1892/1952. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic und philosophische
Kritik 100,25-50. English translation in: P. Geach & M. Black (eds.). Translations from the
Philosophical Writings of Gottlob Frege. Oxford: Blackwell. 1952,56-78.
Giv6n,Talmy 1995. Functionalism and Grammar. Amsterdam: Benjamins.
Goldberg, Adele E. 1995. Constructions. A Construction Grammar Approach to Argument
Structure. Chicago, IL:The University of Chicago Press.
Gruber, Jeffrey S. 1965. Studies in Lexical Relations. Ph.D. dissertation. MIT, Cambridge, MA.
Reprinted in: J. S. Gruber. Lexical Structures in Syntax and Semantics. Amsterdam:
North-Holland, 1976,1-210.
Hauser, Marc D. 2000. Wild Minds. What Animals Really Think. New York: Henry Holt.
Hinrichs, Erhard W. 1985. A Compositional Semantics for Aktionsarten and NP Reference in
English. Ph.D. dissertation. Ohio State University, Columbus, OH.
Jackeiidoff, Ray 1983. Semantics and Cognition. Cambridge, MA:The MIT Press.
Jackeiidoff, Ray 1987. Consciousness and the Computational Mind. Cambridge, M A:The MIT Press.
Jackeiidoff, Ray 1990. Semantic Structures. Cambridge, MA. The MIT Press.
Jackeiidoff, Ray 1991. Parts and boundaries Cognition 41, 9-45. Reprinted in: R. Jackeiidoff,
Meaning and the Lexicon. The Parallel Architecture 1975-2000. Oxford: Oxford University Press,
138-173.
Jackeiidoff, Ray 1992. Mine. Tussaud meets the binding theory. Natural Language and Linguistic
Theory 10,1-31.
Jackendoff, Ray 1996a. The architecture of the linguistic-spatial interface. In: P. Bloom et al. (eds).
Language and Space. Cambridge, M A:The MIT Press, 1-30. Reprinted in: R. Jackeiidoff, Meaning
and the Lexicon. The Parallel Architecture 1975-2010. Oxford: Oxford University Press, 112-134.
Jackeiidoff, Ray 1996b. The proper treatment of measuring out, telicity, and possibly even
quantification in English. Natural Language and Linguistic Theory 14, 305-354. Reprinted in:
31.Two-level Semantics: Semantic Form and Conceptual Structure 709
Stainton, Robert J. 2006. Words and Thoughts. Subsentences, Ellipsis, and the Philosophy of
Language. Oxford: Oxford University Press.
Talmy, Leonard 1978. The relation of grammar to cognition. A synopsis In: D. Waltz (ed.).
Theoretical Issues in Natural Language Processing. New York: ACM, 14-24. Revised and enlarged
version in: L. Talmy. Toward a Cognitive Semantics, vol. I. Concept Structuring Systems.
Cambridge, MA:The MIT Press, 2000,21-96.
Talmy, Leonard 1988. Force-dynamics in language and cognition. Cognitive Science 12,49-100.
Talmy, Leonard 2000. Toward a Cognitive Semantics. Cambridge, MA:The MIT Press.
Tarski, Alfred 1956. The concept of truth in formalized languages In: A. Tarski. Logic, Semantics,
Metamathematics.Translated by J. H. Woodger. 2nd edn.ed. J. Corcoran. London: Oxford
University Press, 152-197.
van der Zee & Jon Slack (eds.) 2003. Representing Direction in Language and Space. Oxford:
Oxford University Press.
Verkuyl, Henk 1993. A Theory of Aspectuality. The Interaction between Temporal and A temporal
Structure. Cambridge: Cambridge University Press.
Wcrthcimcr, Max 1923. Laws of organization in perceptual forms Reprinted in: W. D. Ellis (cd.). A
Source Book ofGestalt Psychology. London: Routledge & Kegan Paul, 1938,71-88.
Wierzbicka, Anna 1996. Semantics. Primes and Universals. Oxford: Oxford University Press.
Wittgenstein, Ludwig 1953. Philosophical Investigations. Translated by G.E.M. Ascombe. Oxford:
Blackwell.
Ray Jac kendoff Med ford, MA (USA)
31. Two-level Semantics: Semantic Form and
Conceptual Structure
1. Introduction
2. Polysemy problems
3. Compositionality and beyond: Semantic underspecification and coercion
4. More on SF variables and their instantiation at the CS level
5. Summary and outlook
6. References
Abstract
Semantic research of the last decades has been shaped by an increasing interest in
conceptually, that is, in emphasizing the conceptual nature of the meanings conveyed by
natural language expressions. Among the multifaceted approaches emerging from this
tendency, the article focuses on discussing a framework that has become known as »Two-
level Semantics«. The central idea it pursues is to assume and justify two basically
distinct, but closely interacting, levels of representation that spell out the meaning of
linguistic expressions: Semantic Form (SF) and Conceptual Structure (CS). The distinction
of SF vs. CS representations is substantiated by its role in accounting for related parallel
distinctions including 'lexical vs. contextually specified meaning', 'grammar-based vs.
concept-based restrictions', 'storage vs. processing' etc. The SF vs. CS distinction is discussed
on the basis of semantic problems regarding polysemy, underspecification, coercion, and
inferences.
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK33.1), deGruyter, 709-740
31.Two-level Semantics: Semantic Form and Conceptual Structure 1
distinction of SF vs. CS but also in providing criteria for deciding what requirements the
representations at issue have to meet.
The methodologically most relevant conclusion drawn by Kelter & Kaup (article
108) reads as follows: "researchers should acknowledge the fact that concepts and
word meaning are different knowledge structures." The claim in (5) suggests that if it
is the SF of lexical items that is stored in long-term memory, the entries should be
confined to representing what may be called "context-free lexical meanings", whereas CS
representations compiled and processed in working memory should take charge of
what may be called "contextually specified (parts of) utterance meanings". The
difference between the two types of meaning representations indicates the virtual semantic
undcrspccification of the former and the possible semantic enrichment of the lattcr.Thcrc
is a series of recent experimental studies designed and carried out along these lines
which - in combination with evidence from corpus data, linguistic diagnostics etc. -
are highly relevant for the theoretical issues raised by SF vs. CS representations.
Experiments reported by Stolterfoht, Gese & Maienborn (2010) and Kaup, Liidtke &
Maienborn (2010) succeeded in providing processing evidence that supports the
distinction of, e.g., primary adjectives vs. adjectivized participles vs. verbal participles,
that is, evidence for packaging categories relevant to SF representations. In addition,
these studies reveal the processing costs of contextualizing semantically underspecified
items, a result that supports the view that contextualizing the interpretation of a given
linguistic expression e is realized by building up an enriched CS representation on the
basis of SF(e).
1.3. SF vs. CS - an illustration from everyday life
To round off the picture outlined so far, we illustrate the features listed in (2)-(5) in favor
of the SF vs. CS distinction by an example wc arc well acquainted with, viz. the
representations involved in handling numbers, number symbols, and numerals in everyday
life. Note that each of the semiotic objects in (6)-(8) below represents in some way the
numerical concept »18«. However, how numerical concepts between »10« and »20« are
stored, activated, and operated on in our memory is poorly understood as yet, so the
details regarding the claim in (5) must be left open. Suffice it to agree that »18« stands
for the concept we make use of, say, in trying to mentally add up the sum to be paid
for our purchases in the shopping trolley. With this proviso in mind, wc now look at the
representations of the concept »18« in (6)-(8) to find out their interrelations.
(6) a. 1 1 1 HI
(7) a. XVIII a". IIXX (rarely occurring alternative)
b. 18
(8) a. eighteen, achtzehn (8+10) English, German
b. dix-huit,shiba (10 + 8) French, Mandarin
c. okto-kai-deka ((8)-and-(10)) Greek
d. diezyocho ((10)-and-(8)) Spanish
e. vosem-na-dcat' ((8)-on-(10)) Russian
714
VI. Cognitively oriented approaches to semantics
f. duo-de-viginti ((2)-of-(20)) Latin
g. ocht-deec (8 + (2 x 5)) Irish
h. deu-naw (2 x 9) Welsh
(6) shows two iconic non-verbal representations of a quantity whose correlation with the
concept »18« and/or with the numerals in (8) rests on the ability to count and the
availability of numerals.The tallying systems in (6) are simple but inefficient for doing
arithmetic and hence hardly suitable to serve as semantic representations of the numerals in
(8).
(7) shows two symbolic non-verbal representations of »18«, generated by distinct
writing systems for numbers. The Roman number symbols are partially iconic in that
they encode addition by iterating up to three special symbols for one, ten, hundred, or
thousand, partially symbolic due to placing the symbol of a small number in front of the
symbol of a larger number, thus indicating subtraction, cf. (7a, a'). The lack of a symbol
for null prevented the creation of a positional system, the lack of means to indicate
multiplication or division impeded calculation. Both were obstacles to progress in
mathematics. Thus, Roman number symbols may roughly render the lexical meaning of (8a-f)
but not those of (8g-h) and all other variants involving multiplication or division.
The Indo-Arabic system of number symbols exemplified by 18 in (7b) is a positional
system without labels based on exponents of ten (10°, 101,102,... , 10"). As a
representational system for numbers it is recursive and potentially infinite in yielding unambiguous
and well-distinguished chains of symbols as output. Thus, knowing the system implies
knowing that 18 * 81 or that 17 and 19 are the direct predecessor and successor of 18,
respectively, even if we do not have pertinent number words at our disposal to name
them. Moreover, it is this representation of numbers that we use when we do arithmetic
with paper and pencil or by pressing the keys of an electronic calculator. Enriched with
auxiliary symbols for arithmetical operations and for marking their scope of application,
as well as furnished with conventions for writing equations etc., this notational system is
a well-defined means to reduce the use of mathematical expressions to representations
of their Conceptual Structures, that is, to the CS representations they denote,
independent of any natural language in which these expressions may be read aloud or dictated.
Let's call this enriched system of Indo-Arabic number symbols the "CS system of
mathematical expressions". Now, what about the SF representations of numerals?
Though all number words in (8) denote the concept »18«, it is obvious that their
lexical meanings differ in the way they are composed, cf. the second column in (8). As
regards their combinatorial category, the number words in (8) are neither
determinative nor copulative compounds, nor are they conjoined phrases. They are perhaps best
categorized as juxtapositions with or without connectives, cf. (8c-f) and (8a-b, g-h),
respectively.The unique feature of complex number words is that the relations between
their numeral constituents are nothing but encoded fundamental arithmetic operations
(addition and multiplication are preferred; division and subtraction are less frequent).
Thus, the second column in (8) shows SF representations of the number words in the
first column couched in terms of the CS system of mathematical expressions.The latter
are construable as functor-argument structures with arithmetic operators ('+','-', 'x'
etc.) as functors, quantity constants for digits as arguments, and parentheses (...) as
boundaries marking lexical building blocks. Now let's see what all this tells us about the
distinctions in (2)-(5) above.
31.Two-level Semantics: Semantic Form and Conceptual Structure 715
The subset - set relation SF c CS mentioned in connection with (2) also holds for
the number symbols in (7b). The CS system of mathematical expressions is capable of
representing all partitions of 18 that draw on fundamental arithmetic operations. Based
on this, the CS system at stake covers the internal structures of complex number words,
cf. (8), as well as those of equations at the sentence level like 18 = 3 x 6; 18 = 2 x 9; 18 =
72:4 etc.
By contrast, the subset of SF representations for numerals is restricted in two respects.
First, not all admissible partitions of a complex number like 18 are designated as SF of
a complex numeral lexicalized to denote »18«.The grammar of number words in L is
interspersed with (certain types of) L-specific packing strategies, cf. Hurford (1975),
Greenberg (1978). Second, since the ideal relationship between systems of number
symbols and systems of numerals is a one-to-one correspondence, the non-ambiguity
required of the output of numeral systems practically forbids creation or use of
synonymous number names (except for the distinct numerals used for e.g. 1995 when speaking
of years or of prices in €).
There is still another conclusion to be drawn from (6)-(8) in connection with (2). The
CS system of mathematical expressions is a purposeful artifact created and developed to
solve the "homogenization problem" raised by CS representations for the well-defined
field of numbers and arithmetic operations on them. First, the mental operations of
counting, adding, multiplying etc., which the system is designed to represent, have been
abstracted from practical actions, viz- from lining up things, bundling up things, bundling
up bundles of things etc. Second, the CS representations of mathematical expressions
provided by the system are unambiguous, complete (that is, fully specified and containing
neither gaps nor variables to be instantiated by elements from outside the system), and
independent of the particular languages in which they may be verbalized.
The lexicon-based packaging and contents of the components of SF representations
claimed in (3) and (4) are also corroborated by (6)-(8). The first point to note is the L-
specific ways in which (i) numerals are categorized in morpho-syntactic terms and (ii) their
lexical meanings are composed. The second point is this: Complex numerals differ from
regular (determinative or copulative) compounds in that the relations between their
constituents arc construed as encodings of fundamental arithmetical operations, cf. (8a-h).
This unique feature of the subgrammar of number words also yields a strong argument wrt.
the "justification problem" posed by the assumption of lexicon-based SF representations.
The claims in (3) and (4) concerning the non-linguistic nature of CS representations
are supported by the fact that e.g. 18 is an intermodally valid representation of the
concept »18« as it covers both the perception-based iconic representations of »18« in (6) and
the lexicon-based linguistic expressions denoting »18« in (8).Thus, the unique advantage
of the CS system of mathematical expressions is founded on the representational inter-
modality and the conceptual homogeneity it has achieved in the history of mathematical
thinking. No other science is more dependent on the representations of its subject than
mathematics.
Revealing as this illustration may be, the insights it yields cannot simply be extended
to the lexicon and grammar of a natural language L beyond the subgrammar of number
words. The correlations between systems of number names and their SF representations
in terms of the CS system of mathematical expressions form a special case which results
from the creation of a non-linguistic semiotic artifact, viz- a system to represent number
concepts under controlled laboratory conditions. The meanings, the combinatorial
720 VI. Cognitively oriented approaches to semantics
Whereas the institution and the building readings are somewhat compatible, the
building and the personnel readings are not; as regards the (type of) reading of the
antecedent, anaphoric pronouns arc less tolerant than relative pronouns or repeated
DPs.The data in (17) show that the conceptual specifications of SF (school) differ in ways
that are poorly understood as yet;cf. Asher (2007) for some discussion. The semantics of
institution nouns, for a while the signature tune of Two-level Semantics, elicited a certain
amount of discussion and criticism, cf. Herzog & Rollinger (1991), Bierwisch & Bosch
(1995).The problems expounded in these volumes are still unsolved but they sharpened
our view of the intricacies of the SF vs. CS distinction.
2.2. Locative prepositions
In many languages the core inventory of adpositions encode spatial relations to localize
some x (called theme, figure or located object) wrt. the place occupied by some y (called
relatum, ground or reference object), where x and y may pairwise range over objects,
substances, and events. Regarding the conceptual basis of these relations, locative
prepositions in English and related languages are usually subdivided into topological (in, at, on),
directional (into, onto), dimensional (above, under, behind), and path-defining (along,
around) prepositions. The semantic problems posed by these lexical items can be best
illustrated with in, which supposedly draws on spatial containment, pure and simple, and
which is therefore taken to be the prime example of a topological preposition.
To illustrate how SF (in) is integrated into a lexical entry with information on
Phonetic Form (PF), Grammatical Features (GF), Argument Structure (AS) etc., we take
German in as a telling example: It renders English in vs. into with distinct cases which
in turn correspond to the values of the feature [± Dir(cctional)] subcatcgorizing the
internal argument y, and to further syntactic distinctions. The entry in (18) is taken from
Bierwisch (1988:37), examples are added in (19). The interdependence of the values for
the case feature [± Obl(ique)] and for the category feature [± Dir] is indicated by means
of the meta-variable a e {+, -} and by the conventions (i) - a inverts the value of a and
(ii) (aW) means that W is present if a = + and absent if a = -.
SF
[(„fin) [loc x] c [loc y]]
(19) a. Die StraBe/Fahrt fiihrt in die Stadt.[+Dir,-Obi] = Ace, "x is a path ending
in y"
The street/journey leads into the city.
/in/; [-V-N, + Dir]; \y \x [(fin) [loc x] c [loc y]]
I
l-Obl]
(18) Lexical entry of the German preposition in:
PF GF AS
/in/; [-V,-N,aDir]; \y \x
I
[- a Obi]
722 VI. Cognitively oriented approaches to semantics
general schema for the SF of locative prepositions thereby abandoning the problematic
functor c and revising the functor loc:
(22) Xy Xx (loc (x, PREP*(y))),
where loc localizes the object x in the region determined by the preposition p and
prep* is a variable ranging overp-based regions.
Bierwisch (1996:69) replaces SF (in) in (18) with Xy Xx [x [loc [int y]]] commenting "jc
loc p identifies the condition that the location of x be (improperly) included in p" and
"int y identifies a location determined by the boundaries of y, that is, the interior of y".
Although this proposal avoids some of the problems with the functor c, the puzzling
effect of "(improperly) included" remains and so does the definition of int y as yielding
"the interior of x".
Herweg (1989) advocates an abstract SF (in) which draws on proper spatial inclusion
such that the examples in (21) are semantically marked due to violating the
"Presupposition of Argument Homogeneity". The resulting truth value gap triggers certain function-
based accommodations at the CS level that account for the interpretations of (21a-d).
Hottenroth (1991), in a detailed analysis of French dans, rejects the idea that SF
(dans) might draw on imprecise region-creating constants like int>\ Instead, SF (dans)
should encode the conditions on the relatum in prototypical uses of dans. The standard
reference region of dans is a three-dimensional empty closed container (bottle, bag, box
etc.). If the relatum of dans does not meet one or more of these characteristics, the
reference region is conceptually adapted by means of certain processing principles (laws of
closure, mental demarcation of unbounded y, conceptual switching from 3D to 2D etc.).
In view of data like those in (21), Carstcnscn (2001) proposes to do away with the
region account altogether and to replace it with a perception-based account of
prepositions that draws on the conceptual representation of changes of focused spatial attention.
To sum up, the brief survey of developments in the semantic analysis of
prepositions may also be taken as proof of the heuristic productivity emanating from the SF vs.
CS distinction. Among polyscmous verbs, the verb to open has gained much attention,
cf. Bierwisch (article 16). Based on a French-German comparison, Schwarze &
Schepping (1995) discuss what type of polysemy is to be accounted for at which of
the two levels. Functional categories (determiners, complementizers, connectives etc.),
whose lexical meanings lack any support in perception and are hence purely operative,
have seldom been analyzed in terms of the SF vs. CS distinction so far; but cf. Lang
(2004) for an analysis that accounts for the abstract meanings of and, but etc. and their
contextual specification by inferences drawn from the structural context, the discourse
context, and/or from world knowledge. Clearly, the 'poorer' the lexical meaning of such a
synsemantic lexical item, the more will its semantic contribution need to be enriched by
means of contextualization.
3. Compositionality and beyond: Semantic underspecification
and coercion
Two-level Semantics was first mainly concerned with polysemy problems of the kind
illustrated in the previous section. Emphasis was laid on developing an adequate theory
31.Two-level Semantics: Semantic Form and Conceptual Structure 723
of lexical semantics that would be able to deal properly and on systematic grounds with
the distinction of word knowledge and world knowledge. A major tenet of Two-level
Semantics as a lexicon-based theory of natural language meaning is that the internal
decompositional structure of lexical items determines their external combinatorial
properties, that is, their external syntactic behavior. This is why compositionality issues are of
eminent interest to Two-level Semantics; cf. (la).
There is wide agreement among semanticists that, given the combinatorial nature
of linguistic meaning, some version of the principle of compositionality - as
formulated, e.g., in (23) - must certainly hold. But in view of the complexity and richness of
natural language meaning, there is also consensus that compositional semantics is
faced with a scries of challenges and problems; sec article 6 (Pagin & Wcstcrstahl)
Compositionality.
(23) Principle of compositionality:
The meaning of a complex expression is a function of the meanings of its parts and
the way they are syntactically combined.
Rather than weakening the principle of compositionality or abandoning it altogether,
Two-level Semantics seeks to cope with the compositionality challenge by confining
compositionality to the level of Semantic Form. That is, SF is understood as comprising
exactly those parts of natural language meaning that are (i) context-independent and
(ii) compositional, in the sense that they are built in parallel with syntactic structure.
This leaves space to integrate non-compositional aspects of meaning constitution at the
level of Conceptual Structure. In particular, the mapping of SF-representations onto CS-
representations may include non-local contextual information and thereby qualify as
non-compositional. Of course, the operations at the CS level as well as the SF - CS
mapping operations are also combinatorial and can therefore be said to be compositional in
a broader sense. Yet their combinatorics is not bound to mirror the syntactic structure
of the given linguistic expression and thus does not qualify as compositional in a strict
sense. This substantiates the assumption of two distinct levels of meaning representation
as discussed in §l.Thus,Two-lcvcl Semantics' account of the richness and flexibility of
natural language meaning constitution consists in assuming a division of labor between
a rather abstract, context-independent and strictly compositionally determined SF and a
contextually enriched CS that also includes non-compositionally derived meaning
components. Various solutions have been proposed for implementing this general view of
the SF vs. CS distinction. These differ mainly in (a) the syntactic fine-tuning of the
compositional operations and the abstractness of the corresponding SF-representations, and
in (b) the way of handling non-compositional meaning aspects in terms of, e.g., coercion
operations.These issues will be discussed in turn.
3.1. Combinatory meaning variation
Assumptions concerning the spell-out of the specific mechanisms of compositionality
are generally guided by parsimony. That is, the fewer semantic operations warranting
compositionality are postulated, the better. On this view, it would be attractive to
have a single semantic operation, presumably functional application, figuring as the
semantic counterpart to syntactic binary branching. An illustration is given in (24):
724 VI. Cognitively oriented approaches to semantics
Given the lexical entries for the locative preposition in and the proper noun Berlin
in (24a) and (24b) respectively, functional application of the preposition to its internal
argument yields (24c) as the compositional result corresponding to the semantics of
the PP.
(24) a. in: Xy Xx (loc (x, iN*(y)))
b. Berlin: berlin
c. [pp in [Dp Berlin]]: Xy Xx (loc (x, iN*(y))) (berlin)
= Xx (loc (x, iN*(berlin)))
Functional application is suitable for syntactic head-complement relationships as it
reveals a correspondence between the syntactic head-non-head relationship and the
semantic functor-argument relationship. In (24c), for instance, the preposition in is
both the syntactic head of the PP and the semantic functor, which takes the DP as its
argument. Syntactic adjuncts, on the other hand, cannot be properly accounted for by
functional application as they lack a comparable syntax-semantics correspondence. In
syntactic head-adjunct configurations the semantic functor, if any, is not the syntactic
head but the non-head; for an overview of the different solutions that have been put
forth to cope with this syntax-semantics imbalance see article 54 (Maienborn & Schafer)
Adverbs and adverbials. Different scholars working in different formal frameworks have
suggested remarkably convergent solutions, according to which the relevant semantic
operation applying to syntactic head-adjunct configurations is predicate conjunction.
This might be formulated, for instance, in terms of a modification template MOD as
given in (25);cf.,e.g. Higginbotham's (1985) notion of 9-identification,Yi\exv/'\scWs (1997)
adjunction schema, Wunderlich's (1997b) argument sharing, or the composition rule of
predicate modification in Hcim & Kratzcr (1998).
(25) Modification template MOD:
MOD: XQXPXx(P(x)&Q(x))
The template MOD takes a modifier and an expression to be modified (= modifyee) and
turns it into a conjunction of predicates. More specifically, an (intersective) modifier adds
a predicate that is linked up to the referential argument of the expression to be modified.
In (26) and (27) illustrations are given for nominal modification and verbal modification,
respectively. In (26), the semantic contribution of the modifier is added as an additional
predicate of the noun's referential argument. In (27), the modifier provides an additional
predicate of the verb's eventuality argument.
(26) a. house: Xz (house (z))
b. [pp in Berlin]: Xu (loc (u, iN*(berlin)))
c- [Np[NPhouse][ppinBerlin]]:
XQ XP Xx (P(x) & Q(x)) (Xz (house (z))) (Xu (loc (u, iN*(bcrlin))))
= Xx (house (x) & loc (x, iN*(berlin)))
(27) a. sleep: Xz Xe (sleep (e) & agent (e,z))
b. [pp in Berlin]: Xu (loc (u, iN*(berlin)))
c- [vplvp sleep] [pp in Berlin]]:
31.Two-level Semantics: Semantic Form and Conceptual Structure 725
XQ XPXx (P(x) & Q(x))(XzXe (sleep (e) & agent(e,z)))(Xu (loc (u,iN*(berlin))))
= Xz Xe (sleep (e) & agent (e, z) & loc (e, iN*(berlin)))
The semantic template MOD thus provides the compositional semantic counterpart to
syntactic head-adjunct configurations. There are good reasons to assume that, besides
functional application, some version of MOD is required when it comes to spelling out
the basic mechanisms of compositionality.
The template MOD in (25) captures a very fundamental insight about the
compositional contribution of intersective modifiers. Nevertheless, scholars working within
the Two-level Semantics paradigm have emphasized that a modification analysis along
the lines of MOD fails to cover the whole range of intersective modification; cf., e.g.,
Maienborn (2001, 2003) for locative adverbials, Dolling (2003) for adverbial modifiers
in general, Bucking (2009, 2010) for nominal modifiers. Modifiers appear to be more
flexible in choosing their compositional target, both in the verbal domain and in the
nominal domain. Besides supplying an additional predicate of the modifyee's referential
argument, as in (26) and (27), modifiers may also relate less directly to their host
argument. Some illustrations are given in (28)-(30). (For the sake of simplicity the data are
presented in English.)
(28) a. The cook prepared the chicken in a Marihuana sauce, (cf. Maienborn 2003)
b. The bank robbers escaped on bicycles.
c. Paul tickled Maria on her neck.
(29) a. Anna dressed Max's hair unobtrusively. (cf. Dolling 2003:530)
b. Ede reached the summit in two days. (cf. Dolling 2003:516)
(30) a. the fast processing of the data (cf. Bucking 2009:94)
b. the preparation of the chicken in a pepper sauce (cf. Bucking 2009:102)
c. Georg's querying of the men (cf. Bucking 2010:51)
The locative modifiers in (28) differ from the general MOD pattern as illustrated in (27)
in that they do not locate the whole event but only one of its integral parts. For instance,
in (28b) it's not the escape that is located on bicycles but - according to the preferred
reading - the agent of this event, viz. the bank robbers. In the case of (28c), the linguistic
structure does not even tell us what is located on Maria's neck. It could be Paul's hand
but also, e.g., a feather he used for tickling Maria. Maienborn (2001, 2003) calls these
modifiers "event internal modifiers" and sets them apart from "event external modifiers"
such as in (27), which serve to holistically locate the verb's eventuality argument.
Similar observations are made by Dolling (2003) wrt. cases like (29). Sentence (29a)
is ambiguous. It might be interpreted as expressing that Anna performed the event of
dressing Max's hair in an unobtrusive manner. This is what the application of MOD
would result in. But (29a) has another reading, according to which it is not the event of
hair dressing that is unobtrusive but Max's resulting hair-style. Once again, the modifier's
contribution does not apply directly to the verb's eventuality argument but to some
referent related to it. The same holds true for (29b), where the temporal adverbial cannot
relate to the punctual event of Ede reaching the summit but only to its preparatory
phase.
31.Two-level Semantics: Semantic Form and Conceptual Structure 727
(34) Paul tickled Maria on her neck.
SF: Be (tickle (e) & agent (e, paul) & patient (e, maria) & R (e, v) & loc (v,
ON*(maria's neck))
CS: Bex (tickle (e) & agent (e,paul) & patient (e, maria) & instr (e, x) & feather
(x) & loc (x,ON*(maria's neck))
This conceptual spell-out provides a plausible utterance meaning for sentence (34). It
goes beyond the compositionally determined meaning by exploiting our conceptual
knowledge that tickling is performed with some instrument which needs to have spatial
contact to the object being tickled. Consequently, the SF-parameter R can be identified
as the instrument relation, and the parameter v may be instantiated, e.g., by a feather.
Although not manifest at the linguistic surface, such conceptually inferred units arc
plausible potential instantiations of the compositionally introduced SF-parameter v. (Dolling
and Maienborn use abduction as a formal means of deriving a contextually specified CS
from a semantically underspecified SF;cf. Hobbs et al. (1993). We will come back to the
SF-CS mapping in §4.)
Different proposals have been developed for implementing the notion of a more
liberal and flexible combinatorics, such as MOD', into the compositional machinery.
Maienborn (2001,2003) argues that MOD' is only licensed in particular structural
environments: Event-internal modifiers have a base adjunction site in close proximity to the
verb, whereas event-external adjuncts adjoin at VP-level. These distinct structural
positions provide the key to a compositional account. Maienborn thus formulates a more
fine-tuned syntax-semantics interface condition that subsumes MOD and MOD' under
a single compositional rule MOD*.
(35) Modification template MOD*:
MOD*: XQ XP Xx (P(x) & R (x, v) & Q(v))
Condition on the application of MOD*: If MOD* is applied in a structural
environment of categorial type X, then R = part-of, otherwise (i.e. in an XP-environment)
R is the identity function.
If MOD* is applied in an XP-environment, then R is instantiated as identity, i.e. v is
identified with the referential argument of the modified expression, thus yielding the
standard variant MOD. If applied in an X-environment, R is instantiated as the part-
of relation, which pairs entities with their integral constituents. Thus, in Maienborn's
account the observed meaning variability is traced back to a grammatically constrained
semantic indeterminacy that is characteristic of modification.
Dolling (2003) takes a different track by assuming that the SF-parameter R is not rooted
in modification but is of a more general nature. Specifically, he suggests that R is introduced
compositionally whenever a one-place predicate enters the composition. By this move, the
SF of a complex expression is systematically extended by a series of SF-parameters, which
guarantee that the application of any one-place predicate to its argument is systematically
shifted to the conceptual level. On Dolling's account, the SF of a complex linguistic
expression is maximally abstract and underspecified, with SF-parameters delineating possible
(though not necessarily actual) sites of meaning variation.
Differences aside, the studies of Dolling, Maienborn and other scholars working
in the Two-level Semantics paradigm emphasize that potential sources for semantic
728 VI. Cognitively oriented approaches to semantics
indeterminacy are not only to be found in the lexicon but may also emerge in the course
of composition, and they strive to model this combinatory meaning variation in terms of
a rigid account of lexical and compositional semantics.
A key role in linking linguistic and extra-linguistic knowledge is taken by so-called
SF-parameters. These are free variables that are installed under well-defined conditions
at SF and are designed to be instantiated at the level of CS. SF-parameters are a means
of triggering and controlling the conceptual enrichment of a grammatically determined
meaning representation. They delineate precisely those gaps within the Semantic Form
that call for conceptual specification and they impose sortal restrictions on possible
conceptual fillers. SF-parameters can thus be seen as well-defined windows through which
compositional semantics allows linguistic expressions to access and constrain conceptual
structures.
3.2. Non-compositional meaning adjustments
Conceptual specification of a compositionally determined, underspecified, abstract
meaning skeleton, as illustrated in the previous section, is the core notion that
characterizes the Two-level Semantics perspective on the semantics-pragmatics interface. Its focus
is on the conceptual exploitation of a linguistic expression's regular meaning potential. A
second focus typically pursued within Two-level Semantics concerns the possibilities of a
conceptual solution of combinatory conflicts arising in the course of composition.These
are combinatory adjustment operations by which a strictly speaking ill-formed linguistic
expression gets an admissible yet irregular interpretation. In the literature such non-
compositional rescue operations are generally discussed under the label of "coercion".
An example is given in (36).
(36) The alarm clock stood intentionally on the tabic.
The sentence in (36) does not offer a regular integration for the subject-oriented
adverbial intentionally, i.e, the subject NP the alarm clock does not fulfill the adverbial's selec-
tional restriction for an intentional subject. Hence, a compositional clash results, and the
sentence is ungrammatical. Nevertheless, although deviant, there seems to be a way to
rescue the sentence so that it becomes acceptable and interpretable. In the case of (36),
a possible repair strategy would be to introduce an actor who is responsible for the fact
that the alarm clock stands on the table. This move would provide a suitable anchor
for the adverbial's semantic contribution. Thus, we understand (36) as saying that
someone put the alarm clock on the table on purpose. That is, in case of a
combinatorial clash, there seems to be a certain leeway for non-compositional adjustments of the
compositionally derived meaning. The defective part is "coerced" into the right format.
Coercion phenomena are a topic of intensive research in current semantics. Up to
now the primary focus has been on the widely ramified notion of aspectual coercion
(e.g. Mocns & Stccdman 1988; Pulman 1997; dc Swart 1998; Dolling 2003, 2010; Egg
2005) and on cases of so-called "complement coercion" as in Peter began the book (e.g.
Pustejovsky 1995; Egg 2003; Asher 2007); see article 25 (de Swart) Mismatches and
coercion for an overview. The framework of Two-level Semantics is particularly suited
to investigate these borderline cases at the semantics-pragmatics interface because of its
31.Two-level Semantics: Semantic Form and Conceptual Structure 731_
(41) Be [[quant max DEF.pole'] = [Norm . + c]]
This much on SF variables coming with dimension terms and on their instantiation in
structural contexts that arc provided by the morphosyntax of the sentence at issue. After
a brief look at the elements instantiating the metavariable dim, we will discuss a type of
SF variable that is rooted in the lexical field structure of DA terms.
Conceived as a basic module of cognition, dimension assignment to spatial objects
involves entities and operations at three levels.The perceptual level provides the sensory
input from vision and other senses; the conceptual level serves as a filter system reducing
perceptual distinctions to the level that our everyday knowledge of space needs, and
the semantic level accounts for the ways in which conceptually approved features are
encoded in categorized lexemes and arranged in lexical fields.
DA basically draws on Dimension Assignment Parameters (DAP) that are provided
by two frames of reference, which determine the dimensional designation of spatial
objects:
(42) a. The Inherent Proportion Schema (IPS) yields proportion-based gcstalt
features by identifying the object's extents as MAximal, MiNimal, and across axis,
respectively,
b. The Primary Perceptual Space (PPS) yields contextually determined position
features of spatial objects by identifying the object's extents as aligned with the
vERTical axis, with the OBserver axis, and/or with an across axis in between.
The DAP in small caps listed in (42) occur in two representational formats that
reflect the SF vs. CS distinction. In SF representations, the DAP figure as functor
constants of category N/N in the SF of L-particular dimension terms that instantiate
{dim} within the general schema in (40). In CS representations, elements of the DAP
inventory figure as conceptual features in so-called Object Schemata (cf. 4.2 below)
that contain the conceptually defining as well as the contextually specified spatial
features of the object at issue.
Lang (2001) shows that the lexical field of spatial dimension terms in a
language L is determined by the share it has in IPS and PPS, respectively. While
reference to the vertical is ubiquitous, the lexical coverage of DA terms amounts to
the following typology: proportion-based languages (Mandarin, Russian) adhere to
IPS, observer-based ones (Korean, Japanese) adhere to PPS, and mixed-type ones
(English, German) draw on an overlap between IPS and PPS. The semantic effects
of this typology are inter alia reflected by the respective across terms: In P-based
and in O-bascd languages, they arc lexically distinct and rcfcrcntially unambiguous, in
mixed-type languages like English they lack both of these properties.
Note the referential ambiguity of the English across term wide in (44.1) and
its contextualized interpretations in (44.2 - 4) when referring to a board sized 100 x
30 x 3 cm in the spatial settings I—III shown in (43):
VI. Cognitively oriented approaches to semantics
the pertinent DA terms for a and b have been added in (47). The respective extent
chosen as d' to anchor across in the OS at issue and/or to interpret wide in (44.2-4)
is in boldface.
(47) I II III
< a b c>
max 0 min
max across min
a = long,b = wide
< a b c>
max 0 min
across vert min
a = wide, b = high
< a b c>
max 0 min
across obs min
a = widc,b = deep
The OS in (47) as CS representations of (43) and (44) capture all semantic aspects
of DA discussed so far but they deserve some further remarks. First, (47-1) results
from primary identification a la (46a) indicated by matching entries in the 2nd and
3rd row, while (47-11 and III) arc instances of contextual specification as defined in
(46b). Second, the typological characteristics of a P/O-mixcd-typc language are met
as d' for wide may be taken from IPS as in (47 I) or from PPS as in (47 II and III).
Third, the rows of an OS, which contain the defining spatial properties and possibly
also some contextual specifications, can be taken as a heuristic cue for designing
the SF representations of object names that lexically reflect the varying degree of
integration into spatial contexts we observe in (43-44), e.g. board (freely movable)
< notice-board (hanging) < windowsill (bottom part of a window) - in this respect
OS may be seen as an attempt to capture what Bicrwisch (article 16 in this volume)
calls "dossiers". Fourth, Lang, Carstcnsen & Simmons (1991) presents a Prolog
system of DA using OS enriched by sidedness features, and Lang (2001) proposes a
detailed catalogue of types of spatial objects with their OS accounting for primary
entries and for contextually induced orientation or perspectivization. Fifth, despite
their close interaction by means of the operations in (46), DAP as elements of SF
representations and OS entries as CS elements arc subject to different constraints,
which is another reason to keep them distinct. The entries in an OS arc subject to
conditions of conceptual compatibility that inter alia define the set of admissible
complex OS entries listed as vertically arranged pairs in (48):
(48) max max max 0 0 0
across vert obs across vert obs
An important generalization is that (48) holds independently of the way in which
the complex entry happens to come about. So, the combination of max and vert in
the same column may result from primary identification in the 2nd row,cf. The pole
is 2m tall, where the SF of tall contains max & vdrt x as a conjunction of DAP, or
from contextual specification, cf. The pole is 2m high, where vert is added in the 3rd
row. The semantic structure of DA terms is therefore constrained by compatibility
conditions at the CS level but within this scope it is cross-linguistically open to
different lexicalization patterns and to variation of what is covered by the SF of single
DA terms.
VI. Cognitively oriented approaches to semantics
(50b), respectively. For the whole range of entailments and SF postulates based on
DA terms sec Bicrwisch (1989).
(50) a. The board is short -> The board is not long,
b. The board is not long enough <-» The board is too short
(51) a. 3c [[quant max B] = [N-c]] => ~ [Be [[3c [[quantmax B] = [N + c]]]
b. ~ [3C [[ QUANT MAX B] = [K + C ]]] <=> 3c [[ QUANT MAX B] = [K - C ]]
4.3.2. Contextually induced dimensional designation
Valid inferences like those in (52) are accounted for, and invalid ones like those in (53)
are avoided, by drawing on the information provided by, or else lacking in, contextually
specified OS.
(52) a. The board is lm wide and 0.3 m high -» The board is lm long and 0.3m wide
b. The pole is 2m tall/2m high —» The pole is 2m long
(53) a. The wall is wide and high enough -& The wall is long and wide enough
b. The tower is 10 m tall/high -/-» *The tower is 10 m long.
The valid inferences result from the operation of de-specification, which is simply the
reverse of the operation of contextual specification defined in (46b):
(54) De-specification:
a. For any OS for x with a vertical entry < p, q >, there is an OS' with < p, p >.
b. For any OS for x with a vertical entry < 0, q >, there is an OS' with < 0,
across >.
The inferences in (53a, b) are ruled out as invalid because the OS under review do
not contain the type of entries needed for (54) to apply.
4.3.3. Commensurability of object extents
Note that the DA terms long, wide and/or thick arc not hyponyms to big despite the fact
that big may refer to the [v + c] of one, two or all three extents of a 3D object, depending
on the OS of the objects at issue. When objects differing in dimensionality are compared
by using the DA term big, the dimensions it covers are determined by the common share
of the OS involved, cf. (55):
(55) a. My car is too big for the parking space. (too long and/or too wide)
b. My car is too big for the garage door. (too wide and/or too high)
So it is above all the two mapping operations between SF and CS representations as
defined in (46a, b) and exemplified by DAP and OS that account for the whole range of
seemingly complicated facts about DA to spatial objects.
31.Two-level Semantics: Semantic Form and Conceptual Structure 737
5. Summary and outlook
In this article we have reported on some pros and cons related to distinguishing SF and
CS representations and illustrated them by data and facts from a selection of semantic
phenomena. Now we briefly outline the state of the art in more general terms and take a
look at the desiderata that define the agenda for future research.
The current situation can be summarized in three statements: (i) the SF vs. CS
distinction brings in clear-cut advantages as shown by the examples in §§ 2-4; (ii) we still
lack reliable heuristic strategies for identifying the appropriate SF of a lexical item;
(iii) it is difficult to define the scope of variation a given SF can cover at CS level.
What we urgently need is independent evidence for the basic assumption underlying
the distinction: SF representations and CS representations differ in nature as they are
subject to completely different principles of organization. By correlating the SF vs. CS
distinction with distinctions relevant to other levels of linguistic structure formation,
cf. (2)-(5) in section 1, the article has taken some steps in that direction. One of them
is to clarify the differences between SF and CS that derive from their linguistic vs.
non-linguistic origin; cf. (4).
The linguistic basis of the SF-representations of DA terms, for instance, is manifested
(i) in the DAP constants' interrelation by postulates underlying lexical relations, (ii) in
participating in certain lexicalization patterns (e.g. proportion-based vs. observer-based),
(iii) in being subject to the uniqueness constraint in (49), which is indicative of the semio-
ticity of the system it applies to, whereas the Conceptual System CS is not a semiotic one.
Pursuing this line of research, phenomena specific to natural languages like idiosyncra-
cies, designation gaps, collocations, connotations, folk etymologies etc. should be
scrutinized for their possible impact on establishing SF as a linguistically determined level of
representation.
The non-linguistic basis of CS-representations, e.g. OS involved in DA, is manifested
(i) in the fact that OS entries are exclusively subject to perception-based compatibility
conditions;(ii) in their function to integrate input from the spatial environment regardless
of the channel it comes in; (iii) in their property to allow for valid inferences to be drawn
on entries that are induced as contextual specifications. To deepen our understanding
of CS-representations, presumptions like the following deserve to be investigated on a
broader spectrum and in more detail: (i) CS representations may be underspecified in
certain respects, cf. the role of '0' in OS, but as they are not semiotic entities they are not
ambiguous; (ii) the compatibility conditions defining admissible OS suggest that the
following relation may hold wrt. the well-formedness of representations: sortal restrictions
c sclcctional restrictions; (iii) CS representations have to be contingent since
contradictory entries cause the system of inferences to break down; contradictions at SF level
trigger accommodation activities.
As the agenda above suggests, a better understanding of the interplay of linguistic
and non-linguistic aspects of meaning constitution along the lines developed here is
particularly to be expected from interdisciplinary research combining methods and insights
from linguistics, psycholinguistics, neurolinguistics and cognitive psychology.
6. References
Aslier, Nicholas 2007. A Web of Words: Lexical Meaning in Context. Ms. Austin, TX, University of
Texas
738 VI. Cognitively oriented approaches to semantics
Bierwiscli, Manfred 1983. Semantische und konzeptuelle Interpretation lexikalischer Einheiten. In:
R. Ruzi£ka & W. Motsch (eds). Untersuchungen zur Semantik. Berlin: Akademie Verlag, 61-99.
Bierwiscli, Manfred 1988. On the grammar of local prepositions In: M. Bierwiscli, W. Motsch &
I. Ziinmerinann (eds.). Syntax, Semantik und Lexikon. Berlin: Akademie Verlag, 1-65.
Bierwiscli, Manfred 1989. The semantics of gradation. In: M. Bierwiscli & E. Lang (eds.).
Dimensional Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer,71-261.
Bierwiscli, Manfred 1996. How much space gets into language? In: P. Bloom et al. (eds). Language
and Space. Cambridge, MA:The MIT Press, 31-76.
Bierwiscli, Manfred 1997. Lexical information from a minimalist point of view. In: Ch. Wilder, H.-M.
Gartner & M. Bierwiscli (eds.). The Role of Economy Principles in Linguistic Theory. Berlin:
Akademie Verlag. 227-266.
Bierwiscli, Manfred 2005. The event structure of cause and become. In: C. Maienborn & A.
Wollstein (eds.). Event Arguments: Foundations and Applications.Tiib'mgeiv. Niemeyer, 11-44.
Bierwiscli, Manfred 2007. Semantic Form as interface. In: A. Spath (cd.). Interfaces and Interface
Conditions. Berlin: dc Gruytcr, 1-32.
Bierwiscli, Manfred 2010. become and its presuppositions In: R. Bauerle, U Reyle & T E.
Ziniinerinann (eds.). Presupposition and Discourse. Bingley: Emerald, 189-234.
Bierwiscli, Manfred & Peter Bosch (eds.) 1995. Semantic and Conceptual Knowledge (Arbeitspa-
piere des SFB 340 Nr. 71). Heidelberg: IBM Deutschland.
Bierwiscli, Manfred & Ewald Lang (eds) 1989a. Dimensional Adjectives: Grammatical Structure
and Conceptual Interpretation. Berlin: Springer.
Bierwiscli, Manfred & Ewald Lang 1989b. Somewhat longer - much deeper - further and further.
Epilogue to the Dimensional Adjective Project. In: M. Bierwiscli & E. Lang (eds). Dimensional
Adjectives: Grammatical Structure and Conceptual Interpretation. Berlin: Springer, 471-514.
Bierwisch, Manfred & Rob Schreuder 1992. From concepts to lexical items Cognition 42,23-60.
Blutncr, Rcinhard 1995. Systcmatischc Polyscmic: Ansatzc zur Erzcugung und Bcschrankung von
Interprctationsvariantcn. In: M. Bierwisch & P. Bosch (eds). Semantic and Conceptual
Knowledge (Arbeitspapiere des SFB 340 Nr. 71). Heidelberg: IBM Deutschland, 33-67.
Blutner, Reinhard 1998. Lexical pragmatics Journal of Semantics 15,115-162.
Blutncr, Rcinhard 2004. Pragmatics and the lexicon. In: L. R. Horn & G. Ward (eds). The
Handbook of Pragmatics. Maiden, MA: Blackwcll, 488-514.
Bucking, Sebastian 2009. Modifying event noininals: Syntactic surface meets semantic
transparency. In: A. Riester & T Solstad (eds). Proceedings of Sinn und Bedeutung (= SuB) 13. Stuttgart:
University of Stuttgart, 93-107.
Bucking, Sebastian 2010. Zur Interpretation adnominaler Genitive bei nominalisierten Infinitiven
im Deutschen. Zeitschrift flir Sprachwissenschaft 29,39-77.
Carstensen. Kai-Uwe 2001. Sprache, Raum und Aufmerksamkeit. Tubingen: Niemeyer.
Dolling, Johannes 1994. Sortale Selektioiisbeschrankungen und systematische Bedeutungsvai iabil-
itat. In: M. Schwarz (ed.). Kognitive Semantik - Cognitive Semantics. Tubingen: Narr. 41-60.
Dolling, Johannes 2001. Systematische Bedeutungsvariationen: Semantische Form und kontextuelle
Interpretation (Linguistische Arbeitsberichte 78). Leipzig: University of Leipzig.
Dolling, Johannes 2003. Flexibility in adverbal modification: Reinterpretation as contextual
enrichment. In: E. Lang, C. Maienborn & C. Fabricius-Hansen (eds). Modifying Adjuncts. Berlin: de
Gruyter, 511-552.
Dolling, Johannes 2005a. Semantische Form und pragmatische Anreicherung: Situationsausdrucke
in dcr AuBcrungsintcrprctation. Zeitschrift fur Sprachwissenschaft 24.159-225.
Dolling, Johannes 2005b. Copula sentences and entailment relations Theoretical Linguistics 31,
317-329.
Dolling, Johannes 2010. Aspectual coercion and eventuality structure.To appear in: K. Robering &
V Engerer (eds). Verbal Semantics.
Egg, Markus 2003. Beginning novels and finishing hamburgers Remarks on the semantics of 'to
begin'. Journal of Semantics 20,163-191.
742 VI. Cognitively oriented approaches to semantics
role in language. We can be in a room, in a social group, in the midst of an activity, in
trouble, and in politics. We can move a chair from the desk to the table, move money
from one bank account to another, move a discussion from religion to politics, and move
an audience to tears. A fundamental distinction between tangibles and intangibles rules
out the possibility of understanding the sense of'in" or "move" common to all these uses.
Our effort, by contrast, has sought to exploit the insights of linguists such as Gruber
(1965), the generative semanticists, Johnson (1987), Lakoff (1987), Jackendoff (see article
30 (Jackendoff) Conceptual Semantics), and Talmy (see article 27 (Talmy) Cognitive
Semantics: An overview). Johnson, Lakoff, Talmy and others have used the term "image
schemas" to refer to a conceptual framework that includes topological relations but
excludes, for example, Euclidean notions of magnitude and shape. We have been
developing core theories that formalize something like the image schemas, and we have been
using these to define or characterize words. Among the theories we have developed are
theories of composite entities, or things made of other things, the figure-ground relation,
scalar notions, change of state, and causality.The idea behind these abstract core theories
is that they capture a wide range of phenomena that share certain features.The theory of
composite entities, for example, is intended to accommodate natural physical objects like
vole an os, artifacts like automobiles, complex events and processes like concerts and
photosynthesis, and complex informational objects like mathematical proofs. The theory of
scales captures commonalities shared by distance, time, numbers, money, and degrees of
risk, severity, and happiness. The most common words in English (and other languages)
can be defined or characterized in terms of these abstract core theories. Specific kinds of
composite entities and scales, for example, are then defined as instances of these abstract
concepts, and we thereby gain access to the rich vocabulary the abstract theories provide.
We can illustrate the link between word meaning and core theories with the rather
complex verb "range". A core theory of scales provides axioms involving predicates
such as scale, <, subscale, top, bottom, and at. Then we arc able to define "range" by the
following axiom:
(V x, y, z)range(x, y,z) =
(3 s, s{, u{, u2)scale(s) a subscale(srs) a bottom(y, s{)
a top(z,sx) a ux e x Aat(u{,y) a u2e x Aat(u2,z)
a (V u e x)(3 v e s,)af(w,v)
That is,Jt ranges from y to z if and only if there is a scale s with a subscale s{ whose bottom
island whose top isz,such that some member m, of x is at y, some member w2of jc is at z,
and every member u of x is at some point v in ^.Then by choosing different scales and
instantiating the at relation in different ways, we can get such uses as
The buffalo ranged from northern Texas to southern Saskatchewan.
The students" SAT scores range from 1100 to 1550.
The hepatitis cases range from moderate to severe.
His behavior ranges from sullen to vicious
Many things can be conceptualized as scales, and when this is done, a large vocabulary,
including the word "range", becomes available.
744 VI. Cognitively oriented approaches to semantics
the use of a word in context.This is far from a solved problem, but recasting meaning and
interpretation in this way gives us a formal, computational way of approaching the
problem.
Defeasibility in the logic gives us an approach to prototypes (see article 28 (Taylor)
Prototype theory; Rosch 1975). Categories correspond to predicates and are
characterized by a set of possibly defeasible inferences, expressed as axioms, among which are
their traditional defining features. For example, bachelors are unmarried and birds fly.
(V x)bachelor(x) z> unmarried(x)
(V x)bird{x) a etc{(x) z>fly(x)
where etc^x) indicates the defeasibility of the axiom. Each instance of a category has
a subset of the defeasible inferences that hold in its particular case. The more
prototypical, the more inferences. In the case of the penguin, which is not a prototypical
bird, the defeasible inference about flying is defeated. In this view, the basic level
category is the predicate with the richest set of associated axioms. For example, there is
more gain in useful knowledge from learning an animal is a dog than from learning a
dog is a boxer.
Similarly, defeasible inference lends itself to a treatment of novel metaphor. In
metaphor, some properties arc transferred from a source to a target, and some arc not. When
we say Pat is a pig, we draw inferences about manner and quantity of eating from "pig",
but not about four-leggedness or species membership.The latter inferences are defeated
by the other things we know. Hobbs (1992) develops this idea.
Taking abstract core theories as basic may seem to run counter to a central tenet of
cognitive linguistics, namely, that our understanding of many abstract domains is founded
on spatial metaphor. It is certainly true that the field of spatial relationships, along with
social relationships, is one of the domains babies have to figure out first. But I think that
to say we figure out space first and then transfer that knowledge to other domains is to
seriously underestimate the difficulty of figuring out space. There are many ways one
could conceptualize space, e.g., via Euclidean geometry. But in fact it is the topological
concepts which predominate in a baby's spatial understanding. A one-year-old baby
fascinated by "in" might put a necklace into a trash can and a Cheerio into a shoe, despite
their very different sizes and shapes. In spatial metaphor it is generally the topological
properties that get transferred from the source to the target. In taking the abstract core
theories as basic, we are isolating precisely the topological properties of space that are
most likely to be the basis for understanding metaphorical domains.
If one were inclined to make innateness arguments, one position would be that we
are born with a instinctive ability to operate in spatial environments. We begin to use
this immediately when we are born, and when we encounter abstract domains, we tap
into its rich models. The alternative, more in line with our development here, is that we
are born with at least a predisposition towards instinctive abstract patterns - composite
entities, scales, change, and so on - which we first apply in making sense of our spatial
environment, and then apply to other, more abstract domains as we encounter them.This
has the advantage over the first position that it is specific about exactly what properties
of space might be in our innate repertoire. For example, the scalar notions of "closer"
and "farther" are in it; exact measures of distance are not. A nicely paradoxical coda for
32. Word meaning and world knowledge 745
summing up this position is that we understand space by means of a spatial metaphor. I
take Talmy's critique of the "concreteness as basic" idea as making a similar point (see
article 27 (Talmy) Cognitive Semantics: An overview).
Many of the preceding articles have proposed frameworks for linking words to an
underlying conceptual structure.These can all be viewed as initial forays into the problem
of connecting lexical meaning with world knowledge. The content of this work survives
translation among the various frameworks that have been used for examining it, and
survives recasting it as a problem of explicitly encoding world knowledge, specifically, a
theory of image schemas explicating such concepts as composite entities, figure-ground,
scales, change of state, causality, aggregation, and granularity shifts—an abstract theory
that can be instantiated in many different, more specialized domains. The core theories
we are developing are not so much theories about particular aspects of the world, but
rather abstract frameworks that are useful in making sense of a number of different
kinds of phenomena. Levin and Rappaport Hovav (see article 19 (Levin & Rappaport
Hovav) Lexical Conceptual Structure) say, "All theories of event structure, either
implicitly or explicitly, recognize a distinction between the primitive predicates which define
the range of event types available and a component which represents what is
idiosyncratic in a verb's meaning." The abstract theories presented here are an explication of
the former of these.
This work can be seen as an attempt at a kind of deep lexical semantics. Not only are
the words "decomposed" into what were once called primitives, but also the primitives
are explicated in axiomatic theories, enabling one to reason deeply about the concepts
conveyed by the text.
2. Core abstract theories
2.1. Composite entities
Composite entities are things made of other things. A composite entity is characterized
by a set of components, a set of properties of these components, and a set of relations
among the components and between the components and the whole. The concept of
composite entity captures the minimal complexity something must have in order for
it to have structure. It is hard to imagine something that cannot be conceptualized as a
composite entity. For this reason, a vocabulary for talking about composite entities will
be broadly applicable.
The elements of a composite entity can themselves be viewed as composite entities,
and this gives us a very common example of shifting granularities. It allows us to
distinguish between the structure and the function of an entity. The function of an entity as a
component of a larger composite entity is its relations to the other elements of the larger
composite entity, its environment, while the entity itself is viewed as indecomposable.
The structure of the entity is revealed when we decompose it and view it as a composite
entity itself. We look at it at a finer granularity.
An important question any time we can view an entity both functionally and
structurally is how the functions of the entity arc implemented in its structure. Wc need to spell
out the structure-function articulations.
For example, a librarian might view a book as an indecomposable entity and be
interested in its location in the library, its relationship to other books, to the bookshelves, and
746 VI. Cognitively oriented approaches to semantics
to the people who check the book out. This is a functional view of the book with respect
to the library. We can also view it structurally by inquiring as to its parts, its content,
its binding, and so on. In spelling out the structure-function articulations, we might say
something about how its content, its size, and the material used in its cover determines
its proper location in the library.
A composite entity can serve as the ground against which some ex\ema\ figure can be
located or can move (see 27 Cognitive semantics: An overview). A primitive predicate at
expresses this relation. In
at(x, y, s)
s is a composite entity,}' is one of its elements, and x is an external entity.The relation
says that the figure x is at a point y in the composite entity s, which is the ground.
The at relation plays primarily two roles in the knowledge base. First, it is involved
in the "decompositions" of many lexical items. We saw this above in the definition of
"range".There is a very rich vocabulary of terms for talking about the figure-ground
relation. This means that whenever a relation in some domain can be viewed as an instance
of the figure-ground relation, we acquire at a stroke a rich vocabulary for talking about
that domain.
This gives rise to the second role the at predicate plays in the knowledge base. A great
many specific domains have relations that arc stipulated to be instances of the at relation.
There are a large number of axioms of the form
(V x, y, s)r(x, y, s) => at(x, y, s)
It is in this way that many of the metaphorical usages that pervade natural language
discourse are accommodated. Once we characterize some piece of the world as a composite
entity, and some relation as an at relation, we have acquired the whole locational way of
talking about it. Once this is enriched with a theory of time and change, we can import
the whole vocabulary of motion. For example, in computer science, a data structure can
be viewed as a composite entity, and we can stipulate that if a pointer points to a node
in a data structure, then the pointer is at that node. We have then acquired a spatial
metaphor, and we can subsequently talk about, for example, the pointer moving around
the data structure. Space, of course, is itself a composite entity and can be talked about
using a locational vocabulary.
Other examples of at relations are
A person at an object in a system of objects:
John is at his desk.
An object at a location in a coordinate system:
The post office is at the corner of 34th Street and Eighth Avenue.
A person's salary at a particular point on the money scale:
John's salary reached $75,000 this year.
An event at a point on the time line:
The meeting is at three o'clock.
VI. Cognitively oriented approaches to semantics
1. a broad sense, including have a son, having a condition hold and having a college degree
2. having a feature or property, i.e., the property holding of the entity
3. a sentient being having a feeling or internal property
4. a person owning a possession
7. have a person related in some way: have an assistant
9. have left: have three more chapters to write
12. have a disease: have influenza
17. have a score in a game: have three touchdowns
The supersense can be characterized by the axiom
have-s\(x,y) z> relatedTo(x,y)
In these axioms, supersenses are indexed with s, WordNet senses with w, and FrameNet
senses with/. Unindcxcd predicates arc from core theories.
The individual senses are then specializations of the supersense where more domain-
specific predicates are explicated in more specialized domains. For example, sense 4
relates to the supersense as follows:
have-w4(x, y) = possess(x, y)
have-w4(x, y) z> have-sl (x, y)
where the predicate possess would be explicated in a commonsense theory of economics,
relating it to the priveleged use of the object. Similarly, have-w3(x, y) links with the
supersense but has the restrictions that x is sentient and that the "relatedTo" property is
the predicate-argument relation between the feeling and its subject.
The second supersense of "have" is "come to be in a relation to".Thisisour changeTo
predicate. Thus, the definition of this supersense is
have-s2(x, y) = changeTo(e) a have-sl \e, x, y)
The WordNet senses this covers are as follows:
10. be confronted with: we have a fine mess
11. experience: the stocks had a fast run-up
14. receive something offered: have this present
15. come into possession of: he had a gift from her
16. undergo, e.g.. an injury: he had his arm broken in the fight
18. have a baby
In these senses the new relation is initiated but the subject does not necessarily play a
causal or agentive role. The particular change involved is specialized in the WordNet
senses to a confronting, a receiving, a giving birth, and so on.
The third supersense of "have" is "cause to come to be in a relation to". The axiom
defining this is
have-s3(x,y) = cause(x, e) a have-s2'(e,x, y)
The WordNet senses this covers are
32. Word meaning and world knowledge 755
remainder-w\(x, e) = remain-w3(x, e)
That is,Jt is the remainder after process e if and only if x remains after e.Thc other three
senses result from specialization of the removal process to arithmetic division, arithmetic
subtraction, and the purposeful cutting of a piece of cloth. The noun "remains" refers to
what remains (w2>) after a process of consumption or degradation.
3.4. The nature of word senses
The most common words in a language are typically the most polysemous. They often
have a central meaning indicating their general topological structure. Each new sense
introduces inferences that cannot be reliably determined just from a core meaning plus
contextual factors. They tend to build up along what Brugman (1981), Lakoff (1987) and
others have called a radial category structure (see article 28 (Taylor) Prototype theory).
Sense 2 may be a slight modification of sense 1, and senses 3 and 4 different slight
modifications of sense 2. It is easy to describe the links that take us from one sense to an
adjacent one in the framework presented here. Each sense corresponds to a predicate which
is characterized by one or more axioms involving that predicate. A move to an adjacent
sense happens when incremental changes are made to the axioms. As we have seen in
the examples of this section, the changes are generally additions to the antecedents or
consequents of the axioms.The principal kinds of additions are embedding in change and
cause, as we saw in the supersenses of "have"; the instantiation of general predicates like
relatedTo and at to more specific predicates in particular domains, as we saw in all three
cases; and the addition of domain-specific constraints on arguments, as in restricting y to
be clothes in remove-fl.
A good account of the lexical semantics of a word should not just catalog various word
senses. It should detail the radial category structure of the word senses, and for each link,
it should say what incremental addition or modification resulted in the new sense. Note
that radial categories provide us with a logical structure for the lexicon, and also no
doubt a historical one, but not a developmental one. Children often learn word senses
independently and only later if ever realize the relation among the senses. See article 28
Prototype theory for further discussion of issues with respect to radial categories.
4. Distinguishing lexical and world knowledge
It is perhaps natural to ask whether a principled boundary can be drawn between
linguistic knowledge and knowledge of the world. To make this issue more concrete,
consider the following seven statements:
(1) If a string w{ is a noun phrase and a string w2 is a verb phrase, then the concatenation
wyw2 is a clause.
(2) The transitive verb "moves" corresponds to the predication move2(x,y), providing a
string describing x occurs as its subject and a string describing y occurs as its direct
object.
(3) If an entity x moves (in sense move2) an entity y, then x causes a change of state or
location ofy.
32. Word meaning and world knowledge 759
knowledge and interpretation processes, as with the study of lexical conceptual stuctures,
as a strategic decision to identify a fruitful and tractable research problem.
Pustejovsky's work is an attempt to specify the knowledge that is required for
interpreting at least the majority of nonstandard uses of words. Kilgarriff (2001) tests this
hypothesis by examining the uses of nine particular words in a 20-million word corpus.
41 of 2276 instances were judged to be nonstandard since they did not correspond to any
of the entries for the word in a standard dictionary. Of these, only two nonstandard uses
were derivable from Pustejovsky's qualia structures. The others required deeper com-
monsense knowledge or previous acquaintance with collocations. Kilgarriffs conclusion
is that "Any theory that relies on a distinction between general and lexical knowledge
will founder." (Kilgariff 2001:325)
Some researchers in natural language processing have argued that lexical
knowledge should be distinguished from other knowledge because it results in more efficient
computation or more efficient comprehension and production. One example concerns
hyperonymy relations, such as that car(x) implies vehicle{x). It is true that some kinds
of inferences lend themselves more to efficient computation than others, and inferences
involving only monadic predicates are one example. But where this is true, it is a result
not of their content but of structural properties of the inferences, and these cut across the
lexical-world distinction. Any efficiency realized in inferring vehicle{x) can be realized in
inferring expensive(x) as well.
All of statements (1 )-(7) arc facts about the world, because sentences and their
structure and words and their roles in sentences are things in the world, as much as barges,
planes, and New Jersey. There is certainly knowledge we have that is knowledge about
words, including how to pronounce and spell words, predicate-argument realization
patterns, alternation rules, subcategorization patterns, grammatical gender, and so on. But
words are part of the world, and one might ask why this sort of knowledge should have
any special cognitive status. Is it any different in principle from the kind of knowledge
one has about friendship, cars, or the properties of materials? In all these cases, we have
entities, properties of entities, and relations among them. Lexical knowledge is just
ordinary knowledge where the entities in question are words.There are no representational
reasons for treating linguistic knowledge as special, providing we are willing to treat the
entities in our subject matter as first-class individuals in our logic (cf. Hobbs 1985).There
are no procedural reasons for treating linguistic knowledge as special, since parsing,
argument realization, lexical decomposition, the coercion of metonymies, and so on can
all be implemented straightforwardly as inference.The argument that parsing and lexical
decomposition, for example, can be done efficiently on present-day computers, whereas
commonsense reasoning cannot, does not seem to apply to the human brain; psycho-
linguistic studies show that the influence of world knowledge kicks in as early as syntactic
and lexical knowledge, and yields the necessary results just as quickly.
We are led to the conclusion that any drawing of lines is for the strategic purpose
of identifying a coherent, tractable and fruitful area of research. Statements (l)-(6)
are examples from six such areas. Once we have identified and explicated such areas,
the next question is what connections or articulations there are among them;
Pustejovsky's research and work on lexical conceptual structures arc good examples of people
addressing this question.
However, all of this does not mean that linguistic insights can be ignored. The world
can be conceptualized in many ways. Some of them lend themselves to a deep treatment
760 VI. Cognitively oriented approaches to semantics
of lexical semantics, and some of them impede it. Put the other way around, looking
closely at language leads us to a particular conceptualization of the world that has
proved broadly useful in everyday life. It provides us with topological relations rather
than with the precision of Euclidean 3-space. It focuses on changes of state rather than
on correspondences with an a priori time line. A defeasible notion of causality is central
in it. It provides means for aggregation and shifting granularities. It encompasses those
properties of space that are typically transferred to new target domains when what looks
like a spatial metaphor is invoked.
More specific domains can then be seen as instantiations of these abstract theories.
Indeed, Euclidean 3-space itself is such a specialization. Language provides us with a rich
vocabulary for talking about the abstract domains. The core meanings of many of the
most common words in language can be defined or characterized in these core theories.
When the core theory is instantiated in a specific domain, the vocabulary associated with
the abstract domain is also instantiated, giving us a rich vocabulary for talking about
and thinking about the specific domain. Conversely, when we encounter general words
in the contexts of specific domains, understanding how the specific domains instantiate
the abstract domains allows us to determine the specific meanings of the general words
in their current context.
We understand language so well because we know so much. Therefore, we will not
have a good account of how language works until we have a good account of what we
know about the world and how we use that knowledge. In this article I have sketched a
formalization of one very abstract way of conceptualizing the world, one that arises from
an investigation of lexical semantics and is closely related to the lexical decompositions
and image schemas that have been argued for by other lexical semanticists. It enables us
to capture formally the core meanings of many of the most common words in English
and other languages, and it links smoothly with more precise theories of specific domains.
I have profited from discussions of this work with Gully Burns, Peter Clark, Tim Clausner,
Christiane Fellbaum, and Rutu Mulkar-Mehta. This work was performed in part under the
IARPA (DTO) AQUAINTprogram, contract N61339-06-C-0160.
5. References
Baker, Colin F., Charles J. Fillmore & Beau Cronin 2003. The structure of the Framcnct database.
International Journal of Lexicography 16,281-296.
Bloomfield, Leonard 1933. Language. New York: Holt, Rinehart & Winston.
Brugman, Claudia 1981. The Story of'Over'. MA thesis University of California, Berkeley, CA.
Cycorp 2008. http://www.cyc.com/. December 9,2010.
Davis, Ernest 1990. Representations of Commonsense Knowledge. San Mateo, CA: Morgan
Kaufmann.
Fodor, Jerry A. 1980. Methodological solipsism considered as a research strategy in cognitive
science. Behavioral and Brain Sciences 3,63-109.
Fodor, Jerry A. 1983. The Modularity of Mind. An Essay on Faculty Psychology. Cambridge, MA:
The MIT Press.
Ginsberg. Matthew L. (ed.) 1987. Readings in Nonmonotonic Reasoning. San Mateo, CA: Morgan
Kaufmann.
VII. Theories of sentence semantics
33. Model-theoretic semantics
1. Truth-conditional semantics
2. Possible worlds semantics
3. Model-theoretic semantics
4. References
Abstract
Model-theoretic semantics is a special form of truth-conditional semantics. According
to it, the truth-values of sentences depend on certain abstract objects called models.
Understood in this way, models are mathematical structures that provide the interpretations
of the (non-logical) lexical expressions of a language and determine the truth-values
of its (declarative) sentences. Originally designed for the semantic analysis of
mathematical logic, model-theoretic semantics has become a standard tool in linguistic semantics,
mostly through the impact of Richard Montagues seminal work on the analogy between
formal and natural languages. As such, it is frequently (and loosely) identified with
possible worlds semantics, which rests on an identification of sentence meanings with
regions in Logical Space, the class of all possible worlds. In fact, the two approaches
have much in common and are not always easy to keep apart. In a sense, (i) model-
theoretic semantics can be thought of as a restricted form of possible worlds semantics,
where models represent possible worlds; in another sense, (ii) model-theoretic
semantics can be seen as a wild generalization of possible worlds semantics, treating Logical
Space as variable rather than given. Consequently, the present introductory exposition of
model-theoretic semantics also covers possible worlds semantics - hopefully helping to
disentangle the relationship between the two approaches. It starts with a general
discussion of truth-conditional semantics (section I), the main purpose of which is to provide
some motivation and background. Model-theoretic semantics is then approached from
the possible worlds point of view (section 2), highlighting the similarities between the
two approaches that give rise to perspective (i). The final section 3 turns to model theory
as a providing a mathematical reconstruction of possible worlds semantics, ultimately
arriving at the more abstract perspective (ii).
1. Truth-conditional semantics
The starting point for the truth-conditional approach to semantics is the tight
connection between meaning and truth. From a prc-thcorctic point of view, linguistic meaning
may appear a multi-faceted blend of phenomena, partly subjective and private, partly
social and inter-subjective, mostly vague, slippery, and apparently hard to define in
precise terms. Nevertheless, there are a few unshakable, yet substantial (non-tautological)
insights into meaning. Among these is the contention that differences in truth-value
necessitate differences in meaning:
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 762-802
33. Model-theoretic semantics
763
Most Certain Principle [MCP] cf. Cresswell (1982:69)
If 5, and S2 are declarative sentences such that, under given circumstances, 5, is
true whereas S2 is not, then 5, and S2 differ in meaning.
The MCP certainly has the flavour of an a priori truth. If it is one, an adequate analysis
of meaning ought to take care of it by making a conceptual connection with truth; in
fact, this connection might lie at the heart of any adequate theoretical reconstruction of
meaning. If so, meaning is primarily sentence meaning; for sentences are the bearers of
truth, i.e. the only expressions that can be true (or false).
We will say that the truth value of a (declarative) sentence S is 1 [under given
circumstances] if S is true [under these circumstances]; and that its truth value [...] is 0
otherwise.Then the lesson to be learnt from the MCP is that, given arbitrary circumstances, the
meaning of a sentence S determines its truth value under these circumstances: nothing,
i.e. no sentence 5', could have that meaning without having the same truth value.
Whatever the meanings of (declarative) sentences may be, then, they must somehow
determine their truth conditions, i.e. the circumstances under which the sentences are true.
According to truth-conditional semantics, anything beyond this most marked trait of
sentence meanings ought to be left out of consideration (or to pragmatics), thus paving
the way for the following'minimalist' semantic axiom:
Basic Principle of Truth-Conditional Semantics [BPJ cf. Wittgenstein (1922:4.431)
Any two declarative sentences agree in meaning just in case they agree in their
truth conditions.
Since by (our) definition, truth conditions arc circumstances of truth, the BP implies
that sentences that differ in meaning, also differ in truth value under at least some
circumstances. Truth-conditional semantics owes much of its flavour, as well as some
of its problems, to this immediate consequence of the BP, to which we will return in
section 2.
According to the BP, the meanings of sentences covary with their truth conditions.
It is important to realize (and easy to overlook) that a statement of covariance is not
an equation. The - common - identification of sentence meanings with truth conditions
requires further justification. Some motivation for it may be sought in functionalist
considerations, according to which truth conditions suffice to explain the extralinguistic roles
attributed to sentence meaning in social interaction and cognitive processes.Though we
cannot go into these matters here, we will later see examples of covarying entities - the
material models introduced in Section 2.3 - that are less likely to be identified with
sentence meanings than truth conditions in the sense envisaged here. Bearing this in mind,
we will still read the BP as suggesting that the meanings of sentences may as well be
taken to be their truth conditions.
In its most radical form, truth-conditional semantics seeks to derive the meanings of
non-sentential expressions from sentence meanings, as the contributions these
expressions make to the truth conditions of the sentences in which they occur; cf. Frege (1884:
71). Without any further specification and constraints, this characterisation of non-
sentential meaning is bound to be incomplete. To begin with, the contribution an
expression makes to the truth conditions of a sentence in which it occurs, obviously depends on
764
VII. Theories of sentence semantics
the position in which it occurs: the contribution of an expression A occupying position p
in a sentence S to the meaning of S must be understood as the contribution of A as
occurring in p; for the sake of dcfinitcncss, positions p can be identified with (certain)
partial, injective functions on the domain of expressions, inserting occupants into hosts,
so that an (occupant) expression A occurs in position p of a (host) expression B just in
case p{A) = B. Positions in sentences are structural positions that need not coincide with
positions in surface strings; in particular, the contributions made by the parts of
structurally ambiguous (surface) sentences depend on the latters' readings, which have different
structures, hosting their parts in different structural positions.
Given their sensitivity to syntactic structure, the truth-conditional contributions
of its parts can hardly be determined by looking at one sentence in isolation.
Otherwise minimal pairs like the following, taken from Abbott & Hauser (1995: 6, fn. 10),
would create problems when it comes to the truth-conditional contributions of their
parts:
(1) Someone is buying a car.
(2) Someone is selling a car.
(3) and (4) have the same truth conditions and the same overall structure, in which
the underlined verbs occupy the same position. Since the two sentences are otherwise
identical, it would seem that the two alternatives make the same contribution. If so, this
contribution cannot be their meaning - after all, the two underlined verb forms are not
synonymous and generally do differ in their truth-conditional contributions, as in:
(3) Mary is buying a car.
(4) Mary is selling a car.
Hence, lest the two verbs come out as synonymous, their meanings can only be identified
with their truth-conditional contributions if the latter are construed globally, taking into
account all positions in which the verbs may occur. Generalising from this example, it
emerges that sameness of meaning among any non-sentential expressions A and B (of
the same category) requires sameness of truth-conditional contribution throughout the
structural positions of all sentences. Since A and B occupy the same structural position
in two sentences SA and SB if SB derives from SA by putting B in place of A, this condition
on synonymy may be cast in terms of a:
Substitution Principle [Subst]
If two non-sentential expressions of the same category have the same meaning,
either may replace the other in all positions within any sentence without thereby
affecting the truth conditions of that sentence.
The italicized restriction is made so as to not rule out the possibility that two
expressions make the same truth-conditional contribution without being grammatically
equivalent. As argued in Gazdar et al. (1985: 32), likely and probable may be cases in
point - given the grammatical contrast between likely to leave vs. *probable to leave.
33. Model-theoretic semantics
765
Taken together, the BP and Subst form the backbone of truth-conditional
semantics: the meanings of sentences coincide with their truth conditions; and synonymy
among non-scntcntial expression is restricted by the synonymies among the sentences
in which they occur. Subst leaves open what kinds of entities the meanings of non-
sentential expressions are. Keeping the 'minimalist' spirit behind the passage from
the MCP to the BP, it is tempting to identify them by strengthening Subst to a
biconditional, thereby arriving at a synonymy criterion, according to which two (non-
sentential) expressions have the same meaning if they may always replace each other
within a sentence without affecting the truth conditions of that sentence. Ontologi-
cally speaking, this synonymy criterion has meaning supervene on truth conditions; cf.
Kupffer (2007). However, we will depart from radical truth-conditional semantics and
not take it for granted that this criterion always applies; possible counterexamples will
be addressed in due course.
The above considerations also support a more general version of Subst according to
which various synonymous (non-sentential) expressions may be replaced simultaneously.
As a result, the meaning of any complex expression can be construed as only depending
of the meanings of its immediate parts - thus giving rise to a famous principle, which we
note here for future reference:
Principle of Compositionality [Compo]
The meaning of a complex expression functionally depends on the meanings of its
immediate parts and the way in which they are combined.
Like Subst, Compo does not say what kinds of objects non-sentential meanings are.
In fact, a permutation argument has it that they arc hopelessly undcrdctcrmincd: if a
truth-conditional account specified some objects x andy as the meanings of some (non-
sentential) expressions A and B, then an alternative account could be given that would
also be in line with Subst, but according to which x and y have changed their places. We
will take a closer look at this argument in section 3.
Even though Compo and Subst arc not committed to a particular concept of non-
sentential meaning, some choices might be more natural than others. Indeed, Subst falls
out immediately if truth-conditional contributions and thus meanings (of non-sentential
expressions) are constructed by abstraction, as classes of expressions that make the same
truth-conditional contributions in all positions in which they occur: if the meaning of an
expressions A is identified with the set of all expressions with which A can be replaced
without any change in truth conditions, then Subst trivially holds. However, there is a
serious problem with this strategy: if (non-sentential) meanings were classes of inter-
substitutable expressions, they would be essentially language-dependent. But if
meanings were just substitution classes, then meaning-preserving translation between distinct
languages would be impossible; and even more absurdly, all expressions of a language
would have to change their meanings each time a new word is added to it! This certainly
speaks against a construal of meanings in terms of abstraction, even though synonymy
classes have proved useful as representatives of meanings in the algebraic study of
compositionality; cf. Hodges (2001).
If contributing to truth conditions and conforming to Subst were all there is to non-
sentential meanings, then there would be no fact of the matter as to what kinds of objects
766
VII. Theories of sentence semantics
they are; moreover, there would be no obvious criteria for deciding whether two
expressions of different languages have the same meaning. Hence a substantial theory of non-
sentential meaning cannot live on Subst alone. One way to arrive at such a theory is by
finding a natural analogue to the MCP that goes beyond the realm of sentences. The
most prominent candidate is what may be dubbed the:
Rather Plausible Principle [RPPJ
If 7, and T2 are terms such that, under given circumstances, 7, refers to something
to which 72does not refer, then 7, and T2 differ in meaning.
A term in the sense of the RPP is an expression that may be said to refer to something.
Proper names, definite descriptions, and pronouns are cases in point. Thus, in one of its
usages, (i) the name Manfred refers to a certain person; under the current actual
circumstances, (ii) the description Regine's husband refers to the same person; and in an
email to Manfred Kupffer, I can use (iii) the pronoun you to refer to him. Of course, in
different contexts, (iii) can also refer to another person; hence the two terms mentioned
in the RPP must be understood as being used in the same context. Under different (past
or counterfactual) circumstances, (ii) may refer to another person, or to no one at all;
the RPP only mentions the reference of terms under the same circumstances, and in the
latter case thus applies vacuously. For the purpose of this survey we will also assume that
names with more than one bearer, like (i), are lexically ambiguous; the terms
quantified over in the RPP must therefore be taken as disambiguated expressions rather than
surface forms.
Why should the RPP be less certain than the A/CP? One possible reason comes from
ontology, the study of what there is: the exact subject matter of sentences containing
referring terms is not necessarily clearly defined, let alone obvious.The reader is referred
to the relevant literature on the inscrutability of reference; Williams (2005) contains a
good, if biased, survey.
Putting ontological qualms aside, we may conclude from the RPP that whatever the
meanings of terms may be, they must somehow determine their reference conditions, i.e.
the referents as depending on the circumstances. According to a certain brand of truth-
conditional semantics, anything beyond this most marked trait of term meanings ought
to be left out of consideration (or to pragmatics), thus giving rise to a stronger version of
our 'minimalist' semantic axiom:
Extended Basic Principle of Truth-Conditional Semantics [EBPJ
Any two declarative sentences agree in meaning just in case they agree in their
truth conditions; and any two terms agree in meaning just in case they agree in
their reference conditions.
Like the BP, the EBP is merely a statement of covariance, not an equation. The -
common - identification of term meanings with reference conditions, also requires
further 'external' evidence. However, as in the case of the BP, we will read the EBP as
suggesting that term meanings be reference conditions. Following this suggestion, and
given Subst, the EBP may then be used to determine the meanings of at least some non-
sentential expressions in a non-arbitrary and at the same time language-independent
way, employing a heuristics based on:
33. Model-theoretic semantics
767
Frege's Functional Principle [FFP]
If not specified by the EBP, the meaning of an expression A is the meaning of
(certain) expressions in which A occurs as functionally depending on A's occurrence.
Since the functional meanings of parts can be thought of as contributions to host meanings,
FFP can be seen as a version of Frege's (1884: x) Context Principle that is not committed
to the radical form of truth-conditional semantics: 'nach der Bedeutung der Worter muss
im Satzzusammenhange, nicht in ihrer Vereinzelung gefragt werden' '[«'one must ask
for the meanings of words in their sentential context, not in isolation']. Obviously, FFP
is more specific; it also deviates from the original in having term extensions determined
directly,in accordance with the EBP- and Frege's (1892) later policy.
According to FFP, the meaning of an expression A that is not a sentence or a term,
turns out to be a function assigning meanings of host expressions (in which A occurs) to
meanings of positions (in which A occurs) - where the latter can be made out in case the
positions coincide with adjacent expressions whose meanings have been independently
identified - either by the EBP, or by previous applications of the same strategy. As a
case in point, the meanings of (coordinating) conjunctions C (like and and or) may be
determined from their occurrences in unembedded coordinations of sentences A and B,
i.e. sentences of the form A C B; these occurrences are completely determined by the two
coordinated sentences and may thus be identified with the ordered pairs (A,/?). Hence,
following, the meaning of a conjunction comes out as a function from pairs of sentence
meanings to sentence meanings, i.e. a function assigning truth conditions to pairs of truth
conditions. In particular, the meaning of and assigns the truth conditions of any sentence
A and B to the pair consisting of the truth conditions of A and the truth conditions of B.
In a similar vein, by looking at simple predications of the form T P (where T is a term),
the meanings of predicates P come out as functions assigning sentence meanings to term
meanings. Since the latter are determined by the EBP, predicate meanings are thus
construed as functions from reference conditions to truth conditions.This result may in turn
be used to obtain the meanings of transitive verbs V, which may take terms T as objects
to form predicates, the meanings of which have already been identified; hence, by another
application of FFP, the meaning of a transitive verb comes out as a function assigning
predicate meanings to reference conditions (= the meanings of terms in object position).
In one form or another, the heuristics just sketched has been applied widely, and
rather successfully, in linguistic semantics. Following it, a large variety of expressions
can be assigned meanings whose primary function is to contribute to truth and reference
conditions without depending on any particular language. Nevertheless, for a variety
of reasons, following FFP (in the way indicated here) does not settle the issue of
determining meaning in a fully principled, non-arbitrary way:
- As the above examples show, the meanings of expressions depend on a choice of
primary occurrences, i.e. positions in which they can be read off the immediate context.
However, in general there is no way of telling which of a given number of grammatical
constructions ought to serve this purpose; and the choice may influence the
meanings obtained by applying FFP. Quantifier phrases - expressions like nobody; most
linguists;every atom in the universe;elc- are cases in point. Customarily they are
analyzed by putting them in subject position and thus interpret them as functions from
predicate meanings (determined in the way above) to sentence meanings (determined
770
VII. Theories of sentence semantics
quantifier in subject position may depend on the value of the predicate extension for
nameless individuals, the aforementioned idealised generalisation of predicate
extensions to functions operating on arbitrary individuals turns out to be crucial for this step.
The above extensional adaptation is much more limited in its application than FFP
in general: it only works as long as extensions behave compositionally; otherwise it is
bound to fail; and there are a number of environments that cannot be analyzed by
compositionally combining extensions. Still, a considerable portion of the grammatical
constructions of English is extensional in that the extensional version of FFP may be applied
to them, thus extending the notion of extension to ever more kinds of expressions. In
the present and the following section we will be exclusively concerned with extensional
constructions and, for convenience, pretend that all constructions arc extensional.
The generalisation of extensions to expressions of (almost) arbitrary categories leads
to a natural theoretical construction of meanings in terms of extensions and
circumstances. The key to this construction lies in a generalisation of the EBP, which comes for
free for those expressions whose extensions have been constructed in the way indicated,
i.e. using the extensional version of FFP.To see this, we have to be more specific about
truth conditions and reference conditions (in the sense intended here): since they are
truth values and referents as depending on circumstances, we will henceforth take them
to be functions assigning the former to the latter. This identification obviously
presupposes that, whatever circumstances may be, they collectively form a set that may serve as
the domain of these functions.Though wc will not be entirely specific as to the nature of
this set, we assume that it is vast, and that its members do not miss a detail:
Vastness: Circumstances may be as hypothetical as can be.
Detail: Circumstances are as specific as can be.
Vastness reflects the raison d'etre of circumstances in semantic theory: according to the
EBP, they serve to differentiate the meanings of non-synonymous sentences and terms
by providing them with distinct truth values and referents, respectively. In order to fulfill
this task in an adequate way, actual circumstances - those met in the actual world -
obviously do not suffice, as a random example shows:
(5) The physicist who discovered the X-rays died in Munich.
(6) The first Nobel laureate in physics died in Munich.
As the educated reader knows, it so happens that the German physicist Wilhelm Conrad
Rontgen discovered a phenomenon known as X-rays in 1895, was subsequently awarded
the first Nobel prize in physics in 1901, and died in Munich in 1923. Hence the sentences
(5) and (6) are both true; and their subjects refer to the same person. In fact, (5) and
(6) are true under any actual circumstances; likewise, their subjects share their referent
under any actual circumstances. Nevertheless, (5) and (6) are far from being synonymous,
and neither are the definite descriptions in their subject position. In order to reflect these
semantic differences in their truth and reference conditions, respectively, circumstances
are called for under which (5) and (6) differ in truth value and, consequently, their
subjects do not corefer. Such circumstances are not hard to imagine; the inventive reader
33. Model-theoretic semantics
771
is invited to concoct his or her pertinent favourite scenario. Since it is hardly
foreseeable what kinds of differences are needed in order to separate two non-synonymous
sentences or terms by circumstances under which their extensions do not coincide, we
will assume that any conceivable circumstances count as evidence for non-synonymy,
however far-fetched or bizarre they may be. The following pair, inspired by Macbeath
(1982), is a case in point:
(7) Dr Who ate himself.
(8) Dr Who is his own father.
Lest both (7) and (8) come out as false for all circumstances and thus as synonymous,
there better be circumstances that contradict current science, thereby reflecting the
meaning difference between the two sentences.
Detail is an assumption made mostly for convenience and definiteness;but once made,
it cannot be ignored. It captures the idea that circumstances are chosen in such a way as
to determine the truth values of any sentence whatsoever. As far as actual circumstances
are concerned, this is a rather natural assumption. My present circumstances are such
that I am sitting in a train traveling through northern regions of Germany at a fairly
high speed. I have no idea what precisely the speed is, nor do I know where exactly the
train is right now, how many passengers arc on it, how old the engineer is, etc. But all of
these details, and innumerable others, are resolved by these circumstances, which are as
specific as can be.The point of Detail is that counterfactual (= non-actual) circumstances
are just as specific. Hence if I had been on a space shuttle instead of riding a train, the
circumstances that I would have been in would have included a specific velocity, a specific
number of co-passengers, etc.To be sure, these details could be filled in in different ways,
all of which correspond to different counterfactual circumstances. Given Detail, then,
counterfactual circumstances are unlike the ordinary conception of worlds of
imagination or fiction: stories and novels usually, nay always, leave open a lot of details - like the
exact number of hairs on the protagonist's head, etc. If anything, worlds of fiction
correspond to sets of circumstances; for instance, 'the world of Sherlock Holmes' corresponds
to the set of (counterfactual) circumstances that are in accordance with whatever Conan
Doyle's stories say, and disagree on all kinds of unsettled details.
As was said above, we assume that the totality of all actual and counterfactual
circumstances forms a rather large set which is called Logical Space; cf. Wittgenstein (1922:
3.4 & passim). We will follow logical and semantic tradition and refer to the elements
of Logical Space as possible worlds rather than circumstances, even though the term is
slightly misleading in that it not only suggests lack of detail (as was just noted) but also a
certain grandeur, which the members of Logical Space need not have; it will be left open
here whether possible worlds are all-inclusive agglomerations of facts, or whether they
could also correspond to more mundane (!), medium-sized situations. What is
important is that Logical Space is sufficiently rich to differentiate the meanings of arbitrary
non-synonymous sentences and terms. Yet even the macroscopic Vastness of Logical
Space and microscopic Details of its worlds cannot guarantee that any two sentences
that appear to differ in meaning also differ in their truth conditions. Thus, e.g., (9)-(10)
are true of precisely the same possible worlds, but arguably differ in meaning:
774
VII. Theories of sentence semantics
Other (extensional) constructions may require some ingenuity on the semanticist's part
in order to determine the precise way in which extensions combine. Quantified objects
are an infamous case in point. As the reader may verify, for any possible world w, the
extension of a predicate of the form V Q, where V is transitive verb and Q is a quantifier
phrase (in direct object position),can be given by the following equation:
{l5)\\VQ\\» = \x.\\Qr{\y.\\VV{y){x))
Like (14), equation (15) completely specifies the extension of a certain kind of
complex expression in terms of the extensions of its immediate parts; and it does so
in a perfectly general manner, for arbitrary worlds w. By definition, this is so for
all extensional constructions, which arc extensional precisely in that the extensions of
the expressions they combine determine the extensions of the resulting phrases,
thus giving rise to general equations like (14) or (15). In particular, then, the
combinatorial part (b) of the specification of extensions does not depend on the particular
circumstances envisaged. It thus turns out that the world dependence of extensions
is entirely a matter of lexical specification (a). Hence the role worlds play in the
determination of extensions can be seen as being restricted to the extensions of lexical
expressions - the rest is compositionality, captured by general equations like (14) and
(15). To the extent that the determination of extensions is the only role possible worlds
w play in (compositional extensional) semantics, they could be identified with
functions assigning extensions to lexical expressions. If A is a lexical expression, we would
thus have:
(16) ||A||" = FH.(A)
In (16) Fw is a function assigning to any expression in its domain the extension of that
expression at world w. What precisely is the domain of FM.? It has just been noted that
it only contains lexical expressions - but does it include all of them? A brief
reflection shows that this need not even be so; for some of the equations of the form (16)
do not depend on a particular choice of w. For instance, as noted above, under any
circumstances w the extension of a (sentence-) coordinating conjunction and is a binary
function on truth values; the same goes for other 'logical' particles:
(17) a. ||and||"=XM.Xv.Mx v [= (12e)]
b. ||or||**' = Xu.Xv. (u + v) - (u x v)
c. ||not||H' = Xw.l -u
The equations in (17) imply that, in any worlds w and w\ the extensions of and, or, and
not remain stable: ||and||H' = ||and||H', ||or||H' = ||or||H', and ||not||H' = ||not||""'.This being
so, the specifications of the world-dependent extensions of lexical expressions may safely
skip such logical words. A similar case can be made for quantificational determiners and
the is of identity:
(18) a. ||every||H=XP.X0.l-Pc0H [= (12a)]
b. ||no||H- = XP.X0.l-Pn 0 = 0-1
c. \\\s\\w = Xx.Xy.\-x=y-\
776
VII. Theories of sentence semantics
(ii-a) leveryl^'1 = XP. XQ. \- P c QH where PcUw and 0c Uw
(Hi) \A\Mw = Fw(A)MAeNE
(iv-a) \DN\M» = \D\M"(\N\M«)
if D N is a quantifier phrase, where D is a quantificational determiner and
TV is a count noun;
In order to complete (29), clauses (i) and (ii) would have to take care of all logical words
of E. We will now give a semantic characterisation of logical words that helps to draw the
line between the non-logical part NE- or NL in general - and the rest of the lexicon. We
have already seen that one characteristic feature of logical words is that their extension is
largely world-independent: it may depend on which individuals there are in the world but
it does not depend on any particular worldly facts. However, having a world-independent
intension (even stricto sensu) is not sufficient for being a logical word. Indeed, there
arc good reasons for taking the intensions of proper names as constant functions over
Logical Space.; cf. Kripke (1972), where terms with constant intensions are called rigid
designators. Still, a name like Jesus can hardly be called logical even if its bearer turned
out to be present in every possible world. What, then, makes a word, or an expression
in general, logical') The (rather standard) answer given here rests on the intuition that
the extension of a logical word can be described in purely structural terms. As a case in
point, the determiner no is logical because its extension may be described as the relation
that holds between two sets of individuals just in case they do not overlap - or, in terms
of characteristic functions: that function which, successively applied to two characteristic
functions (of sets of individuals), yields (the truth value) 1 just in case these functions do
not both assign 1 to any individual. If one thinks of functions as configurations of arrows
leading from arguments to values, then the extension of no - and that of a logical word
in general - can be described solely in terms of the abstract arrangement of these arrows,
without mentioning any particular individuals or other extensions - apart from the truth
values. In other words, the descriptions do not depend on the identity of the individuals,
which may be anything - or replaced by any other individuals. The idea, then, is to
characterise logical words as having extensions that arc stable under any replacement of
individuals. In order to make this intuition precise, a bit of notation should come in handy.
It is customary to classify the extensions of expressions according to whether they
derive (i) from the EBP, or (ii) by application of FFP: (i) the extensions of sentences
are said to be of type t, those of terms are of type e; (ii) if the extension of an expression
operates on extensions of some type a resulting in extensions of some type b, it is said to
be of type (a,b). In other words, (i) t is the type of truth values;e is the type of individuals
(or entities); (ii) (a,b) is the type of (total) functions from a to b. More precisely, t is a
canonical label of the set of truth values, etc.; 'f' is mnemonic for truth value, V
abbreviates entity. In this notation, the extensions of sentences, terms, coordinating conjunctions,
predicates, transitives, quantificational phrases, and determiners are of types t; e; (t, (t,t));
(e,t); (e,(e,t)); ((e,t),t); and ((e,t),((e,t),t)), respectively. It should be noted that, unlike the
extensions of the expressions, their types remain the same throughout Logical Space
(and thus across all material models).The function \L assigning to each expression of a
33. Model-theoretic semantics
777
language L the unique type of its extensions is called the type assignment of L; we take xL
to be part of the specification of the language L (and will usually suppress the subscript).
Now for the characterisation of logicality. Given (not necessarily distinct) possible
worlds w and w\ a replacement (of w by w') is a bijective function from the domain
of individuals of w to the domain of individuals of w' (which must therefore have the
same cardinality). Replacements may then be generalized from individuals (of type e)
to all types of extensions: given a replacement p (from w to w'), the following recursive
equations define corresponding functions pa, for each type a:
- pe = p;
- P, = l(0*0)*(M)) [= Xjc.jc, where 'jc' ranges over truth values];
" P(a.h) = Xf- l(P„M' P*00l /W = y\ [where lf ranges over functions of type (a,b)].
In other words, the generalised replacement leaves truth values untouched - because
they define the structure of extensions like characteristic functions - and maps functional
extensions to corresponding functions, replacing arrows from x to y by arrows between
substitutes; it is readily seen that, for any type a, pa is the identical mapping on the
extensions of type a if (and only if) pis the identical mapping on the domain of individuals.
Given the above generalisation of replacements from individuals to objects of arbitrary
types, a logical word A can be defined as one whose extensions cannot be affected by
replacements. More precisely, if / is an intension of type a (i.e. a function from W to
extensions of type a), then / is (replacement-) invariant just in case for any replacements
pandp'of why some world tv',it holds that pa(f(w)) = p'a(f(w));and logical words may
then be characterised as lexical expressions A with invariant intensions: PT(/l)(llA||H) =
p'T(/l)(||A||H),for any worlds w and replacements p and p' of w by some world w\
The definition implies that pT(A)( IIA ||H) = || A ||,v, whenever A is a logical word and p is a
replacement from a world w to itself (or any world with the same domain of individuals) -
which is a classical criterion of logicality, going back to Lindenbaum &Tarski (1935), and
rediscovered by a number of scholars since; cf. M acFarlane (2008: Section 5) for a survey.
Extensive treatments of replacements in (models of) Logical Space can be found in Fine
(1977) and Rabinowicz (1979).
Derivatively, we will also call extensions of types a invariant if they happen to be the
values of invariant intensions of type a. Invariant extensions are always of a peculiar,
combinatorial kind. In particular, and ignoring all too small universes (of cardinalities
1 and 2), extensions of type e are never invariant; there are only two invariant extensions
of type (e,t), viz. the empty one and the universe; and four of type (e,(e,t)), viz. identity,
distinctness, and the two trivial (empty and universal) binary relations;extensions of type
((e,t),t) are invariant just in case they classify sets of individuals according to their
cardinalities; etc. As a consequence, the extensions of logical words are severely limited by the
invariance criterion. Still, it takes more for an intension to be invariant than just having
invariant extensions; in particular, a function that assigns different invariant extensions
to worlds with the same domain cannot be invariant. More generally, if the domains
U and U' of two material models M and M' have the same cardinality, the invariance
condition on logical words of the second kind (ii) ensure that, intuitively speaking, they
have analogous extensions.Thus, e.g., a determiner whose extension relative to M is the
subset-relation on U will denote the subset on U' in M'.
33. Model-theoretic semantics
779
by (22a) is the case iff [w | |5|'M» = 1} c [w | |5'|'M»- = 1). Similar arguments apply to the
other sense relations defined above.
Let us define the L-intension \A\ of an expression A (of a language L) as the
function that assigns A's extension (according to MH) to each material model: |A|(JWH.)
= |A|-'M»-, i.e. \A\ = ^MH.. |A|-'M»-. Observation (22) shows that L-intensions are as good as
intensions when it comes to determining extensions; and we have seen that they may
also be used to reconstruct sense relations defined via intensions, basically on account
of (22b). Moreover, the Boolean structure of the set of propositions expressible in L (=
those that happen to be the intensions of a sentence of L) turns out to be isomorphic to
the Boolean structure of L-propositions (= the L-intcnsions of the sentences of L): as
the patient reader may want to verify, the mapping from ||5||H to iSl^tv is one-one
(infective) and preserves algebraic (Boolean) structure in that ||S||B" c ||S'||B" iff |S|'M»- c \S'\M»;
\\S\\wn ||5'||H'= |5|:M» n |5'|'M'setc.The tight relation between Logical Space and the set
of material models can also be gleaned from considering worlds that cannot be
distinguished by the expressions of a given language L:
Definition
If w and w' are possible worlds and L is a language, then w is L-indistinguishable
from tv-in symbols: w = Lw' - iff ||A||,V = ||A||H", for any expression A of L.
In general, L-indistinguishability will not collapse into identity.The special case in which
it does, will be marked for future reference:
Definition
A language L is discriminative iff no two distinct possible worlds w and w' are
L-indistinguishable.
Hence, in a discriminative language L any two distinct possible worlds w and w\can be
distinguished by some expression A: ||A||"'* ||A||H'';it is not hard to see that in this case
the material models stand in a one-one correspondence with Logical Space. In particular,
if L contains a term the referent of which at a given world is that world itself, L will
certainly be discriminative; arguably, the definite description the world is such a term,
thus rendering English discriminative. In any case, for almost all languages L, any two
L-indistinguishable worlds give rise to the same material model:
(23) If w = Lw\ then Mw = Mw..
As a conscquence,material models stand in a one-one correspondence to the equivalence
classes induced by L-indistinguishability (which is obviously an equivalence relation).
Since L-intensions and material models are suited to playing the role of worlds in
determining extensions and to characterising Boolean structure,one may wonder whether
they could replace worlds and intensions in general. In other words, is it possible to do
(extensional) semantics without Logical Space altogether? In particular, could meaning
be defined in terms of material models instead of possible worlds? Ontological thrift
seems to support this option: on the face of it, material models are rather down-to-earth
mathematical structures as compared to the dubious figments of imagination that
780
VII. Theories of sentence semantics
possible worlds may appear to the semantic layman. However, this impression is
mistaken: though material models are mathematical structures, they consist of precisely the
stuff that possible worlds are made of, viz. fictional individuals under fictional
circumstances. Still, replacing intensions with L-intensions may turn out to be, if not ontologi-
cally less extravagant, at least theoretically more parsimonious, sparing us a baroque
theory of Logical Space and its structure. However, this is not obvious either:
- So far material models have been defined in terms of Logical Space, since the
extension a material model assigns to an expression A depends on details about w. If
material models arc to replace worlds altogether, an independent characterisation for
them would have to be given. For instance, we have seen that no possible world
separates (9) from (10), particularly because there is no intensional difference between
woodchuck and groundhog. Hence these two nouns have the same L-intension: this
is a consequence of the above definition of a material model, relating it to the world
w on which it is based; it is reflected in the general equation (20m). Now, if the very
notion of a material model is going to be defined independently of Logical Space,
then the coincidence of |woodchuck|-'M>» and |groundhog| M» in all material models Mw
would have to be guaranteed in some other way, without reference to the worlds these
models are based on. Hence some restriction to the effect that no model can assign
different extensions to the nouns under consideration would have to be formulated.
And even if in this particular case the restriction should appear unwelcome, there arc
other cases in which similar restrictions are certainly needed - like the infamous
inclusion relation between the extensions of bachelor and unmarried. Arguably, at the end
of the day the restrictions to be imposed on the material models amount to a theory
of Logical Space; cf. Etchemendy (1990:23f) for a similar point.
- Whereas Logical Space is absolute, material models are language-dependent. In
particular (and disregarding neurotic cases), for any two distinct languages L and L', the
sets of material models of L and of L' will not overlap. Hence if material models were
to replace worlds in semantic theory, expressions from different languages would
always have different intensions. In particular, sentences would never be true under
the same circumstances, if circumstances corresponded to models.To make up for this
obvious inadequacy,some procedure for translating material models across languages
would be called for. For instance, a material model for English assigning a certain
set of individuals to the noun groundhog would have to be translated into a
material model for German assigning the same set to the noun Murmeltier. In general,
then, circumstances would correspond to equivalence classes of material models of
different languages. Obviously, at the end of the day the construction of these
equivalence classes again recapitulates the construction of Logical Space;presumably, a one-
one correspondence between material models and possible worlds requires an ideal
'language' of Logical Space, as envisaged by Wittgenstein (1922).
It thus appears that Logical Space can only be eliminated from (extensional) semantics
at the price of a theory of something very much like Logical Space. The upshot is that
neither ontological objcctionability nor theoretical economy arc sufficient motives for
reversing the conceptual priority of Logical Space over the set of material models. On
the other hand, this does not mean that material models ought to be dispensed with. On
33. Model-theoretic semantics
781
the contrary, they may be seen as compressions of possible worlds, reducing them to the
barest necessities of semantic analysis.
2.3. Intensionality
The adaptation of FFP to derive extensions for expressions other than sentences and
terms only works in so-called extensional contexts, i.e. (syntactic) constructions in which
extensionally equivalent parts may replace each other without affecting the extension of
the host expression. However, a number of environments prove to be non-extensional.
Clausal complements to attitude verbs like think and know are classical cases in point:
their contribution to the extension of the predicate cannot be their own extension which
is merely a truth value; if it were, materially equivalent clausal complements (= those
with identical truth values) would be substitutable salva veritate (= preserving the truth
[value] of the host sentence) - which they are not. E.g., (24) and (25) may differ in truth
value even if the the underlined clauses do not.
(24) John thinks Mary is home.
(25) John thinks Ann is pregnant.
This failure of substitutivity obviously blocks the derivation of the extension of think
via the extensional version of FFP: if it were a function / assigning the predicate
extension to the extension of the complement clause, then, as soon as the complement clauses
have the same truth value (and are thus co-extensional),/ would have to assign the same
extension to the predicates:
(26) If: || Mary is home||" = || Ann is pregnant ||",
then: /(|| Mary is home||") =/( || Ann is pregnant||").
The argument (26) shows that an attitude verb like think cannot have an extension /
that operates on the extension (= truth value) of its complement clause. It does not show
that attitude verbs do not have extensions, nor that FFP cannot be used to determine
them. Rather, since FFP seeks to derive the extension of an expression in terms of the
contribution made by its natural environment (or primary context), the lesson from
(26) ought to be that this contribution must consist in more than a truth value. On the
other hand, given Subst, it is safe to assume that it is the intension of the complement
clause; cf. Frege (1892). After all, by the BP, any two intensionally equivalent sentences
are synonymous and may thus replace each other in any (sentential) position - including
non-cxtcnsional ones - without affecting the truth conditions of the host sentence. In
particular, any two sentences of the forms T thinks S and T thinks S' (where T is a term)
will be synonymous if the sentences S and S' have the same intension. But then at any
possible world, the extensions of the predicates thinks S and thinks S' coincide, and thus
so do their intensions. In other words, there can be no difference in the extensions of
the predicates without a difference in the intensions of the complement clauses, which
is to say that the former functionally depend on the latter. Following the spirit of FFP,
then, one can think of the intensions of the embedded clauses as the contributions they
33. Model-theoretic semantics
783
extensions relative to material models of E may also contain clauses like the following,
where £ is a more inclusive fragment of English than E:
(iv-c) \VS\M» = |V\M»- (kw'. \S\*w)
if V S is a predicate, where V is an attitude verb and 5 is a clausal complement.
Equations like (iv-c) show that the programme of eliminating Logical Space in favour of
the set of material models cannot be upheld beyond the realm of extensional
constructions. For,unlike the intension of the embedded clause, Xtv'.^I^ISl'*1"', the corresponding
L-intension, \MW>. |S|'M»', cannot serve as an argument to the extension \V\M* of the
attitude verb: the latter is the value of a lexical extension assignment /v, which itself is a
component of Mw, which in turn is a member of the domain of XMW<. |5|'M»' - which cannot
be, for set-theoretic reasons.
It should be noted that the intension of the embedded clause is defined in terms of its
extensions relative to all other material models. While this does not present any technical
problem, it does make the material models less self-contained than their extensional
counterparts, which contain all the information needed to determine the extensions of all
expressions. If need be, this seeming defect can be remedied by tucking in all of Logical
Space and its inhabitants as components of material models:
Definition
Given a possible world w* and a language L, the intensional material model {for
L based on w*) is the quadruple J#H., = (W, U, w*, F) consisting of the set W of
all possible worlds; the domain function U assigning to each possible world w the
domain of individuals Uw; the world w* itself; and the lexical interpretation
function F which assigns to every non-logical lexical expression A of L the intension
of A at w.
Two complications arising from this definition arc worth noting:
- Universal and existential quantification over possible worlds are prime candidates for
logical operations of type ((s,t),t), the former yielding the truth value 1 only if applied
to Logical Space itself, whereas the latter is true of all but the empty set of possible
worlds. Since the extensions of lexical expressions need not be of extensional types,
the logicality border needs some adjustment. For the time being we will leave this
matter open, returning to it in Section 3.2 with a natural extension of the replacement
approach to logicality.
- Lest clauses like (iv-c) should make reference to anything outside the intensional
model, the lexical interpretation function is defined for all worlds w. As a
consequence, two distinct intensional material models only differ in their 3rd component,
and each determines the extensions of all expressions at all possible worlds. Hence the
procedure for determining the extensions is more general than expected; its precise
format will also be given in Section 3.2.
Intensional material models are rather redundant objects, which is why they are not
used in real-life semantics; we have mainly defined them here for future reference. Once
again, it is obvious that set theory precludes Logical Space as it appears in them, from
784
VII. Theories of sentence semantics
being replaced by the set of all intensional material models. On the other hand, its role
might be played by the set of extensional material models, i.e. those that only cover
the extensional part of L - containing only its lexical expressions with extensions of
extensional types, and its extensional syntactic constructions.
3. Model-theoretic semantics
3.1. Extensional model space
One rather obvious strategy for constructing (extensional) models within set theory is
to start with material models and replace their individuals by (arbitrary) set-theoretic
objects; as it turns out, this can be done by a straightforward generalisation of the
replacements used in the above characterisation of logical words (cf. Section 2.2). However,
since the individuals to be replaced also function as the building blocks of extensions, the
latter will have to be generalised first. Given any non-empty set {/, the U-extensions of
type e arc the elements of {/; the {/-extensions of type t arc the truth values; and if a and
b are extensional types, then the {/-extensions of type (a,b) are the (total) functions from
{/-extensions of type a to {/-extensions of type b. Hence, {/-extensions (of extensional
types) are to the elements of {/what ordinary extensions are to the individuals of a given
possible world; and clearly, ordinary extensions come out as {/^-extensions, where U
happens to be the domain of individuals in w.
Definition
Given a language L, a formal model (for L) is a pair !M = ({/, F) consisting of a
non-empty set U (= the universe of M) and a function F which assigns to every
non-logical lexical expression A of L a {/-extension of type *L(A).
Using the recursive procedure applied earlier, we can extend any bijection p between
any (not necessarily distinct) non-empty sets U and U of the same cardinality, to a family
of functions pa taking {/-extensions to corresponding {/-extensions (where a is an
extensional type): pe = p; pt(x) = x, if x is a truth value; and ph(f(x)) = p(a h)(f)(p „(■*)),
whenever/and x arc {/-extensions of types (a,b) and a, respectively. It is readily verified that
replacements assign structural analogues to the {/-extensions they are applied to. For
instance, if/is a {/-extension of some type a, then p(af)(/) is (i.e., characterises) the set of
all {/-extensions of the form pa(x), where x is a {/-extension of type a; in particular, and
given that pa is a bijection,/and p(a t)(f) are of the same cardinality. It is also worth noticing
that the values of invariant extensions are themselves invariant, in an obvious sense:
Observations
Let {/H. be the domain of individuals of some world w, X an invariant Uw-
extension of some type a, and p a bijection from Uw to a set U of the same
cardinality. Then:
(*) p'a{X) = p"a(pa(X)), for any bijections p' and p" from Uw and U to some set
{/' of the same cardinality, respectively;
(**) pa(X) = p'a(X), for any bijection p* from UH. to U.
The proofs of (*) and (**) are rather straightforward and thus left to the readers.
33. Model-theoretic semantics
787
this representation is a perfect match if the language L is discriminative. However, even
if the material models correspond to the possible worlds in a one-one fashion, there is
no guarantee that so do the classes of ersatz models in (34). More precisely, even if (35)
holds of any worlds w and w' and the corresponding material models Mw and Mw. the
analogous implication (36) about the latter and their representations in Ersatz Space
need not be true:
(35) If w * w\ then Mw * Mw.
(36) If Mw * Mw. then |3f Js * \MJS
In other words, distinct material models may be represented by the same ersatz models
- which will be the case precisely if L allows for distinct, but isomorphic material models
in the first place. Discriminativity does not exclude this possibility: it only implies some
extensional differences between any two material models; but these differences could be
made up for by the replacements used in representing material models in Ersatz Space. In
order to guarantee a perfect match between Logical Space and Ersatz Space, a stronger
condition on L is needed than discriminativity. The natural candidate is completeness,
which is defined like discriminativity, except that it is not based on indistinguishability
but the weaker notion of equivalence:
Definitions
If w and w' are possible worlds and L is a language, then w is L-equivalent to w' -
in symbols: w =*Lw' - iff ||5||H = ||5||H', for any declarative sentence S of L.
A language L is complete iff no two distinct possible worlds w and w' are
L-equivalent.
Unlike L-indistinguishability, L-equivalence is not affected by replacements in that
it only concerns the truth values of sentences rather than the extensions of arbitrary
expressions.
The following observations about arbitrary worlds w and w' and languages L are not
hard to establish:
(37) a. If w= Lw\ then w*= Lw'.
b. If L is complete, then L is discriminative.
c. If L is complete and w * w', then |^MJ_* l-^H.|=
In effect, (37c) says that, via the isomorphicity classes, the Ersatz Space of a complete
language matches Logical Space. Thus, completeness plays a similar role for the
adequacy of ersatz models as does discriminativity in the case of material models. However,
whereas discriminative languages are easy to find, completeness seems a much rarer
property. Of course, if a language is incomplete, its ersatz models could still match
Logical Space in the sense of (37a), but then again their isomorphicity classes may equally
well contain, and thus conflate, representations of distinct worlds. However, since the
differences between distinct worlds that correspond to the same ersatz models are -
by definition - inexpressible in the language under investigation, this imperfect match
788
VII. Theories of sentence semantics
between Logical Space and Ersatz Space does not necessarily conflict with the general
programme of replacing possible worlds with formal models. As far as its potential in
descriptive semantics goes, then, Ersatz Space does seem to earn its name. However,
as the reader will have noticed, its very definition still appears to be of no avail when
it comes to soothing ontological worries: by employing the concept of representation,
it depends on material models - and thus presupposes Logical Space. To overcome this
embarrassment, a definition of Ersatz Space is needed that does not rely on Logical
Space. The usual strategy is one of approximation, starting out from a maximally wide
model space and gradually restricting it by eliminating those formal models that do not
represent any material counterparts; the crucial point is that these restrictions be
formulated without reference to Logical Space, i.e. in the language of pure set theory, or
with reference only to urelements that are less dubious than the inhabitants of Logical
Space. We will refer to the natural starting point of this enterprise as L's Model Space,
and identify it with the class of all pure models (for a given language L), i.e. all formal
models whose universe is a pure set. It should be noted that the concept of a pure model
only depends on L's syntax and type assignment, both of which in principle may be given
in terms of pure set theory.
In order to exploit Model Space for semantic purposes, the procedure for determining
the extensions of expressions must be generalised from ersatz models to arbitrary pure
models. Hence, for any pure model M = (U, F), the extensions of logical expressions
need to be specified, as well as the combinations of extensions corresponding to the
syntactic constructions. As in the case of Logical Space and Ersatz Space, this should not
pose any particular problems. In fact, these extensions and combinations are invariant
and should only depend on the universe U. Thus, for those U that happen to be of the
same cardinality as the universe Uw of some material model Mw = (Uw, FB>), their
specifications may be taken over from Ersatz Space that is bound to contain at least some pure
model with universe U (and isomorphic to ^MH). For all other models they would have
to be suitably generalised. Thus, e.g., it is natural to define the extension of the English
determiner every as characterising the subset relation on a given universe U, even if
the latter is larger than any of the universes Uw encountered in Logical Space; similarly,
the extension of a quantifier phrase D N, where D is a determiner and N a count noun,
can be determined by functional application - which, suitably restricted, is an invariant
{/-extension of type ((e,t),t),(e,t);eic.
At the end of the day, then, the formal models resemble their material and ersatz
counterparts. In particular, the specification of extensions is strikingly similar:
(38) For any expression A of £ and any formal model M = (U, F) for E, the extension
of A relative to M - [A]-"*1 - is determined by the following induction (on the
grammatical complexity of A):
(i-a) [and]:M = Xu. Xv. u x v ... where ue {0,1} and ve {0,1}
(ii-a) [every]*1 = XP. XQ. \-P c Q-\ where P^U and Q^U
(Hi) [A]-w = FH.(A),ifAGyV£
(iv-a) [D N}M = \D\M ([S]:Kt)
33. Model-theoretic semantics
789
if D N is a quantifier phrase, where D is a quantificational determiner and
N is a count noun;
Given specifications of extensions in the style (38), one may start approximating the
Ersatz Space of a language L by restricting its Model Space. A common strategy to this
end is to eliminate models that have certain sentences come out false, viz. analytic
sentences that owe their truth to their very meaning. As a case in point, one may require of
appropriate models for English that the extension of (39) be the truth value 1. For the
sake of the example it may be assumed that the extension of a predicative adjective is a
set of individuals and that it is passed on to the predicate (Copula + Adjective); cf. Heim
&Kratzer(1998:6lff).
(39) No bachelor is married.
More generally, a set £ of sentences of L may be used to characterise the class of all pure
models M (for L) such that [5]|'M = 1, for all Sel. Ever since Carnap (1952), sentences
of a language L that are used to characterise the appropriateness of formal models L are
called meaning postulates; and we will refer to a set of sentences used for this purpose as
as a postulate system.
Obviously, the truth of a meaning postulate like (39) may guarantee the truth or
falsity of certain other sentences:
(40) Every bachelor is not married.
(41) Some bachelor is married.
Once (39) is adopted as a meaning postulate, there is no need to also take on (40),
or to rule out (41) separately, because the truth values of these two sentences come
out as intended in any formal model for English relative to which (39) is true.
Alternatively, if (40) is taken as a meaning postulate, (39) and (41) come out as desired. As this
example suggests, when it comes to approximating Ersatz Space by meaning postulates,
there need not be a general criterion for preferring one postulate system over another.
And it is equally obvious that the effect of a meaning postulate, or a postulate system,
may be achieved by other means. Thus, e.g., the effect of adopting (39) as a meaning
postulate may also be obtained by the following constraint on appropriate formal models
M = (U,F) for E:
(42) F(bachelor) n F(married) = 0
In other words, the effect of adopting (39) or (40) as a meaning postulate is to establish
a certain sense relation between the noun bachelor and the participle married (which
we take to be a lexical item, if only for the sake of the example). The relation is known
as incompatibility and holds between any two expressions just in case their extensions
of some type (a,t) cannot overlap. For future reference, we also define the relation of
compatibility as it holds between box and wooden - and in general between expressions
A and B if their extensions may overlap. Interestingly, compatibilities do not have to be
790
VII. Theories of sentence semantics
established by meaning postulates because they are guaranteed by the existence of any
model attesting the overlaps; hence compatibility will hold as long as Model Space is not
(erroneoulsy) narrowed down so as to exclude all of these models.
Obviously any meaning postulate S can be replaced by a corresponding constraint on
formal models to the effect that S come out true; and given a procedure for determining
the extensions of arbitrary expressions, this constraint can be given directly in terms of
the extensions of the (non-logical) lexical expressions S contains. On the other hand, not
every constraint on formal models needs to correspond to a meaning postulate, or even
a postulate system;cf.Zimmermann (1985) for a concrete example.
On top of restrictions on the range of possible extensions of lexical expressions, more
global constraints may rule out pure models that do not belong to Ersatz Space for
cardinality reasons. In general, then, the set-theoretic reconstruction of Logical Space consists
in formulating suitable constraints on the members of Model Space, i.e. the pure models.
Taken together, these constraints define a class DCof appropriate models (according to the
constraints). Ideally, this class should coincide with Ersatz Space.The closer DCgeis to this
ideal, and the more it coincides with Ersatz Space in semantically relevant aspects, the
more descriptively adequate will ^Cbe.Thus, e.g., the set of DC-valid sentences - those that
are true according to all DAeDC- should be the same as the set of sentences that are true
throughout Logical Space.
If two formal models are isomorphic, both represent precisely the same material
models, and hence there is no reason for ruling out (or counting in) one but not the other.
Appropriateness constraints are thus subject to the following meta-constraint on classes
DCoi appropriate models:
(43) If DA = D4\ then D4e DC iff D4'e DC.
In mathematical jargon, (43) says that the class DC of appropriate models is closed under
isomorphism: if D4 and D4' are isomorphic formal models (for some language L), then
DA is appropriate according to a given set of constraints just in case D4' is appropriate
according to the same constraints.
Two things about (43) are noteworthy. First, as a consequence of (31), constraints
formulated in terms of meaning postulates (or postulate systems) always satisfy (43):
any two isomorphic models make the same sentences true. Secondly, if the constraints
are jointly satsifiable at all (which they should be), the appropriate models always form
a proper class; this is so because for any formal model D4 = (U,F) and any set U' of the
same cardinality as U, there is an isomorphic model of the form D4' = (U\ F') - and
hence any pure set whatsoever will be a member of some universe of an appropriate
formal model.
One interesting consequence of (43) is that, no matter how closely a given system of
constraints and/or meaning postulates may approximate Logical Space, it will never be
able to pin down a specific model, let alone specific extensions of all expressions. In fact,
given any term A (of some language L) and any pure set x, there will be an appropriate
model D4x = (Ux, F) (for L), according to which [Aj^ = x; D4x may be constructed from
any given appropriate model D4 = (U, F) by replacing [A]|'M with x (and
simultaneously x with U, in case xe U) - thereby preserving appropriateness, by (43). This only
reflects the strategy of having inhabitants of Logical Space represented by arbitrary set-
thcorctic objects and is thus hardly surprising. However, a similar line of thought docs
33. Model-theoretic semantics
791
give rise to interesting consequences for the relation between reference and truth. This
is the gist of the following permutation argument, made famous by Putnam (1977,1980)
with predecessors including Newman (1928:137ff), Jeffrey (1964:82ff), Field (1975), and
Wallace (1977);see Devitt (1983), Lewis (1984), Abbott (1997),Williams (2005:89ff), and
Button (2011) for critical discussion of its impact.
Given a model M for a language L, the extensions according to any isomorphic model
M* may be characterised directly by an inductive specification in terms of M, without
mention of M*. For the present purposes, it suffices to consider the special case in which
M = (U,F) and M* = (£/*, F*) are models for our extensional fragment of English and
share their universe U = U*. More specifically, since any bijection n on U is a model-
isomorphism from M to a model M* = (U*,F*), the following induction characterises the
extensions relative to M* entirely in terms of M:
(44) For any expression A of £ and any formal model M = (U,F) for £, the permuted
extension of A relative to M- IIAIIM - is determined by the following induction (on
the grammatical complexity of A):
(i-a) Hand//™ = \u.\v. u x v ... where ue {0,1} and ve {0,1}
(ii-a) //every//^ = XP XQ. \-P c Q-\ where Pcf/ and Q^U
(Hi) IIAll™ = 7tT(/l)(FH.(A)), if As NE
(iv-a) IID Nil™ = IIDII'M (IIAII'M)
if D N is a quantifier phrase, where D is a quantificational determiner and N
is a count noun;
Obviously, IIAllM = rcT(/t)([A]|'M) = [Aj^*, for any expression A of £. Moreover, there
is a striking similariy between (44) and the specification (38) of the extensions relative
to !M. In fact, the only difference lies in clause (Hi): according to (44), non-logical
words are assigned the 7t-image of the {/-extension they are assigned according to (38).
The formulation of (44m) may suggest that the values IIAII'M somehow depend on the
permutation k when of course, they are perfectly independent {/-extensions in their
own right - just like the values [Afl^: any instantiation of either (38m) or (44m) will
assign some specific {/-extension to some specific expression. In particular, then, (44)
is no more complicated or roundabout than (38); it is just different. Yet, as far as the
specification of the truth values of sentences 5 of L is concerned, the two agree, since
nsirM = *T([s]TM) = [syM.
The construction (44) can also be carried out if M = Mw is a material model. Of
course, in this case the permutation model M* cannot be expected to be a material
model too, but then again it is not needed for the definition of the //A/^-valucs anyway.
All that is needed is a permutation k of the domain of w. (44) will then deliver exactly
the same truth valuation of L as (38).The comparison between (44) and (38) thus
illustrates that truth is independent of reference in that the latter is not determined by the
former. In particular, then, although the extensions of terms help determining the truth
values of the sentences in which they occur, this role hopelessly undcrdctcrmincs them:
792
VII. Theories of sentence semantics
if reference is merely contribution to truth, then reference is arbitrary;else, reference has
to be grounded independently.
3.2. Intensional Model Space
Beyond the realm of extensionality the strategy of approximating Logical Space in terms
of formal models is bound to reach its limits. Even when restricted to the extensional
part of a language L and constrained by meaning postulates or otherwise, Model Space
is a proper class and thus cannot serve as the domain of any function. As a consequence,
the set-theoretic reconstruction of intensional semantics cannot have intensional models
assign L-intensions to expressions. However, this does not mean that formal models
cannot be adapted to intensional languages. Rather, they would each have to come
with their own set-theoretic reconstruction of Logical Space, consisting of an arbitrary
(non-empty) set W representing the worlds and a system 0 of arbitrary (non-empty)
universes, i.e. one set per world (representation):
Definition
A formal ontology is a pair (W, 0), where W is a non-empty set (the worlds
according to (W, 0) and 0 is a function with domain W such that 0(w) is a
non-empty set whenever we W [= the individuals of w, according to (W, 0)].
In the literature on modal logic, the requirement of non-emptiness is sometimes
weaker, applying to the union of all domains rather than each individual domain; cf. Fine
(1977: 144). Given a formal ontology (W, 0), one may generalise {/-extensions from
extensional to arbitrary types. Due to the world-dependence of individual domains, these
generalised extensions also depend on which element w of W is chosen. More precisely,
(W, 0, w)-extensions may be defined by induction on (the complexity of) their types:
(i) (W, 0, tv)-extensions of type t are truth values; (ii) (W, 0, tv)-cxtensions of type e are
members of 0(w); (iii) whenever a and b arc types, (W, 0, tv)-cxtcnsions of type (a,b)
are functions assigning (W, 0, tv)-extensions of type b to all (W, 0, tv)-extensions of type
a; and (iv) (W, 0, tv)-extensions of type (s,a) are functions / with domain W assigning
to any w'e W a (W, 0, tv')-extensions of type a (where, again, a is a type). At first glance,
this definition may appear gruesome, but closer inspections shows that it merely mimics
(a staightforward generalisation of) the replacements defined in the previous part; we
invite the reader to check this for her- or himself.
(W, 0, tv)-extensions give rise to (W, 0)-intensions of any type a as functions assigning
a (W, 0, tv)-extension of type a to any we W. We thus arrive at the following:
Definition
Given a language L, an intensional formal model (for L) is a quadruple
M = (W,0, w*,F), where (W, 0) is a formal ontology; a member w* of W
(= the actual world according to !M); and a function F which assigns to every non-
logical lexical expression A of L a (W, 0)-intension of type tt(A) (= the lexical
interpretation function according to M).
Moreover we say that an intensional formal model J^ =(W, U, w*,F) is based on the
ontology (W, 0) and call the members of W and (J U(w), the worlds and individuals
u-elV
800
VII. Theories of sentence semantics
3.4. Mathematical Model Theory
Model-theoretic interpretation originated in mathematical logic, where it is not used
to approximate Logical Space but rather accounts for semantic variation among
various (fragments of) formal languages; cf. Etchemendy (1990) for more on this difference
in perspective, which incidentally does not affect any mathematical technicalities and
results.Though a large part of the latter concern certain varieties of first-order predicate
logic, they may have repercussions on model-theoretic semantics of natural language,
especially if they concern questions of expressiveness.The two most fundamental results
of mathematical model theory - usually derived as corollaries to completeness of first-
order logic - are cases in point:
- According to the Compactness Theorem, a set Z of (closed) first order formulae can
only imply a particular (closed first-order) formula if the latter already follows from
a finite subset of I. As a consequence, no first-order formula can express that the
universe is infinite (though it may imply this) - because such a formula would be
implied by the set of (first-order expressible) sentences that say that there are at least
n objects, but not by any of its finite subsets.
- According to the Lowenheim-Skolem Theorem, a set Z of (closed) first order
formulae that is true in a model with an infinite universe, is true in models with universes
of arbitrary infinite cardinalities. In particular, no collection of first-order formulae
can express that the universe has a particular infinite cardinality.
According to a fundamental result of abstract model theory due to Lindstrom (1969),
the above two theorems characterise first-order logic in that any language exceeding its
expressive power is bound to fail at least one of them.There is little doubt that natural
languages have the resources to express infinity; if they can also be shown to be able
to express, say, the countability of the universe, Lindstrom's Theorem would imply that
the notion of validity in them cannot be axiomatised - cf. Ebbinghaus et al. (1994) for
the technical background. The modcl-thcorctical study of higher-order logics is
comparatively less well investigated, a major theme being the distinction between standard
and non-standard models, the latter being introduced to restore axiomatisability - at
the price of losing the kinds of idealisations in the set-theoretic construction of
denotations mentioned and motivated in Section 2; a survey of the most fundamental results of
higher-order model theory can be found in van Benthem & Doets (1983).
4. References
Abbott, Barbara 1997. Models, truth and semantics Linguistics & Philosophy 20,117-138.
Abbott, Barbara & Larry Hauser 1995. Realism, Model Theory, and Linguistic Semantics. Paper
presented at the Annual Meeting of the Linguistic Society of America. New Orleans, January 8,
1995. http://cogpriiits.Org/256/l/realism.litni. December 9,2010.
Benthem, Johan van & Kees Doets 1983. Higher-order logic. In: D. M. Gabbay & F. Guenthner
(eds.). Handbook of Philosophical Logic, vol. I. Dordrecht: Reidel, 275-329.
Button, Tim 2011. The metamathematics of Putnam's model-theoretic arguments Erkenntnis 74,
321-349.
Carnap, Rudolf 1947. Meaning and Necessity. Chicago. IL:The University of Chicago Press.
Carnap, Rudolf 1952. Meaning postulates Philosophical Studies 3,65-73.
Casanovas, Enrique 2007. Logical operations and invariance. Journal of Philosophical Logic 36,
33-60.
33. Model-theoretic semantics
801
Cresswell, Maxwell J. 1982. The autonomy of semantics In: S. Peters & E. Saarinen (eds). Processes,
Beliefs, and Questions. Dordrecht: Kluwer, 69-86.
Devitt, Michael 1983. Realism and the renegade Putnam: A critical study of meaning and the moral
sciences Nous 17,291-301.
Dowty, David 1979. Word Meaning and Montague Grammar. Dordrecht: Kluwer.
Ebbinghaus, Heinz-Dieter, Jorg Flum & Wolfgang Thomas 1994. Mathematical Logic. 2nd edn.
Berlin: de Gruyter.
Etchemendy, John 1990. The Concept of Logical Consequence. Cambridge. MA: The MIT Press.
Field, Hartry 1975. Conventionalism and instrumental ism in semantics Nous 9,375-405.
Fine, Kit 1977. Properties, propositions, and sets Journal of Philosophical Logic 6,135-191.
Frege, Gottlob 1884/1950. Die Grundlagen der Arithmetik: eine logisch-mathematische Untersu-
chung iiber den Begriff der Zahl. Breslau: W. Koebner 1884. English translation in J. Austin.
The Foundations of Arithmetic: A Logico-mathematical Enquiry into the Concept of Number.
1st edn. Oxford: Blackwell, 1950.
Frege, Gottlob 1892/1980. Uber Sinn und Bedeutung. Zeitschrift fur Philosophic undphilosophische
Kritik 100,25-50. English translation in: P. Geach & M. Black (cds). Translations from the
Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1980,56-78.
Gazdar, Gerald, Ewan Klein, Geoffrey Pullum & Ivan A. Sag 1985. Generalized Phrase Structure
Grammar. Cambridge, MA:The MIT Press.
Heim, Irene & Angelika Kratzer 1998. Semantics in Generative Grammar. Oxford: Oxford
University Press.
Hodges, Wilfried 2001. Formal features of compositionality. Journal of Logic, Language and
Information 10,7-28.
Janssen,Theo M.V. 1983. Foundations and Applications of Montague Grammar. Doctoral
dissertation. University of Amsterdam.
Jeffrey, Richard J. 1964. Review of Sections IV-V of E. Nagel et al. (eds.). Logic, Methodology, and
Philosophy of Science (Stanford, CA 1962). Journal of Philosophy 61,79-88.
Kripke. Saul A. 1972. Naming and necessity. In: D. Davidson & G. Harman (eds.). Semantics of
Natural Language. Dordrecht: Kluwer, 253-355.
Kupffer, Manfred 2007. Contextuality as supervenience. In: M. Aloni, P. Dekker & F. Roelofsen
(cds). Proceedings of the Sixteenth Amsterdam Colloquium. Amsterdam: ILLC, 139-144. http://
www.illc.uva.nl/AC2007/uploaded_files/proceedings-AC07.pdf. December 9,2010.
Lasersohn. Peter 2000. Same, models and representation. In: B. Jackson & T. Matthews (eds.).
Proceedings from Semantics and Linguistic Theory (=SALT) X. Ithaca, NY: Cornell University, 83-97.
Lasersohn, Peter 2005. The temperature paradox as evidence for a presuppositional analysis of
definite descriptions Linguistic Inquiry 36.127-134.
Lewis, David K. 1984. Putnam's paradox. Australasian Journal of Philosophy 62,221-236.
Lindenbaum, Adolf & Alfred Tarski 1935/1983. Uber die Beschranktheit der Ausdrucksmittel
deduktiver Theorien. Ergebnisse eines mathematischen Kolloquiums 7, 15-22. English
translation in: J. Corcoran (ed.). Logic, Semantics, Metamathematics. 2nd edn. Indianapolis, IN: Hackett
Publishing Company, 1983,384-392.
Lindstrom, Per 1969. On extensions of elementary logic. Theoria 35,1-11.
Lobner, Sebastian 1979. Intensionale Verben und Funktionalbegriffe.Tub'mgen: Niemeyer.
Macbeath, Murray 1982. "Who was Dr Who's Father?". Synthese 51,397-430.
MacFarlane, John 2008. Logical constants In: E. Zalta (ed.). The Stanford Encyclopedia of
Philosophy (Fall 2008 Edition). Online Publication. Stanford Encyclopedia of Philosophy, http://
plato.stanford.edu/cntrics/logical-constants December 9,2010.
Machover, Moshe 1994. Review of G. Slier. The Bounds of Logic. A Generalized Viewpoint
(Cambridge, MA 1991). British Journal for the Philosophy of Science 45,1078-1083.
Mendelson, Elliott 1997. An Introduction to Mathematical Logic. 4th edn. London: Chapman &
Hall.
Menzel, Christopher 1990. Actualism.ontological commitment, and possible world semantics
Synthese 85,355-389.
804
VII. Theories of sentence semantics
These developments are accompanied by a newly found interest in the linguistic and
ontological foundation of events.To the extent that more attention is paid to less typical
events than the classical "Jones buttering a toast" or "Brutus stabbing Caesar", which
always come to the Davidsonian semanticist's mind first, there is a growing awareness
of the vagueness and incongruities lurking behind the notion of events and its use in
linguistic theorizing. A particularly controversial case in point is the status of states.The
question of whether state expressions can be given a Davidsonian treatment analogous
to process and event expressions (in the narrow sense) is still open to debate.
All in all, Davidsonian event arguments have become a very familiar "all-purpose"
linguistic instrument over the past decades, and recent years have seen a continual
extension of possible applications far beyond the initial focus on verb semantics and adver-
bials also including a growing body of psycholinguistic studies that aim to investigate the
role of events in natural language representation and processing.
This article will trace the motivation, development, and applications of event
semantics during the past decades and provide a picture of current views on the role of events
in natural language meaning. Section 2 introduces the classical Davidsonian paradigm,
providing an overview of its motivation and some classical and current applications, as
well as an ontological characterization of events and their linguistic diagnostics.
Section 3 discusses the Neo-Davidsonian turn with its broader perspective on eventualities
and the use of thematic roles.This section also includes some notes on decompositional
approaches to event semantics. Section 4 turns to the stagc-lcvcl/individual-lcvcl
distinction outlining the basic linguistic phenomena that are grouped together under this label
and discussing the event semantic treatments that have been proposed as well as the
criticism they have received. Section 5 returns to ontological matters by reconsidering
the category of states and asking whether indeed all of them, in particular the referents
introduced by so-called "statives", fulfill the criteria for Davidsonian eventualities. And,
finally, section 6 presents some experimental results of recent psycholinguistic studies
that have tested the insights of Davidsonian event semantics.The article concludes with
some final remarks in section 7.
2. Davidsonian event semantics
2.1. Motivation and applications
On the standard view in Pre-Davidsonian times, a transitive verb such as to butter in (la)
would be conceived of as introducing a relation between the subject Jones and the direct
object the toast, thus yielding the logical form (lb).
(1) a. Jones buttered the toast,
b. butter (jones, the toast)
The only individuals that sentence (la) talks about according to (lb) are Jones and the
toast. As Davidson (1967) points out such a representation does not allow us to refer
explicitly to the action described by the sentence and specify it further by adding, e.g.,
that Jones did it slowly, deliberately, with a knife, in the bathroom, at midnight. What,
asks Davidson, does it refer to in such a continuation? His answer is that action verbs
introduce an additional hidden event argument that stands for the action proper. Under
34. Event semantics
805
this perspective, a transitive verb introduces a three-place relation holding between the
subject, the direct object and an event argument. Davidson's proposal thus amounts to
replacing (lb) with the logical form in (lc).
(1) c. 3e [butter (jones, the toast, e)]
This move paves the way for a straightforward analysis of adverbial modification.
If verbs introduce a hidden event argument, then standard adverbial modifiers may
be simply analyzed as first-order predicates that add information about this event; cf.
article 54 (Maienborn & Schafer) Adverbs and adverbials on the problems of alternative
analyses and further details of the Davidsonian approach to adverbial modification.Thus,
Davidson's classical sentence (2a) takes the logical form (2b).
(2) a. Jones buttered the toast in the bathroom with the knife at midnight.
b. 3e [butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) &
at (e, midnight)]
According to (2b), sentence (2a) expresses that there was an event of Jones buttering the
toast, and this event was located in the bathroom. In addition, it was performed by using
a knife as an instrument, and it took place at midnight. Thus, the verb's hidden event
argument provides a suitable target for adverbial modifiers. As Davidson points out, this
allows adverbial modifiers to be treated analogously to adnominal modifiers: Both target
the referential argument of their verbal or nominal host.
Adverbial modification is thus seen to be logically on a par with adjectival modification:
what adverbial clauses modify is not verbs but the events that certain verbs introduce.
Davidson (1969/1980: 167)
One of the major advances achieved through the analysis of adverbial modifiers as
first-order predicates on the verb's event argument is its straightforward account of the
characteristic entailment patterns of sentences with adverbial modifiers. For instance, we
want to be able to infer from (2a) the truth of the sentences in (3). On a Davidsonian
account this follows directly from the logical form (2b) by virtue of the logical rule of
simplification; cf. (3*). See, e.g., Eckardt (1998,2002) on the difficulties that these
entailment patterns pose for a classical operator approach to adverbials such as advocated by
Thomason & Stalnaker (1973), see also article 54 (Maienborn & Schafer) Adverbs and
adverbials.
(3) a. Jones buttered the toast in the bathroom at midnight.
b. Jones buttered the toast in the bathroom.
c. Jones buttered the toast at midnight.
d. Jones butterd the toast with the knife.
e. Jones buttered the toast.
(3') a. 3e [butter (jones, the toast,e) & in (e, the bathroom) & at (e, midnight)]
b. 3e [butter (jones, the toast, e) & in (e, the bathroom)]
c. 3e [butter (jones, the toast, e) & at (e, midnight)]
806
VII. Theories of sentence semantics
d. 3e [butter (jones, the toast, e) & instr (e, the knife)]
e. 3e [butter (jones, the toast, e)]
Further evidence for the existence of hidden event arguments can be adduced from
anaphoricity, quantification and definite descriptions among others: Having introduced
event arguments, the anaphoric pronoun it in (4) may now straightforwardly be analyzed
as referring back to a previously mentioned event, just like other anaphoric expressions
take up object referents and the like.
(4) It happened silently and in complete darkness.
Hidden event arguments also provide suitable targets for numerals and frequency
adverbs as in (5).
(5) a. Anna has read the letter three times/many times,
b. Anna has often/seldom/never read the letter.
Krifka (1990) shows that nominal measure expressions may also be used as a means of
measuring the event referent introduced by the verb. Krifka's example (6) has a reading
which does not imply that there were necessarily 4000 ships that passed through the lock
in the given time span but that there were 4000 passing events of maybe just one single
ship. That is, what is counted by the nominal numeral in this reading are passing events
rather than ships.
(6) 4000 ships passed through the lock last year.
Finally, events may also serve as referents for definite descriptions as in (7).
(7) a. the fall of the Berlin Wall
b. the buttering of the toast
c. the sunrise
See,e.g.,Bierwisch (1989),Grimshaw (1990),Zucchi (1993),Ehrich & Rapp (2000),Rapp
(2007) for event semantic treatments of nominalizations; cf. also article 51 (Grimshaw)
Deverbal nominalization. Engelberg (2000: lOOff) offers an overview of the phenomena
for which event-based analyses have been proposed since Davidson's insight was taken
up and developed further in linguistics.
The overall conclusion that Davidson invites us to draw from all these linguistic data
is that events are things in the real world like objects; they can be counted, they can be
anaphorically referred to, they can be located in space and time, they can be ascribed
further properties. All this indicates that the world, as we conceive of it and talk about it,
is apparently populated by such things as events.
2.2. Ontological properties and linguistic diagnostics
Semantic research over the past decades has provided impressive confirmation of
Davidson's (1969/1980:137) claim that "there is a lot of language we can make
systematic sense of if we suppose events exist". But, with Quine's dictum "No entity without
34. Event semantics
807
identity!" in mind, we have to ask: What kind of things are events? What are their
identity criteria? And how are their ontological properties reflected through linguistic
structure?
None of these questions has received a definitive answer so far, and many versions
of the Davidsonian approach have been proposed, with major and minor differences
between them. Focussing on the commonalities behind these differences, it still seems
safe to say that there is at least one core assumption in the Davidsonian approach that
is shared more or less explicitly by most scholars working in this paradigm. This is that
eventualities are, first and foremost, particular spatiotemporal entities in the world.
As LePore (1985:151) puts it,"[Davidson's] central claim is that events are concrete
particulars - that is, unrepeatable entities with a location in space and time." As the past decades'
discussion of this issue has shown (see, e.g., the overviews in Lombard 1998, Engelberg
2000, and Pianesi & Varzi 2000), it is nevertheless notoriously difficult to turn the above
ontological outline into precise identity criteria for eventualities. For illustration, I will
mention just two prominent attempts.
Lemmon (1967) suggests that two events are identical just in case they occupy the
same portion of space and time. This notion of events seems much too coarse-grained,
at least for linguistic purposes, since any two events that just happen to coincide in
space and time would, on this account, be identical.To take Davidson's (1969/1980:178)
example, we wouldn't be able to distinguish the event of a metal ball rotating around its
own axis during a certain time from an event of the metal ball becoming warmer during
the very same time span. Note that we could say that the metal ball is slowly becoming
warmer while it is rotating quickly, without expressing a contradiction.This indicates that
we are dealing with two separate events that coincide in space and time.
Parsons (1990), on the other hand, attempts to establish genuinely linguistic identity
criteria for events: "When a verb-modifier appears truly in one source and falsely in
another, the events cannot be identical." (Parsons 1990: 157). This, by contrast, yields a
notion of events that is too fine-grained; see, e.g., the criticism by Eckardt (1998: § 3.1)
and Engelberg (2000:221-225). What we are still missing, then, are ontological criteria of
the appropriate grain for identifying events.This is the conclusion Pianesi & Varzi (2000)
arrive at in their discussion of the ontological nature of events:
[...] the idea that events are spatiotemporal particulars whose identity criteria are
moderately thin [...] has found many advocates both in the philosophical and in the linguistic
literature. [...] But they all share with Davidson's the hope for a 'middle ground' account of
the number of particular events that may simultaneously occur in the same place.
Pianesi & Varzi (2000:555)
We can conclude, then, that the search for ontological criteria for identifying events will
probably continue for some time. In the meantime, linguistic research will have to build
on a working definition that is up to the demands of natural language analysis.
What might also be crucial for our notion of events (besides their spatial and temporal
dimensions) is their inherently relational character. Authors like Parsons (1990), Carlson
(1998), Eckardt (1998), and Asher (2000) have argued that events necessarily involve
participants serving some function. In fact, the ability of Davidsonian analyses to make
explicit the relationship between events and their participants, either via thematic roles
or by some kind of decomposition (sec sections 3.2 and 3.3 below), is certainly one of
the major reasons among linguists for the continuing popularity of such analyses. This
34. Event semantics
809
elaborate on the internal functional set-up of events. This was already illustrated by our
sentence (2);see article 54 (Maienborn & Schafer) Adverbs and adverbials for details on
the contribution of manner adverbials and similar expressions that target the internal
structure of events.
This is, in a nutshell, the Davidsonian view shared (explicitly or implicitly) by current
event-based approaches. The diagnostics in (10) provide a suitable tool for detecting
hidden event arguments and may therefore help us to assess the Neo-Davidsonian claim
that event arguments are not confined to action verbs but have many further sources, to
which we will turn next.
3. The Neo-Davidsonian turn
3.1. The notion of eventualities
Soon after they took the linguistic stage, it became clear that event arguments were
not to be understood as confined to the class of action verbs, as Davidson originally
proposed, but were likely to have a much wider distribution. A guiding assumption
of what has been called the Neo-Davidsonian paradigm, developed particularly by
Higginbotham (1985, 2000) and Parsons (1990, 2000), is that any verbal predicate may
have such a hidden Davidsonian argument as illustrated by the following quotations
from Higginbotham (1985) and Chierchia (1995).
The position E corresponds to the 'hidden' argument place for events, originally suggested
by Donald Davidson (1967).There seem to be strong arguments in favour of. and little to be
said against, extending Davidson's idea to verbs other than verbs of change or action. Under
this extension, statives will also have impositions.
Higginbotham (1985:10)
A basic assumption I am making is that every VP, whatever its internal structure and
aspectual characteristics, lias an extra argument position for eventualities, in the spirit of
Davidson's proposal. [...] In a way, having this extra argument slot is part of what makes
something a VP, whatever its inner structure.
Chierchia (1995:204)
Note that already some of the first commentators on Davidson's proposal took a
similarly broad view on the possible sources for Davidson's extra argument. For instance,
Kim (1969:204) notes: "When we talk of explaining an event, we are not excluding what,
in a narrower sense of the term, is not an event but rather a state or a process." So it was
only natural to extend Davidson's original proposal and combine it with Vendler's (1967)
classification of situation types into states, activities, accomplishments and achievements.
In fact, the continuing strength and attractiveness of the overall Davidsonian enterprise
for contemporary linguistics rests to a great extent on the combination of these two
congenial insights: Davidson's introduction of an ontological category of events present in
linguistic structure, and Vendler's subclassification of different situation types according
to the temporal-aspectual properties of the respective verb phrases;cf., e.g., Pirion (1997),
Engelberg (2002), Saeb0 (2006) for some more recent event semantic studies on the
lexical and/or aspectual properties of certain verb classes.
810
VII. Theories of sentence semantics
The definition and delineation of events (comprising Vendler's accomplishments and
achievements), processes (activities in Vendler's terms) and states has been an extensively
discussed and highly controversial topic of studies particularly on tense and aspect. The
reader is referred to the articles 48 (Filip) Aspectual class and Aktionsart and 97 (Smith)
Tense and aspect. For our present purposes the following brief remarks shall suffice:
First, a terminological note:The notion "event" is often understood in a broad sense,
i.e. as covering, besides events in a narrow sense, processes and states as well. Bach (1986)
has introduced the term "eventuality" for this broader notion of events. In the remainder
of this article I will stick to speaking of events in a broad sense unless explicitly indicated
otherwise. Other labels for an additional Davidsonian event argument that can be found
in the literature include "spatiotemporal location" (e.g. Kratzer 1995) and "Davidsonian
argument" (e.g. Chierchia 1995).
Secondly, events (in a narrow sense), processes, and states may be characterized in
terms of dynamicity and telicity. Events and processes are dynamic eventualities, states
are static. Furthermore, events have an inherent culmination point, i.e., they are telic,
whereas processes and states, being atelic, have no such inherent culmination point; see
Krifka (1989,1992,1998) for a mereological characterization of events and cf. also Dowty
(1979), Rothstein (2004).
Finally, accomplishments and achievements, the two subtypes of events in a narrow
sense, differ wrt. their temporal extension. Whereas accomplishments such as expressed
by read the book, eat one pound of cherries, run the 100m final have a temporal extension,
achievements such as reach the summit, find the solution, win the 100m final are
momentary changes of state with no temporal duration. See Kennedy & Levin (2008) on so-
called degree achievements expressed by verbs like to lengthen, to cool, etc. The variable
aspectual behavior of these verbs - atelic (permitting the combination with a for-PP) or
telic (permitting the combination with an m-PP) - is explained in terms of the relation
between the event structure and the scalar structure of the base adjective; cf. (12).
(12) a. The soup cooled for 10 minutes. (atelic)
b. The soup cooled in 10 minutes. (telic)
Turning back to the potential sources for Davidsonian event arguments, in more recent
times not only verbs, whether eventive or stative, have been taken to introduce an
additional argument, but other lexical categories as well, such as adjectives, nouns and also
prepositions. Motivation for this move comes from the observation that all
predicative categories provide basically the same kind of empirical evidence that motivated
Davidson's proposal and thus call for a broader application of the Davidsonian analysis.
The following remarks from Higginbotham & Ramchand (1997) are typical of this view:
Once we assume that predicates (or their verbal, etc. heads) have a position for events,
taking the many consequences that stem there from, as outlined in publications originating
with Donald Davidson (1967), and further applied in Higginbotham (1985, 1989), and
Terence Parsons (1990), we are not in a position to deny an event-position to any predicate;
for the evidence for, and applications of, the assumption are the same for all predicates
Higginbotham & Ramchand (1997:54)
As these remarks indicate, nowadays Neo-Davidsonian approaches often take event
arguments to be a trademark not only of verbs but of predicates in general. We will come
back to this issue in section 5 when we reconsider the category of states.
34. Event semantics
811
3.2. Events and thematic roles
The second core assumption of Neo-Davidsonian accounts, besides assuming a broader
distribution of event arguments, concerns the way of relating the event argument to the
predicate and its regular arguments. While Davidson (1967) introduced the event
argument as an additional argument to the verbal predicate thereby augmenting its arity,
Neo-Davidsonian accounts use the notion of thematic roles for linking an event to its
participants. Thus, the Neo-Davidsonian version of Davidson's logical form in (2b) for
the classical sentence (2a), repeated here as (13a/b) takes the form in (13c).
(13) a. Jones buttered the toast in the bathroom with the knife at midnight.
b. 3e [butter (jones, the toast, e) & in (e, the bathroom) & instr (e, the knife) &
at (e, midnight)]
c. 3e [butter (e) & agent (e, jones) & patient (e, the toast) & in (e, the bathroom)
& instr (e, the knife) & at (e, midnight)]
On a Neo-Davidsonian view, all verbs are uniformly one-place predicates ranging over
events. The verb's regular arguments are introduced via thematic roles such as agent,
patient, experiencer, etc., which express binary relations holding between events
and their participants; cf. article 18 (Davis) Thematic roles for details on the nature,
inventory and hierarchy of thematic roles. Note that due to this move of separating the
verbal predicate from its arguments and adding them as independent conjuncts, Neo-
Davidsonian accounts give up to some extent the distinction between arguments and
modifiers. At least it isn't possible anymore to read off the number of arguments a verb
has from the logical representation. While Davidson's notation in (13b) conserves the
argument/modifier distinction by reserving the use of thematic roles for the integration
of circumstantial modifiers, the Neo-Davidsonian notation (13c) uses thematic roles both
for arguments such as the agent Jones as well as for modifiers such as the instrumental
the knife;sec Parsons (1990: 96ff) for motivation and defense and Bierwisch (2005) for
some criticism on this point.
3.3. Decompositional event semantics
The overall Neo-Davidsonian approach is also compatible with adopting a
decompositional perspective on the semantics of lexical items, particularly of verbs; cf. articles 16
(Bierwisch) Semantic features and primes and 17 (Engelberg) Frameworks of
decomposition. Besides a standard lexical entry for a transitive verb such as to close in (14a) that
translates the verbal meaning into a one-place predicate close on events, one might also
choose to decompose the verbal meaning into more basic semantic predicates like the
classical cause, become etc; cf. Dowty (1979). A somewhat simplified version of Parsons'
"subatomic" approach is given in (14b); cf. Parsons (1990:120).
(14) a. to close: Xy Xx Xe [close (e) & agent (e, x) & theme (e, y)]
b. to close: Xy Xx Xe [agent (e, x) & theme (e, y) & 3e' [cause (e, e') & theme
(e',y) & 3s [become (e',s) & closed (s) & theme (s,y)]]]
According to (14b) the transitive verb to close expresses an action c taken by an agent
x on a theme y which causes an event e' of y changing into a state s of being closed.
On this account a causative verb introduces not one hidden event argument but three.
34. Event semantics
813
used and combined with further semantic instruments such as decompositional and/or
thematic role approaches.
4. The stage-level/individual-level distinction
4.1. Linguistic phenomena
A particularly prominent application field for contemporary event semantic research
is provided by the so-called stage-level/individual-level distinction, which goes back to
Carlson (1977) and, as a precursor, Milsark (1974,1977). Roughly speaking, stage-level
predicates (SLPs) express temporary or accidental properties, whereas individual-level
predicates (ILPs) express (more or less) permanent or inherent properties; some
examples are given in (17) vs. (18).
(17) Stage-level predicates
a. adjectives: tired, drunk, available,...
b. verbs: speak, wait, arrive,...
(18) Individual-level predicates
a. adjectives: intelligent, blond, altruistic,...
b. verbs: know, love, resemble,...
The stage-level/individual-level distinction is taken to be a conceptually founded
distinction that is grammatically reflected. Lexical predicates are classified as being either
SLPs or ILPs. In the last years, a growing list of quite diverse linguistic phenomena have
been associated with this distinction. Some illustrative cases will be mentioned next;cf.,
e.g., Higginbotham & Ramchand (1997), Fernald (2000), Jager (2001), Maienborn (2003:
§2.3) for commented overviews of SLP/ILP diagnostics that have been discussed in the
literature.
4.1.1. Subject effects
Bare plural subjects of SLPs have, besides a generic reading ('Firemen are usually
available.'), also an existential reading ('There are firemen who are available.') whereas bare
plural subjects of ILPs only have a generic reading ('Firemen are usually altruistic.'):
(19) a. Firemen arc available. (SLP: generic + existential reading)
b. Firemen are altruistic. (ILP: only generic reading)
4.1.2. There-coda
Only SLPs (20) but not ILPs (21) may appear in the coda of a f/iere-construction:
(20) a. There were children sick. (SLP)
b. There was a door open.
(21) a. *Thcrc were children tall. (ILP)
b. There was a door wooden.
34. Event semantics
815
In sum, the standard perspective under which all these contrasts concerning subject
effects, tv/ieAi-conditionals, locative modifiers, and so on have been considered is that they
are distinct surface manifestations of a common underlying contrast. The stage-level/
individual-level hypothesis is that the distinction of SLPs and ILPs rests on a
fundamental (although still not fully understood) conceptual opposition that is reflected in
multiple ways in the grammatical system.The following quotation from Fernald (2000)
is representative of this view:
Many languages display grammatical effects due to the two kinds of predicates, suggesting
that this distinction is fundamental to the way humans think about the universe.
Fernald (2000:4)
Given that the conceptual side of the coin is still rather mysterious (Fernald (2000: 4):
"Whatever sense of permanence is crucial to this distinction, it must be a very weak
notion"), most stage-level/individual-level advocates content themselves with
investigating the grammatical side (Higginbotham & Ramchand (1997: 53): "Whatever the
grounds for this distinction, there is no doubt of its force"). We will come back to this
issue in section 4.3.
4.2. Event semantic treatments
A first semantic analysis of the stage-level/individual-level contrast was developed by
Carlson (1977). Carlson introduces a new kind of entities, which he calls "stages".These
are spatiotemporal partitions of individuals. SLPs and ILPs are then analyzed as
predicates ranging over different kinds of entities: ILPs are predicates over individuals, and
SLPs are predicates over stages.Thus, on Carlson's approach the stage-level/individual-
level distinction amounts to a basic difference at the ontological level. Kratzer (1995)
takes a different direction locating the relevant difference at the level of the argument
structure of the corresponding predicates. Crucially, SLPs have an extra event argument
on Kratzer's account, whereas ILPs lack such an extra argument. The lexical entries for
a SLP like tired and an ILP like blond are given in (27).
(27) a. tired: Xx Xe [tired (e, x)]
b. blond: Xx [blond (x)]
This difference in argument structure may now be exploited for selectional restrictions,
for instance. Perception verbs, e.g., require an event denoting complement; see the
discussion of (11) in section 2.2.This prerequisite is only fulfilled by SLPs, which explains
the SLP/ILP difference observed in (24). Moreover, the ban of ILPs from depictive
constructions (see (25) vs. (26)) can be traced back to the need of the secondary
predicate to provide a state argument that includes temporally the main predicate's event
referent.
A very influential syntactic explanation for the observed subject effects within
Kratzer's framework has been proposed by Diesing (1992). She assumes different
subject positions for SLPs and ILPs: Subjects of SLPs have a VP-internal base position;
subjects of ILPs arc base-generated VP-cxtcrnally. In addition, Diesing formulates a
816
VII. Theories of sentence semantics
so-called Mapping Hypothesis, which serves as a syntax/semantics interface condition on
the derivation of a logical form. (Diesing assumes a Lewis-Kamp-Heim style tripartite
logical form consisting of a non-sclcctivc quantifier Q, a restrictive clause (RC), and a
nuclear scope (NS).) Diesing's (1992) Mapping-Hypothesis states that VP-material is
mapped into the nuclear scope, and VP-external material is mapped into the restrictive
clause. Finally, Diesing takes the VP-boundary to be the place for the existential closure
of the nuclear scope.The different readings for SLP and ILP bare plural subjects follow
naturally from these assumptions: If SLP subjects stay in their VP-internal base position,
they will be mapped into the nuclear scope and, consequently, fall under the scope of the
existential quantifier. This leads to the existential reading; cf. (28a). Or they move to a
higher, VP-external subject position, in which case they are mapped into the restrictive
clause and fall under the scope of the generic operator.This leads to the generic reading;
cf. (28b). ILP subjects, having a VP-external base position, may only exploit the latter
option.Thus, they only have a generic reading;cf. (29).
(28) a. 3e,x [Ns firemen (x) & available (e,x)] (cf. Kratzer 1995:141)
b. Gen e, x [Rc firemen (x) & in (x, e)] [Ns available (e, x)]
(29) Gen x [Rc firemen (x)] [ns altruistic (x)]
Kratzer's account also offers a straightforward solution for the different behavior of
SLPs and ILPs wrt. locative modification; cf. (23). Having a Davidsonian event
argument, SLPs provide a suitable target for locative modifiers, hence, they can be located in
space. ILPs, on the other hand, lack such an additional event argument, and therefore do
not introduce any referent whose location could be further specified via adverbial
modification. This is illustrated in (30)/(31). While combining a SLP with a locative modifier
yields a semantic representation like (30b), any attempt to add a locative to an ILP must
necessarily fail;cf. (31b).
(30) a. Maria was tired in the car.
b. 3c [tired (c, maria) & in (c, the car)]
(31) a. */??Maria was blond in the car.
b. [blond (maria) & in (???, the car)]
Thus, on a Kratzerian analysis, SLPs and ILPs indeed differ in their ability to be located
in space (see the above quote from Fernald), and this difference is traced back to the
presence vs. absence of an event argument. Analogously, the event variable of SLPs
provides a suitable target for tv/iew-conditionals to quantify over in (22a), whereas the
ILP case (22b) lacks such a variable; cf. Kratzer's (1995) Prohibition against Vacuous
Quantification.
A somewhat different event semantic solution for the incompatibility of ILPs with
locative modifiers has been proposed by Chierchia (1995). He takes a Neo-Davidso-
nian perspective according to which all predicates introduce event arguments. Thus,
SLPs and ILPs do not differ in this respect. In order to account for the SLP/ILP
contrast in combination with locatives, Chierchia then introduces a distinction between two
kinds of events: SLPs refer to location dependent events whereas ILPs refer to location
34. Event semantics
817
independent events; see also McNally (1998). The observed behavior wrt. locatives
follows on the assumption that only location dependent events can be located in space.
As Chierchia (1995:178) puts it: "Intuitively, it is as if ILP were, so to speak, unlocatcd.
If one is intelligent, one is intelligent nowhere in particular. SLP, on the other hand, are
located in space."
Despite all differences, Kratzer's and Chierchia's analyses have some important
commonalities. Both consider the SLP/ILP contrast in (30)/(31) as a grammatical effect.That
is, sentences like (31a) do not receive a compositional semantic representation; they are
grammatically ill-formed. Kratzer and Chierchia furthermore share the general intuition
that SLPs (and only those) can be located in space.This is what the difference in (30a) vs.
(31a) is taken to show. And, finally, both analyses rely crucially on the idea that at least
SLPs, and possibly all predicates, introduce Davidsonian event arguments.
All in all, Kratzer's (1995) synthesis of the stage-level/individual-level distinction
with Davidsonian event semantics has been extremely influential, opening up a new
field of research and stimulating the development of further theoretical variants and of
alternative proposals.
4.3. Criticisms and further developments
In subsequent studies of the stage-level/individual-level distinction two tendencies can
be observed. On the one hand, the SLP/ILP contrast has been increasingly conceived
of in information structural terms. Roughly speaking, ILPs relate to categorial
judgements, whereas SLPs may build either categorial or thetical judgements;cf.,e.g., Ladusaw
(1994), McNally (1998), Jager (2001). On this move, the stage-level/individual-level
distinction is usually no longer seen as a lexically codified contrast but rather as being
structurally triggered.
On the other hand there is growing skepticism concerning the empirical adequacy of
the stage-level/individual-level hypothesis. Authors such as Higginbotham & Ramchand
(1997),Fernald (2000), and Jager (2001) argue that the phenomena subsumed under this
label are actually quite distinct and do not yield such a uniform contrast upon closer
scrutiny as a first glance might suggest.
For instance, as already noted by Bauerle (1994: 23), the group of SLPs that support
an existential reading of bare plural subjects is actually quite small; cf.( 19a).The majority
of SLPs, such as tired or hungry in (32) behaves more like ILPs, i.e., they only yield a
generic reading.
(32) Firemen arc hungry/tired. (SLP: only generic reading)
In view of the sentence pair in (33) Higginbotham & Ramchand (1997: 66) suspect
that some notion of speaker proximity might also be of relevance for the availability of
existential readings.
(33) a. (Guess whether) firemen are nearby/at hand.
b. ?(Gucss whether) firemen arc far away/a mile up the road.
r/iere-constructions, on the other hand, also appear to tolerate ILPs, contrary to what
one would expect;cf. the example (34) taken from Carlson (1977:72).
820
VII. Theories of sentence semantics
VP-modifiers. They should not be confounded with locative frame adverbials such as
those in (39).These are sentential modifiers that do not add an additional predicate to a
VP's event argument but instead provide a semantically underspecified domain
restriction for the overall proposition. Locative frame adverbials often yield temporal or
conditional interpretations (e.g.'When he was in Italy, Maradona was married.' for (39c))
but might also be interpreted epistemically, for instance ('According to the belief of the
people in Italy, Maradona was married.'); see Maienborn (2001) for details and cf. also
article 54 (Maienborn & Schafer) Adverbs and adverbials.
(39) Locative frame adverbials:
a. By candlelight, Carolin resembled her brother.
b. Maria was drunk in the car.
c. In Italy, Maradona was married.
Secondly, we are now in a position to make more precise what is going on in sentence
pairs like (23), repeated here as (40), which are often taken to demonstrate the different
behavior of SLPs and ILPs wrt. location in space; cf. the discussion in section 4.
(40) a. Maria was tired/hungry/nervous in the car. (SLP)
b. ??Maria was blond/intelligent/a linguist in the car. (ILP)
Actually, this SLP/ILP contrast is not an issue of grammaticality but concerns the
acceptability of these sentences under a temporal reading of the locative frame; cf. Maienborn
(2004) for a pragmatic explanation of this temporariness effect.
Thirdly, sentences (38d/e) are well-formed under an alternative syntactic analysis that
takes the locative as the main predicate and the adjective as a depictive secondary
predicate. Under this syntactic analysis sentence (38d) would express that there was a state of
the dress being on the clothesline, and this state is temporally included in an
accompanying state of the dress being wet.This is not the kind of evidence needed to substantiate
the Neo-Davidsonian claim that states can be located in space. If the locative were a true
event-related modifier, sentence (38d) should have the interpretation:There was a state of
wetness of the dress, and this state is located on the clothesline. (38d) has no such reading;
cf. the discussion on this point between Rothstein (2005) and Maienborn (2005b).
Turning back to our event diagnostics, the same split within the group of state
expressions that we observed in the previous cases also shows up with manner adverbials,
comitatives and the like - that is, modifiers that elaborate on the internal functional
structure of events. State verbs combine regularly with them, whereas statives do not, as
(41) shows. Katz (2003) dubbed this the Stative Adverb Gap.
(41) Manner adverbials etc.:
a. Bardo slept calmly/with his teddy/without a pacifier.
b. Carolin sat motionless/stiff at the table.
c. The pearls gleamed dully/reddishly/moistly.
d. *Bardo was calmly/with his teddy/without a pacifier tired.
e. *Carolin was restlessly/patiently thirsty.
f. *Andrea resembled with her daughter Romy Schneider.
g. *Bardo owned thriftily/generously much money.
34. Event semantics
821
There has been some discussion on apparent counterexamples to this Stative Adverb
Gap such as (42). While, e.g., Jager (2001), Mittwoch (2005), Dolling (2005) or Rothstein
(2005) conclude that such cases provide convincing evidence for assuming a Davidsonian
argument for statives as well, Katz (2000,2003,2008) and Maienborn (2003,2005a,b,2007)
argue that these either involve degree modification as in (42a) or are instances of event
coercion, i.e. a sentence such as (42b) is, strictly speaking, ungrammatical but can be
"rescued" by inferring some event argument to which the manner adverbial may then apply
regularly; see the discussion in section 6.2. For instance, what John is passionate about in
(42b) is not the state of being a Catholic but the activities associated with this state (e.g.
going to mass, praying, going to confession). If no related activities come to mind for some
predicate such as being a relative of Grit in (42'b) then the pragmatic rescue fails and the
sentence becomes odd. According to this view, understanding sentences such as (42b)
requires a non-compositional reinterpretation of the stative expression that is triggered
by the lack of a regular Davidsonian event argument.
(42) a. Lisa firmly believed that James was innocent.
b. John was a Catholic with great passion in his youth.
(42')b. ??John was a relative of Grit with great passion in his youth.
Sec also Rothmayr (2009) for a recent analysis of the semantics of stative verbs including
a decompositional account of stative/eventive ambiguities as illustrated in (43):
(43) a. Hair obstructed the drain. (stative reading)
b. A plumber obstructed the drain, (preferred eventive reading)
A further case of stativc/cvcntivc ambiguities is discussed by Engclbcrg (2005) in
his study of dispositional verbs such as German helfen (help), gefahrden (endanger),
erleichtern (facilitate).These verbs may have an eventive or a stative reading depending
on whether the subject is nominal or sentential; cf. (44).Trying to account for these
readings within the Davidsonian program turns out to be challenging in several respects.
Engelberg advocates the philosophical concept of supervenience as a useful device to
account for the evaluative rather than causal dependency of the effect state expressed
by these verbs.
(44) a. Rebecca helped Jamaal in the kitchen. (eventive)
b. That Rebecca had fixed the water pipes helped Jamaal in the kitchen, (stative)
In view of the evidence reviewed above, it seems justified to conclude that the class of
statives, including all copular constructions, does not behave as one would expect if they
had a hidden Davidsonian argument, regardless of whether they express a temporary
or a permanent property. What conclusions should we draw from these linguistic
observations concerning the ontological category of states? There are basically two lines of
argumentation that have been pursued in the literature.
Maienborn takes the behavior wrt. the classical event diagnostics in (10) as a
sufficiently strong linguistic indication of an underlying ontological difference and assumes
that only state verbs denote true Davidsonian eventualities, i.e., Davidsonian states,
822
VII. Theories of sentence semantics
whereas statives resist a Davidsonian analysis but refer instead to what Maienborn calls
Kimian states, exploiting Kim's (1969,1976) notion of temporally bound property
exemplifications. Kimian states may be located in time and they allow for anaphoric reference,
Yet, in lacking an inherent spatial dimension, They are ontologically "poorer", more
abstract entities than Davidsonian eventualities; cf. Maienborn (2003, 2005a, b, 2007)
for details.
Authors like Dolling (2005), Higginbotham (2005), Ramchand (2005) or Rothstein
(2005) take a different track. On their perspective, the observed linguistic differences call
for a more liberal definition of eventualities that includes the referents of stative
expressions. In particular, they are willing to give up the assumption of eventualities having
an inherent spatial dimension. Hence, Ramchand (2005: 372) proposes the following
alternative to the definition offered in (8):
(45) Eventualities are abstract entities with constitutive participants and with a
constitutive relation to the temporal dimension.
So the issue basically is whether we opt for a narrow or a broad definition of events.
40 years after Davidson's first plea for events we still don't know for sure what kind of
things event(ualitie)s actually are.
6. Psycholinguistic studies
In recent years, a growing interest has emerged in testing hypotheses on theoretical
linguistic assumptions about event structure by means of psycholinguistic
experiments. Two research areas involving events have attracted major interest within the still
developing field of semantic processing; cf. articles 15 (Bott, Featherston, Radd &
Stoltcrfoht) Experimental methods, 102 (Frazicr) Meaning in psycholinguistics. These
are the processing of underlying event structures and of event coercion.
6.1. The processing of underlying event structures
The first focus of interest concerns the issue of distinguishing different kinds of events
in terms of the complexity of their internal structure. Gcnnari & Pocppcl (2003) show
that the processing of event sentences such as (46a) takes significantly longer than the
processing of otherwise similar stative sentences such as (46b).
(46) a. The visiting scientist solved the intricate math problem. (eventive)
b. The visiting scientist lacked any knowledge of English. (stative)
This processing difference is attributed to eventive verbs having a more complex decom-
positional structure than stative verbs; cf. the Bierwisch-style representations in (47).
(47) a. to solve: Xy Xx Xe [e: cause (x, become (solved (y))]
b. to lack: Xy Xx Xs [s: lack (x, y)]
Thus, the study of Gennari & Poeppel (2003) adduces empirical evidence for the event
vs. state distinction and it provides experimental support for the psychological reality of
34. Event semantics
823
structuring natural language meaning in terms of decompositional representations.This
is, of course, a highly controversial issue; cf. the argumentation in Fodor, Fodor & Garrett
(1975), de Almeida (1999) and Fodor & LePore (1998) against decomposition, and see
also the more differentiated perspective taken in Mobayyen & de Almeida (2005).
McKoon & Macfarland (2000, 2002), taking up a distinction made by Levin and
Rappaport Hovav (1995) investigate two kinds of causative verbs, viz. verbs denoting an
externally caused event (e.g. break) as opposed to verbs denoting an internally caused
event (e.g. bloom). Whereas the former include a causing subevent as well as a change-
of-state subevent, the latter only express a change of state; cf. McKoon & Macfarland
(2000: 834). Thus, the two verb classes differ wrt. their decompositional complexity.
McKoon and Macfarland describe a scries of experiments that show that there arc clear
processing differences corresponding to this lexical distinction. Sentences with external
causation verbs take significantly longer to process than sentences with internal
causation verbs. In addition, this processing difference shows up with the transitive as well
as the intransitive use of the respective verbs; cf. (48) vs. (49). McKoon and Macfarland
conclude from this finding that the causing subevent remains implicitly present even if
no explicit cause is mentioned in the break-case. That is, their experiments suggest that
both transitive and intransitive uses of, e.g., awake in (49) are based on the same lexical
semantic event structure consisting of two subevents. And conversely if an internal
causation verb is used transitively, as wilt in (48a), the sentence is still understood as denoting a
single event with the subject referent being part of the changc-of-statc event.
(48) Internal causation verbs:
a. The bright sun wilted the roses.
b. The roses wilted.
(49) External causation verbs:
a. The fire alarm awoke the residents.
b. The residents awoke.
In sum, the comprehension of break, awake etc. requires understanding a more complex
event conceptualization than that of bloom, wilt etc. This psycholinguistic finding
corroborates theoretically motivated assumptions on the verbs' lexical semantic
representations. Sec also Hartl (2008) on a more thorough and differentiated study on implicit
event information. Hartl discusses whether, to what extent, and at which processing
level implicit event participants and implicit event predicates are still accessible for
interpretation purposes.
Most notably, the studies of McKoon & Macfarland (2000, 2002) and Gennari &
Poeppel (2003) provide strong psycholinguistic support for the assumption that verb
meanings are represented and processed in terms of an underlying event structure.
6.2. The processing of event coercion
The second focus of psycholinguistic research on events is devoted to the notion of event
coercion. Coercion refers to the forcing of an alternative interpretation when the
compositional machinery fails to derive a regular interpretation. In other words, event coercion
is a kind of rescue operation which solves a grammatical conflict by using additional
824
VII. Theories of sentence semantics
knowledge about the involved event type;cf. also article 25 (de Swart) Mismatches and
coercion.There are two types of coercion that are prominently discussed in the literature.
The first type, the so-called complement coercion is illustrated in (50). The verb to begin
requires an event-denoting complement and forces the given object-denoting
complement the book into a contextually appropriate event reading. Hence, sentence (50) is
reinterpreted as expressing that John began, e.g., to read the book; cf., e.g., Pustejovsky
(1995), Egg (2003).
(50) John began the book.
The second kind, the so-called aspectual coercion refers to a set of options for adjusting
the aspectual type of a verb phrase according to the demands of a temporal modifier. For
instance, the punctual verb to sneeze in (51a) is preferably interpreted iteratively in
combination with the durative adverbial for five minutes, whereas the temporal adverbial for
years forces a habitual reading of the verb phrase to smoke a morning cigarette in (51b),
and the stative expression to be in one's office receives an ingressive reinterpretation
due to the temporal adverbial in 10 minutes in (51c); cf., e.g., Moens & Steedman (1988),
Pulman (1997), de Swart (1998), Dolling (2003), Egg (2005). See also the classification of
aspectual coercions developed in Hamm & van Lambalgen (2005).
(51) a. John sneezed for five minutes.
a. John smoked a morning cigarette for years.
b. John was in his office in 10 minutes.
There are basically two kinds of theoretical accounts that have been developed for the
linguistic phenomena subsumed under the label of event coercion: type-shifting accounts
(e.g., Moens & Steedman 1988, de Swart 1998) and underspecification accounts (e.g.,
Pulman 1997, Egg 2005); cf. articles 24 (Egg) Semantic underspecification and 25 (de
Swart) Mismatches and coercion. These accounts and the predictions they make for
the processing of coerced expressions have been the subject of several psycholinguistic
studies;cf., e.g.,de Almeida (2004), Pickering McElrcc & Traxlcr (2005) and Traxlcr ct al.
(2005) on complement coercion and Pinango,Zurif&Jackendoff (1999),Pinango,Mack&
Jackendoff (to appear), Pickering et al. (2006), Bott (2008a, b), Brennan & Pylkkanen
(2008) on aspectual coercion.The crucial question is whether event coercion causes
additional processing costs, and if so at which point in the course of meaning composition such
additional processing takes place.The results obtained so far still don't yield a fully stable
picture. Whether processing differences are detected or not seems to depend partly on the
chosen experimental methods and tasks;cf. Pickering et al. (2006). Pylkkanen & McElree
(2006) draw the following interim balance: Whereas complement coercion always raises
additional processing costs (at least without contextual support), aspectual coercion does
not appear to lead to significant processing difficulties. Pylkkanen & McElree (2006)
propose the following interpretation of these results: Complement coercion involves an
ontological type conflict between the verb's request for an event argument and a given
object referent.This ontological type conflict requires an immediate and time-consuming
repair; otherwise the compositional process would break down. Aspectual coercion, on
the other hand, only involves sortal shifts within the category of events that do not seem
to affect composition and should therefore best be taken as an instance of semantic
35. Situation Semantics and the ontology of natural language 83_[
- The priority of information: language has external significance, as model theoretic
semantics has always emphasized, but, as cognitive scientists of various stripes
emphasize, it also has mental significance, yielding information about agents' internal states;
in this respect see also article 11 (Kempson) Formal semantics and representation-
alism. What is needed is a way of capturing the commonality between the external
and the mental, a matter exacerbated when multimodal meaning (gesture, gaze, visual
access) enters into the picture.
- Cognitive readability: in common with all other biological organisms, language users
are resource bounded agents.This requires that only relatively "small" entities feature
in semantic accounts, hence the emphasis on situations and their characterization in a
computable fashion.
- Structured objects: semantic objects such as propositions need to be treated in a way
that treats their identity conditions very much on a par with 'ordinary' individuals.
Such entities are structured objects:
(1) The primitives of our theory are all real things: individuals, properties, relations,
and space-time locations. Out of these and objects available from the set theory we
construct a universe of abstract objects. (Barwise & Perry 1983:178)
That is, structured objects arrive on the scene with certain constraints that 'define them'
in terms of other entities of the ontology in a manner that is inspired by proof theoretic
approaches. This way of setting up the ontology has the potential of avoiding various
foundational problems that beset classical theories of properties and propositions. For
propositions, these problems typically center around doxastic puzzles such as logical
omniscience and its variants.
An important component in fulfilling these desiderata, according to Barwise and
Perry, is a theory by means of which external (and internal) reality can be represented -
an ontology of some kind. The formalism that emerged came to be known as
Situation Theory - its make up and motivation constitute the focus of sections 2,3, and 4 of
the paper. These proceed in an order that reflects the evolution of Situation Theory:
initially as a theory of situations, then as a theory that includes both concrete entities
such as situations and abstract ones such as propositions, finally as a more extended
ontology, comprising in addition entities such as questions, outcomes, and
possibilities. Section 5 of the paper concerns the emergence of a type theoretic version of
the theory, within a formalism initiated by Robin Cooper. Section 6 provides some
concluding remarks.
2. Introducing situations into semantics: Empirical motivations
Situation Semantics owes its initial prominence to its analysis of the naked infinitive (NI)
construction, exemplified in (2). Here is a construction, argued Barwise (1989b), that
intrinsically requires positing situations - spatio-temporally located parts of the world.
One component of this argument goes as follows: the difference in meaning between
(2a) and (2b) illustrates that "logically equivalent" NIs (relative to an evaluation by a
world) are not semantically equivalent. And yet, the intuitive validity of the inference
from (2b) to (2c) and the inference described in (2d) shows that NIs bring with them
834
VII. Theories of sentence semantics
Lewis 1979), until Barwise and Perry's proposal, the requisite relativization was not
considered a matter to be handled in semantic theory. Barwise and Perry's essential idea is
that in language use more than one situation comes into the picture: they make a
distinction between the described situation, the situation which roughly speaking a declarative
utterance picks out, and a resource situation, so called because it is used as a resource
to fix the range/reference of an NP. Cooper's argument is based on data such as (7),
modelled on an example from Lewis (1979), where two domains are in play, a local one
and a New Zealand one. The former is exploited in the first two sentences, after which
the New Zealand domain takes over. At the point marked II we are to imagine a sudden
shift back to the local domain. By assuming that domains are situations we capture the
fact that once a shift is made, it encompasses the entire situation, ensuring that the dog
referred to is local:
(7) The dog is under the piano and the cat is in the carton.The cat will never meet our
other cat because our other cat lives in New Zealand. Our New Zealand cat lives
with the Cresswells and their dog. And there he'll stay because the dog would be sad
if the cat went away. II The cat's going to pounce on you. And the dog's coming too.
For computational work using resource situations, integrated also with visual information
see Poesio (1993). For experimental work on the resolution of definitcs in conversation
taking a closely related perspective sec Brown-Schmidt & Tancnhaus (2008).
3. The Austinian picture
The ontology we have discussed so far comprises situations and situation types (as well
as of course the elements that make up these entities - individuals, role to individual
assignments). Situations are the main ingredient in a treatment of bare perceptual
reports and play a significant role in underpinning NP meaning and assertion. This is
essentially the ontology of Situations and Attitudes in which there were no propositions.
These were rejected as'artifact(s) of the semantic endeavor' (Barwise & Perry 1983). As
Barwise & Perry (1985) subsequently admitted, this was not a move they were required
to make. Indeed propositional-like entities, more intensional than situations, are a
necessary ingredient for accounts of attitude reports and illocutionary acts. Sets of situations,
although somewhat more fine grained than sets of worlds, are vulnerable to
sophisticated variants of logical omniscience (see e.g. Soames' puzzle in Soames 1985).
Nonetheless, Angelika Kratzer has initiated an approach, somewhat confusingly also known as
Situation Semantics, that does attempt to exploit sets of situations for precisely this role
and develops accounts for a wide range of linguistic phenomena, including modality,
donkey anaphora, cxhaustivity, and factivity. Sec Kratzcr (1989) for an early version of
this approach, and Kratzer (2008) for a detailed, recent survey.
The next cheapest solution available within the Situations and Attitudes ontology
would be to draft the situation types to serve as the propositional entities. Indeed,
situation types are competitive in such a role: they can distinguish identity statements
that involve distinct constituents (e.g. (8a) corresponds to the situation type in (8c),
whereas (8b) corresponds to the situation type in (8d), while allowing substitutivity of
co-referentials and cross-linguistic equivalents, as exemplified respectively by (8e) and
(8f), the Hebrew analogue of (8b):
35. Situation Semantics and the ontology of natural language 835
(8) a. Enesco is identical with himself.
b. Poulenc is identical with himself.
c. ((Identical; enesco, enesco))
d. ((Identical; poulenc, poulenc))
e. He is identical with himself.
f. Poulank zehe leacmo.
Nonetheless, post 1985 situation theory did not go for the cheapest solution; as we will
see in section 4., not succumbing to ontological stinginess pays off when scaling up the
theory to deal with other abstract entities.
Building on a conception articulated 30 years earlier by Austin (1970), Barwisc &
Etchemendy (1987) developed a theory of propositions in which a proposition is a
structured object prop(s,a), individuated in terms of a situation s and a situation type a Here
the intuition is that s is the described situation (or the belief situation, in so far as it is
used to describe an agent's belief, or utterance token, in the case of locutionary
propositions discussed below), with the relationship between s and a being the one introduced
above in our discussion of NIs, leading to a straightforward notion of truth and falsity:
(9) a. prop(s, a) is true iff s : a (s is of type a).
b. prop(s, a) is false iff s :la(s is not of type a).
In saying that a proposition prop(s, a) is individuated in terms of s and a, the intention is
to say that prop(s, 6) = prop(f, r) if and only if s = t and <T= r. Individuating propositions
in terms of their "subject matter" (i.e. the situation type component) is familiar, but what
is innovative and/or puzzling is the claim that two propositions can be distinct despite
having the same subject matter.
I mention three examples from the literature of cases which motivate differentiating
propositions on the basis of their situational component.The first is one we saw above in
the case of definiteness resolution, where the possibility of using 'the dog' is underwritten
by distinct presuppositions; the difference in the presuppositions resides in the different
resource situations exploited:
(10) a. pvop(slocal, ((UNIQUE, Dog)))
b. prop(snewzealantl,((UNIQUE, Dog)))
A second case are the locutionary propositions introduced by Ginzburg (2011). Ginzburg
argues that characterizing both the update potential and the range of utterances that can
be used to seek clarification about a given utterance uQ requires reference to the
utterance token u0, as well as to its grammatical type Tu (see article 36 (Ginzburg) Situation
Semantics for details). By defining propositions (locutionary propositions) individuated
in terms of «Qand Tu one can simultaneously define update and clarification potential
for utterances. In this case, there are potentially many instances of distinct locutionary
propositions, which need to be differentiated on the basis of the utterance token -
minimally any two utterances classified as being of the same type by the grammar.
The original motivation for Austinian propositions was in the treatment of the Liar
paradox by Barwise & Etchemendy (1987).This paradox concerns sentences like (lla,b)
which, pretheoretically, are false if true and true if false. Although one approach to this
838
VII. Theories of sentence semantics
viable theory of propositions can on its own deliver a viable theory of the attitudes. This
is because attitudes have structure not perfectly mirrored by their external content, a
realization of which prompted Barwise and Perry to abandon their initial essentially
proposition-based account.The most striking illustration of this is in puzzles like Kripke's
Pierre (Kripke 1979), who is unaware that the wonderful city of Londres about which
he learnt as a child is the same place as the squalid London, where he currently resides.
While his beliefs are perfectly rational, we can say of him that he believes that London is
pretty and also does not believe that London is pretty.
One possible conclusion from this (see e.g. Crimmins 1993), a way out of paradox,
is that attitude reports involve implicit reference to attitudinal states: relative to
information acquired in France, Pierre believes London is pretty; relative to
information acquired on the ground in London, he believes the opposite. Here is yet another
important role for situations in linguistic description. One way to integrate this in an
account of complementation was offered in Cooper & Ginzburg (1996) and Ginzburg
(1995) for declarative and interrogative attitude reports, respectively. This constitutes a
compositional reformulation of the philosophical accounts cited above.
The main idea is to assume that attitude predicates involve at least three arguments:
an agent, an attitudinal state and a semantic entity. For instance with respect to belief, this
relates an agent's belief in a proposition to facts about the agent's mental situation.This
amounts to linking a positive belief attribution of proposition p relative to the mental
situation ms with the existence of an internal belief state whose content is p. An example
of such a mental situation is given in section 5.
4. Awiderontological net
The ontology of Situation Theory (ST) was originally designed on the basis of a rather
restricted data set. One of the challenges of more recent work has been to extend this
ontology in order to account for two related key domains for semantics: root clauses in
conversational use and verb complementation. A large body of semantic work that has
emerged since the late 1970s demonstrating that interrogative clauses possess
denotations (questions) distinct in semantic type from declarative ones; imperative and
subjunctive clauses possess denotations (dubbed outcomes by Ginzburg & Sag 2000) distinct
in semantic type from declarative and interrogative ones; facts are distinct from true
propositions; for detailed empirical evidence for these distinctions, see Vendler (1972),
Asher (1993), Peterson (1997) and Ginzburg & Sag (2000); see also article 66 (Krifka)
Questions and article 67 (Han) Imperatives.
The main challenge in developing an ontology which distinguishes the diverse
menagerie of abstract entities including propositions, questions, outcomes and facts is
characterizing the structure of these entities, indeed figuring out how the distinct entities relate
to each other. As pointed out by Ginzburg & Sag (2000), quantified NPs and certain
adverbs are possible in declarative, interrogative and imperative semantic environments.
Hence, the ontology must provide a semantic unit which constitutes the input/output of
such adverbial modifiers and of NP quantification. To make this concrete - the
assumption that the denotation of imperatives is of a type distinct from t (however cashed out)
is difficult to square with (a simplistic implementation) of the received wisdom that
NPs such as 'everyone' are of type « e, t >, t >. If the latter were the case, composing
'everyone' with 'vacate the building' in (14c) would yield a denotation of type t:
35. Situation Semantics and the ontology of natural language
839
(14) a.
b.
c.
d.
e.
f.
Everyone vacated the building.
Did everyone vacate the building?
Everyone vacate the building!
Kim always wins.
Does Kim always win?
Always wear white!
As we will see subsequently, a good candidate for this role are situation types. These, as
we observed in section 3., are not conflated with propositions in the situation theoretic
universe.
Ginzburg & Sag (2000) set out to construct an ontology that appropriately
distinguishes these entities and yet retains the features of the ST ontology discussed earlier.
The ontology, dubbed a Situational Universe with Abstract Entities (SU+AE), was
developed in line with the strategy of Barwise and Perry's (l).This was implemented on two
levels, one within a universe of type-based feature structures (Carpenter 1992).This
universe underpinned grammatical analysis, using Head Driven Phrase Structure Grammar
(HPSG). A denotational semantics was also developed in the Axiom of Foundation
with Atoms (AFA)-based set theoretic framework of Seligman & Moss (1997). In what
follows, I restrict attention to the latter.
A semantic universe is identified with a relational structure S of the form [A, 5,,...,
Sm; Rv ..., Rm]. Here A - sometimes notatcd also as I S| - is the universe of the structure.
From the class of relations we single out the 5,,..., Sm which are called the structural
relations, as they are to capture the structure of certain elements in the domain. Each 5. can
be thought of as providing a condition that defines a single structured object in terms of
a list of n objects xv ..., xn.
Situations and situation types serve as the 'basic building blocks' from which the
requisite abstract entities of the ontology arc constructed:
Propositions are structurally determined by a situation and a situation type. (See
discussion in section 3.)
Intuitively, each outcome is a specification of a situation which is futurate relative
to some other given situation. Given this, outcomes are structurally determined by a
situation and a situation type abstract whose temporal argument is abstracted away,
thereby allowing specification of fulfillcdncss conditions.
Possibilities, a subclass of which constitutes the universe's facts, are structurally
determined by a proposition. This reflects the tight link between propositions and
possibilities. As Ginzburg & Sag (2000) explain, there is no obvious priority between
possibilities and propositions: one could develop an ontology where propositions
are built out of possibilities.
An additional assumption made is that the semantic universe is closed under
simultaneous abstraction. Simultaneous abstraction, originally defined by Aczel & Lunnon
(1991), is a semantic operation akin to X-abstraction with one significant extension:
abstraction is over sets of elements, including the empty set. Moreover, abstraction (including
over the empty set) is potent - the body out of which abstraction occurs is distinct from
the abstract.The assumption about closure under simultaneous abstraction is akin to the
common type theoretic assumption about closure under functional type formation.
842
VII. Theories of sentence semantics
in Barwise (1989c), and comprehensively developed in Seligman & Moss (1997), were
rather complex. Concretely, simultaneous X-abstraction with restrictions is a tool with
a variety of uses, including quantification, questions, and the specification of attitudinal
states and meanings (for the latter see article 36 (Ginzburg) Situation Semantics). Its
complex set theoretic characterization made it difficult to use. Concomitantly, the theory
in this form required an auxiliary coding into a distinct formalism (e.g. typed feature
structures) for grammatical and computational applications. Neither of these versions of
the theory provides an adequate notion of role-dependency that has become standard in
recent treatments of anaphora and quantification on which much semantic work has been
invested in frameworks such as Discourse Representation Theory and Dynamic
Semantics; sec Gawron & Peters (1990) for a detailed theory of anaphora and quantification
in situation semantics, though one that is not dynamic.
Motivated to a large extent by such concerns, the situation theoretic outlook has been
redeveloped using tools from Type Theory with Records (TTR), a framework initiated
by Robin Cooper. Ever since Sundholm and Ranta's pioneering work (Sundholm 1986;
Ranta 1994), there has been interest in using constructive type theory (often referred to
as Martin-LofType Theory) as a framework for semantics (see e.g. Fernando 2001 and
Krahmer & Piwek 1999). TTR is a model theoretic outgrowth of constructive type
theory. Its provision of entities at both levels of tokens and types allows one to combine
aspects of the typed feature structures world and the set theoretic world, enabling its use
as a computational grammatical formalism. As wc will sec,TTR provides the scmanticist
with a formalism that satisfies the desiderata I mentioned in section 1. Cooper (2006a)
has shown that the lion's share of situation theoretic results can be recast in TTR - the
main exception being those results that depend explicitly on the existence of a non-well-
founded universe, for instance Barwise and Etchemendy's account of the Liar; the type
theoretic universe is well founded. But one could, according to Cooper (p.c), recreate
non-wcll-foundcdncss at the level where witnessing of types occurs. In addition, TTR
allows for DRT-oriented or Montogovian treatment of anaphora and quantification. For
a computational implementation of TTR, see Cooper (2008); for a closely related
framework, the Grammatical Framework see Ranta (2004).
The move to TTR is, however, not primarily a means of capturing and perhaps mildly
refining past results, but crucially underpins a theory of conversational interaction on
both illocutionary and mctacommunicative levels. A side effect of this is, via a theory of
generation, an account of attitude reports.
In the remaining space, I will briefly exposit the basics of TTR, show its ability to
underpin SU+AEs and briefly sketch how this can be used to define basic information
states in dialogue. One linguistic application will be provided, one that ties up situations,
information states, and meaning: a specification of the meaning of a discourse-bound
pronoun.
5.1. Generalizing the situation/situation type relation
The most fundamental notion of TTR is the typing judgement a : T classifying an
object a as being of type 7. This can be seen as a generalization of the situation
semantics judgement s : a, generalization in that not only situations can figure as subjects of
typing judgements. Note that the theory provides the objects and the types, but this
form of judgement, as well as other forms are metatheoretical. Examples are given in
846
VII. Theories of sentence semantics
goes on to offer a criterion of type individuation of record types using X-types, where the
corresponding'labels' function as bound variables.
Ginzburg (2005a) shows how to formulate a theory of questions as propositional
abstracts in TTR, while using the standard TTR notion of abstraction. In this way, a
possible criticism of the approach of Ginzburg & Sag (2000), that they use an ad hoc and
complex notion of abstraction, can be circumvented. Similarly, Ginzburg (2005b) shows
how to explicate outcomes within TTR.
5.3. Ontology in interaction
The most active area in the application of TTR to the description of NL is in the area
of dialogue. Larsson (2002) and Cooper (2006b) showed how to decompose interaction
protocols, such as those specified situation theoretically in Ginzburg (1996), by using
TTR to describe update rules on the information states of dialogue participants.This was
extended by Ginzburg (2011) to cover a variety of illocutionary moves, metacommunica-
tive interaction (see article 36 (Ginzburg) Situation Semantics for some discussion) and
conversational genres. Fernandez (2006) uses TTR to develop a wide coverage of the
range of non-sentential utterances that occur in conversation.
In these works, information states are assumed to consist of a public and unpublicized
part. For current purposes wc restrict attention to the public part, also known as each
participant's dialogue gameboard (DGB). Each DGB is a record of the type given in (27) -
the spkr, addr fields allow one to track turn ownership, Facts represents conversationally
shared assumptions, Pending and Moves represent respectively moves that are in the
process of/have been grounded, QUD tracks the questions currently under discussion:
(27) rspkr:Ind
addr:Ind
c-utt: addrcssing(spkr,addr)
Facts : Set(Prop)
Pending : list(LocProp)
Moves : list(LocProp)
I QUD : poset(Question)
We call a mapping that indicates how one DGB can be modified by conversationally
related action a conversational rule, and the types specifying its domain and its range
respectively the preconditions and the effects. Here I exemplify the use of TTR to give
a partial characterization of the meaning of pronouns in dialogue, a task that links
assertion acceptance, situations, and meaning.
The main challenge for a theory of meaning for pronouns is of course how to
characterize their antecedency conditions; here I restrict attention to intersentential cases,
see Ginzburg (2011) for an extension of this account to intra-sentential cases. Dialogue
takes us away quite quickly from certain received ideas on this score: antecedents can
arise from queries (28a), from partially understood or even disflucnt utterances ((28b,c)
respectively). Moreover, as (28d) illustrates, where 'he' cannot refer to 'Jake', the shelf
life of an antecedent is potentially quite short. Although the data are subtle, a plausible
assumption is that for non-referential NPs anaphora are not generally possible from
35. Situation Semantics and the ontology of natural language
847
within a query (polar or wh) (Groenendijk 1998), or from an assertion that has been
rejected (e.g. (28e,f)).
(28) a. A: Did John phone? B: He's out of contact in Daghestan.
b. A: Did John phone? B: Is he someone with a booming bass voice?
c. Peter was, well he was fired.
d. A: Jake hit Bill. / B: No, he patted him on the back. / A: Ah. Is Bill going to the
party tomorrow? /B: No./ A: Is #he/Jake?
e. A: Do you own an apartment? B: Yes. A: Where is it located?
f. A: Do you own an apartment? B: No. A: #Where might you buy it?
This means, naturally enough, that witnesses to non-referential NPs can only emerge in a
context where the corresponding assertion has been accepted. A natural move to make
in light of this is to postulate a witnessing process as a side effect of assertion acceptance,
a consequence of which will be the emergence of referents for non-referential NPs. For
uniformity's sake, we can assume that these witnesses get incorporated into the
contextual parameters (c-params) of that utterance, which in any case includes (witnesses for)
the referential NPs.This means that c-params serves uniformly as the locus for witnesses
of'discourse anaphora'. The rule of incorporating non-referential witnesses in c-params
is actually simply a minor add on to the rule that underwrites assertion acceptance (see
Ginzburg 2011, chapter 4) - the rule underpinning the utterance of acceptances - it can
be viewed as providing for a witness for situation/event anaphora since this is what gets
directly introduced into c-params. In cases where the witness is a record (essentially
when the proposition is positive), NP witnesses will emerge. In (29) the preconditions
involve the fact that the speaker's latest move is an assertion of a proposition whose type
isTl.The effects change the speaker/addressee roles (since the asserted to becomes the
accepter) and adds a record w, including a witness for Tl, to the contextual parameters.
(29) Accept move:
spkr:Ind 1 1
addr:Ind
_|~sit = sitl "I
P~[sit-typc = TlJ' r°P
LatestMoveco,,,c,"= Assert(spkr,addr,p): IllocProp J
spkr = preconds.addr: Ind 1
addr = prcconds^pkr: Ind
w = preconds.LatestMove.c-params U [sit = sitl]: Rec
Moves = ml ©preconds.Moves:list(LocProp)
mjco.ucni _ Accept(spkr,addr,p): LocProp
m0.c-param = w : Rec JJ
Wc can now state the meaning of a singular pronoun somewhat schematically as follows:
it is a word whose contextual parameters include an antecedent, which is to be sought
from among the constituents of an active move; the pronoun is identical in reference to
preconds:
effects
848
VII. Theories of sentence semantics
this antecedent and agrees with it. Space precludes a careful characterization of what
it means to be active, but given the data we saw above it is a composite property
determined by QUD - essentially being specific to an clement of QUD - and Pending. Sec
article 36 (Ginzburg) Situation Semantics for the justification for including reference to
an utterance's constituents in grammatical representation. In (30), I provide a lexical
entry in the style of HPSG that captures this specification: here m represents the active
move and a the antecedent; the final condition on c-params requires that within /n's
contextual parameters is one whose index is identical to that of a's content:
(30)
PHON:(shc)
m : LocProp
a '.Sign
c-params:I cl: member(a,m.constits)
c2: ActivcMovc(m)
m.sit.c-params: [a.c-params.index = a.cont.index : Ind]
head: N
ana : +
agr = c-params.m.cat.agr
cont: [index = a.cont.index : Ind]
num =sg: Number
gen = fem : Gender
pers = third: Person
:syncat
6. Conclusions
Situation semantics initiated an ontology-oriented approach to semantics: the aim being
to develop means of representing the external and internal reality of agents in a cogni-
tivcly tractable way. The initial emphasis was on situations, based in part on evidence
from the naked infinitival construction. Situations, parts of the world, have proved to be
of significant importance to a variety of phenomena ranging from the domains
associated with NP use to negation and, one way or the other, play a significant role in the very
act of asserting. The theory of situations subsequently lead to a theory of propositions
(Austinian propositions), structured objects constructed from situations and situation
types: Austinian propositions are significantly more fine-grained than possible worlds
propositions, but coarse grained enough to pass translation and paraphrase criteria.They
also have a potential construal in terms of differingly coarse grained memory traces.
The technique of individuating abstract entities as structured objects enables the
theory to scale up: by integrating questions, outcomes and facts into the ontology,
Situation Theory was able to underpin a rudimentary theory of illocutionary interaction
(entities such as questions, propositions and outcomes serve as the descriptive content
of queries, assertions and requests) and a theory of complementation for attitudinal
predicates.
35. Situation Semantics and the ontology of natural language 849
A recent development has been to recast the theory in type theoretic terms, concretely
using the formalism of Type Theory with Records. Type Theory with Records has many
similar characteristics to situation theory - an ontology - oriented approach,
computational tractability, structured objects. This enables most of the results situation theory
achieved to be maintained. On the other hand, Type Theory with Records brings with it
a more transparent formalism and the existence of dependent types allows both dynamic
semantic and unification grammar techniques to be utilized. Indeed, all these can be
combined to construct a theory of illocutionary and metacommunicative interaction, one of
the key areas of development for semantics in the early 21st century.
7. References
Aczel, Peter 1988. Non Well Founded Sets. Stanford, CA: CSLI Publications
Aczel, Peter, David Israel, Stanley Peters & Yasuhiro Katagiri (eds) 1993. Situation Theory and Its
Applications, III. Stanford, CA: CSLI Publications
Aczel, Peter & Rachel Luniioii 1991. Universes and parameters In: J. Barwise ct al. (eds). Situation
Theory and its Applications, II. Stanford, CA: CSLI Publications, 3-24.
Allen. James 1983. Maintaining knowledge about temporal intervals Communications of the ACM
26,832-843.
Aslicr, Nicholas 1993. Reference to Abstract Objects in English: A Philosophical Semantics for
Natural Language Metaphysics. Dordrecht: Kluwei.
Austin. John L. 1970. Truth. In: J. Urinson & G. J. Warnock (eds). Philosophical Papers. 2nd edn.
Oxford: Oxford University Press, 117-133.
Barwise, Jon 1989a. Branch points in Situation Theory. In: J. Barwise. The Situation in Logic.
Stanford, CA: CSLI Publications 255-276.
Barwise, Jon 1989b. Scenes and other situations In: J. Barwise. The Situation in Logic. Stanford, CA:
CSLI Publications 5-36.
Barwise, Jon 1989c. The Situation in Logic. Stanford, CA: CSLI Publications
Barwise, Jon & John Etchemendy 1987. The Liar: An Essay on Truth and Circularity. Oxford:
Oxford University Press
Barwise, Jon ct al. (eds) 1991. Situation Theory and Its Applications, II. Stanford, CA: CSLI
Publications
Barwise, Jon & John Perry 1983. Situations and Attitudes. Cambridge, MA: The MIT Press
Barwise, Jon & John Perry 1985. Shifting situations and shaken attitudes Linguistics & Philosophy
8,399^52.
Brown-Schmidt. S. & Michael K.Tanenhaus 2008. Real-time investigation of referential domains in
unscripted conversation: A targeted language game approach. Cognitive Science: A Multidisci-
plinary Journal 32,643-684.
Carpenter, Bob 1992. The Logic of Typed Feature Structures: With Applications to Unification
Grammars, Logic Programs, and Constraint Resolution. Cambridge: Cambridge University
Press
Cooper, Robin 1993. Generalized quantifiers and resource situations In: P. Aczel et al. (eds).
Situation Theory and Its Applications, III. Stanford, CA: CSLI Publications 191-211.
Cooper, Robin 1996. The role of situations in Generalized Quantifiers In: S. Lappin (ed.).
Handbook of Contemporary Semantic Theory. Oxford: Blackwell, 65-86.
Cooper, Robin 1998. Austinian propositions Davidsonian events and perception complements In:
J. Ginzburg et al. (eds). The Tbilisi Symposium on Logic, Language and Computation: Selected
Papers. Stanford, CA: CSLI Publications 19-34.
Cooper, Robin 2006a. Austinian truth in Martin-L6f Type Theory. Research on Language and
Computation 3,333-362.
35. Situation Semantics and the ontology of natural language 851
Kamp, H. 1979. Events, instants and temporal reference. In: R. Bauerle, U. Egli & A. von Stechow
(eds.). Semantics from Different Points of View. Berlin: Springer. 376-417.
Kirn, Yookyung 1998. Information articulation and truth conditions of existential sentences
Language and Information 1,67-105.
Kiahmei. Emiel & Paul Piwek 1999. Presupposition projection as proof construction. In: H. Bunt &
R. Muskens (eds.). Computing Meaning: Current Issues in Computational Semantics. Dordrecht:
Kluwer, 281-300.
Kratzer. Angelika 1989. An investigation of the lumps of thought. Linguistics & Philosophy 12.
607-653.
Kratzer. Angelika 2008. Situations in natural language semantics In: E. N. Zalta (ed.). The
Stanford Encyclopedia of Philosophy (Fall 2008 Edition), http://plato.stanford.edu/cntries/situation-
semantics/. December 15,2010.
Ki ipke, Saul 1975. Outline of a theory of truth. The Journal of Philosophy, 690-716.
Kripke, Saul 1979. A puzzle about belief. In: A. Margalit (cd.). Meaning and Use. Dordrecht: Rcidel,
239-283.
Larsson, Staffaii 2002. Issue based Dialogue Management. Doctoral dissertation. University of
Gothenburg.
Lewis, David K. 1979. Score keeping in a language game. In: R. Bauerle, U. Egli & A. von Stechow
(eds.). Semantics from Different Points of View. Berlin: Springer, 172-187.
McCawley, James D. 1979. Presupposition and discourse structure. In: C.-K. Oh & D. Dinneen
(eds.). Presupposition. New York: Academic Press, 371-388.
McGee, Vann 1991. Review of J. Barwise & J. Etchemendy. The Liar (Oxford, 1987). Philosophical
Review 100,472^74.
Moss, Lawrence 1989. Review of J. Barwise & J. Etchemendy. The Liar (Oxford, 1987). Bulletin of
the American Mathematical Society 20,216-225.
Muskens, Rcinhard 1989. Meaning and Partiality. Doctoral dissertation. University of Amsterdam.
Reprinted: Stanford, CA: CSLI Publications, 1995.
Neale, Stephen 1988. Events and logical form. Linguistics & Philosophy 11,303-321.
Peterson, Philip 1997. Fact, Proposition, Event. Dordrecht: Kluwer.
Poesio, Massimo 1993. A situation-theoretic formalization of definite description interpretation
in plan elaboration dialogues In: P. Aczel ct al. (eds). Situation Theory and Its Applications, III.
Stanford, CA: CSLI Publications, 339-374.
Ranta, Aarne 1994. Type Theoretical Grammar. Oxford: Oxford University Press.
Ranta.Aarne 2004. Grammatical framework. Journal of Functional Programming 14,145-189.
Richard. Mark 1990. Propositional Attitudes: An Essay on Thoughts and How We Ascribe Them.
Cambridge, MA:The MIT Press.
Russell, Bertrand 1905. On denoting. Mind 14,479^193.
Schulz, Stephen 1993. Modal Situation Theory. In: P. Aczel et al. (eds.). Situation Theory and Its
Applications, III. Stanford, CA: CSLI Publications, 163-188.
Seliginan, Jerry & Larry Moss 1997. Situation Theory. In: J. van Benthcm & A. tcr Meulen (eds).
Handbook of Logic and Linguistics. Amsterdam: North-Holland, 239-309.
Soames, Scott 1985. Lost innocence. Linguistics & Philosophy 8,59-71.
Sundholm, Goran 1986. Proof Theory and meaning. In: D. Gabbay & F. Guenthner (eds.).
Handbook of Philosophical Logic. Oxford: Oxford University Press, 471-506.
Vendler, Zeno 1972. Res Cogitans. Ithaca, NY: Cornell University Press.
Vogel, Carl & Jonathan Ginzburg 1999. A situated theory of modality. Paper presented at the
3rd Tbilisi Symposium on Logic, Language, and Computation, Batumi. Republic of Georgia,
September 1999.
Jonathan Ginzburg, Paris (France)
VI11. Theories of discourse semantics
36. Situation Semantics: From indexicality
to metacommunicative interaction
1. Introduction
2. Desiderata for semantics
3. The Relational Theory of Meaning
4. Meaning, utterances, and dialogue
5. Closing remarks
6. References
Abstract
Situation Semantics emerged in the 1980s with an ambitious program of reform for
semantics, both in the domain of semantic ontology and with regard to the integration of context
in meaning. This article takes as its starting point the focus on utterance (as opposed to
sentence) interpretation. The far reaching aims Barwise and Perry proposed for semantic
theory are spelled out. Barwise and Perry's Relational Theory of Meaning is described, in
particular its emphasis on utterance situations and on the reification of information. The
final part of the article explains how conceptual apparatus from situation semantics has
ultimately come to play an important role in a highly challenging enterprise, modelling
dialogue interaction, in particular metacommunicative interaction.
1. Introduction
Situation Semantics emerged in the 1980s with an ambitious program of reform for
semantics, both in the domain of semantic ontology and with regard to the integration of
context in meaning. In their 1983 book Situations and Attitudes (Barwise & Perry 1983),
as well as a host of other publications around that time collected in Barwise (1989)
and Perry (2000), Barwise and Perry argued for the preeminence of a situation-based
ontology and took contexts of utterance to be situations, thereby offering the potential
for a richer view of context than was available previously. For situation semantics and
ontology, see article 35 (Ginzburg) Situation Semantics and NL ontology. This article
takes as its starting point the focus on utterance (as opposed to sentence) interpretation.
In section 2 I spell out the far reaching aims Barwise and Perry proposed for semantic
theory. In section 3 I sketch Barwise and Perry's Relational Theory of Meaning, in
particular its emphasis on utterance situations and on the reification of information. I also
point out some of the weaknesses of Barwise and Perry's enterprise, particularly the
approach to context. One of these weaknesses, in my view, is that the theory is quite
powerful, but it was, largely, applied to dealing with traditional, scntcncc-lcvcl semantics.
The final section of this article, section 4, explains how conceptual apparatus from
situation semantics has ultimately come to play an important role in a highly challenging
enterprise, modelling dialogue interaction.
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 852-872
36. Situation Semantics: From indexicality to metacommunicative interaction 853
2. Desiderata for semantics
Barwise and Perry's starting point is model theoretic semantics, as developed in the
classical Montague Semantics tradition (sec e.g. Montague 1974; Dowty, Wall & Peters 1981;
Gamut 1991 and article 33 (Zimmcrmann) Model-theoretic semantics): a natural
language is likened to a formal language (first order logic, intensional logic etc). On this
approach, providing a semantics for such a language involves primarily assigning
interpretations (or denotations) to the words of the language and rules that allow phrases to be
interpreted in a compositional manner.This allows both the productivity of NL meaning
and the potential for various kinds of ambiguity to be explicated. Contexts, on this view,
are identified with indices - tuples consisting of a small and fixed number of dimensions,
prototypically providing values for speaker, addressee, time, location. Interpretations of
words/phrases arc then all taken to be relative to contexts, thereby yielding two essential
semantic entities: characters/meanings which involve abstracting away indices from
contents/interpretations. These - supplemented by lexical meaning postulates - can be used
to explicate logically valid inference.
Barwise and Perry view this picture of semantics as significantly too restrictive. The
basic perspective they adopt is one in which linguistic understanding is assimilated to
the extraction of information by resource bounded agents in their natural environment
(inspired in part by the work of Gibson, e.g. Gibson 1979).This drives their emphasis on a
number of unorthodox seeming fundamental desiderata for semantic theory, desiderata
we will subsequently come to see find considerable resonance in the desiderata for a
theory of meaning for conversational interaction.
The first class of desiderata are metatheoretical in nature and can be summed up as
follows:
Desideratum 1: The priority or information
Language has external significance, as model theoretic semantics has always emphasized,
but, as cognitive scientists of various stripes emphasize, it also has mental significance,
yielding information about agents' internal states. What is needed is a way of capturing
the commonality between the external and the mental, the flow of information - the
chain from fact to thought in one participant's mind to utterance to thought in another
participant's mind, graphically exemplified in Fig. 36.1.
Fig. 36.1: The Flow of Information. (Barwise & Perry 1983: 17)
854
VIII. Theories of discourse semantics
An important component in fulfilling this desideratum, according to Barwise and Perry,
is a theory by means of which external (and internal) reality can be represented - an
ontology of some kind. This is what developed into situation theory and type theory
with records (see article 35 (Ginzburg) Situation Semantics and NL ontology). A key
ingredient in such a theory are some notion of constraints, a way of capturing necessary,
natural, or conventional linkages between situations (e.g. smoke means fire, image being
such and such means leg is broken etc.), along with a theory of how agents in a situation
extract information using constraints.The other crucial component is the naturalization
of linguistic meanings - their reduction to concepts from the physical world - in terms
of constraints.
The other two pivotal desiderata put forward by Barwise and Perry arc more directly
aimed at repositioning the semantic fulcrum, from interpretation towards context.
Desideratum 2: Information content is underdetermined by interpretation
We might provide news about the Argentinean elections using any of the following
sentences in (1). All three sentences uttered in these circumstances intuitively have
the same external significance - we would wish to identify their content and, on some
accounts, their meaning as well. Nonetheless, different information can be acquired from
each: for instance, (lb) allows one to infer that Kirchner is a woman, whereas Lavagna
is a man.
(1) a. Kirchner beat Lavagna.
b. Senora Kirchner defeated Senor Lavagna.
c. Cristina's losing opponent was Lavagna.
Desideratum 3: Language is an efficient medium
Barwise and Perry emphasize that the flip side of productivity gets less attention as a
fundamental characteristic of NL: the possibility of reusing the same expression to say
different things. Examples of the phenomena Barwise and Perry had in mind are in (2),
which even in 2009 are tricky. By 'tricky' I don't mean we lack a diagnosis, I mean there
is no single formal and/or implemented semantic/pragmatic theory that picks them all
off with ease, interfacing along the way with inter alia theories of gesture, gaze, and visual
access.
(2) a. A: I'm right, you're wrong. B: I'm right, you're wrong.
b. I want you, you, and you to stand here and I want you, you, and you to stand
here, (based on examples in Levinson 1983; Pollard & Sag 1994)
c. A: John is irritating John no end. B: He can't be annoying him so badly.
d. In last week's FoLLI dissertation prize meeting sadly the linguist voted for the
linguist, whereas the logician voted for the logician, (based on an example in
Cooper 1996)
3. The Relational Theory of Meaning
At the heart of situation semantics is the Relation Theory of Meaning. There are two
fundamentally innovative aspects underlying this theory, which bear significant importance
to current semantic theory in the wider sense:
36. Situation Semantics: From indexicality to metacommunicative interaction 855
(3) a. Meaning Reification: the reification of meanings as entities on which humans
reason (rather than as metatheoretical entities, as standard in logic),
b. Speech Events as Semantic Entities: recognition of speech events (incl speakers,
addressees, the speech token) as fundamental semantic units; sentences
arc viewed as derivative: type-like entities that emerge from utterances, or, as
Barwise and Perry put it, uniformities over utterances.
To get a feel for the theory, consider a simple example. (4b), taken to be the meaning of
(4a), is a crude representation of an early version of the Relational Theory of Meaning:
a (declarative) meaning relates all utterance events u in which there exists a speaker
a, addressee b, spatiotcmporal locations /, t, referents ;', m (for the names 'Jacky' and
'Molly' respectively) to described events e in which ; bites m at /.This relation is
exemplified graphically in Fig. 36.2, which emphasizes the reality of the utterance situation.
I have purposely used quasi-Davidsonian notation (see article 34 (Maienborn) Event
semantics) to indicate that the central insight there is independent of the various more
and particularly less standard formalisms in which the Relational Theory of Meaning has
been couched. As we will soon see, there are various ways which differ significantly to
cash out the characterization of u,e and their interrelation.
(4) a. Jacky bit Molly.
b. { u,e | Ba,b,l,j,m,t [uttering(a,'Jacky is biting Molly',u) a addressee(u,b) a
In(u,l) a rcfcrring(a,j,'Jacky') a Namcd(j,'Jacky') a rcfcrring(a,m,'Molly') a
Namcd(m,'Molly') a coincidcnt(l,t) a dcscribing(a,c) a bitc(c,j,m,t)J }
Fig. 36.2: The meaning of 'Jacky is biting Molly' as a relation between situations in
which this construction is uttered and events in which a Jacky bites a Molly. (Barwise &
Perry 1983: 122)
Of the two assumptions, Speech Events as Semantic Entities was introduced by Barwise
and Perry in a stronger form than (3), graphically exemplified in Fig. 36.2 - not only do
they make reference to speech event, but Barwise and Perry actually posit a compositional
aspect to speech events:
(5) a. If oris a phrase with sub-constituents X,Y, then uttcring(a, a, u) entails the
existence of two subevents of e er e2 such that
856
VIII. Theories of discourse semantics
b. e{ < e2 (el temporally precedes e2)
c. uttering(a, X, u{)
d. uttcring(a, Y, u2)
This formulation raises a variety of issues concerning syntax, presupposing essentially a
strongly surfacey and linearized approach. For obvious reasons of space I cannot enter
into these, but they constitute an important backdrop. Speech Events as Semantic
Entities underlay a number of grammar fragments subsequent to Barwise & Perry (1983)
(e.g. Gawron & Peters 1990; Cooper & Poesio 1994), but on the whole was not the focus
of much interest until Poesio realized its significance for conversational processing, as
we discuss in section 4.2. In contrast, issues concerning Meaning Reification drove much
research in the hey day of Situation Semantics. The relation examplified in (4b) is
certainly a relation with relevance to the semantics of utterances of 'Jacky is biting Molly': it
relates events in which the speaker mouths a particular linguistic form while referring to
a Jacky and a Molly with an event the speaker is describing in which that Jacky bit that
Molly. Barwise and Perry view attunement - the awareness of similarities between
situations and of relationships that obtain between such similar situations - to the constraint
in (4) as being what underlies our competence to use and understand such utterances.
Nonetheless, there are two aspects which the formulation above abstracts away from:
contextual parameter instantiation and truth evaluation. (4) does not make explicit the
fact that understanding such an utterance involves finding appropriate referents for the
two NP sub-utterances, as indeed in certain circumstances - e.g. for an overhearer who
cannot see the speech participants or hears a recording - for the speaker and the time. In
fact, in the original formulation of the Relational Theory of Meaning Barwise and Perry
made a point of not packaging all of context in one event/situation, but distinguished
three components of context: (a) the discourse situation, comprising the public aspects
of an utterance (including all the standard indexical parameters), (b) the speaker
connections, comprising information pertaining to a speaker's referential intentions, and (c)
resource situations, events/situations distinct from the described situation, used to serve
as referential/quantificational domains. Although the discourse situation/speaker
connection dichotomy docs not seem to have survived - examples such as (2b) illustrate
the importance of speaker intention even with 'simple indexicals', the ultimate insight to
be drawn here, it seems, is the unbounded nature of contextual dependence. Resource
situations are one of the important contributions of situation semantics (see particularly
Cooper 1996), and are further discussed in article 35 (Ginzburg) Situation Semantics and
NL ontology.
Returning to (4), the formulation of the Relational Theory of Meaning as a
relation between contextual situations (the discourse situation, speaker connections, zero
or more resource situations) and described situations, is problematic. It means that the
latter cannot serve as the denotations of declarative utterances (since they are not truth
bearers), nor does it generalize to non-declarative meaning.This reflects the fact that in
Situations and Attitudes Barwise and Perry attempt to stick to an avowedly "concrete"
ontology, one which eschews abstract entities such as propositions, leading them into
various foundational problems.
This stance was abandoned soon after - various notions of propositions emerged as
situation theory developed. Hence, in works such as Gawron & Peters (1990), Cooper
& Poesio (1994), works from a maturer version of situation semantics, (declarative)
36. Situation Semantics: From indexicality to metacommunicative interaction 861_
(rank: 30; 5% of turns). Clarification Requests (CRs) constitute approximately 4-5%
of all utterances (see e.g. Purver 2004; Rodriguez & Schlangen 2004). Moreover, there is
suggestive evidence from artificial life simulation studies that the existence of CRs is not
an incidental feature of interaction but a key component in the long-term viability of a
language. Macura & Ginzburg (2006) and Macura (2007) show that when repair acts are
a part of a linguistic interaction system, a stable language can be maintained over
generations. Whereas, in a community endowed with a language that /acfoCRification,asI refer
to the interaction brought about by a CR, the emergent divergence among language
users is so high that the language eventually dies out. Ignoring metacommunicative
interaction, as has been the case for just about the entire tradition of formal semantics, means
missing out one of the basic building blocks of linguistic interaction. Situation Semantics
was itself complicit in this. However, the view of language it provides, with its reference
to speech events as part of the semantic domain, and the reification of meanings provides
important building blocks for a theory of metacommunicative interaction.
How then to integrate metacommunicative aspects into the semantic process? Such
phenomena have been studied extensively by psycholinguists and conversational
analysts in terms of notions such as grounding and feedback (in the sense of Clark 1996 and
Allwood 1995, respectively) and of repair (in the sense of Schegloff 1987). The main
claim that originates with Clark & Schaefer (1989) is that any dialogue move m{ made
by A must be grounded (viz acknowledged as understood) by the other conversational
participant B before it enters the common ground; failing this CRification must ensue.
While Clark and Schacfer's assumption about grounding is somewhat too strong, as
Allwood argues, it provides a starting point, indicating the need to interleave the
potential for grounding/CRification incrementally; the size of the increments being
an important empirical issue. From a semantic theory, we might expect the ability to
generate concrete predictions about forms/meanings of metacommunicative
interaction utterances in context. Such a characterization needs to cover both the range of
possibilities associated with successful communication (grounding), as well as with
imperfect communication - indeed it has been argued that /m'scommunication is the
more general case (see e.g. Healey 2008). Thus, we can suggest that the adequacy of
semantic theory involves the ability to characterize for any utterance type the update
that emerges in the aftermath of successful grounding and the full range of
possible CRs otherwise. This is, arguably, the early 21st century analogue of truth
conditions. The update component of this criterion builds on earlier adequacy criteria that
emerged from dynamic semantics' frameworks (see article 38 (Dckkcr) Dynamic
semantics). Nonetheless, these frameworks have abstracted away from metacom-
munication.
I now consider two general approaches that strive to develop semantic theories
capable of delivering grounding conditions/CRification potential. The first approach,
an extension of Discourse Representation Theory (DRT) (see article 37 (Kamp &
Reyle) Discourse Representation Theory), aims at explicating inter alia the potential for
acknowledgements and utterance-oriented presuppositions; the second approach,
constructed from the start as a theory of dialogue perse, shows how to characterize
CRification potential.
A crucial assumption both approaches bear in common, one that distinguishes them
from other dynamic semantic work (e.g. Roberts 1996; Groenendijk 1998; Dekker 2004;
Asher & Lascarides 2003), but one that seems inescapable if metacommunicative
36. Situation Semantics: From indexicality to metacommunicative interaction 863
the two utterances in (16a).The discourse entities eel and ce2 can serve as antecedents
both of implicit anaphoric references, e.g. in the case of 'backward' acts like answers to
questions, and of explicit ones. Consider (17): this may be viewed as performing at least
two functions here: implicitly accepting the option proposed in eel, and performing a
query. Indeed backward-looking acts - (for the backward/forward-looking dialogue act
dichotomy see Core & Allen 1997) such as accept are all implicitly anaphoric to a
previous conversational event (eel in this case), hence the assumption that conversational
events introduce discourse markers just like normal events do.
(17) a. A: We should send an engine to Avon. B: Shall we use engine E3?
b. [ccl,cc2,cc3|ccl:opcn-option(A,B,[x,w,c|cnginc(x),Avon(w),c:scnd(A,B,x,w)]),
ce2: accept(B,cel) ce3: ask(B,A,[y,e'| engine(y), E3(y),e ':use(A,B,y)])]
In fact, as mentioned earlier, Poesio and Traum develop their theory on the basis of a
strong and dynamicized version of Speech Events as Semantic Entities: an utterance is
taken to be a sequence of micro-conversational events (MCEs). On this view, the
discourse situation is updated not just when a complete sentence has been observed, but
whenever a new event is observed. Psychological research suggests that such updates
can take place every few milliseconds (Tanenhaus & Trueswell 1995), so that observing
the utterance of a phoneme is sufficient to cause an update; but in practice PTT typically
assumes that updates take place after every word. The incremental update hypothesis
is not just motivated by psychological findings about incremental interpretation in
sentential utterances, but by the fact that in dialogue many types of conversational acts
are hardly, if ever, performed with full sentences. A class of non-sentential utterances
that quite clearly lead to immediate updates of the discourse situation are those used to
perform dialogue control acts such as take-turn, keep-turn and release-turn actions
whose function is to synchronize the two participants in the conversation as to whom is
holding the floor (Traum & Hinkelmann 1992) and acknowledgements. These
conversational actions are sometimes performed by sentential utterances that also generate
a core speech act (e.g., the second utterance in (17a)), but more commonly they are
generated by single-word discourse markers like 'mmh','okay','well','now'.
In PTT, lexicon and grammar are formulated as defeasible rules characterizing the
update potential of locutionary acts. The motivation for defeasibility include psycholin-
guistic results about lexical access, e.g. work such as Onifcr & Swinncy (1981)
demonstrating that conversationalists simultaneously access all meanings of ambiguous words.
Lexical entries and syntactic rules link a precondition stated in terms of the
phonological/syntactic characteristics of a micro-conversational event and a possible effect stated
in terms of the possible meaning of that event. In particular, syntactic rules enable the
construction of compound locutionary events, whose atomic constituents are the MCEs
corresponding to utterances of individual words. Each locutionary act laf sets up the
potential for a subsequent illocutionary act il{ (one of whose) effects is to constitute an
acknowledgement of /a,.
This provides the basis for a treatment of grounding and dialogue control particles.
I illustrate this for 'okay' in its use as an acknowledgement particle; PTT assumes that
locutionary acts generate - here in a causal sense introduced by Goldman (1970) - core
speech acts.The lexical entry could be specified, roughly, as in (18), where u represents a
locutionary and ce an illocutionary act resepctively:
866
VIII. Theories of discourse semantics
lexemes (e.g.'yes','no'), through reprise fragments, can be analyzed as indexical
expressions relative to the DGB.
Context change is specified in terms of conversational rules, rules that specify the
effects applicable to a DGB that satisfies certain preconditions. This allows both illo-
cutionary effects to be modelled (preconditions for and effects of greeting, querying,
assertion, parting etc), interleaved with locutionary effects, our focus here. In the
immediate aftermath of the speech event u, Pending gets updated with a record of the
[sit = u
sit-typc =
(of type LocProp (locutionary proposition)). Here Tn is a grammatical
type that emerges during the process of parsing u, as already exemplified above in (6).The
relationship between u and Tn - describable in terms of the Austinian proposition (see
[sit = u
_ -
can be utilized in providing an analysis of grounding/CRification conditions:
(21) a. Grounding: pn is true: the utterance type fully classifies the utterance token,
b. CRification: Tn is weak (e.g. incomplete word recognition); u is incompletely
specified (e.g. incomplete contextual resolution).
Thus, pending utterances are the locus off of which to read grounding/CR conditions.
Without saying much more, wc can formulate a lexical entry for CR particles like
'eh?' (Purver 2004). Given a context that supplies speaker, addressee and a pending
utterance the content expressed is a question querying the intended content of the
utterance:
(22)
phon:(eh)
cat = interjection: syncat
spkr:IND
addr:IND
pending :utt
[c2: address(addr.spkr,pending)
cont = Ask(c-params.spkr,c-params.addr,
[Xjc Mean(c-params.addr,c-params.pending,x)):IllocProp
c-params:
(22) is straightforward apart from one point - what is the type utt. This actually is a
fundamental semantic issue, one which, as we will see, responds to the underdetermina-
tion of information by interpretation desideratum raised in section l:what is the semantic
type of pending? In other words, what information needs to be associated with pending
to enable the formulation of grounding conditions/CR potential? The requisite
information needs to be such that it enables the original speaker to interpret and recognize
the coherence of the range of possible clarification queries that the original addressee
might make.
872
VIII. Theories of discourse semantics
Taiienhaus, Michael & JohnTrueswell 1995. Sentence comprehension. In: J. Miller & P.Eimas(eds.).
Handbook of Perception and Cognition, vol. 11: Speech, Language and Communication. New
York: Academic Press, 217-262.
Traum, David & Elizabeth Hinkelmann 1992. Conversation acts in task-oriented spoken dialogue.
Computational Intelligence 8,575-599.
Jonathan Ginzburg, Paris (France)
37. Discourse Representation Theory
1. Introduction
2. DRT at work
3. Presupposition and binding
4. Binding in DRT
5. Lexicon and inference
6. Extensions
7. Direct reference and anchors
8. Coverage, extensions of the framework, implementations
9. References
Abstract
Discourse Representation Theory (DRT) originated from the desire to account for aspects
of linguistic meaning that have to do with the connections between sentences in a discourse
or text (as opposed to the meanings that individual sentences have in isolation). The
general framework it proposes is dynamic: the semantic contribution that a sentence makes to
a discourse or text is analysed as its contribution to the semantic representation - Discourse
Representation Structure or DRS - that has already been constructed for the sentences
preceding it. Interpretation is thus described as a transformation process which turns DRSs
into other (as a rule more informative) DRSs, and meaning is explicated in terms of the
canons that govern the construction of DRSs. DRT's emphasis on semantic
representations distinguishes it from other dynamic frameworks (such as the Dynamic Predicate
Logic and Dynamic Montague Grammar developed by Groenendijk and Stokhof, and
numerous variants of those). DRT is - both in its conception and in the details of its
implementation - a theory of semantic representation, or logical form.
The selection of topics for this survey reflects our view of what are the most important
contributions of DRT to natural language semantics (as opposed to philosophy or
artificial intelligence).
1. Introduction
1.1. Origins
The origins of Discourse Representation Theory (DRT) had to do with the semantic
connection between adjacent sentences in discourse. Starting point was the analysis of tense,
Maienbom, von Heusinger and Portner (eds.) 2011, Semantics (HSK 33.1), de Gruyter, 872-923
37. Discourse Representation Theory 873
and more specifically the question how to define the different roles of Imperfect (Imp)
and Simple Past (PS) in French.The semantic effects these tenses produce are often
visible through the links they establish between the sentences in which they occur and the
sentences preceding them.
A telling example, which has become something of a prototype for the sentence
linking role of tenses, is the following. (1) is the original example, in French; (2), its
translation into English, establishes the same point.
(1) Quand Alain ouvrit (PS) les yeux, il vit (PS) sa femme qui £tait (Imp) debout pres
de son lit.
a.Ellcluitsourit.(PS)
b. Elle lui souriait. (Imp)
(2) When Alain opened his eyes he saw his wife who was standing by his bed.
a. She smiled.
b. She was smiling.
The difference between (la) and (lb) is striking:The PS-sentence in (la) is understood
as describing the reaction of Alain's wife to his waking up, the Imp-sentence in (lb) as
describing a state of affairs that already holds at the time when Alain opens his eyes and
sees her: the very first thing he sees is his smiling wife.
In the late seventies the study of the tenses in French and other languages led to the
conviction that their discourse linking properties are an essential aspect of their meaning
and an effort got under way to formulate interpretation rules for different tenses that
make their linking roles explicit. In 1980 came the awareness that the mechanisms which
account for the inter-sentential connections that are established by tenses can also be
invoked to explain the inter- and intra-scntcntial links between pronouns and their
anaphoric antecedents. (An analogy in the spirit of Partee 1973, but within the realm
of anaphora rather than deixis.) DRT was the result of working out the details for a
small fragment dealing with sentence-internal and -external anaphora. This first
fragment (Kamp 1981a) dealt only with pronominal anaphora, but a treatment of temporal
anaphora, which offered an analysis of, among others, the anaphoric properties of PS and
Imp, followed in the same year (Kamp 1981b). The theory presented in Kamp (1981a)
proved to be equivalent to the independently developed File Change Semantics of Hcim,
which became available to a general audience at roughly the same time (Heim 1982).
However, DRT and FCS were inspired by different intentions from the start, a difference
that became more pronounced with the advent of Groenendijk and Stokhofs Dynamic
Semantics (Groenendijk & Stokhof 1991; Groenendijk & Stokhof 1990). Dynamic
Semantics in the spirit of Groenendijk and Stokhof followed the lead of FCS, not DRT
(Barwise & Perry 1983;Rooth 1987).
From the very beginning one of the strong motivations of DRT was the desire to
capture certain features of the way in which interpretations of sentences, texts and
discourses are represented in the mind of the interpreter, including features that cannot be
recaptured from the truth conditions that the chosen interpretation determines.This
representational aspect of DRT was at first seen by some as a draw-back, viz. as an
unwelcome deviation from the emphatically anti-psychologistic methods and philosophy of
Montague Grammar (Montague 1973; Montague 1970a; Groenendijk & Stokhof 1990);
874
VIII. Theories of discourse semantics
but with time this resistance appears to have lessened, largely because of the growing
trend to see linguistics as a branch of cognitive science. (How good the representations of
DRT are from a cognitive perspective, i.e. how much they tell us about the way in which
humans represent information - or at least how they represent the information that is
conveyed to them through language - is another matter, and one about which the last
word has not been said.)
In the course of the 1980s the scope of DRT was extended to the full paradigm of tense
forms in French and English, as well as to a range of temporal adverbials, to anaphoric
plural pronouns and other plural NPs, the representation of propositional attitudes and
attitude reports, i.e. sentences and bits of text that describe the propositional attitudes
of an agent or agents (Kamp 1990; Ashcr 1986; Kamp 2003), and to ellipsis (Ashcr 1993;
Lerner & Pinkal 1995;Hardt 1992.) The nineties saw,besides extension and consolidation
of the applications mentioned, a theory of lexical meaning compatible with the general
principles of DRT (Kamp & RoBdeutscher 1992; Kamp & RoBdeutscher 1994a), and an
account of presupposition (van der Sandt 1992; Beaver 1992; Beaver 1997; Beaver 2004;
Geurts 1994; Geurts 1999; Geurts & van der Sandt 1999; Kamp 2001a; Kamp 2001b; van
Genabith, Kamp & Reyle 2010). The nineties also saw the beginnings of two
important extensions of DRT that have become theories in their own right and with their
own names, U(nderspecified) DRT (Reyle 1993) and S(egmented) DRT (Asher 1993;
Lascarides & Asher 1993; Asher & Lascarides 2003). SDRT would require a chapter on
its own and wc will only say a very few words about it here; UDRT will be discussed (all
too briefly) in Section 8.2. In the first decade of the present century DRT was extended
to cover focus-background structure (Kamp 2004; Riester 2008; Riester & Kamp 2010)
and the treatment of various types of indefinites (Bende-Farkas & Kamp 2001;Farkas&
de Swart 2003).
2. DRT at work
In this section we show in some detail how DRT deals with one of the examples that
motivated its development.
2.1. Tense in texts
As noted in Section 1, the starting point for DRT was an attempt in the late seventies
to come to grips with certain problems in the theory of tense and aspect. In the sixties
and early seventies formal research into the ways in which natural languages express
temporal information had been dominated by temporal logics of the kind that had been
developed from the fifties onwards, starting with the work of Prior and others (Prior 1967;
Kamp 1968; Vlach 1973). It became increasingly clear, however, that there were aspects
to the way in which temporal information is handled in natural languages which neither
the original Priorcan logics nor later extensions of them could handle.
One of the challenges that tenses present to semantic theory is to determine how they
combine temporal and aspectual information and how those two kinds of information
interact in the links that tenses establish between their own sentences and the ones
preceding them. (1) and (2) are striking examples of this challenge. Here we will look at a
pair of slightly simplified discourses which illustrate the same point.
37. Discourse Representation Theory
(3) Alain woke up.
a. His wife smiled.
b. His wife was smiling.
We will assume that (3a) and (3b) are related to the first sentence of (3) in the same way
as the second sentences of (1) and (2) are related to the first sentences there: in the case
of (3b) Alain's wife was smiling when Alain opened his eyes, in (3a) she smiled as a
reaction to that.These respective interpretations may not be as compelling as they are in the
case of (2) or (1), but they are there and it is these readings for which we are now going
to construct semantic representations.
We assume that the first sentence has the syntactic structure given in (4).
(4)
Note that this structure makes the assumption, familiar from syntactic theories based
on the work of Chomsky, that the interpretation provided by tense is located at a
node T high up in the sentence tree and above the one containing the verb. We will
see presently what implications this has for the construction of a semantic
representation. We assume that the construction is bottom up (unlike in the first explicit
formulations of DRT (Kamp 1981a; Kamp & Reyle 1993) where it is top down, and which
for many years were treated as a kind of standard in DRT). Before we describe the
construction procedure, we show the resulting DRS in (5), so that the reader will have
an idea of what we are working towards. We will then describe the procedure in more
detail.
(5)
t<n
Alain (x)
ect
e:wake_up'(x)
876
VIII. Theories of discourse semantics
Formally DRSs are pairs <U,Con> consisting of (i) a set U of discourse referents, and
(ii) a set Con of conditions. (5) exemplifies the graphical convention in DRT to
represent DRSs as 2-dimensional structures, with the universe displayed at the top and the
condition set below it. We will keep to that convention throughout this article. Discourse
referents play the role of representatives of entities.They behave much like the variables
of predicate logic, but not quite. In fact, the behaviour of some discourse referents
resembles more closely that of individual constants - details will become clear as we go along.
In general, discourse referents come with certain sortal restrictions built into them.
For instance, (5) has discourse referents of three different sorts - x stands for an
individual (by individual we understand any entity that can be the referent of a definite noun
phrase), c stands for an event and t for a time. DRS-conditions arc in essence formulas
of predicate logic, built from predicates and discourse referents. The discourse referents
play the part of argument terms.The predicates are either translations of predicate words
from the represented natural language - such as e.g. the verb wake up in (5) or the proper
name Alain - into the DRS representation formalism, or else they are primes of the
representation formalism, to which no simple word of the represented natural language has
a privileged connection. Examples of such primes are the relational predicates < and c.
(The condition t < n means that the point or interval of time t precedes the utterance
time n.The condition e c t means that the event e is temporally included in the time
point or interval t.)
Wakc_up' is the predicate corresponding to the English verb wake up. The discourse
referent e stands for the waking up event that is described by the sentence.The notation
e:wake_up'(x) is to be read as"e is an event of the type x waking up". We could also have
written wake_up'(e,x) - a notation one often sees elsewhere - but stick with the
notation using ":" which was originally introduced to highlight the asymmetry between the
referential argument e of wake up and its non-referential argument x.
A brief explanation may be needed here of the distinction between referential and
non-referential arguments. We assume with Williams (1977) that all words which function
semantically as predicates (names, prepositions, adjectives and adverbs) have one
referential argument and in addition one or more non-referential arguments.The referential
argument of a verb is always the event or state the verb is used to describe. (Whether
this argument is an event or a state is determined by the lexical properties of the verb in
question.) Verbs also always have at least one non-referential argument, which is realised
as the grammatical subject when the verb is used in the active voice. Intransitive verbs
have just this one non-referential argument, simple transitive verbs have two and so
on. Most nouns, adjectives and adverbs just have a referential argument but no non-
referential arguments. But there are also relational nouns, adjectives and adverbs, which
have non-referential arguments as well. An example is the noun wife. In a DP such as
Alain's wife the non-referential argument is the referent of the embedded DP Alain.
The referential argument is not represented by a separate phrase but introduced by the
predicate word wife itself; it becomes the referent of the complete phrase of which the
noun wife is the lexical head - here the DP Alain's wife.
The non-relational noun woman only has a referential argument, which for instance
can be expressed by a containing DP like the woman. Wc write woman(y) and wifc(y,z)
rather than y:woman or y:wife-of(z), which might have been expected given what has
just been said in connection with verbs.This arguably is a slight inconsistency of notation,
but it is harmless and it has become standard practice within DRT, so we will stick to it
37. Discourse Representation Theory 877
here. Proper names like Alain are also treated as predicates, with a built-in referential
uniqueness. For instance, Alain(x) means that the discourse referent x represents the
individual referred to (in the represented utterance) by the name Alain.
This completes the informal description of the different parts of the DRS in (5). A
formally precise description is provided by the model-theoretic semantics for DRSs.
From a logical point of view DRSs are the formulas of DRT's semantic representation
languages. These languages come - like other logical formalisms, such as the predicate
calculus - with a syntax which defines the possible forms of expressions (here: the well-
formed DRSs and DRS-conditions), and with a model theory which describes for each
of the well-formed expressions its denotation in each of the models that it specifies for
the given DRS-language. We won't go into the formal definition of the models for DRS
languages that include DRSs like that in (5), but refer the reader to the literature (Kamp
& Reyle 1993 or van Genabith, Kamp & Reyle 2010).
Since the DRSs of our formalism involve discourse referents for entities of various
sorts, models must have a fairly complicated structure (much more so than the models for
standard first order logic): they must have a time structure as well as temporal relations
between times and eventualities (eventuality is used as a cover term for both events and
states). For any model M the denotation of (5) in M will be the set of all functions of the
universe of (5) into the universe of M such that: (i) f(t) is a time of M (i.e. an interval or
point of the time structure of M), (ii) f(e) is an event of M, (iii) f(x) is an individual of M,
(iv) the conditions of (5) are all satisfied in M by the f-values of their arguments: (a) f(t)
temporally precedes the utterance time of the first sentence of (3); (b) f(e) is temporally
included in f(t); (c) f(x) is Alain (i.e. the person referred to by the speaker in using Alain
on the given occasion; it is assumed that Alain is one of the individuals of M); (d) f(e) is
an event of f(x) waking up. Such functions, from the universe of a DRS into the universe
of a model, arc usually called embedding functions or, simply, embeddings. If an
embedding function verifies all the conditions of the DRS in the model - in the case of (5) this
means that it satisfies the requirements (a) - (d) - then it is called a verifying embedding.
With this definition of the denotation of (5) in a model comes a definition of truth: (5)
is true in M if there exists a verifying embedding of (5) in M,i.e. if the denotation of (5) in
M is non-empty. Note the existential form of this definition: (5) is true in M if there exists
a way of associating entities from M with the universe of (5) such that the conditions of
(5) are satisfied in M.
To construct (5) from (4) we proceed bottom up, associating semantic representations
with the nodes of (4) as we traverse the tree from its leaves to its root, working our way
up, roughly speaking, from bottom right to top left. We start by replacing the given
occurrence of the verb, wake up, with the appropriate instantiation of its semantic
representation.This representation is provided by the lexical entry for the verb, which we assume is
given in the following form.
(6) wake up verb nom
e x
Selectional Restrictions: event animal
Semantic Representation: e:wake_up'(x)
This entry says that wake up is a verb, that its referential argument is an event (see
the Selection Restrictions), that wake up has one non-referential argument, which must
VIII. Theories of discourse semantics
The one remaining task is the binding of t. In the present case, where the S node we are
dealing with is the S node of a discourse-initial main clause, binding of the discourse
referent t is existential, which once again amounts to transferring the discourse referent
from its store to the universe of the DRS adjacent to it. The resulting representation is
the one that was already shown as (5).
We now turn to the two second sentences (a) and (b) of (3). This is where the special
features of DRT, which concern the semantic relations between successive sentences
in a discourse, come into prominence. We start with sentence (3b). To avoid
unhelpful complications we treat the form be smiling as a single verb, which serves to
describe states: the condition s: be_smiling'(x) means that s is a state to the effect that x
is smiling.
Construction of the semantic representation of (3b) proceeds in much the same way
as that of the first sentence of (3). The only differences have to do with the temporal
location of the state s, with the representation of the subject phrase his wife and with the
binding of the temporally locating discourse referent t'.The first difference manifests
itself in the transition from VP to TP. We assume that states are located at times in the
sense that the time is one at or during which the state holds; that is, the location
relation between t' and s is t' c s (rather than the converse relation which was assumed for
events). So we get as representation at this construction stage:
(10)
The representation of the subject phrase his wife differs from that of Alain in several
respects. First, the discourse referent introduced as referential argument of wife' must be
a fresh discourse referent (i.e. one that is distinct from the discourse referents that have
been previously introduced into the representation of the discourse) so that no clashes
can occur between the referential roles they are meant to play. In particular, the new
882
VIII. Theories of discourse semantics
(13)
t' s x' u
wife'(x', u)
t'<n t'cs t' = t u =
s:be_smiling'(x')
The identification of t' with t also amounts to a kind of anaphora resolution. In fact, past
tenses in non-initial sentences of a discourse tend to be anaphoric in this sort of way.
It is this anaphoric dimension which enables them to contribute to the temporal
connectedness of texts and discourses, something that the present examples are meant to
illustrate.
Note that (13) is not a proper DRS in the sense that some of the discourse referents
occurring in its conditions (viz. x and t) do not occur in its universe. Such improper DRSs
do not have well-defined denotations in the sense we specified earlier: embeddings of
their universes do not determine the satisfaction of all of their conditions.The improper-
ness of (13) disappears, however, when it is merged with the representation of the first
sentence (by forming the union of their universes and the union of their condition sets).
The result of the merge is given in (14).
(14)
t e x t' s x' u
Alain (x)
t <
: t
e:wake_up' (x)
wife'(x',u)
t' < n t' c s t' = t u = x
s:be_smiling'(x')
(14) is the joint representation of the two sentences of (3b). It is a proper DRS and thus
has a denotation and a truth value in every model.
The representation construction for (3a) is in most respects like that for (3b). The
only differences have to do with the fact that (3a) functions as an event description, and
not, like (3b), as a state description.This difference shows up, first, in that the verb smile
introduces an event discourse referent e' (and not a state discourse referent) and, second,
in that the temporal relation established between e' and e is different from that between
s and e in the case of (3b).
Intuitively, wc saw, the relationship between c' and c is that c' follows c. But how
does this relationship get established? Although this has been the subject of discussion
for several decades, it continues to be a topic for ongoing research and debate. But this
much is clear: How successions of event sentences are interpreted depends very much on
the rhetorical relations between them, and these vary. For instance, the event described
inthesecondoftwoconnectedsentencesinadiscourseissometimesunderstoodasareacfioft
to the event described in the first sentence - the natural interpretation in the case of (3a) -
but in other cases it can also be understood as the cause of that event - the most salient
interpretation for John fell. Bill pushed him; or it can be understood as an elaboration
37. Discourse Representation Theory
of the event of the first sentence - as in John went to see his mother. He took the bus.
(And these are just some of the various possibilities.) The interactions between
rhetorical relations and temporal relations have been a central issue in SDRT (Segmented
Discourse Representation Theory; see Lascarides & Asher 1993; Asher & Lascarides
2003), and in fact, they were an important impetus to the development of that approach.
They are a topic that we can do no more than mention here; the reader is referred
to the just mentioned publications and to further work cited in Asher & Lascarides
(2003).
The rhetorical relation between (3a) and the first sentence of (3) is called
Narration in SDRT (as well as in some other accounts of discourse relations, cf. Mann &
Thompson 1988, as well as article 39 (Zeevat) Rhetorical relations). The Narration
relation between two successive sentences is one of the relations which entail that the
event of the second sentence temporally follows that of the first. We implement the
implications of this for DRS construction by assuming that the same relation holds
between the location times t and t' of these events and use for this the condition t < t'.
This time the joint representation of the two sentences together has the form given
in (15).
(15)
t e x t' e' x' u
Alain (x)
t <n e c t
e:wake_up'(x)
wife'(x',u)
|t'< n e'et' t<t' u =
e':smile'(x')
We have dealt with these two examples at a level of detail that some readers may have
thought excessive, and inappropriate for a handbook. We have done so for two reasons.
First, the DRS-construction principles we have discussed reflect some of the
fundamental assumptions of DRT about the nature of verbal and nominal predication.These
are intimately connected with the principles of sentence-external and -internal discourse
connection for which DRT has long been primarily known and to which much that will
be said in this article will be devoted. Without the details wc have gone into in our
discussion of (3) it would have been much more difficult to explain these more specifically
dynamic aspects of DRT.
Through the work of van der Sandt and Geurts starting in the late eighties (van der
Sandt 1992; van der Sandt & Geurts 1991; Geurts 1994; Geurts 1999), the basic
architecture of DRT underwent a fundamental change. One part of that change is illustrated by
the two-step procedure wc have just used in dealing with the second sentences of (3): first
a preliminary representation was constructed solely on the basis of the sentence itself,
and then the issues that this construction had left unsettled were resolved on the basis
of the context representation provided by the antecedent part of the discourse. (In our
example this was just the representation of the first sentence of (3).)
884
VIII. Theories of discourse semantics
What is still missing, though, is a precise articulation of the steps that provide the links
between the two sentence representations. A more detailed proposal for how such links
are established will be discussed below in Section 3.
2.2. Donkey sentences and complex DRS conditions
Before we get to that discussion we will first have a look at examples, given in (16) and
(17), of the two types of donkey sentences which were used in Geach (1962) to illustrate
the puzzle known as the donkey sentence problem or the donkey pronoun problem. The
early development of DRT, and its initial reception, were closely connected with this
problem.
(16) If Alain owns a donkey he beats it.
(17) Every farmer who owns a donkey beats it.
In both these sentences the pronoun it is understood as referring back to the indefinite
DP a donkey. The methods that are suggested by classical predicate logic for constructing
logical forms for these sentences, in which pronouns are treated as bound variables and
indefinite DPs as existential quantifiers, have trouble with such sentences - their logical
forms come out either as ill-formed or they give the wrong truth conditions. The two
apparent options for (16) are shown in (18) and (19).
(18) By(donkey'(y) a own'(a,y)) -» beat'(a,y)
(19) By(donkey'(y) a (own'(a,y) —► beat'(a,y))
In (18) the final occurrence of y is not bound, so we have an open formula, which doesn't
determine any definite truth conditions.This problem does not arise for (19),but the truth
conditions of this formula are clearly not those of (16).The solution of DRT (Heim 1982;
Kamp 1981a) was based on a conceptual parallel between the sentences in (16) and (17)
and a two sentence discourse like (20).
(20) Alain owns a donkey. He beats it.
In (20) the pronouns he and it of the second sentence are anaphoric to Alain and a
donkey in the first sentence in the same way that his is anaphoric to Alain in (1) and
(3); and a DRT-based interpretation of (20) will establish these anaphoric links in the
same way in which we assumed the link was established between his and Alain in the
treatment we have presented of (3b).The point about (16) and (17) is that here too the
same linking mechanism is involved. Let us focus first on (16).This sentence has the form
of a conditional "if A then B".The DRT conception of a conditional is this: the
antecedent A functions as the description of a certain type of situation, and the conditional as
a whole has the function to express that if a situation is of this type then it is also of
the type described by the consequent B. And since it is about situations of the type
described by A that the claim is made that they also satisfy the description given by B, it is
37. Discourse Representation Theory 887
model M and X is a set of discourse referents, g 2t f means that Dom(g) = Dom(f) u X.
Moreover, when K is a DRS, then g ^K f is short for g 3t f, where X = U^.
Definition 2 (Verification of complex DRS conditions)
Let f be a function from discourse referents to elements of the model M, and let K,
and K2 be DRSs.
(a) f verifies -iK, in M if there is no extension g 3k, f which verifies all the conditions of
K, in M.
(b) f verifies K, => K2 in M if every extension g ^k, f which verifies all the conditions of
K, in M has an extension h 3^ which verifies all the conditions of K2 in M.
(c) f verifies K, v K2 in M if either there is an extension gl ^k, f which verifies all the
conditions of K, in M or there is an extension g2 3k2S which verifies all the
conditions of K2 in M.
/K
(d) [verifies K, / \K2in JWif every extension g2i/f, where U = Uk, u{a},such that
g verifies all the conditions of K{ in M has an extension h 3k2 g which verifies all the
conditions of K2 in M.
DRS languages which include the operators mentioned in Def. 1 (with the semantics
given for them in Def. 2) have at least the expressive power of first order predicate logic.
As in standard formulations of first order logic the set of operators is redundant - some
operators in the set can be expressed with the help of others. But for the given set of DRS
operators the redundancy is more extreme. Since existential quantification and
conjunction are built into the structure of DRSs, just adding -i already suffices to express all the
remaining classical operators.
It should be stressed, however, that the operators of extensional logic cover only a
small part of the operator-like constructions that are found in natural languages and that
play a crucial part in making natural languages the powerful and flexible instruments of
expression and communication they are. In order to cover more of these operations DRS
languages have been extended repeatedly as time went on. Some such extensions will be
discussed in Section 6.
We return to the discussion of the sentences in (16) and (17). The point of departure
for that discussion was the analogy between those sentences and discourses like (20):
The discourse context that the first sentence (20) provides for the second sentence of
(20) is provided in the conditional sentence (16) by the conditional's antecedent for the
conditional's consequent. A similar analogy applies to (17), where it is the restrictor of
the quantifier that provides a discourse context for the nuclear scope.The DRS for (17)
is given in (22).
37. Discourse Representation Theory 889
fact, there were two new perspectives, each of them focussed more on the structure and
use of natural language than on the properties of logical calculi. The first, best known
through the writings of Stalnaker (Stalnaker 1972; Stalnaker 1974; Stalnaker 1979), is
the view that presupposition is a pragmatic phenomenon, in the following sense. When
we use language to communicate we always assume that there is much information we
already share with those we are trying to communicate with; without such a supporting
Common Ground it would be virtually impossible to convey any information concisely
and clearly. So it is a natural and ubiquitous feature of verbal communication that
much information is presupposed without which we couldn't express ourselves as
concisely as we want to. The second perspective - sometimes somewhat unfortunately
referred to as "semantic" - emphasises the conventional aspect of presupposition:
Certain words and grammatical constructions come with presuppositions in the sense
that no utterance in which these words or constructions occur can be considered
felicitous unless the context in which the utterance is made verifies the presuppositions
they introduce. A speaker who uses such words or constructions cannot help but make
the corresponding presuppositions, the Common Ground of which she assumes that
it obtains between her and her interlocutors must verify those presuppositions.
Historically this second view, according to which linguistic presuppositions are conventionally
associated with their triggers, is primarily connected with the names of Langendoen and
Savin, Kiparsky, and Karttunen (Langendoen & Savin 1971; Kiparsky & Kiparski 1970;
Karttuncn 1973; Karttunen 1974). (Sec also article 91 (Beaver & Gcurts)
Presupposition.) It is this second perspective that motivates the treatment of presupposition
within DRT.
With the semantic perspective came the awareness that presupposition is much
more widespread than had previously been assumed. Presuppositions are not only
triggered by definite descriptions and other singular terms, but also by many other words
and constructions. Indeed, when investigations inspired by this new awareness got under
way, a significant part of the work consisted in identifying the presupposition triggers
that can be found in English and other natural languages; and even today that search is
far from over.
The central challenge that this way of looking at presupposition was soon seen to
present for the theoretical linguist, however, was the Projection Problem (Karttunen
1973). Karttunen noted that presuppositions carried by simple sentences often disappear
when these sentences arc embedded within more complex ones. Our intuitions about the
truth and felicity conditions of simple sentences give us reliable information about which
words and constructions are presupposition triggers, and about what presuppositions
they trigger;but sometimes, when the triggers occur in logically complex sentences, these
very presuppositions seem to be absent nonetheless. To give an example: the word again
carries the presupposition that an event or state of the kind described by the clause
containing it happened or obtained at some time preceding the event or state that is being
described. Thus compare the utterances (23a) and (23b).
(23) a. He will come.
b. He will come again.
c. He won't come again.
d. Will he come again?
892
VIII. Theories of discourse semantics
pronoun is different in form from the presupposition imposed by a word like again, but
that is as far as the difference goes. In van Genabith, Kamp & Reyle (to appear) a
distinction is made between anaphoric and non-anaphoric presuppositions. Anaphoric
presuppositions always involve anaphoric discourse referents and impose on the context
the constraint that it supplies discourse referents which can serve as antecedents for
those. The anaphoric discourse referent is then interpreted as standing to its antecedent
in a certain relation; often this relation is coreference, but not always (not e.g. in cases of
temporal anaphora). Non-anaphoric presuppositions are presuppositions in the sense of
presupposition theory "before DRT" and act as entailment requirements: they express
presuppositions that the context is required to entail. (As a matter of fact, both these
types of presuppositions can be seen as special cases of a more general form of anaphora
cum presupposition: a constraint to the effect that the context provides one or more
discourse referents for which it entails certain properties and/or relations; arguably van der
Sandt's theory can be interpreted as advocating this generalised notion.)
Here we follow the version of presupposition theory presented in van Genabith,
Kamp & Reyle (to appear). In this version anaphoric pronouns are treated as triggers
of presuppositions of the form: <x?,Cl(x),..,Cn(x)>, where x is the discourse referent
representing the pronoun and Cl,..,Cn are conditions connected with the pronoun in
question (e.g. a condition to the effect that the referent must be a person of the female
sex) and as before the question mark behind the discourse referent x serves as indicator
that the context must provide an antecedent discourse referent with which x is to be
identified. With this new treatment of anaphoric pronouns the representation of (23b)
will no longer require a store. (However, we will see below that stores are still needed.)
Instead the two pronoun occurrences now each contribute a presupposition. The new
representation, which replaces (25b), is given in (26).
(26)
yesterday (t)
x'
person(x)
male(x)
}>.
t,e,
t,< n
e,ct,
t,ct
e:come'(x)
H
person (x')
male(x')
e':come'(x')
t,e,
n < t,
e2ct,
e2:come'(x')
Resolution of the first pronoun presupposition is possible only if a suitable discourse
referent is provided by the utterance context. In such a context the second pronoun
presupposition can then also be resolved to that discourse referent. This will lead to
37. Discourse Representation Theory 893
identification of x' with x. And once this identification has been made, the antecedent
DRS will entail the agam-presupposition.
(26) is a typical example of a preliminary sentence representation, in which all
anaphoric and presuppositional constraints that parts of the sentence impose on their
respective contexts have been explicitly represented in the form of anaphoric and non-
anaphoric presuppositions.
One of the unsolved issues in presupposition theory is accommodation. It often
happens that a speaker utters a sentence which carries a presupposition that is neither
entailed by some local context provided by the sentence itself nor by what the
interpreter has reason to assume is the global context (or Common Ground; we do not make
a distinction in this article between global context and Common Ground). In such cases
the interpreter will normally accommodate the presupposition by adjusting the global
context so that it satisfies the constraint that the presupposition imposes. Apparently,
thus the implicit reasoning behind accommodation, the speaker is assuming a global
context distinct from the one I was assuming myself, for otherwise she would not have used
the sentence she did use (Beaver 1997; Beaver 2002). (A problem arises in those cases
where the necessary adjustments conflict with some of the interpreter's beliefs: should
he give these up or should he rather dismiss the speaker's utterance as infelicitous? How
recipients resolve conflicts of this sort can depend on all sorts of factors.The
presupposition theory of DRT has nothing to say about this.)
It should be stressed that accommodation of a presupposition need not take the
form of simply adopting the presupposition itself as a new context constituent. Often
the recepient will assume that the Common Ground contains additional information
that exceeds what is needed to entail a presupposition, simply because that is the more
plausible adjustment. And on the other hand it is often the case that the accommodated
information need not entail the presupposition on its own but only in combination with
information that is already present, say in the local context where the presupposition
can be resolved. In such cases the accommodated information may be weaker than the
presupposition (or the two may be logically independent). The first of these points has
been emphasised by Beaver and the second in Kamp & RoBdeutscher (1994a), which
proposes the term presupposition justification to cover all cases of (a) verification of
the presupposition by the context as is, (b) wholesale accommodation and (c)
satisfaction by a combination of old and new (that is, accommodated) contextual information.
A further intriguing feature of presupposition justification is that the given discourse
context often points to a particular accommodation that will guarantee presupposition
verification; and the force of this pointing may be so powerful that it confers upon the
indicated accommodation the apparent status of a valid inference (Kamp 2001b). One
famous illustration of this last effect is an example due to Kripke (Kripke 2009), which
once more involves the presupposition trigger again. Compare the sentences in (27).
(27) a. We won't have pizza on John's birthday if we are going to have pizza on Mary's
birthday.
b. We won't have pizza again on John's birthday if we are going to have pizza on
Mary's birthday.
c. We already had pizza on Billic's birthday last week. So we won't have pizza
again on John's birthday if we are (also) going to have pizza on Mary's birthday.
37. Discourse Representation Theory 897
location times can be bound in each of the three ways listed above for nominal discourse
referents. In our examples so far we have only encountered structural binding of
location times. But instances of the other two types are also found. Quantificational binding
of location times occurs in the presence of temporal adverbs like always and mostly.
These adverbs give rise to temporal duplex conditions and introduce location time
discourse referents that are bound by the quantifiers which the adverbs introduce (see
Reyle, Rossdeutscher & Kamp 2008 for details).
There are also cases where location times are bound presuppositionally. In (29a) the
temporal relative clause ten minutes after Bill called her serves to locate the event
described by the main clause Mary left the house. The relative clause introduces a location
time for the calling event which serves to locate the event described by the main clause.
(29) a. Mary left the house ten minutes after Bill called her.
b. Mary didn't leave the house ten minutes after Bill called her.
That both the Bill called event and its location time have a presuppositional status is
shown by (29b), the negation of (29a). The most prominent interpretation of this
sentence is one according to which Bill did call Mary at some time t and that ten minutes
after t Mary did not leave the house. (An interpretation according to which the time at
which Mary left the house wasn't ten minutes after the time when Bill called because
Bill didn't call seems possible only in quite special contexts.) This indicates that both the
event discourse referent of the relative clause and the discourse referent for its location
time are bound presuppositionally.
5. Lexicon and inference
The meanings of the sentences we utter and the texts we write are a function of the
meanings of the words from which they are built. So any account of linguistic meaning
must start with the meanings of words. And that entails that any semantic theory must
include a lexicon. In fact, the lexicon is arguably the most demanding part of semantics,
on the one hand because languages have so many words, and on the other because
capturing the meaning of even a single word - or crafting a satisfactory semantic entry for
it - can be very hard.
How hard will depend on what purpose the entry is supposed to serve. But we have
already hinted at what that purpose should be: the semantics of words that is specified
by their entries must be such that the meanings of sentences and texts can be computed
from these entries in a systematic, compositional fashion. From the start this has been
the guiding principle for lexicon development within DRT (see in particular Kamp &
RoBdeutscher 1994a). And here it takes the following form: The semantic specifications
given in the lexical entries of words should be such that they can be imported into the
semantic representations of the sentences and texts that can be constructed out of those
words, and these representations should capture the meanings of the represented texts
and sentences in the specific sense that they support the inferences that human
interpreters arc able to draw from those texts and sentences.
In DRT the problem of semantic lexical specification is thus tied directly to the
problem of inference: Suppose that T is a text and that S is a sentence which an interpreter
898
VIII. Theories of discourse semantics
would naturally infer (or be able to infer) from T.Then the DRS for T should entail the
DRS for S and our theory should allow us to show this. And whether the theory can do
that will in almost all cases depend on what the lexical entries for the words that occur in
T and S contribute to their representations.
Although the development of a DRT-based lexicon has been a concern for many
years, there still is a long way to go. In particular, current DRT-based lexicons lack the
broad coverage that is indispensable in automated natural language processing. Even
so, existing results exceed by a wide margin what could be reported within the limits of
this article. We have therefore decided to focus on some of the methodological
problems of semantic lexicon design as they present themselves within DRT, instead of
presenting an inevitably fragmentary overview of the proposals for the entries of particular
words and word classes. (For more on lexical semantics within a DRT-based framework
see Kamp & RoBdeutscher 1994a; Kamp & RoBdeutscher 1994b; RoBdeutscher 1994;
RoBdeutscher 2000; Solstad 2007, and for a more recent perspective Lechler &
RoBdeutscher 2009; Hamm & Kamp 2009.)
Our first encounter with the lexicon was in Section 2 when we looked at the DRS
construction for the sentences in (3). DRS construction always starts with the insertion
of semantic representations for the words of the sentence for which a DRS is being built.
This presupposes that the lexical entries for those words provide the representations that
are suited for this task. For verbs, we saw, this minimally involves two requirements: the
entry must (i) provide a semantic representation with slots for the verb's arguments and
(ii) contain information that is needed by the syntactic parser (which provides the input
trees to the DRS construction algorithm) to link these slots to argument phrases in the
sentence.
But these are just the minimal requirements. In addition we would also like lexical
entries to provide semantic information that will make it possible to deduce from
DRSs for sentences or bits of discourse those conclusions that arc readily available to a
human interpreter. On this point the one entry we have so far presented - that for the
verb wake up, given in (6) and repeated here as (30) - seems to score pretty close to
the bottom of the scale: To simply use a DRS predicate wake_up' as semantics for the
verb wake up doesn't tell us very much about what wake up means, or what inferences
it supports.
(30) wake up (verb) nom
e x
Sel.Restr.: event animal
Sem.Repr.: |e:wake_up'(x)|
Of course there is little that can be predicted about the inferential power of an entire
system from just one entry. Most of the inferences that we draw from what we hear or
read have to do with the semantic connections between words and not just with the
meanings of single words. To take one, very simple example: from the information that
Alain woke up at t we can infer that he was awake immediately after t and that he was
asleep or unconscious immediately before. It is conceivable that even with the simplistic
entry in (30) these inferences could be computed as long as enough is said, in some part
of the overall theory, about the connections between the predicate wake_up' and the
902
VIII. Theories of discourse semantics
The second principle has to do with the punctual character of the time specification
10 o'clock. Intuitively such a time t cannot extend beyond the duration of an event
c happening at that timc.This entails that if c happens at t and s is a state that right-abuts
e, then s must include some initial segment of the period that stretches from t into the
future. Using "punct" to represent the notion of punctuality that is conveyed by
expressions like 10 o'clock, we can formalise this principle as in (38).
With the help of (37) and (38), (36) can be modified into (39), from which (34) can be
obtained by renaming discourse referents and thinning (i.e. throwing away discourse
referents and conditions).
(39)
t<n
t =*=t'
t t' e x s t"
lo-o-clock(f) let' e c t
res(s.e) e dc s
t" c s Alain(x) s:awake'(x)
To infer the DRS for (33c) from that of (33a) a little more is involved. We now need in
addition: (i) entries for asleep and unconscious (which are like (34)) and the meaning
postulate (32): (ii) a general principle to the effect that if a state s of a certain type is the
result state of an event e, then immediately before e there was a state of the opposite
type, i. c. a state s' which is characterised by the negation of the condition that
characterises s. (Often s' and s are referred to as the prestate and poststate of e.)
The prestate of an event e will hold at the moment e begins. Sometimes it will last
throughout e, but it may also cease before that, in which case the gap between it and
the result state will be bridged by some transitional state. However, if we assume that
for each state s and interval t included in the duration of s there is a state s' of the same
time as s and with t as its duration, then the Prestate Principle can be formulated simply
that there is a state which abuts e on the left and is of a type opposite to e's result state.
Note that the Prestate Principle has the status of making explicit a presupposition.
Compare for instance the sentences Bill left the room and Bill didn't leave the room. For
both of these the salient interpretation is that according to which, at the time in question,
Bill is inside the room. For some reason the question is raised whether Bill left the room
at that time, and the two sentences resolve that issue in opposite ways - the one says that
Bill did leave the room and the other that he remained in it. (The negated sentence can
be used in relation to a situation in which Bill is not in the room at t, as in Bill didn 't leave
the room for the simple reason that he wasn't in the room to begin with at that time. But
37. Discourse Representation Theory
903
such contexts are marked, and they typically involve information which denies that the
pre-state held at the relevant time.)
The upshot of this is that the general principle according to which any combination
of an event e and a result state s implies the existence of a corresponding prestate s'
must take the form of adding to the representation of event e and result state s a
presupposition saying that e is left-abutted by a state whose characterisation is the
opposite of that of s. (40) gives the template for adding pre-state presuppositions to
arbitrary representations of this kind (i.e. to arbitrary representations introduced
by change-of-state verbs). K is a schematic letter for DRSs, K' for a DRS or a
DRS-condition.
(40)
To infer the DRS for (33c) from the DRS (36) for (33a) we first expand the latter by
an application of (40) in which K' is instantiated by the conditon awake'(x). ((41)
shows this instantiation of the second term in (40) of the second occurrence of ©.) The
condition that the event of x waking up was immediately preceded by a sate to the
effect that x was not awake, is incorporated into (36) as a presupposition (see (42)).
After the presupposition of (42) has been justified (possibly through accommodation),
it is available as non-presuppositional information and can be merged with the rest of
(42), see (43). At this point we can apply (32) to replace the characterisation awake'(x)
of the pre-state sQby asleep'(x) v unconscious'(x). Moreover, we can also apply to the
state s0 a principle that is the mirror image of principle (38): if t"is punctual, e c t", and
s0dc e, then sQ includes some period of time t'"that left-abuts t". With this last application
a DRS has been obtained from which the DRS for (33.c) can be derived by renaming
and thinning.
(42)
t e x f
t
1
(
<n 10(f) t c
s0
s0: -i awake'(x)
so zxze
'
1
L
|
e c t Alain(x
s
res(s.e)
s:awake'(x)
)
37. Discourse Representation Theory 905
the application of these postulates to the premise representation that the inference rules
of classical formal logic came into play. In the examples we have looked at the uses we
have made of such rules have been very elementary, and in this respect our examples
have been good illustrations of what is involved in such inferences generally: The real
action - or there is of it - in such inference processes is in selecting the postulates that
make it possible to transform the premise DRS in the ways required. Since the number
of meaning postulates that are part of knowing a language is very large, choosing the
right ones for a given inferential task could well be a non-trivial problem. It seems
however that this is something at which language users are remarkably adept. So, inasmuch
as knowing meaning postulates and making the right choices from the set of meaning
postulates one knows is arguably part of our linguistic competence, the conclusion would
seem to be that our inferential abilities depend as much as on our knowledge of language
as on a language-independent capacity for abstract formal reasoning.
We conclude this discussion with a word of caution. The inferences reconstructed
above were strict, deductively valid inferences. But most of the inferences people draw
in the day to day business of real life aren't like that; they are approximate and
defeasible. Lexical entries should be such that they can support such inferences too. However,
our understanding of how defeasible inferencing works is still fragmentary and tentative.
And so long as our grasp of this part of the general problem of inference remains as
feeble as it is today, there is no way of telling how well the lexical semantics we propose
will stand up in the longer run.
6. Extensions
6.1. Plurals
Natural languages such as English mark the distinction between singular and plural.
Plural nouns arc as common as singular ones. This in itself would be reason enough to
demand of a theory dealing with the semantics of such a language that it can handle
plurals as well as singulars. But in the case of DRT there is a special reason for wanting
to cover plurals and not only singulars, which has to do with the original motivations for
the theory. As noted earlier, one of the main selling points of early DRT was the way it
treats the anaphoric relation between pronouns and their indefinite antecedents. A
crucial ingredient of that account is the assumption that the interpretation of an anaphoric
pronoun always involves a discourse referent that is either made available by the DRS
of the preceding discourse or by the DRS of the sentence itself. It is to this discourse
referent that the pronoun is then related (often, though not invariably, in the sense of
conference). Plural pronouns, however, do not behave in strict accordance with this principle.
They are often interpreted as referring back to elements of the discourse context that
are not (yet) represented by discourse referents and that must first be constructed from
material that is explicitly contained in the representation.
Our first task in this section will be to review the relevant observations about the
behaviour of plural pronouns and DRT's account of them. But the extension of DRT to
cover plurals also has another important aspect: By extending DRT so that plurals are
covered as well as singulars we move from first to second order logic. This point is not
as obvious as it might seem, and requires careful discussion. But it is methodologically
906
VIII. Theories of discourse semantics
important. We will discuss the two issues - plural anaphora and the expressive power of
the extended DRS language - in that order.
Many plural DPs denote sets with two or more members. (Not invariably, see Kamp
& Reyle (1993, Ch.4);but in this brief review we will confine ourselves to DPs for which
this is the case.) This applies also to anaphoric plural pronouns. But often the sets that
these pronouns refer to are not explicitly represented in the discourse representation
that should supply the pronoun's antecedent - not, at least, when that representation has
been constructed from the antecedent discourse along the lines sketched in Section 2.
An example is (44).
(44) Alan took his wife out to dinner. They shared the hors d'ocuvrc.
Here the pronoun they in the second sentence clearly refers to the pair consisting of Alan
and his wife. But if the DRS for the first sentence is constructed according to the
standard rules of DRS construction, then it will have no discourse referent representing this
pair; it will only have discourse referents - x and y, say - that represent Alan and his wife
separately. To obtain a discourse referent that can serve as antecedent for they we have
to apply the operation of Summation. This application takes the form of introducing a
new plural discourse referent Z and adding the condition Z = x 0 y, which defines Z as
the sum of x and y.
The possibility of constructing discourse referents that can serve as antecedents for
plural pronouns is of methodological interest because it is subject to certain restrictions.
This is illustrated by (45).
(45) Half of the shareholders were present at the meeting. They learned about what
had been said and decided at the meeting only from the newspapers the next
morning.
In this sentence they can only refer to the half of the shareholders that were at the
meeting, even if that interpretation makes little sense in the given context. The
interpretation that would make more sense - the one according to which they refers to the
other half, who were not at the meeting - is blocked: subtraction of one set from another
is not an operation that is available for creating pronominal antecedents. (It has been
argued that subtraction is possible in certain circumstances, though speakers vary in their
judgements about such cases. For extensive discussion of this issue see Nouwen 2003
and Kibble 1997.) Creation of pronoun antecedents is thus not just a matter of logical
inference from the DRS representing the given discourse context. Only certain
operations, such as the Summation operation that was used to synthesise the Z in constructing
a representation for (44), are permitted. (For more details see Kamp & Reyle 1993).
Moreover, the restrictions to which this process is subject, and which distinguish it from
standard logical deduction, are specific to the interpretation of pronouns:There is no
difficulty in referring to the half of the shareholders that were not at the meeting, provided
we use a definite description. (In fact, we just did; and in (45) the phrase the other half
would have done as well.)
The upshot of these considerations:There are a number of principles that can be used
to construct antecedents for plural pronouns from material that is explicitly present in
the given DRS, but these principles do not exhaust the full power of logical inference.
37. Discourse Representation Theory 907
Moreover, these principles, with their built-in restrictions, apply not only within the
sentence but equally to cases of inter-sentential anaphora. And as we just saw, they are
specific to pronouns; anaphoric definite descriptions, for instance, are not subject to them.
The methodological interest of these observations is that what we are seeing here is a
distinctive property of some particular linguistic category (plural pronouns), and which
applies not just to what happens within single sentences, but equally to multi-sentential
discourse. This is a case, in other words, where the grammatical properties of a certain
type of expression exert their influence beyond the sentences in which they occur.
The restrictions on the construction of plural pronoun antecedents illustrated in (45)
find a parallel in the famous "ball" example due to Partee:
(46) a. One of the ten balls is not in the bag. It/the missing ball is under the sofa,
b. Nine of the ten balls are in the bag. *It/the missing ball is under the sofa.
Getting the intended antecedent for it in (46b) by subtracting the mentioned set of nine
balls (the balls in the bag) from the mentioned set of ten balls (the set of all balls) doesn't
work, or at any rate not very well.
This gets us to a question that imposes itself once it has been accepted that the
antecedents of plural pronouns may be constructed using a certain range of special purpose
principles: How does this account for plural anaphoric pronouns relate to the one for
singular pronouns that was outlined in Section 2? The simplest relationship would be
that of subsumption: If we accept that singular pronouns must always refer to single
individuals (or singleton sets, the distinction doesn't matter here), then the operations that
permit antecedent construction will be otiose; they cannot create a discourse referent for
a single individual, unless they start from a discourse referent that belongs to the
representation already and which they then simply return. As it turns out, things are not quite
that simple. One of the operations that can be used to construct antecedents for plural
pronouns is Abstraction. It takes a nominal predication represented within the DRS as
input and returns its extension as output. An example is (47), in which Abstraction can
be applied to the predicate friend that Alan had invited and returns as output a discourse
referent for the extension of this predicate which can then serve as antecedent for the
pronoun they in the second sentence.
(47) No friend that Alan had invited came to his party. They were too busy.
If the same rules that can be used to produce antecedents for plural pronouns are
available in principle for singular pronouns as well, couldn't Abstraction be used to yield
antecedents for singular pronouns as well, in those cases where the predicate to which it
is applied has a unique instance? As it turns out, the evidence suggests that this is indeed
the case. A variant of an often quoted example is (48).
(48) It is not true that there is no rabbi officiating at this wedding. He is standing over
there, right next to the buffet.
One way to account for the felicity of this discourse is to construe the antecedent of he
as the result of applying Abstraction to the predicate rabbi officiating at this wedding
together with the general assumption that the number of officiating rabbis at a wedding
37. Discourse Representation Theory 909
Both (50b) and the sentence (50a) are known to be essentially second order (meaning
that their truth conditions cannot be captured by any first order statement) and to yield,
when conjoined with (representations of) first order sentences that express the familiar
properties of addition and multiplication, a non-axiomatisable theory. This establishes
the non-axiomatisability of the logic expressible in a fragment of English with plurals,
and likewise of a DRS-language which has plural discourse referents and just a handful
of predicates (among which e and the *-predicate that occurs in (50b)).
DRS languages which licence DRSs like (50b) (such as e.g. the DRS-language assumed
in Ch. 4 of Kamp & Reyle 1993) thus have the power of second order logic. (Versions of
the argument given here have been known for several decades. A seminal paper on the
topic is Boolos 1984, in which a very similar method to the one used here is attributed to
David Kaplan.) Attempts to find natural and independently motivated characterisations
of sublanguages of such DRS languages which retain the ability of representing a
significant portion of English sentences with plurals, but whose logic is axiomatisable (and
which must therefore exclude DRSs like (50b)), have thus far been without success, and
at this point it seems doubtful that such sublanguages can be found.
6.2. Intensionality
Only some natural language constructions that build clauses out of other clauses are
extensional.The majority are intensional, in the sense that the denotation of the resulting
clause is a function of the intensions of the clause or clauses from which it is constructed,
and not from their truth values. Modifying DRT so that it can handle intensional
constructions is straightforward in principle. All that is needed for this is that we replace the
extensional models we have been considering up to now by intensional models. We can
take these intensional models to be bundles of extensional models indexed by possible
worlds. A simple formalisation of this notion is as a pair M = <W,M), where W is a
nonempty set of possible worlds and M is a function from W to extensional models. In
relation to such intensional models M we can define the denotation of a DRS K in M, \[K]\ M,
to be the set of all pairs <w,f) where w e W and f is an embedding of K into the universe of
A/(w) which verifies the conditions of K in A/(w); and the proposition expressed by K in
M can be defined as the set jw | B f «w,f) e [[K]] ,M)}. In analogous ways we can also define
the intensions in a model M that are determined by expressions other than sentences.
The details are straightforward.
For this to be of any practical use in linguistic analysis we of course also need
linguistically motivated predicates and functors that take intensions as arguments. Here it is
useful to distinguish between modal and attitudinal predicates and functors, even if it is
not possible to draw a sharp line between these categories. As far as modal notions are
concerned, there is (to our knowledge) not much work that has been done towards their
analysis within a specifically DR-thcorctical context. But any of the existing proposals
for the analysis of modal words or constructions (including countcrfactuals and other
conditionals) can be incorporated into DRS-languages once intensional models have
been adopted, since all further structure that might be needed (such as, for instance,
various Kripkean accessibility relations between worlds) can be imposed on these models.
The analysis of propositional attitudes is another matter. The account of propositional
attitudes and of the operators (believes, intends, etc.) that arc used in natural language
to describe them differs significantly from the paradigm provided by modal logic, and,
37. Discourse Representation Theory 913
6.3. Propositional attitudes
The classical treatment of belief, knowledge and other propositional attitudes as it has
come to us through the work of Carnap, Montague, Kaplan, Lewis and others treats the
objects of the attitudes - that which is believed, known, etc. - to be intensions, as defined in
the last subsection. There is one major objection to this kind of analysis, often referred to
as the problem of logical omniscience: intensions do not allow for distinctions that arc
sufficiently fine-grained. In particular, any two sentences that are logically equivalent express
the same proposition (i.e. propositional intension). An account that takes intensions to be
the objects of belief cannot explain how a person could stand in a relation of belief to a
sentence S while failing to stand in that relation to a sentence S' in cases where S and S'
happen to be logically equivalent, but where this is not so easy to see and the person in
question hasn't seen it. If the person professes belief in S but denies belief in S', then all
the theory could say is that she must either be wrong about S or about S'.There is wide (if
not universal) agreement that this conclusion is incompatible with the actual aetiology of
belief.
To obtain an ontology of attitudinal objects that is immune to this objection Asher
(Asher 1986; Asher 1993) offers a DRT-based account in which the objects of the
attitudes are identified as equivalence classes of DRSs. The equivalence relation that
generates these classes is defined in terms of the structural properties of DRSs and is
substantially tighter than the relation of logical equivalence in classical logic.This approach
has the merit of providing an explicit definition of intensional identity that escapes
the problem of logical omniscience (at least in its most obvious and unacceptable
manifestations). A potential drawback is that it is hard to see which relation of structural
equivalence between DRSs gives us the intuitively correct notion of intensional identity.
(The problem seems to be in part that intensional identity varies with context.)
A second DRT-based approach, related to Asher's, but different at certain points,
was first outlined in Kamp (1990) and subsequently developed in explicit formal detail
in Kamp (2003). (See also Kamp 2006 and van Genabith, Kamp & Reyle (to appear).)
Here the emphasis is on the development of a representation formalism that is capable
of representing not just single propositional attitudes but also combinations of
attitudes involving different attitudinal modes - e.g. the combination of a belief and a
desire, or of a belief and a doubt. Moreover, it can represent such combinations of two
or more attitudes not only for a single agent at a single time, but also combinations of
attitudes that a person entertains at different times, or that are held by different
persons at the same or at different times.The basic construct of this extension of DRT is a
predicate 'Att' whose first argument slot is for the bearer of the represented attitudinal
state, while the second slot is for a representation of the attitudinal state that is being
ascribed to him. (There is also a third slot, which is reserved for connections between
discourse referents occurring in the second slot and objects in the world to which these
are anchored, so that these discourse referents function as directly referring terms. For
ease of exposition we ignore this slot until further notice; but see the next subsection.)
The representations that fill the second slot of 'Att1 are sets of pairs consisting of an
attitudinal mode indicator (such as BEL for belief, DES for desire, etc.) and a DRS
specifying the propositional content of the attitude represented by the pair.
An important feature of this way of representing attitudinal states is that the
representations which occur as second members of the pairs may share discourse referents
916
VIII. Theories of discourse semantics
to capture the effects of direct reference. Let us return to example (57), and focus on the
belief it attributes to Alain, while forgetting about the desire. Let z' and y' be discourse
referents that we, as attributors of the belief, use to represent the road and the coin
on which the belief we attribute is targeted. Then we can use the formalism of the last
section to express that the attributed belief is (doubly) singular - with respect to z and
the road and with respect to y and the coin - by inserting in the third argument
position of'Att' external anchors for y and z. These external anchors can be given the simple
form of ordered pairs <z,z'> and <y,y'>, where z' and y' are again discourse referents,
and for the term we insert into the third slot of 'Att' we can use the canonical way of
denoting the set consisting of these two anchors, i.e. {<z,z'>, <y,y'>). If we now also add
z' and y' to the universe of the main DRS of (57), then we get a DRS which says
that there are two entities (represented by z' and y') such that Alain has a belief that
is singular with respect to these entities.The model-theoretic semantics for 'Att'
guarantees that the contents of the attributed attitudes are singular propositions in the sense
defined above.
We have assumed that in order to entertain a proposition that is singular with
respect to an object an agent must (i) stand in some non-representational relation to
d (representational relations are inherently descriptive, yielding general, not singular
propositional content), and (ii) take himself to stand in such a relation to the object.
Thus the discourse referent x he uses to represent the object must on the one hand
stand in a non-representational relation to the object (parasitic on the
non-representational relation between the object and the agent himself). And on the other hand
x must have conditions associated with it at the level of internal representation
which express that x is non-representationally connected with the object in the relevant
way. A condition, or set of conditions, which expresses that a certain discourse referent
is non-representationally connected to the object it represents is called an internal
anchor.
Sometimes an agent will have an internally anchored discourse referent for which
there is no corresponding external anchor. These are cases where the agent is under
some kind of illusion, as when he thinks he is seeing a coin in the middle of the road, but,
in reality there is nothing there, neither a coin nor some other object that he mistakes for
a coin. Such representations are deficient; they presuppose the existence of an external
anchor, but the presupposition fails to be true. Strictly speaking such representations
have no well-defined propositional content - the content tries to be that of a singular
proposition, but doesn't succeed. On the other hand an external anchor for a discourse
referent of an internal representation for which there is no matching internal anchor will
be irrelevant to propositional content; this content would be the same that it would be if
the external anchor didn't exist.
These contributions of DRT to questions of direct reference depend essentially on
extending its representation formalism with the predicate 'Att' discussed in the last
section. What we said in the concluding paragraph of that section also applies to the
contributions to the theory of direct reference discussed in this one: Direct reference, according
to the view presented here, is in the first instance an aspect of thought; thoughts -
beliefs, desires etc - can be directly referential by virtue of involving representations with
anchored discourse referents. Utterances can be directly referential as well, but only by
virtue of expressing directly referential thoughts.
37. Discourse Representation Theory 917
8. Coverage, extensions of the framework, implementations
8.1. Coverage
The preceding sections have given some impression of the linguistic scope of DRT:
Singular and plural DPs of various types; the behaviour of tenses and temporal adverbs, as
well as certain aspect operators; presuppositions of any kind; certain modal and inten-
sional operators (the aspect operators among them); propositional attitudes and direct
reference. Among contributions to these topics that make a significant use of DRT but
have not yet been mentioned are Stirling (1993) on switch reference, Farkas & de Swart
(2003) on incorporation, de Swart (1991), Reyle, Rossdeutscher & Kamp (2008) on
temporal quantification, and de Swart & Molendijk (1999) on the interaction between tense
and negation.
There are also natural language phenomena that so far have not been mentioned
at all, but where there is substantial DRT-based work. We mention two: (i) Ellipsis
(Hardt 1992; Klein 1987; Asher,Hardt & Busquets 1997; Bos 1993) and (ii) Information
Structure (Kamp 2004; Riester 2008).
8.2. Extensions: SDRT and UDRT
DRT has led to a number of extensions that differ from it substantially and that have
come to be known by their own names. The best known of these arc S(egmented) DRT
and U(nderspecified) DRT. Each of these would need a separate contribution. Rather
than compromising by attempting a presentation that would be far too brief to do them
justice we limit ourselves here to a mere statement of their goals and a few references.
a. SDRT (Asher & Lascarides 1993, 2003): SDRT extends DRT in that it uses DRSs as
building blocks of more complex structures in which rhetorical and other discourse
relations between the successive sentences that make up a discourse or text arc analysed
as relations between DRSs representing those sentences. This adds an additional level
of discourse structure, with its own principles of structure and computation. SDRT has
developed into what is arguably the most sophisticated current approach to some of
the most challenging questions on the borderline between semantics and pragmatics.
b. UDRT (Reyle 1993; Reyle 1996; Reyle, Rossdeutscher & Kamp 2008): The aims of
UDRT are quite different from those of SDRT. UDRT was motivated by the problem of
computing DRT's semantic representations - i.e. DRSs - from syntactic input. A major
problem that affects the computation of semantic representations generally (whether
DRSs or representations of some other form) is that natural language is full of ambiguity.
Words are often ambiguous and of the ambiguous words many are multiply ambiguous,
with three or more (often many more) different readings. Ambiguity arises also in
connection with certain syntactic configurations (e.g. those that give rise to scope
ambiguities). But in the practice of language use these ambiguities are much less of a problem
than they might have been, for mostly they are resolved by context, provided either by
the sentences in which they occur or by the circumstances in which these sentences are
produced. For the theorist, however, disambiguation is a challenge; and it is an inexorable
challenge for those who try to build effective, scmantically sophisticated NLP systems.
Disambiguation is not only an issue that arises in connection with interpreting
nearly every sentence we ever come across; it is also something that almost always
918
VIII. Theories of discourse semantics
requires inference. One of the premises for the inferences that are needed for resolving
an ambiguity in a given sentence is always the still ambiguous representation of the
sentence itself. It is important to structure this representation in such a way that the
required inferences can be drawn efficiently. In particular it will as a rule be useful to
avoid long sentence level disjunctions in which each disjunct represents a potential
complete reading of the reading of the sentence as a whole.
This is the central concern of UDRT. UDRT offers ways of representing ambiguous
premises that permit deductions which avoid many of the duplications that are
typically involved in such 'proofs by cases'. At the same time the 'underspecified'
representations proposed by UDRT (known as 'UDRSs') support the inferences that are typically
needed for ambiguity resolution. Thus the inferential component of a UDRT-based
system serves a double purpose: (i) assist in ambiguity resolution and (ii) draw
inferences from premises for which complete disambiguation can't be achieved. (See also
article 24 (Egg) Semantic underspecification. The advantages of underspecification have
been challenged in Ebert 2005.)
8.3. DRT and Dynamic Semantics
DRT has often been compared with Dynamic Semantics as developed by Groenendijk
and Stokhof and others (see in particular Groenendijk & Stokhof 1991; Groenendijk
& Stokhof 1990; Dckkcr 1993, as well as article 38 (Dckkcr) Dynamic semantics). In
Dynamic Semantics sentences are explicitly analysed as relations between information
states: The meaning of a sentence S manifests itself in the way S changes the
information state of one who receives it, interprets it and accepts it. So in Dynamic Semantics
it is the concept of meaning itself that has become dynamic (meanings are transformers
of information states into other information states), whereas in DRT the notion of
meaning remains static. As in classical Montague Grammar meanings in DRT are
relations between sentences or texts (or their semantic representations) on the one hand
and models or possible worlds on the other hand. Here the dynamics doesn't concern the
notion of meaning as such, but the process of interpretation.
Muskens (1994) showed how DRT can be formalised within a version of the X-calculus
(a modest extension of Montague's HOIL) and as part of that how DRSs can be given
a compositional relational semantics of the kind proposed by Dynamic Semantics (see
also Eijck & Kamp 1997; Muskens, van Benthem & Visser 1997).This is an elegant and
insightful way of eating one's cake and having it. Since the mid-nineties this dynamic
approach to the analysis of meaning in natural language has been developed further,
especially during the past decade in the work of, among others, Bittner and Brasoveanu
(Bittner 2005; Bittner 2007; Bittner 2008; Brasoveanu 2007).
8.4. Cognitive significance
From the very beginning one of the central motivations behind DRT has been the hope
that it can tell us more about the processes of language interpretation and semantic
representation in the human mind than Montague Semantics - the approach out of which
DRT developed and which it shares most of its theoretical and methodological
commitments - cither can or wants to.The DRSs of DRT ought to tell us some things about the
38. Dynamic semantics
927
(9) Once upon a time there was an old king, who didn't have a son. He did have a
daughter, though. Whenever she saw a frog, she kissed it.
(10)
x,t
"
OLD(x, t)
KING (x, t)
y
SON_OF(x,y, t)
x,t,z
-
DA
OLD(x, t)
KING(x, 0
y
SON_OF (x, y, t)
UGHTEFLOF (x, z, t)
X, t,Z
OLD(x, 0
KING(x, t)
SON_OF (x, y, t)
DAUGHTERJDF (x, z, t)
v, f
FROG(u)
SEE (z, v, f)
KISS (z, v, t)
These three DRSs represent the contents of the discourse in (9) after processing the
first sentence (10a), the first two sentences (10b), and after processing the whole (10c).
Notice that the material contributed by the second and the third sentence gets added in
the representations that result from processing the first and the first two sentences. In
this way, the pronouns he and she are appropriately related to the established domain of
discourse.
1.4. Historical remarks
We end this introductory section with some historical remarks on the treatment of
indefinite anaphoric relationships in terms of discourse reference.The subject has gained
prominence by, among many others, the logico-philosophical (Geach 1962), and the
seminal but relatively informal work on discourse reference in Karttunen (1968). Kamp
(1984) and Heim (1988) were the first, independently, to present a formal framework
of interpretation for anaphoric phenomena, DRT and File Change Semantics (FCS)
930
VIII. Theories of discourse semantics
[I3*0]L = l(g, h) | for some k: g[x]k and <*, h) e [[0l]J
110 a f\\M = ((&A>|forsomcfc(g,A:>6lI^Ilil|and<A,A>6lIVfIlJ
[IV*0]] „ = [ig,h)\g = h and for all k: if g[x]k
then there is h: (k, h) e [[01] J
110 v ^M = (<g, /t> | g = /t and for some *:<g, A:> € [[0]]^
or for some k: (g,k)e [[ y/]\M)
U^¥\\M = (<g, *> |g = /i and for all*: if <g,*>e [[*]]„
then there is h:(k,h)e \[ifJ\\M)
Most clauses require the input assignment g and output assignment h to be the same,
besides some standard predicate logical conditions. For instance, if an atomic formula
like Rtx... tn or r = tt is true relative to M and g then the input-output pair (g, g) is in the
interpretation of such a formula. Intuitively this says that, if the standard test succeeds, g
is accepted as possible input and the interpretation of the formula docs not change
anything in its output. If the test fails then g is not accepted as possible input and in that case
there is no assignment h such that (g, h) is in the interpretation of that formula.
Exactly when this is the case, that is, when the conditions imposed by a formula 0upon
M and g is not satisfied, then its negation is satisfied, and g is a possible input for -i0
relative to M.In other words, if 0 cannot be executed upon input g, then -i0can, and its
interpretation will yield g again as output. Apart from the clauses for existentially quantified
formulas and conjunctions, the other clauses in the definition are also static, in the sense
that they do not allow input assignments to really change.The associated conditions are
straightforward adjustments of the static interpretation of the embedded formulas to
the dynamic (relational) interpretation. Only the interpretation of an implication is a bit
more involved. An implication (0 -» y/) is satisfied (relative to M and g) iff relative to all
ways of satisfying 0 on input g in M, y/is true as well. Since y/here gets evaluated relative
to outputs (k) of interpreting 0, dynamic effects of 0 may affect the interpretation of y/.
Changes in assignment functions are due to the interpretation of existentially
quantified formulas. According to the above definition, if we have some input assignment g,
then the interpretation of 3jc0 requires us to try out any assignment k which differs from
g only in its valuation of x, then see if it serves as an input for interpreting 0, and if it does
and outputs h, then h is also a possible output for interpreting 3x</> on input g. Notice
that if x indeed, as in most examples, occurs free in 0, and 0 imposes certain conditions
on the valuation of x, then the output valuation of x will have to satisfy these
conditions. Metaphorically speaking, a 'discourse referent' with the properties attributed to x
is introduced by such a formula, and it is labeled with the variable x.
A conjunction docs not change any context all by itself, but it docs preserve, or rather
compose, possible changes brought about by the combined conjuncts.That is to say, if 0
accepts an input g and produce some possibly different output k, and if ^accepts k as
input and delivers h as possible output, then the conjunction (0 a y/) accepts g as
possible input upon which h is a possible output.This implements the dynamic idea that the
interpretation of (0 a y/) involves the interpretation of 0 first and y/next.
38. Dynamic semantics 931_
2.2. Dynamic binding
By way of illustration, let us first consider a simple example in detail, throughout
neglecting reference to a model M.
(12) A farmer owned a donkey. It was unhappy. It didn't have a tail.
3x(Fx a 3y(Dy a Oxy)) a Uy a -3z(Tz a Hyz)
Relative to input assignment g this will have as output assignment h if wc can find
assignments k and / such that k is a possible output of interpreting 3x(Fx a 3y(Dy a Oxy))
relative to g, and / a possible output of interpreting Uy relative to k, and h a possible output
of interpreting —3z(Tz a Hyz) relative to /. Since the second formula is atomic, and the
third a negation, we know that in that case k = I and I = h.
Assignment h (that is: k) is obtained from g by resetting the value of x so that h(x) e
1(F), and by next resetting the value ofyso that h(y) e 1(D) and (h(x),h(y)) e /(O).That
is,h(x) is a farmer who owns a donkey h(y). Observe that for any farmer/and donkey d
that /owns, there will be a corresponding assignment /i':g[{-*,.y}]/i' and such that h'(x) =
fandh'(y) = d.
The second conjunct first tests whether y is unhappy, that is, whether h(y) = l(y) e
I(U). The third conjunct, a negation, tests whether assignment h cannot serve as input
to satisfy the embedded formula 3z(Tz a Hyz).This is the case if we cannot change the
valuation of z into anything that is a tail had by h(y). Putting things together, (g, h) is in
the interpretation of our example (12) if, and only if, g[[x, y\\h and h(x) is a farmer who
owns a donkey h(y) which is unhappy and does not have a tail. Observe that for any
farmer/and unhappy tail-failing donkey d that /owns, there will be a corresponding
assignment/I'lgljjc,^}]/!'and such \na\h'(x) = f and h'(y) = d.
In the example discussed above we see that a free variable y, for instance in the second
conjunct, gets semantically related to, or effectively bound by, a preceding existential
quantifier which does not have the variable in its syntactic scope. This is an example of
a much more general fact about interpretation in DPL, which goes under the folkloric
name of a 'donkey equivalence'.
Observation 1 (Donkey Equivalences) For any formulas 0and y/
(Bjt0a vO = Bjt(0a V)
(3x0 -> \ff) = Vjc(0 -> iff)
These equivalences are classical, but for the fact that they do not come with the proviso
that x not occur free in y/iThis is dynamic binding at work. In the first equivalence we see
that a syntactically free variable x in y/may get semantically bound by a previous
existential quantifier. The second one shows that this semantic binding gains strong, universal,
force in implications.The use of the second equivalence is exemplified by the following,
canonical, examples.
(13) If a farmer owns a donkey, he beats it.
(3jc(Fjc a 3y(Dy a Oxy)) -> Bxy)
932
VIII. Theories of discourse semantics
(14) Every farmer beats every donkey he owns.
Vx(Fx -> Vy((Dy a Oxy) -> Bxy))
These two sentences have generally been judged equivalent, and so are the naturally
associated translations in DPL. (As a historical remark, back in 1979 Urs Egli has
proposed the above equivalences, in order to account for the anaphoric puzzles that
plagued the literature. One of the merits of DPL is that the equivalences show up as true
theorems.)
The following, classical, equivalences are also valid in DPL.
Observation 2 (Equivalences that Hold)
i i 10= —10 Vjf0=-i3jt-i0
(0 V ^)=-.(-.0A-.^) (0~> V)=-.(0A-.^)
As we have seen —■ is an operator that introduces tests without any further dynamic
impact, and V, v and -> do likewise.That is, if 0 contains a quantifier with binding
potential, this potential gets lost when it occurs in the scope of -i, V, v or ->. As a consequence
other equivalences typically do not hold in DPL.
Observation 3 (Equivalences that do Not Hold)
—1—10 * 0 Bx0 * —iVjt—10
(0 A Iff) « -,(-,0V-,^) (0 A Iff) * -,(0 -> -i^f)
These non-cquivalcnccs arc motivated by the observation that the pronouns in the
following examples do not seem to be resolved, or at least not bound by the indefinite which
figures in the scope of one of the mentioned operators.
(15) Farley doesn't have car. It is red.
(16) Every man here owns a car. It is a mustang.
(17) Mary has a donkey or she doesn't have one. It brays.
The facts about (undoing) dynamic binding, which follow by the definition of the
semantics, correspond one to one with the facts about (in-)accessibility of discourse referents
in basic DRT, cf. article 75 (Geurts) Accessibility and anaphora.
2.3. Dynamic consequences
DPL has been motivated by the desire to bring out the logic of a system of interpretation
that accounts for anaphoric relationships, like DRT. It allows us to study the logical
consequences in full formal detail. Before we can see this more clearly, we have to present
the DPL notions of truth and dynamic entailment first.
936
VIII. Theories of discourse semantics
The dynamics of discourse is much more involved than these simple examples suggest. It
is not just the passing on of information exchanged, and not just the creation and
utilization of discourse referents.This type of information comes by in a structured manner, as
the following examples serve to show.
Notice, first, that plural pronouns may pick up plural entities which have not as such
been introduced in the discourse.
(24) Bob and Carol went to play bridge with Ted and Alice. They had a wonderful
evening.
Obviously the pronoun "they" should not be taken to refer to cither Bob, or one of the
others. The pronoun can, however, refer to the couple of Bob and Carol, and also to
the whole group of four, Bob, Carol,Ted and Alice. However, this group of four has not
been mentioned as such in the first line of example (24). Somehow this plural referent
may have to be constructed from the four persons that have been explicitly introduced.
(This process of forming plural discourse referents is called 'summation' in Kamp &
Reyle 1993.)
Besides this type of summation of individual referents more is at stake. When we
deal with plural anaphoric dependencies all questions about the distributive, collective
or cumulative interpretation of plurals play up, in a dynamic fashion. Consider the
following example.
(25) Seven pupils and four teachers wrote five ballads and some rhymes.They performed
them at an evening during the spring holiday.
The first sentence here can be taken to introduce a group X of pupils and teachers,
and a group Y of ballads and rhymes, such that X wrote V and X performed Y at a
certain evening. Of course, the writing of Y by X can be analyzed in more detail.
Maybe the intended reading is that the pupils wrote the ballads and the teachers the
rhymes. Also, the pronouns "they" and "them" can be taken to require further
analysis, possibly depending on the particular kind of reading associated with the first
sentence. The performers can be taken to have performed, individually or group-wise, the
ballads and rhymes they wrote themselves. Upon this rather natural analysis, the
truth conditions of the second sentence arc dependent on the analysis chosen for the
first, so that the dynamics of interpreting the first sentence must, not only deliver just
two plural discourse referents, but some internal relation between these referents
as well.
This point also shows in quantified constructions.
(26) Almost all students chose a book. Most of them wrote an essay about it.
The first sentence in this example can be taken to yield a referent set of students who
chose a book, and this is the set of individuals which the pronoun "them" intuitively
refers to. However, assuming not everyone of them chose the same book, there is no
singular referent figuring as the chosen book, which the pronoun it could have referred to. A
natural interpretation of the second sentence of (26) is that the students wrote an essay
about a book that they individually chose. We witness again a close interaction between
38. Dynamic semantics 939
Informative types of discourse not only consist of assertions, but they are, typically,
also guided by questions: interrogatives, topics which the interlocutors raise, and
questions they themselves face. Already at the outset of formal semantic theorizing about
questions their dynamic role, their role in discourse or dialogue, has been obvious.
Adopting a dynamic theoretical perspective, discourses or dialogues can be described as
games of stacking and answering 'questions under discussion' or as processes of 'raising
and resolving issues'.
Such processes are not unstructured, but governed by structural linguistic rules,
and highly pragmatic principles of reasonable or rational coordination (Ginzburg 1995;
Roberts 1996; Hulstijn 1997). A quite minimal way to account for this proceeds by
representing information as a set of possibilities, one of which is supposed to be actual.The
possibilities are grouped in sorts.The sorting indicates that the current issue is, not which
of the possibilities is the actual one, but which is the sort of the actual one, that is, in which
sort of possibilities the actual world can be found. Information states then can be taken
to be sets of sets of possibilities, which may get updated with further questions and more
data in the development of a discourse.
Such a tentative sketch already provides the basics for characterizing certain basic
discourse and dialogue notions like that of a coherent and felicitous dialogue. For a dialogue
to proceed coherently and felicitously, one may require assertions to be consistent with the
current information state, but also informative: logically speaking it is of no use to accept
a state of inconsistent information or to assert what is already (commonly) known.
Questions can be required to be non-superfluous as well,so that one doesn't raise an issue which
is already there. Assertions can also be required to address issues at stake and not to
provide unsolicited information.These observation and requirements, and many others, find
a neat formulation in update style systems of interpretation. Asher & Lascarides (2003)
gives an impression of the wealth of data to be covered in this direction, and Aloni, Butler
& Dckkcr (2007) gives an overview of recent formalizations in a dynamic paradigm.
3.3. Modality and presupposition
Like we said, a pragmatic system of interpretation along the lines of Veltman's update
semantics provides a ground for evaluating epistemic modals, assertions made with the
auxiliaries may and must or adverbials like maybe and evidently. The sententials
operators may and must neatly seem to fit in the dynamic paradigm, as the first can be used
to express consistence with the current context of information, and must to express
something that can be derived from this context.Thus, the effect of a modal mod, which
expresses contextual epistemic possibility, can be tentatively defined as follows.
(32) (t)[mod0] = r, if (r)[0] is possible, and (t)[mod0] = J. otherwise.
Since the interpretation of the modality is stated in terms of the possible update of the
current information state rwith the embedded sentence 0, the interpretation of these
modal sentences may be variable. For instance, it may be the case at one point in an
exchange that Nancy might be home, as far as we all know, while later in the discourse
we may have collected information which rules out that she is home. Epistemically
used modals might and must may change their truth, or acceptability, in the course of
events. The dynamic logic of such epistemic operators is investigated in detail in various
38. Dynamic semantics 941
function composition, the second formula's presupposition that 0 ("Mike has children")
is automatically satisfied by the update of the context with the first conjunct 0. Not so
according to the rendering of example (3b) as (x0 a 0).This conjunction as a whole still
presupposes that 0, while its second conjunct still needlessly and explicitly conveys what
has already been presupposed by the first. An update semantics precisely accounts for
this difference.
The AB theory of presupposition (van der Sandt 1992; Geurts 1999) is dynamic like
the satisfaction theory, but it presents a different account of resolution. According to the
AB theory, presuppositions appear in a preliminary phase of interpretation, and at some
intermediary level, as distinguished information units which have to be 'resolved' for the
interpretation process to be completed. Being resolved roughly means being semanti-
cally invisible, the AB counterpart of being satisfied. If they are resolved these
information units as it were dissolve at the intermediary level of DRTs discourse representation
structures. If they are not resolved, they get 'accommodated', or, rather, they
accommodate themselves, like true squatters. In such a case their contents are settled in a relevant
part of a representation which resolves them in the slot where they originally appeared.
For a very sketchy illustration, consider again example (33).
(33) Sally believes that Harry didn't quit smoking cannabis.
Nowhere in example (33) it is literally said or communicated that Harry did smoke
cannabis, or that Sally believes so. So if the preceding context of such an example does not
supply a way of resolving this presupposition, it has to accommodate itself. Even though
this may require some further contextual support, one way for this presupposition to
accommodate itself is right there where it stands, thus bringing about a reading according
to which Sally is taken to believe that it is not the case that Harry smoked cannabis and
stopped, similar to the last reading above. It may be easier to obtain a reading by
accommodating the presupposition in Sally beliefs. Sally would then be taken to believe that
Harry did smoke cannabis and didn't stop doing so, as one gets from the second reading
above. Probably the most straightforward interpretation is one where the presupposition
accommodates itself at the main level of interpretation, so that example (33) is taken to
say that Harry used to smoke cannabis, and that according to Sally he didn't stop doing
so, basically the first reading again.
While the underlying ideas of the satisfaction and the binding theory on
satisfaction and resolution are similar, and while both are dynamic, the specific treatments are
quite different.The first is a logical approach, in that it predicts presuppositions as a type
of entailments following from an independently specified semantics; according to the
second presuppositions arc typically computational constructions.The difference is vast,
but not unbridgeable. A semantic variant of the AB theory, which fleshes out a logical
notion of the meaning or interpretation of unresolved presuppositional structures, is
presented in Dekker (2008).
4. Methodological issues
In this section we discuss two more theoretical issues. They arc related to the classical
Fregean themes of compositionality, representationalism, dynamicity and contextuality,
which show to be very actual still in current debates.
39. Rhetorical relations
953
(13) a. John fell. Bill pushed him. (Nonvolitional Cause)
b. John pushed Bill. He was angry. (Volitional Cause)
c. John pushed Bill. I saw it myself. (Justify)
A special case are also elaborations that give extra information about participants or the
location, for the purpose of identification or for motivating or explaining the pivot.
This whole group of relations can - with some charity - be described as relations
between the state or event described by the current sentence and the state or event
described by the pivot. All other relations are different. They are Contrast, Concession,
Narration and List and seem primarily related to the strategic level: what is reported
belongs together because it is relevant to the same issue, but the events and states
that are reported are not necessarily related to each other. This does not mean that
they cannot have causal, temporal or local relations to each other. Contrast and
Concession often go together with temporal simultaneity and local overlap. The elements
of Lists also often have non-accidental temporal and local relations to each other.
But none of these relations seems to entail any particular temporal, local or causal
relationship between the events and states involved. If they nonetheless have a
relation of this kind it would be due to their shared relation with an overarching topic.
Narration (Sequence in the list of relations above) is connected with moving through
time and will imply temporal succession and that is a reason for distinguishing Narration
from List.
In a Narration, a story is told.The pivot event is abandoned and the current sentence
reports the next event that is relevant for the story. That is the definitional property for
Narration: the new event happens after the pivot event. Hobbs (1985) adds another
element: the new event is contingent on the pivot event. Contingency should be defined in
terms of causality, but it is not obvious how this must be done. The idea is that the pivot
event does not itself cause the utterance event, but establishes one of its preconditions.
This can be expressed as a counterfactual (14) and an illustration is (15).
(14) If the pivot event had not happened, the current event would not have happened
either.
(15) John stepped out of his car and walked up to the door.
contingency: John's stepping out of the car brings him in a state and at a place which
makes it possible for him to walk to the door.
counterfactual: If John had not stepped out of his car, he would not have walked up
to the door.
Stories are held together not just by temporal succession and contingency, but also by
protagonists and locations (the aboutness topic). One can try to express the unity of
stories by thinking of them as an Elaboration of a single event. Another way of defining the
unity of a story tries to see the whole story as an answer to a single question (the story
topic) that gets answered by the events that make it up. Unfortunately, clear ideas about
the construction of this topic question from the story have not been forthcoming, except
for simple constructed cases.
Narration cannot be reduced to a relation between the reported events, not even if
with Hobbs one adopts Contingency. On the topic view, it is the relation to the event
954
VIII. Theories of discourse semantics
described by the whole story or the overarching topic question that makes the pivot and
the current sentence stand in the Narration relation.
Topic questions (see article 72 (Roberts) Topics) do a much better job with Lists. A
List (a sequence of sentences connected by the List relation) can be seen as an answer
in many parts to a single topic question, typically in a situation in which single sentence
answers would not do the same job. Here there are algorithms, e.g. Priist, Scha & van
den Berg 1994 that compute the topic question (or a closely related object) for simple
cases. Lists can also be seen as an answer to the problem that one cannot fully specify
binary or ternary relations in a single sentence, unless one is very lucky. For instance, let
John love Mary and Sue, Bill Mary and Harriet and Tom only Sue, while there are other
boys and girls. It is impossible to give a simple clause which specifics the whole relation,
in response to a question: Which boy loves which girl? (16) is true but fails to give the
details.
(16) John, Bill and Tom love Mary, Sue and Harriet.
But a List like (18) does. Full specification is possible by splitting up the given question
into subquestions as in (17) and answering each in turn as in (18).
(17) Which girls does x love?
The List in (18) can be seen as a strategy to avoid the problem.
(18) John loves Mary and Sue. Bill Mary and Harriet. And Tom Sue.
Contrast is the topic of ongoing discussions. Umbach (2001) provides a definition
for the case when the English "but" or rather the German "aber" does not express
Concession but what is normally called Formal Contrast. The definition runs as
follows:
A positively addresses a topic question Q which B (directly or indirectly) addresses
negatively. This is illustrated in (19).
(19) John is tall but Bill is small.
topic question: Are John and Bill tall?
background knowledge: Small implies not tall.
(19) should be contrasted with (20) which requires a different topic question.
(20) John is tall and Bill is small.
topic question: How tall are John and Bill?
The generality of this approach can be undermined however by looking at languages
where Concession and Formal Contrast arc expressed by different words. A case in
point is Russian, where the expressions a and no both correspond to English but with no
specialising in the Concessive readings (Malchukov 2004).
39. Rhetorical relations
955
A is sometimes rendered by but in English. In many cases however, the correct
translation is and. The problem is that the other Russian conjunction / is subject to strong
restrictions which prevent it to be the uniform translation of and. One hypothesis (due
to Jasinskaja & Zeevat 2008) is that a relates to double contrast: a doubly distinct answer
to a double wh-question, as in example (21).
(21) John likes Susan "a" Bill likes Harriet.
topic question:^'ho like who?
No would not be adequate in this case, since no marks a special case of an answer to
a double wh-question, namely the case where one wh-element is why and the other is
whether. The polarity switch is then due to the fact that there are only two distinct
polarities that can answer a whether-question.
Concession can be defined along the lines proposed by Konig (1991) as anticausative:
Concession(A, B) iff A normally causes B to be false. This is however not fully
general, as was already noted by Ducrot (1973). In an example like (22), the first conjunct
argues for going to a restaurant, the second against going there. Typically, the speaker
can be taken as committing herself to the drift of the second conjunct, i.e. as proposing
not to go.
(22) The food is good but it is expensive.
This can be fitted into the scheme of double wh-questions by making the concessive case
provide doubly distinct answers to a why-whether question. In (22), the first conjunct
gives a reason for going to the restaurant, and the second conjunct a reason for not going
there. The anticausal readings are then the ones where the second conjunct itself is the
conclusion that is argued against (in the first conjunct) and argued for (by itself) in the
second conjunct.
Concession has its origin in conversation. A Concession is typically partly
acknowledgment of what the other said. Suppose the other said "A". Then one can concede any
part of A or a consequence of A. It goes together with not accepting A in its entirety,
normally with a rejection of A. In a conversational concession what is conceded is already
accepted by the other speaker and so has common ground status. What is not conceded
from A, lacks this status. This may explain why subordinate concessive clauses are
presupposition triggers.
In English, it is necessary to mark Concession, by markers like but, (al)though,
however and nonetheless. In languages like German and Dutch there are concessive
markers like zwar and weliswaar inside the first clause that indicate that one is in a
Concession and announce the aber- or /naar-clause that will follow. So these languages
have a coordinate structure that is unambiguously concessive, unlike the English
construction with but.
All rhetorical relations allow asyndetic expression (expression without any overt
marking) and there are often lexical markers. A curious fact is that while most relations
can be expressed by specific grammaticalised markers and can be marked as a relation
956
VIII. Theories of discourse semantics
between constituent clauses in a syntactically integrated complex construction there are
exceptions to this principle.
The analysis of many of the markers as listed in Tab. 39.2 is problematic. Good
analyses would give an account of the rhetorical relation(s) they can mark.The growing body
of research on the semantics and pragmatics of particles is therefore directly relevant
for a better understanding of rhetorical relations (see also article 76 (Zimmermann)
Discourse particles).
Tab. 39.2: Grammaticalised cues for the RST relations
Grammaticalised markers
Antithesis
Background
Concession
Enablement
Evidence
Justify
Motivation
Preparation
Restatement
Summary
Circumstance
Condition
Elaboration
Evaluation
Interpretation
Means
Non-volitional Cause
Non-volitional Result
Otherwise
Purpose
Solutionhood
Unconditional
Unless
Volitional Cause
Volitional Result
Conjunction
Contrast
Disjunction
Joint
List
Multinuclear Restatement
Sequence
but
but, though
because, since
because, since
because, since
so
while
if
namely
so
because, since
so
otherwise, else
anyway
unless
so
so
and
but, and
or
and, also
then
39. Rhetorical relations
957
4. Where must rhetorical relations be assumed?
The table at the end of section 3. already can serve to make an important point: it is
not just the sentences of a discourse that arc related by rhetorical relations. Rhetorical
relations must also must be assumed within a single sentence as obtaining between
coordinated clauses and between main clauses and subordinated clauses. And between
subordinate clauses as in the following example (23).
(23) Stepping out of his car and walking to the door John noticed a squirrel in the tree.
Stirling (1993) reports that this is in fact the favourite way of telling a story in switch-
reference languages. In Latin - where the case system allows a far more reliable way of
tracking the different participants in a sentence than the pronominal systems of many
modern languages - multiple participial constructions with rhetorical relations holding
between them are much favoured. (24) is from Caesar (1869: book 1,24).
(24) a. ipsi confcrtissima acic, rciccto nostro cquitatu, phalange facta sub primam nos-
tram aciem successerunt
b. they themselves in very close order, after having repulsed our cavalry and
formed a phalanx, advanced up to our front line.
In (24), the relation of Narration between throwing back the cavalry and forming a
phalanx, both expressed by an absolute ablative is only expressed by the linear order.
This raises the question how deep one should go into syntactic structure for applying
rhetorical relations. Very far it seems. Any subordinate predication can in principle
be related by a rhetorical relation to another predication, although not to any other
predication. Predications have to be related to other ones, since the speaker has put
them there for a reason and the hearer needs to figure out why the material was
deemed useful by the speaker. An exception is material that is used for the purpose of
identification, but also such material can be simultaneously used to give Causes or
Justifications.
(25) a. The angry farmers blocked the road. (Volitional Cause)
b. Pushing the button, John blew up the wall. (Means)
c. Holding the flowers in his arm, John crossed the road. (Circumstance)
d. When he reached the crossing, he turned right, following Mary's instructions.
(Volitional Cause)
One can even quite legitimately ask the opposite question. How much of the semantic
connections expressed by lexical and syntactic means arc in fact rhetorical relations? As
it turns out many are. Typically all the connectors between sentences have a semantics
that is reminiscent of a rhetorical relation.The thematic relations expressed by case and
word order have to be analysed by the proto-thematic properties following the analysis
of Dowty (1989). And those are suspiciously reminiscent of rhetorical relations: Cause,
Volition, Affected, Beneficiary, Result, Instrument. Complement sentences can be related
to Elaborations. On this perspective, one could say that there are fundamental semantic
relations for natural language which should be recognised in interpretation and that
960
VIII. Theories of discourse semantics
the same quantifier scopes should be preferentially assumed etc. Kehler has however
convincingly argued that this is not a general default but one that depends for its
operation on the assumption of discourse relations that force parallellism: List, Formal
Contrast and (cases of) Concession and Narration. The effects of parallelism can be illustrated
by the following examples. In (32)-without special intonation - she musi be Marian and
cannot be Susan. (With stress on she, it is the other way round, due to the distinctness
between she and Marian expressed by the contrastive stress: the most parallel reading
that is compatible with contrastive stress on she.).
(32) Marian likes Susan. And she likes Tom.
In the second clause of (33), the ellipsed VP is "like Susan at dinner parties" and not "like
Susan". In (34), this default is broken by the provision of a different parallel modifier.
(33) Marian likes Susan at dinner parties. And Tom does too.
(34) Marian likes Susan at dinner parties. And Tom does in dancing class.
Kehler's point is that in cases like these, but not with other rhetorical relations the syntax of
the connected clauses must be very similar.The effects show up especially in ellipsis.
Compare (35) with (36). (36) allows two intcrprctations:Tom likes himself or Tom likes Marian,
interpretations that are both unavailable for (35). ((35) will be repaired to have one of the
indicated readings, but And Tom does too. is just not the right way to express them.)
(35) Marian likes herself. (?) And Tom does too.
(36) Marian likes herself. Because Tom docs too.
Further compare (37) with (38). Here the unexpressed reflexive in the ellipsed clause
creates a problem in (37), while this does not seem to matter in (38).
(37) Marian likes Tom. (?) And he does too.
(38) Marian likes Tom. Because he docs too.
What seems to be going on is that the List, Formal Contrast and (cases of) Concession
force syntactic parallelism and that the antecedent as a syntactic object must allow the
interpretation. This restriction is removed when the relation is not one of Similarity.
It is quite tempting to think that this can be analysed with quaestios (von
Stutterheim & Klein 1989), discourse topics (Asher 2004) or schemas (Prust,Scha & van
den Berg 1994). Lists are then joint answers to a single question (or different elements
falling under one discourse topic or sharing a schema). The problem then seems to be
that the underlying questions (topic, schemata) in the cases considered are not well-
formed. This would block the interpretations that need them. For example, in Priicst,
Scha & van den Berg (1994) And Tom does too, is expanded to And Tom likes herself
which cannot be interpreted.
39. Rhetorical relations
963
5.3. Information structure
There is a seemingly opposite school in the study of rhetorical structure represented
by scholars like von Stutterheim & Klein (1989) and van Kuppevelt (1995). In this
approach, the starting point is the task of the speaker, conceived as a question. The
structure of a text is then a complex answer to this question. This immediately leads
to a distinction between partial answers to the question and satellites to such partial
answers (which answer questions of their own, raised by their nucleus). The question
of a text also leads to the fixation of the topic and foci of the partial answers and
the fixation and movement of the referential parameters in these texts. This leads
to interesting parallels with treatments of discourse semantics such as Discourse
Representation Theory.
One can say from the perspective of this school that RST is just a classification of a
set of natural relations that arise in the goal-directed enterprise that is telling a story or
producing an overview of some states of affairs or to provide an extended explanation.
A difference with RST is that the question immediately makes a connection with the
task of the text as a whole. Moreover, it is possible to develop an account of especially
those discourse relations that seem to be governed by information structure rather than
by semantic relations almost directly on the basis of the questions structure: especially
van Kuppevelt works out this connection and treats both information structure and text
structure in terms of questions.This makes it possible to link up with areas such as
theories of topic-focus articulation (cf. article 71 (Hinterwimmer) Information structure or
Grosz, Joshi & Weinstein 1995 on centering).
The theory seems to have the potential to serve as a foundation for rhetorical
relations and rhetorical structure. Any element of a text needs to be assigned a role with
respect to the question of the text or with respect to one of its answers. If this role can
be expressed as another question, specific for the particular element, this question will
determine both the rhetorical relation and the pivot.
5.4. Text interpretation
While rhetorical relations started in text generation as a theory of how to structure texts,
section 5.1. should have made it clear that there is quite a lot of semantic mileage in
having a grasp on rhetorical relations in interpretation. The main obstacle for achieving
such a grasp is the fact that rhetorical relations quite commonly are not overtly marked
at all.
Two remarks are in order. First of all, it is quite normal that there is not a marker
in NL for all the conceptual distinctions that are expressed in the sense that speakers
are aware of the distinction and hearers are supposed to figure out what it is. The many
meanings of common prepositions in English like "with", "of" or "in" are a clear case
in point. A natural language understanding system faces the task of filling in the blanks
there, if it is aiming for understanding that is comparable with what humans get out of
the language input. A lot of ambiguity resolution is necessary.
The second remark is that nonetheless NL is full of markers for rhetorical relations
and that texts are full of these markers: various coordinating and subordinating
connectors express one or more rhetorical relations, many particles at least rule out some
rhetorical relations. (Webber et al. 2003 argues for what seems to be the correct view: the
39. Rhetorical relations
965
one needs to be able to compare explanations by a combination of logical and empirical
methods to get anywhere at all. It would however not seem that there is anything going
on in the area of rhetorical relations that is not just normal computational linguistics. It
would therefore seem that further development of empirical computational semantics
will bring solutions for asyndetic marking of rhetorical relations.
SDRT has dominated rhetorical structure in recent years. It started as an attempt
to get rid of some of the limitations of the treatment of tense and aspect in DRT, a
treatment that is largely a refinement of the treatment of tense and aspect in Hinrichs
(1986) assuming a very basic narrative structure (Kamp's earlier collaborative work with
Christian Rohrer on French tense and aspect (Kamp & Rohrer 1983) - the discovery
context for discourse representation theory - was also limited to narration, this time
based on Gustave Flaubert's Education Sentimentale). It was however clear to
everybody, that the limitation to narration led to generalisations that are not tenable when one
considers a wider category of texts. The original work is reported in Lascarides & Asher
(1993) and is influenced by attempts at that time to investigate the applicability of default
logic to phenomena in natural language semantics and pragmatics. The combination of
rhetorical structure and non-monotonic reasoning was not new, since Hobbs had
proposed a combination of abduction and his work on text coherence, but SDRT aims for
a comprehensive model for the first time. It brought applications to presupposition and
the development of a hybrid logical system which separates the logic of rhetorical
relations (a propositional non-monotonic logic) from the logic in which the proper natural
language semantics is taking place (a variant of standard first order logic). Merging the
two logics would lead to an intractable system. While it seems that SDRT can profit
as much from empirically acquired statistical data, it requires argument to assume that
non-monotonic logic is the most suitable formalism for exploiting it. The aim of SDRT
is to develop a implementable logical model of rhetorical structure. And certainly,
the work collected in the monograph Asher & Lascarides (2003) comes closer to that
goal than any other effort in the area. Moreover, almost all issues in rhetorical structure
have come by in this enterprise and no student of rhetorical structure would be wise
to ignore SDRT.
6. Outlook
I have tried to show in this overview that rhetorical structure is a necessary ingredient of
accounts of text generation and of generalised pronominal resolution, including next to
pronouns, tense and aspect, ellipsis and gapping, particles and information structure and
therefore an essential part of context-based interpretation of natural language. Here the
qualification "context-based" can be safely dropped: there is no natural language
interpretation worthy of the name that is not context-based.
With the exception of SDRT, there is no theoretical model of the recognition of
rhetorical relations that has been worked out on a larger scale. For SDRT too, the proper
recogniser of rhetorical structure has not been achieved yet. With the exception of the
rather limited cue-based recognition of rhetorical relations, there is therefore nothing
that can be immediately incorporated into natural language understanding systems.
In text generation, rhetorical structure is part of most systems. Nonetheless there are
many phenomena that need improved understanding. For example, while recent years
have seen a substantial improvement in our understanding of the meaning of particles
966
VIII. Theories of discourse semantics
and conjunctions, the question when to deploy these devices in a text generator is still
not well understood. It seems that knowing the meaning is not enough, one also needs
a proper description of the triggering conditions beyond meaning. For an area that is
as central as rhetorical structure, the research investment so far has been very small
indeed. As compared e.g. to the investment in the syntax of sentences where hundreds if
not thousands of researchers are active there are maybe 75 people altogether who have
contributed to the area of rhetorical structure and for a few of them only it has been a
substantial part of their career. The intellectual interest of the two problems is hard to
compare, but theoretical syntax seems to have only a minor impact on natural language
understanding and generation and the potential of rhetorical structure seems to be far
greater for both of these areas.
I would gamble on two new impulses into the area in the coming years. One is the
advent of corpora that are annotated with rhetorical structure. Such corpora have
already been developed and annotation schemes have been provided. For useful
references, see http://www.isi.edu/ marcu/discourse/. More recent is the Penn Discourse Tree-
bank (http://www.ldc.upenn.edu).They can be combined with the development of more
semantic understanding within the stochastic paradigm in NLP in combination with
logical techniques. This will lead within the coming years to the possibility of estimating
the plausibility of a given interpretation in a context and preferring the most plausible
interpretation. This will also allow the recognition of rhetorical relations and thereby
liberate research on rhetorical structure from the fixation on the problem of asyndetic
expression, a problem outside the reach of current technologies. While these new
possibilities will help considerably in other areas of semantics and pragmatics, the
improvement of rhetorical understanding will have strong repercussions on the overall quality of
semantic and pragmatic understanding.
A second promise is optimality theory. Both generation and interpretation can
usefully be seen as a choice between alternatives where the choice is guided by principles,
i.e. as optimisation problems. Both generation and interpretation also need blocking
and blocking is typical for optimisation problems. Certain interpretations may be
obscured by other and better ones, certain possible realisations may be blocked by
preferred other realisations. Two interpretational constraints seem directly relevant
to rhetorical structure and define defaults there.The first is a principle that maximises
the givenness of any element of an interpretation, called *NEW, DOAP,
♦ACCOMMODATE by different authors. This creates defaults in the interpretation of
rhetorical relations as indicated in the following diagram, taken from Zeevat & Jasinskaja
(2007).
Explanation Contrast
Reformulation > Elaboration > > Narration >
Justification Concession
The principle also underpins the default in pivot identification: the lowest element on the
right frontier that fits.
A good thing about the area of rhetorical relations is that, while schools have
been formed, they have not led to divisions other than in general ideology. The Right