The Companions of Iranian Languages and Linguistics. Volume 3. Tajik Linguistics - Ido S., Mahmoodi-Bakhtiari B., Korangy A.

1 How Tajik was made into a national language

6 Tajik dialects of Badakhshan and Shughnani

Автор: Ido S. Mahmoodi-Bakhtiari B. Korangy A.

Теги: linguistics reference book tajik language

ISBN: 978-3-11-061940-9

Год: 2023

Похожие

The Birth of Tajikistan

Tajik Persian Reference Grammar

The Personal History of a Bukharan Intellectual: The Diary of Muhammad Sharif-I Sadr-I Ziya

Handbook of major Soviet nationalities

Текст

Tajik Linguistics

The Companions of
Iranian Languages
and Linguistics

Editor
Alireza Korangy

Volume 3

Tajik Linguistics

Edited by
Shinji Ido and Behrooz Mahmoodi-Bakhtiari

ISBN 978-3-11-061940-9
e-ISBN (PDF) 978-3-11-062279-9
e-ISBN (EPUB) 978-3-11-061953-9
ISSN 2627-0765
Library of Congress Control Number: 2022945599
Bibliographic information published by the Deutsche Nationalbibliothek
The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de
© 2023 Walter de Gruyter GmbH, Berlin/Boston
Typesetting: Integra Software Services Pvt. Ltd.
Printing and binding: CPI books GmbH, Leck
www.degruyter.com

Contents
List of contributors

VII

Lutz Rzehak
1
How Tajik was made into a national language
Shinji Ido
2
Standard Tajik phonology

Sepideh Koohkan, Roohollah Mofidi
3
Modality and mood in Tajik
109
Roohollah Mofidi, Negin Mohammadi Nafchi
4
Aspect in Tajik
183
Justin M. Power
5
Tajik Sign Language in context

229

Leyli R. Dodykhudoeva
6
Tajik dialects of Badakhshan and Shughnani: A comparative
perspective
275
Dilia Hasanova
7
Linguistic landscape of Bukhara: The ambiguous future
of Tajik
371
Mirzo Hassan Sulton
8
Terminology in Tajik
Index

403

389

List of contributors
Leyli R. Dodykhudoeva
Institute of Linguistics, Russian Academy of
Sciences
1 Bolshoy Kislovsky pereulok
Moscow, 125009
Russia
leiladod@yahoo.com

Behrooz Mahmoodi-Bakhtiari
University of Tehran
College of Fine Arts,
Department of Performing Arts
Enghelab Avenue, Tehran
Iran
mbakhtiari@ut.ac.ir

Dilia Hasanova
School of Journalism, Writing, and Media
The University of British Columbia
205-1873 E Mall
Vancouver, BC V6T 1Z1
Canada
dilia.hasanova@ubc.ca

Roohollah Mofidi
Imam Khomeini International University
Qazvin
Iran
mofidi@hum.ikiu.ac.ir

Mirzo Hassan Sulton
Lexcography and Terminology Division
Institute of Language and Literature
21 Rudaki Avenue, 734025 Dushanbe
Tajikistan
sulton_66@mail.ru
Shinji Ido
Nagoya University, Nagoya
Japan
ido@nagoya-u.jp
Sepideh Koohkan
Tarbiat Modares University, Tehran
Iran
sepideh.koohkan@gmail.com

https://doi.org/10.1515/9783110622799-203

Negin Mohammadi Nafchi
Imam Khomeini International University
Qazvin
Iran
mohammadinegin65@gmail.com
Justin M. Power
Department of Linguistics
University of Texas at Austin
305 E. 23rd Street, Stop B5100
Austin, Texas 78712
USA
justin.power@utexas.edu
Lutz Rzehak
Central-Asian Seminar,
Humboldt-Universität zu Berlin
Unter den Linden 6
10099 Berlin,
Germany
lutz.rzehak@hu-berlin.de

Lutz Rzehak

1 How Tajik was made into a national
language
Abstract: The establishment of Soviet power in Central Asia, among many other
social spheres, greatly changed the ethnic and linguistic realities. Based on contemporary publications from 1919 to the late 1920s, this chapter examines how the Persian-speaking population of Central Asia tried to orient itself under the new political circumstances. Already in 1919, the Bolsheviks appealed to the local population
to organize into so-called “national sections” but only selected Persian-speaking
groups in Samarkand responded to this appeal and founded a “Persian section”
whereas other groups in mind did not feel addressed by this appeal. As an argument to join the “Persian Section”, the ideologeme of “mother tongue” was introduced. The historical roots of this ideologeme are questioned here and its function
in the language-political debates of those years is examined. It is asked why the
originally planned project of a “Persian nation” could not prevail and was soon
abandoned. Attention is drawn to the question how the established practice of biand multilingualism and the competing project of a “Turkistani nation” affected
the language policy debates of the period. It is argued that the territorial-administrative reorganization of Central Asia in 1924 brought about a change in the attitude of many Persian-speaking groups toward their first language what was subsequently accompanied by the emergence of a Tajik national consciousness. The
new political circumstances meant that a language once considered the leading
language of culture, education and Islamic religion in a multilingual milieu was
transformed into a language whose function was largely reduced to its role as
the first language of a speech community defined according to newly introduced
‘national’ criteria. Outwardly, this change in the function of a language manifested
itself in the change of the language’s name: “Persian” became “Tajik”.
The Persian language has existed in Central Asia for centuries in a multilingual
society, where it held a leading position in the fields of religion, science, literature, administration, correspondence, and trade according to the principles of
a functional hierarchy of languages.1 After the establishment of Soviet power in
Central Asia, this medium of communication underwent a fundamental trans1 This paper is a shortened and revised translation of chapters 3 and 5 of Lutz Rzehak, Vom Persischen zum Tadschikischen. Sprachliches Handeln und Sprachplanung in Transoxanien zwischen
Tradition, Moderne und Sowjetmacht (1900–1956), Wiesbaden: Reichert Verlag 2001.
https://doi.org/10.1515/9783110622799-001

Lutz Rzehak

formation into a language whose function was largely reduced to its role as the
first language of a speech community defined according to a newly introduced
‘national’ criteria. This article describes the linguistic change, which manifested
itself in the change of the language name: Persian into Tajik.

1 Language and revolution
1.1 In search of a unity of language and nation
In early 1918, Russian railroad workers from Tashkent proclaimed the Soviet Republic of Turkistan. However, they were cut off from Soviet Russia by the Orenburg
Cossacks, and their influence was limited to a few cities. In much of Central Asia,
therefore, the revolution took the form of conquest by the Red Army, which imposed
the same forms of centralized party and military control on these areas as on the
rest of Russia. By 1920, the awareness was gaining ground that conquest alone
would not be sufficient to control non-Russian territories unless constant resistance
from the native population was accepted. The recruitment of “national” leaders
was intended to give Soviet power in these areas a “national” veneer and blur the
impression that it was a new form of Russian domination. The 10th Party Congress
of the Russian Communist Party (Bolsheviks) in March 1921, therefore, adopted a
resolution calling for the promotion of “national” cultures.
The “solution of the national problem for the multinational peoples of Central
Asia” (Varejkis and Zelenskij 1924: 3) was given an importance that extended
far beyond the borders of this region. The successful construction of the newly
created states in Central Asia was to serve as a model for other countries in the
Asian region and thus encourage further revolutions.
In the tension between social-theoretical and real-political and strategic considerations decisions were made and facts created in the few years between 1917
and 1928. These were decisions that would have a lasting effect on the development of the Persian/Tajik2 language until today. Drawing on contemporary press
2 In this article, the language name “Persian” is used whenever (but not, of course, in direct or
indirect quotations) it refers to the corresponding temporal varieties of the written and, in part,
the spoken language that were widespread in the territory of Transoxiana. The designation “Tajik”
stands for spoken varieties which were called tojik by their speakers, and for the temporal varieties
that began to develop with the first reference of this name to written language from 1924. The combinatorial designation “Persian/Tajik” should make clear that in the respective context the totality
of all historical forms of existence of this language is meant, which existed in the investigation
period and in the investigation area.

1 How Tajik was made into a national language

publications, the following discussion will show what irritations the nationality
and language policy of those years had initially caused among the groups that
used the Persian/Tajik language and how these people eventually managed to
cope with the newly created national orders. We will see that ethnic identities and
linguistic loyalties are neither natural nor unchangeable givens, but rather confessions that could also be changed depending on the social conditions – not arbitrarily, but within the framework of certain decision-making alternatives. Years
of upheaval like these are characterized by a particularly dynamic development.
Within a few months, ideological guidelines and political directives, but also their
understanding and associated misunderstandings, could change fundamentally.
For this reason, a strictly chronological presentation seems appropriate.

1.1.1 The project of a “Persian nation”
In Stalin’s concept of nationality, which he had developed in 1912/1913 in his “Marxism and the National Question” (Stalin 1950: 272), and which became the basis of
the Bolsheviks’ nationality policy in the non-Russian territories, language played
an overriding role alongside the territorial principle. Nationality was defined primarily by language, and when Stalin speaks of the “community [commonality] of
language”, he naturally means the commonality of one language. Multilingualism
was not envisaged in this concept. If multilingualism was perceived at all, it was
only seen as a deviation from the norm, and a rather unwelcome one at that.
In the multilingual milieu of Central Asia, Persian did not simply coexist with
other languages. In a functional set of priorities, Persian existed and competed
with other languages. On the one hand, Persian served as the first language for
members of various ethnic groups. These included the sedentary inhabitants who
were referred to by others – at least in the cities – as tojik in the catchment area
of Zarafshan, Amu Darya, and Syr Darya, as well as in the foothills of the Pamirs.
But Persian was also used by other inhabitants of the cities who were bilingual or
had even abandoned their previous idioms in favor of this language, and according to their place of residence called themselves simply as Buxoroi, Samarqandi,
or Xujandi. Persian/Tajik was also the first language of Central Asian Jews, some
Arabs, and Éronī, as well as Afghans and exiled Iranians residing in Bukhara and
Turkistan. On the other hand, Persian was used as a second language by many
other inhabitants of Central Asia, in its capacity, as a language of faith and as
an established high-level language for literature, religious purposes, correspondence, or in school education. Transoxianian multilingualism was characterized
by a functional coexistence and hierarchy of Persian, Turki, and other languages.
Against such a background, the idea that nationality was defined by a common

Lutz Rzehak

language was bound to cause great confusion. Even those who sympathized
with the new rulers from the beginning and supported their activities were not
exempt from such irritations. In early 1919, when a bitter civil war was still raging
in large parts of Central Asia, the first “national sections” were established at the
Commissariat for National Affairs in Tashkent: instructions were issued to the
corresponding commissariat in the controlled cities to establish such “national
sections” as well.
1.1.1.1 The initial initiative of the Éronī
In this wave of founding “national sections”, a telegraphic instruction arrived in
early May 1919 in Samarkand, the most important urban center in the Turkistan
area of distribution of the Persian language (Bukhara was still under the rule of
Amir Olim Xon and formally independent at that time). In the instruction, the
founding of a “Persian section” was demanded. The corresponding message published in the revolutionary weekly Šū”lai inqilob3 “Flame of the Revolution” (see
Figure 1), allows some insights into how the Bolsheviks’ concept of nationality
was understood:
The Persian section (šū”ba-ji fors) (that is: éronī, afǧon, tojik) opened in our center Tashkent
at the Commissariat of National Affairs, has sent a telegram these days to the Commissariat
of National Affairs of Samarkand, proposing the opening of such a section in that authority. It has also recommended the opening of schools, clubs, reading rooms, and the printing and publishing of literature for the fors and in the forsi language. On behalf of all our
forsi-speaking brethren, we express our gratitude to the Center from the bottom of our hearts.
(Šū”lai inqilob 12, 15.5.1919, 8)

As we can see, language was taken very seriously as a decisive criterion for defining this “national section”. The “Persian section” was addressed to all Persian
speakers: Éronī, afǧon and tojik are mentioned by name. Left without mention
and thus without consideration are such important Persian-speaking groups
as the indigenous (also: Bukharan) Jews, some Arabs, or representatives of the
Pamir peoples.
The ranking of the groups mentioned, at least as far as the naming of the Éronī
in the first place is concerned, is no coincidence, but an expression of the political conditions at that time. From the beginning, most of the Éronī from Samarkand had supported resolutely the goals of the Bolsheviks and their nationality
policy more than representatives of the other Persian-speaking groups in Central
3 In the text of this article, a Romanized transcription system for proper names and Persian/
Tajik words is used, which for reasons of uniformity and recognizability is based on the pronunciation standard valid today, as fixed in the Cyrillic writing system of Tajik.

1 How Tajik was made into a national language

Figure 1: The journal Šū”lai inqilob from 20.8.1919, section of the title page.

Asia and had worked in the Communist Party. Éronī of Samarkand established
the journal Šū”lai inqilob as the first Persian-language press publication of the
Soviet period, which appeared with some interruptions as a weekly in Samarkand
from April 10, 1919, to December 1921. The official publisher was the Samarkand
Regional Committee of the Communist Party.4 The magazine was created at the
suggestion of the Committee of Communist Workers of Boǧi šamol, a predominantly Éronī-inhabited district of Samarkand. Its director and editor-in-chief was
Sayyid Rizo Alizoda (1887–1938), who himself belonged to the Éronī of Boǧi šamol
and had already published in the journal Oyina a few years earlier (Alizoda 1913).

4 Information about the editorship of press products of this period should be treated with great
caution. Only a nominal editor was named on the front pages. Under the conditions of the civil
war, all press products were de facto published jointly by the party and revolutionary committees
as well as the councils and political departments of the Red Army (Abdullaev 1989: 7).

Lutz Rzehak

It is not surprising that the individual Éronī quickly identified with the goals
of the revolution and became politically active in a prominent position. In the
Sunni-dominated environment, the Éronī defined themselves primarily through
their Shiite denomination. The core of the Éronī (also: marvī < Marv) were descendants of the Persian-speaking Shiites of Merv, who had moved to Bukhara after the
destruction of Merv in 1785. Some of them stayed in Bukhara, and others moved on
to different cities of Turkistan and came to Samarkand. Later migrants, who came
from Persia and other parts of Central Asia until the beginning of the 20th century,
also joined the group of the Éronī. The Éronī of Bukhara were predominantly Persian-speaking, whereas the Samarkand Éronī maintained active bilingualism. In
addition to Persian, they spoke an idiom – still poorly studied – that is said to be
very close to the Oghuz-Uzbek dialects of Khorezm or Azerbaijani (Suxareva 1966:
153–165). In 1919, 15,000 of the Éronī population lived in the Samarkand district
of Boǧi šamol. They were generally not considered wealthy, and the memory of
earlier expulsions and oppression had become firmly imprinted in their minds.
After the bloody Shiite-Sunni clashes had taken place in Bukhara in 1910, some of
the Éronī population of Bukhara began to refer to themselves not as éronī but as
fors (Suxareva 1966: 153). This name (Fors) was intended to emphasize the similarities with the rest of the population, i.e., the common Persian language; and to
push the confessional differences to the background.
Being an ethnic and confessional minority, the Éronī had enough reasons to
pin their hopes on the Bolsheviks when they promised – with their style at the
time – liberation from their yoke to all working people, the poor, destitute, and
the oppressed. At the beginning of July 1919, the šū”bai fors ‘Persian Section’ was
founded at the Commissariat for National Affairs in Samarkand according to the
Tashkent model. The chairman of this section was Alizoda.
1.1.1.2 The ideologeme of the “mother tongue”
In a contribution to the journal Šū”lai inqilob, Alizoda (1919: 1) put all his journalistic skills at the service of the idea of a “Persian nation”. He enthusiastically
explained the nationality policy of the Bolsheviks, which allowed everyone,
everywhere, to speak in his own language and obliged every assembly to provide
an interpreter even if a single participant wished to express his thoughts in a
language different from that of the majority. Without going into the conditions
in Turkistan in more detail, he refers to the chauvinistic language policy of the
Tsarist regime using the example of Poland. In order to explain how he imagines
a “Persian nation” and who, in his opinion, should belong to this nation, Alizoda
coined those words in this contribution, which often was transfigured and with
other intentions were also quoted later with pleasure:

1 How Tajik was made into a national language

Language is the great pillar of a nation, and once the language disappears, the nation
that converses in that language will also disappear and perish. No nation in the world can
ensure its existence and survival unless it guards and protects its mother tongue.
(Šū”lai inqilob 12, 10.7.1919, 1)

Alizoda has masterfully reproduced the given equation of “language” and
“nation” and the term zaboni modari ‘mother tongue’ experienced the beginning
of a linguistic ideological process that continues to this day in relation to the
Persian/Tajik language of Central Asia. Characteristically, this concept remains
as undefined as the concept of nation. It is pretended that mother tongue and
nation are primordial givens that define groups in a natural way and therefore
beyond a shadow of a doubt.5 The very question of the Bukharan Jews, who were
left out of Alizoda’s project of a Persian nation, would have exposed the equation
of “mother tongue” and “nation” as a construct, since the Bukharan Jews also
have sufficient claim to regard Persian as their “mother tongue”. Furthermore,
it remains unclear in this context whether Alizoda intended, for example, those
Turkic tribes of Central Asia to be part of the “Persian nation” who were known
as tojik-čiǧatoy and inhabited hundreds of settlements between Širobod, Baysundaryo, Surxandaryo, Kofarnihon, and Qyzyl-Su. Most of them used Persian/Tajik
idioms, referred to themselves as tojik, but all the same saw themselves as belonging to various Turkic tribes (see Materialy po rayonirovaniyu 1926: 231–232). Cases
are known from Urgut district where the members of such a tribal lineage spoke
Persian/Tajik or Turki idioms, depending on the village they lived in, but still felt
they belonged to one group (Suxareva 1966: 128).
It was undoubtedly useful for Alizoda’s argumentation not to define the
concept of “nation” or “mother tongue” as in many cases this would have been
difficult to do. The situation in the bilingual milieu of Samarkand may have
allowed a more or less clear distinction between who had learned Persian “from
his mother” and who had learned Turki. But the spoken language in the distribution areas of the northern Tajik dialects exhibits so many Turkic features in
lexicon, morphology and syntax that its speakers, especially if illiterate, hardly
notice the shift to the neighboring Turki dialects. In a concept where the question
of “mother tongue” is linked to an either-or, there is no place for the speakers
of these – later called by Doerfer (1967: 57) “Turkic language in nascent state” –
Northern Tajik dialects. The incompatibility of the Bolshevik concept of nation

5 An identity of “mother tongue” and “nationality” was already pretended by the Russian side
with the census of 1897, when information on the “mother tongue” (rodnoj jazyk) was primarily
intended to register the nationality of the persons surveyed. See Bauer, Kappeler, and Roth (1991:
A, 144–146).

Lutz Rzehak

with the traditional bi- and multilingualism of Transoxiana will concern us
repeatedly in the following.
Nevertheless, the term “mother tongue” is particularly well suited as an ideological vehicle, because the combination of the words modar ‘mother’ and zabon
‘tongue’ refers to more than the person from whom one learned the basics of the
so-called “mother tongue” in early childhood. The term “mother tongue” also
implies a strong moral call to respect this idiom, and to respect it as one respects
one’s mother. To say “mother tongue” is to call for nurturing and preserving that
idiom. This deontic connotation comes to the fore when this term is addressed
to persons who, for practical, pragmatic, or other reasons, but in any case, as a
result of their own decision, predominantly use another idiom in everyday communication. In view of widespread bilingualism and against the background of
Central Asian Turki having had become increasingly attractive to many Persian
speakers of Samarkand and other areas of the Zarafšon Valley about fifty years
after the Russian conquest of Central Asia, there is reason to believe that Alizoda
chose the term “mother tongue” because of its moral weight.
There are no indications of how the term “mother tongue” was received and
understood at that time: that is, in July 1919. A glance at the press of August 1924,
however, yields some conclusions. On the immediate eve of the “national delimitation of Central Asia” and more than three years after the ideologeme “mother
tongue” had been introduced into the discourse, a contributor for the newspaper Ovozi tojik used the following phrase in this context: “In the provinces Turkistan and Farǧona, as well as in Eastern Bukhara and Masčoh, there are many
Tajiks who generally speak Tajik at home and in the bazaar, that is, their ordinary
language (zaboni oddii išon) is Tajik” (Ovozi tojik 4.9.1924: 1). Although the term
“mother tongue” would undoubtedly be appropriate by today’s standards, it is
instead said that a language is spoken “in general” and is therefore “ordinary
language”. Obviously, zaboni modari “mother tongue” was not at all part of the
general and ordinary vocabulary of the implied Persian/Tajik “native speakers”
at that time. A few lines later, this word does appear in the same text. The author
uses it when he explains that in the schools of Samarkand there are some children who understand faster in Uzbek, though others understand faster in Tajik,
and concludes from this that one learns knowledge faster ba zabonī oddi va
modari-ašon “in one’s ordinary and mother tongue” (Ovozi tojik 4,9,1924: 1). So
even in 1924 the ideologeme of the “mother tongue” still required an additional
explanation as “ordinary language”, i.e., an adaptation to the general contemporary use of language.
Already in July 1919, when Alizoda introduced the term “mother tongue” into
the debate and declared it to be the “great pillar of the nation”, he himself still
had great linguistic difficulties in clearly formulating the idea of a nation based

1 How Tajik was made into a national language

on the Persian language. In the above-quoted text, the expressions millati fors
‘Persian nation’, millathoi fors ‘Persian nations’, forsiyon ‘Persian[-speaker]s’,
and forsho ‘Persians’ are used alternately without any discernible logical connection; although these are quite different in content. Such linguistic inconsistency is, of course, closely connected with the conventional use of the word
millat ‘nation’ which until then had denoted anything but a linguistic community.
Therefore, Alizoda could not avoid naming the population groups he had in mind
with their established appellatives: tojik ‘Tajiks’, éronī ‘Éronī’, afǧon ‘Afghans’,
hindi ‘Indians’.
1.1.1.3 Limited acceptance for the project of a “Persian Nation”
The real evil of the ideologeme of the “mother tongue” was that identities were
assumed to be natural phenomena that existed only as constructs. Therefore,
again, only Éronī felt truly addressed. Following the call of the “Persian Section”,
some of them founded the anjumani muovanati éronīyon ‘Society for the Representation of Iranians’ on July 15, 1919. The word éronī was used here with a
narrower nuance. This society saw itself as the representative of those Iranians
with Iranian citizenship and who wanted to look after their interests until which
point there would be diplomatic representation (Šū”lai inqilob 13, 17.7.1919, 8).
These were merchants from Iran who resided in Samarkand – but also younger
Iranian migrants. The vast majority of Samarkand Éronī, who could also be called
marvī because of their descent from Merv, and who had been living in Samarkand for over a hundred years, were not among them. Transcending religious and
denominational differences this society wanted to unite all Iranians (in the aforementioned legal sense) of Samarkand. At the founding meeting ten Muslims were
elected to the fifteen-member presidium. Five additional positions were reserved
for Jews, Bahais, with the Iranian-Armenians to be elected at a later date.
In August 1919, the national affairs administration in Samarkand declared
that the following “national sections” existed: a Persian section, an Uzbek
section, a Jewish section, an Armenian section, a Ukrainian section, a Tatar
section, and a Turkish (Azerbaijani) section (Šū”la-yo inqilob 18, 20.8.1919, 7). The
“Persian section” published another appeal at that time whose desperate tone
already reveals a growing resignation:
Brothers! Speakers of the same language (hamzabonho)! You should know well that everyone
cries over his own grave. No nation knows your needs and sorrows like yourselves . . . How many
fors and tojiks are there in Samarkand and how many Persian schools do they have? Now we
have abolished the classes and reading rooms! If you are fed up with your mother tongue, if
you have renounced your mother tongue, then tell us officially so that we can close our store.
(Šū”lai inqilob 18, 20.8.1919, 7)

Lutz Rzehak

The refusal of the numerous Persian speakers to join the project of a “Persian
nation” is closely linked to the attractiveness of another project, which was based
on the historically developed priority of regional identities over ethnic or linguistic identities; and was pursued during this period by notable representatives of
the Central Asian Enlightenment. This project was primarily not linguistically, but
regionally, oriented and can be summarized under the keyword millati turkiston
‘nation of Turkistan’. Such a project had already been propagated in the magazine Sadoi Turkiston ‘Voice of Turkistan’ from 1914 to May 1917. The great uprising
against the forced recruitment among the Central Asian population – which Tsar
Nicholas II had decreed in 1916 – helped Turkistan’s patriotism achieve its final
political breakthrough. Such an understanding of millat, based on the Turkistan
region, had received further impetus when the Autonomous Republic of Turkistan
was established on April 30, 1918. Turkistani patriotism began to transform into
Turkistani nationalism. The Bolshevik idea that the formerly oppressed peoples
of Russia first needed a national revolution before they could tackle a social one
must be seen against the background of the awakening of Turkistan’s nationalism, which, incidentally, could also be instrumentalized for ideas that extended
beyond the spatial borders of Turkistan.6 Linguistically, this change manifested
itself in the fact that on the pages of the journal Šū”lai inqilob too, the word turon
“Turan” began to be used more and more often than the word Turkistan.
At that time, the idea of Turkistan patriotism that would unite all Muslims
in Turkistan, could inspire even those who primarily used the Persian language.
Sadriddin Aynī, who, unlike other linguists and contemporaries, had never completely renounced the Persian/Tajik language composed a Turan-Marsh in the
Turki language at that time, in which he called on Uzbeks, Kazakhs, Tatars, and
Turkmen to be united and unified (Komatsu 1989: 129).
Against this background, the anjumani doniši forsiyon ‘Knowledge Society
of Persian Speakers’, which was founded in Samarkand in early September
1919, also went largely unnoticed. This society addressed all Persian speakers of
Samarkand and the surrounding area, with its commitment to educational issues
and the publication of Persian-language literature. Despite the comprehensive
claim made by the name of this society, this organization was also exclusively a
child of the Éronī of Boǧi šamol.

6 For more details on Turkistan nationalism see Zenkovsky (1960: 225–253), Komatsu (1989), and
Allworth (1990: 173–209).

1 How Tajik was made into a national language

1.1.1.4 Early misunderstandings around the ideologeme “national language”
However, by this time the political learning process in the Bolshevik sense had
progressed further: When the journal Šū”lai inqilob reported on the founding
of the “Knowledge Society of Persian Speakers”, the term zaboni millī ‘national
language’ was probably introduced for the first time with direct reference to
the Persian language of Central Asia (Šū”lai inqilob 19, 11.9.1919, 7). Where the
concept of nation was so strongly fixed on language as in the understanding of
the Bolsheviks at that time, the term “national language” could not be long in
coming. When the expression zaboni millī was written down in September 1919,
and disseminated through the medium of the journal, it was at first no more than
a formal copy of the then very common Russian expression nacionalʹnyj jazyk.
But the Russian original had a somewhat different meaning in the vernacular of
the time than did its Persian replica. Again, the Turkistan nationalists construed
a different meaning with the term “national language”.
In the pre-revolutionary period, the Central Asian languages were usually
summarized by the Russian authorities in a way that – out of imperial posturing –
did not further differentiate them with the attributes signifying tuzemnyj ‘native’,
musulʹmanskij ‘Muslim’, or inorodčeskij ‘foreign-born’, the latter being used preferentially for writing and school systems. It is essential to note that such designations grouped all other languages together without further distinction and
contrasted them with Russian.7 Although the Bolsheviks pursued different political intentions in 1919 than the tsar’s officials previously had, they had no better
knowledge of the ethnic and linguistic situation in Central Asia. With regard to
Central Asia, therefore, the “solution of the national question” was no more than
an abstract idea in those months. This is shown by the fact that national oppression and Russian chauvinism at the time of the tsarist empire were always clearly
noted in the Bolshevik propaganda intended for the Central Asian population –
only on the basis of examples from the European part of Russia, but never on the
basis of Central Asian examples. Preference was given to the persecution of Jews
and the situation of Poles and Ukrainians.8 Examples from Central Asia would
certainly have been much more convincing but they were not given, probably
because the situation in Central Asia was not known well enough. In these years,
however, day-to-day politics was essentially determined by the civil war, and
surveys or detailed investigations on the “national question” – or the language
situation – in Central Asia were a distant dream before the arrival of Turkkomis7 A veritable treasure trove of appropriate formulations is offered by a report on the education
system of the Syr-Darya province, submitted by the then director of elementary schools of this
region, S. Gramenidskij (1916).
8 See Šu”lai inqilob (1, 10.4.1919 and 12, 10.7.1919,1).

Lutz Rzehak

sija, which will be further elaborated on. For this reason, some of the Bolsheviks’
formulations initially differed from the language of the tsarist era only in appearance, but not in content.
The term inorodčeskij was considered a taboo for the Bolsheviks because of
its imperial connotations, but other terms – tuzemnyj more than musulʹmanskij –
were also identified with the language of yore and therefore increasingly replaced
by the word nacionalʹnyj, which corresponded much better with their political
ambitions. But when the Bolsheviks first mentioned nacionalʹnyj in Central Asia
in 1919, it initially often meant nothing more than an indifferent “native”, or
“non-Russian”. At that time, for instance, the term nacionalʹnaja pečatʹ ‘national
press’ was used to refer to all publications in Central Asian languages to contrast
them with the russkaja pečatʹ ‘Russian press’, without further distinction. As late
as 1923, nacionalizacija ‘nationalization’ was used to describe the inclusion of
indigenous forces in public administration (Abdulloev 1989: 7). The republics
founded after the territorial reorganization in Central Asia were grouped together
in contemporary language use under the designation nacionalʹnye respubliki (or
nacrespubliki) ‘national republics’, and as contrasted with Russia. Therefore,
nacionalʹnyj jazyk also often meant nothing more than “indigenous language” or
“non-Russian”.9
But for Alizoda and the other Éronī who had founded the “Knowledge Society
of Persian Speakers” in Samarkand in September 1919, the expression zaboni
millī formed after a Russian model, obviously, had a concrete meaning. They did
not necessarily see the term “national language” as an antithesis to “Russian”,
although such usage could not be ruled out. First and foremost, this term was
intended to mark a new status for their own idiom, which henceforth wanted to
assert a right to exist, be nurtured, and promoted.
The Eroni in particular, perhaps more than other groups, sought an answer to
the question of what the term “national language” should mean, because, from a
linguistic point of view, it was mainly Central Asian Turki that could benefit from
a Turkistan nationalism that was very popular at the time. In the Soviet republic
of Turkistan, proclaimed in 1918, Turki was, along with Russian, de facto the predominant administrative language, whereas Persian could only maintain its last
stronghold in the Emirate of Bukhara. Against this background, the then already
9 This meaning of the word nacionalʹnyj has survived situationally until the recent past. The
departments of Russian libraries in which literature in the languages of the non-Slavic peoples
of the former Soviet Union is kept are called otdel nacionalʹnoj literatury ‘department of national literature’. Members of Central Asian peoples may also be referred to in everyday speech as
nacionaly ‘nationals’ and, without further differentiation, may be grouped exclusively in their
capacity as “non-Russians” or “non-Europeans”.

1 How Tajik was made into a national language

widespread conviction that Turki was the language of the new civilization and
Persian the language of the past was given new impetus. The members of the language and literature circle Čiǧatoy gurungi ‘Chaghatai Discussion Circle’ around
Abdurrauf Fitrat tried to develop a literary language on the basis of Central Asian
Turki in order to do justice to the changed political and social status of this idiom
(Baldauf 1991: 83). The definition of this conceived literary language took place
in conscious demarcation against Persian, and it was pursued in corresponding
purisms.10
As a general lingua franca, as well as with its acceptance in the administrative
system of Turkistan, Turki had thus long since received the status of a “national
language” – in this Turkistani understanding. This status was also to be asserted
against other languages of the region.
When Alizoda and the other activists of the “Knowledge Society of Persian
Speakers” henceforth called for the rights of a “national language” to be claimed
for Persian as well, the intention was to prevent this language from continuing
to be seen only as a “language of faith” and discredited as a “language of the
past”. However, as aforementioned on the founding of this society, it was only
limitedly called zaboni millii éroniyon ‘national language of the Éronī/Iranians’.
The original idea, formulated in May 1919, to unite all Persian speakers of Central
Asia into a “Persian nation” was thus already abandoned in September of the
same year.
The project of a “Persian nation” was finally doomed to failure. The journal
Šū”lai inqilob, which propagated this project, and was intended for the members
of this imagined “Persian nation”, had to bid farewell to its readers on October 16,
1919, after a brief existence of only six months. Despite repeated advertising and
agitation campaigns, the necessary number of 1,000 subscribers was not reached.
The expenses became an unsurmountable burden, and the Party Committee of
Samarkand was not ready to provide more support. There was a shortage, not
only of subscribers but also of authors. In six months, only six Persian-language
articles from outside had reached the editors (Šū”lai inqilob 22, Oct. 16, 1919, 8).
It should probably be considered a consolation when, alongside the editorial
board’s farewell letter, an advertisement was printed for the Persian-language
newspaper Najot ‘Salvation’, which was published by the “Muslim section” of the

10 A comprehensive account of Fitrat’s transformation into an anti-Persian Turkistani language
nationalist for several years is offered by Komatsu (1989: 123–128). At the end of the 1920s, Fitrat
was again very committed to the Persian Tajik language in connection with Latinization and the
discussion about the development of a national literary language.

Lutz Rzehak

Communist Party in Ashgabad. Šū”lai inqilob was the only Persian magazine in
Arabic script published in Transoxiana in early Soviet time.11

1.1.2 The project of “bilingualism”
The priority of regional identities over linguistic or ethnic ones expressed itself
politically as a strengthening Turkistan nationalism, about which many Persian
speakers were also enthusiastic at that time. Hence another path appeared to
promote the Persian language and literature – at least temporarily: it seemed,
at the time to be more promising than the failed project of a “Persian nation”.
This way consisted in the conscious continuation of the traditional Turki-Persian
bilingualism, which already had characterized the journal Oyina in pre-revolutionary times.12 The editor of this journal Mahmud Behbudī declared in the first
issue: “Turkistanis need to know turki, forsi, Arabic and Russian. Turki is necessary because the majority of the Turkistani population speak ūzbaki. Forsi is the
language of the madrasa and of literature. [. . .] Persian poetry will never lose its
elegance. [. . .] Turki is the language of modern learning which opens the way to
Tolstoy and Jules Verne, Kepler, Copernicus, and Newton.” (Oyina. 1, 20.8.1913,
12, quoted after Baldauf 1992).
1.1.2.1 Custom and habit became a project
The project of “bilingualism” was a foregone natural deed for those who pursued
it. When it was undertaken, it was associated with two names in particular during
the years 1919–1921: Sadriddin Aynī and Hojī Muin.
11 From the end of 1919, wall newspapers for the native population of Turkistan were published
under the title Rost on the initiative of Turkkomissija (Abdullaev 1989: 12). Pestovskij [1927, 427)
and Sajidūf (1931 b, 1) refer to a wall newspaper of the same name, which was published or distributed free of charge in 1920–1921 by R. Badalūv for Bukharan Jews, i.e., also in Persian, but in
Hebrew script.
12 From August 1913 to October 1915, Mahmudxoja Behbudī (1875–1919) published the magazine
Oyina ‘Mirror’ in Samarkand which published 68 issues. It played a major role in the dissemination of Enlightenment ideas and, if one disregards some advertisements in Russian, appeared as
a bilingual paper in Turki and Persian. However, these two languages were represented to varying
degrees in this journal. The dominance of Turki is evident in the fact that about one-third of all
contributions were in Persian and all others in Turki. A clear functional division is also evident in
the selection of texts that were printed in one language or the other. New text genres such as news
reports, which were created with the medium of the journal in the first place, were published exclusively in Turki. This also applied to editorial and scientific contributions. Persian, on the other
hand, was the language of essays and more abstract treatises of a philosophical or similar nature.

1 How Tajik was made into a national language

Hojī Muin ibn Šukrullo Samarqandī had already made a name for himself
in the 1910s as a teacher, school founder, author, and translator, as well as one
of the most important publishers of New Methodist textbooks and Enlightenment literature in Samarkand. In 1914 he appeared on the scene as the author
of a Persian poetry collection for New Methodist schools (Hojī Muin 1332), and a
religious textbook Aqidai islomiya ‘Islamic view’. In 1916, his three socially critical plays appeared in Turki with the titles Juvonbozlik qurboni ‘Sacrifice of Boy’s
Love’, Eski maktab – yangi maktab ‘Old School – New School’, and Turkiston
maišatidan: Kūknori ‘From the Life of Turkistan: The Opium Smokers’. As early
as 1911, Hojī Muin had also translated Fitrat’s Munozara from Persian into Turki
and published it first in the newspaper Turkiston viloyatniñ gazeti, and two years
later in his own publishing house as a book. In 1914 (or 1915) Hojī Muin became
temporarily the editor of the journal Oyina, where many of his contributions also
appeared. Between 1914 and 1917, Hojī Muin also published in the newspaper
Sadoi Turkiston (Baldauf 1992: 9–10; Allworth 1990: 179–180).
Aynī’s journey from exclusively speaking Persian to becoming bilingual has
already been alluded to. After a six-month recuperative stay in Tashkent, Aynī
returned to Samarkand on October 22, 1918, where he taught language, literature, and history classes, in Persian and Turki, in a school founded by the Soviet
authorities (Aynī 1958a: 102).
Among the political conditions to be considered in this context, the arrival
of the Turkistanskaja komissija (Turkkomissija) ‘Turkistan Commission’ must be
mentioned, which had been appointed by Lenin on October 3, 1919, and was
directly subordinate to the Central Executive Committee and the Council of People’s Commissars of the Russian Soviet Federative Socialist Republic. The Turkkomissija arrived in Tashkent on November 4, 1919 with a large staff of party
workers. Among other things, a special commission was formed to prepare a
report on the state of the Soviet administration in Turkistan, concluding that
the imperialist policies of the tsarist era had been continued in many places by
cruel methods under the pretext of waging class struggle. On the initiative of the
Turkkomissija, the abolition of Islamic jurisdiction, which was decreed in 1918
was withdrawn and mosques were (temporarily) removed from under the yoke
of Soviet control (see Figes 1989: 750–751). An announcement in the magazine
Šū”lai inqilob shows how this event was received in Samarkand at that time. In
contemporary diction it was said:
Yes, after the October Revolution, Soviet power was established in Turkistan as well, and
parties and organizations had been formed in the name of the Communists and the downtrodden. But those brothers who had taken the lead in the name of the downtrodden, and
in the name of the Party of Communists mostly did not act according to the program of the
Communists. That is why there were many improper appearances, unjustified requisitions,

Lutz Rzehak

and the like. Therefore, our European brothers still presented themselves as the upper class,
cultivated a bad image about the Muslims and denied them equal rights. Then, if one of the
Muslims stood up and demanded the rights of the Muslims, he was attacked with the words
“You are a nationalist! You are a reactionary! You are a supporter of the rich!” . . . Now that
the military actions on the road from Tashkent to Orenburg have ended, and railroad traffic
has been restored, a commission has come from Moscow to correct the affairs in Turkistan.
The mentioned commission has full authority to do anything it wants to do.
(Šū”lai inqilob 29, 9.2.1920, 1)

The Turkkomissija also attached great importance to press and publications.
Under the leadership of its member M. K. Trojanovsky, wall newspapers (Rost)
with political statements, appeals, and news reports appeared regularly. After
Trojanovsky sent a copy of the first issue of these wall newspapers to Moscow, a
telegram arrived from Moscow with the following message: “Develop this matter
further. Lenin is especially in favor of newspapers in local languages. Money will
be given as much as necessary” (quoted after Abdullaev 1989: 12–14).
Under these changed political conditions, the journal Šū”lai inqilob was also
to be given a second chance. On December 7, 1919, barely two months after its
farewell to its readers, it appeared again. In an editorial contribution, Alizoda
informed that some like-minded people, especially Hojī Muin, had not come to
terms with the end of this journal and had lobbied for its reappearance. The necessary material resources were requested from the Samarkand provincial Committee of Muslims and delivered to where they were needed. In addition, comrades Aynī, Muxtorī, and Muhammadī had agreed to cooperate (Šū”lai inqilob 23,
7.12.1919, 1).
Aynī simultaneously took up work in the Turki language newspaper Mehnatkašlar tovuši ‘Voice of the Working People’, and wrote his poems and essays
during this period in both Persian and Turki. Ostensibly to give the impression
of authorial diversity because of the shortage of literati, Aynī signed his essays
with different initials, at that time (Aynī 1958a: 103). In contrast to the historical model of the journal Oyina, bilingualism was no longer cultivated within one
publication organ, but rather in separate newspapers that appeared in different
languages: but to which the same people could contribute. However, only one
journal in Persian existed whereas numerous publications of several kind existed
in Turkic-language.
Nevertheless, the spell seemed to have been broken. With such a prominent
advocate as Hojī Muin – Aynī was still relatively unknown at the time – the journal
Šū”lai inqilob could no longer be regarded merely as the exclusive enterprise of
the Éronī of Boǧi šamol. In the unmistakable hope of gaining as many Persian
speakers as possible as readers, from then on, all formulations that would indicate a claim to a “Persian nation” were dispensed with. Moral appeals to the need

1 How Tajik was made into a national language

to nurture a “national language” were also abandoned. When Aynī explained the
necessity of a Persian-language journal in his first essay for this journal (Tanviri
afkor ‘Enlightenment of Thought’, Šū”lai inqilob 23, 7.12.1919, 1–3), he therefore
used – in accordance with the new claim – only innocuous expressions with the
meaning “Persian speaker” (mostly: forsizabonon, only once: forsiyon). Not by
reminding of any “national feelings” or again straining the deontic concept of the
“mother tongue”, Aynī hoped to win new readers for this magazine. He pointed
instead to the general benefits of reading periodicals and addressed the hopedfor readers with the simple remark that this periodical is, after all, ba zaboni
xudaton našr mešavad ‘published in your own language’. The article states in
detail:
In the two years since the revolution, all our Uzbek brothers have become acquainted with
newspapers. In the cities, there are hardly any people who can read and do not know about
newspapers.
But we Persians, unfortunately, still stand as spectators in the alley of cluelessness,
thinking that the press and newspapers are only there to make money for the publicists.
When there was no Persian newspaper in Turkistan, speakers of Persian could be forgiven
for not reading newspapers. With the six-month existence of Šū”lai inqilob magazine, there
is nothing to forgive. We have been forced to acknowledge that Persian speakers do not yet
know what newspapers are. Even though vast sums of money are spent to print a single
magazine, no one wants to buy it even for just one-eighth [of the production price], which
is almost free.
But since the entire care of the Soviet Government and the Party Committee is devoted
to enlightening the minds of the downtrodden, it was decided to continue Šū”lai inqilob.
For all downtrodden speakers of Persian, the door to this traveling school [to newspapers –
L.R.] remained open.
Now I turn to the Persian-speaking brothers with all my hope and in my deepest
sincerity:
Brothers! Devote one hour of your life every week to reading this newspaper, and use
this newspaper, written in your own language, as best you can, so that in return for this one
hour you may get back that life which you have lived in cluelessness”.
(Šū”lai inqilob 23, 7.12.1919, 2–3)

The self-image that this journal began to convey in the following was even more
clearly mo turkistoniyon “we Turkistanis” or occasionally mo tūroniyon “we
Turanis”. The journal Šū”lai inqilob considered itself responsible for those Turkistanis who used the Persian language exclusively or primarily.
In contrast to 1919, there is no evidence of the term millat ‘nation’ having
been used explicitly in connection to the Persian/Tajik language in 1920 and
1921. Most often, the word millat was related to “Turkistan”. This confirms
the wide acceptance Turkistan nationalism received at that time, even among
Persian speakers. In such a sense, Alizoda wrote on the occasion of Behbudī’s
death in April 1920:

Lutz Rzehak

The martyrdom of Mahmud Xoja has not only put the people of Samarkand into mourning
and gloom, but has affected and saddened all ten million [members] of the nation of Turkistan (millati turkiston). Today, the whole of Turkistan participates in this general misfortune and national grief, is wrapped in black clothes, and is gripped by desperate pain.
(Šū”lai inqilob 35, 8.4.1920, 1)

In an analogous way, referring to the whole of Turkistan without linguistic differences, Aynī also used the word millat when he writes, for example:
As a result of the general war [World War – L. R.], the enslaved of Russia have carried out an
uprising and overthrown the Tsar’s throne with belongings, given freedom to all poor toilers
and enslaved peoples (millat), and also given us Turkistanis a republic as well as separate
autonomy.
(Šū”lai inqilob 40, 17.6.1920, 1)

However, the understanding of millat was not as clear-cut as these examples
would lead one to believe. An exclusive reference to the Persian language was
ruled out, but millat could still be understood in the conventional sense as a
“community of faith” and used accordingly:
Since the Soviet rule was established in Turkistan, the way to this kind of education is open
for the whole population, the materials for it are available. But, unfortunately, we Turkistanis do not use these benefits, do not collect the ripe grain sheaves. And yet, of all the
peoples (millat) living in Turkistan, we Muslims need knowledge and education the most.
(Šū”lai inqilob 44, 26.7.1920, 2)

The poem Xitob ba tūroniyon ‘Call to the Turanis’ by Mullo Nodiro, published in
the journal Šū”lai inqilob in March 1920, not only confirms a synonymous use
of the term millat referring to the Islamic religious community. It also shows us
how well the use of Persian language could be united with a Turkistan or Turanoriented self-awareness in the understanding of the time. It states:
mekunam ay aziz bo [sic] tu bayon
ki ba ahli basirat ast ayon
kist on modare ki asli tust?
xoki poki muqaddasi tūron
millatat čist? millatat islom!
hodiyat kist? hodiyat qur”on!

O beloved, I say to you,
what is clear to perceptive people.
Who is that mother from whom you come?
The pure holy earth of Turan.
What is your nation? Your nation is Islam.
Who is your guide? Your guide is the Quran.
(Šū”lai inqilob 33, 8.3.1920, 5)

1.1.2.2 Persian/Tajik indifference
In Persian language usage, as it has come down to us in the form of the journal
Šū”lai inqilob from the years 1920 and 1921 (from 1922 to 1924 there were no
Persian press products in Turkistan), two circumstances stand out that should be
of particular importance for the following development:

1 How Tajik was made into a national language

First: The Persian language is referred to exclusively as forsī “Persian”. The
term tojik(ī) “Tajik” does not occur as a language designation. This corresponded
to the linguistic habits known from pre-revolutionary times. Forsī was a general
designation for the written Persian language and a proper designation for the
idiom spoken in the cities of the Zarafšon plain. Only Turkic-speaking groups
in Central Asia referred to all groups of people as tojik who belonged to the
Islamic community, led a sedentary lifestyle, and did not have tribal structures.
However, tojik was not used as a self-designation by all who could be so called
by others. Only the inhabitants of the mountainous regions, for example in and
near Khujand, in the Zarafšon Mountains, and in Qarotegin and Darwoz, called
themselves tojik. In Falǧar and Darwoz, tojik was used both as a self-designation
and as a foreign name by the neighboring Jaǧnobi. In the upper reaches of the
Panj, speakers of East Iranian Pamir languages could also refer to themselves as
Tajik, while their Persian-speaking neighbors in Darvoz, who saw themselves as
tojik, were referred to by them as porsigūy / forsigūy “Persian speakers” (Andreev
1925: 156).13 Hence, the self-confidence of Persian speakers living in the mountains, many of whom were quite wont to refer to themselves and their language as
tojik and zaboni tojikī, had virtually no influence on the formation of the opinions
of Samarkand publicists. The use of the term forsī also testifies to a linguistic
independence from the Russian-dominated Soviet power, since the term tadžikskij, derived from tojik, had already established itself in Russian usage at the time
with a meaning similar to that in the Turkic languages of Central Asia.
Second: Unlike Persian speakers, Turki speakers were not grouped together
as such, being called instead ūzbekon ‘Uzbeks’ and also defined as a group in
word compounds such as maktabi ūzbekiya ‘Uzbek school’ or toifayi ūzbek ‘Uzbek
people’. Their language continued to be called turkī. The Persian-speaking groups
of Turkistan, on the other hand, are still defined exclusively by their language
and grouped together as forsizabonon ‘Persian speakers’ (more rarely: forsiyon).
From a historical perspective, the complete absence of any formulations with the
word tojik(ī) is remarkable. Aynī also did not use any wording during this period
that would indicate an identity as a tojik.

13 Regarding the origin, semantic change, and especially the – historically seen – predominantly
social-cultural meaning of the term tojik, reference is made to the summary accounts in Bartolʹd
(1964), Fragner (1989, 1999: 19–21, 41–43, 2021: 24–29), Lentz (1933: 9–15), and Schoeberlein-Engel
(1996: 123–172). Lorenz (1964: 574–579) offers an overview of etymological attempts at interpretation from the 19th and early 20th centuries, some of which proved scientifically untenable, but
which continue to inspire a wide variety of persons in Central Asia, as well as Iranian and Afghan
intellectuals.

Lutz Rzehak

In other words, the primarily Turkic-speaking inhabitants of Turkistan began
to develop a group identity as “Uzbeks”, which – despite the absence of a generally recognized literary language and their tribal or socio-cultural diversity –
was able to unite both linguistic commonalities and the regionally based self-image as “Turkistanis”. The term ūzbek/o’zbek was originally reserved only for the
semi-nomadic population, who were considered descendants of the Dašti Qipčaq
conquerors and, unlike the inhabitants of the cities, had preserved tribal structures. After this word was first used in a broader sense, probably by the Russians,
it was soon able to gain increasing appeal as a proper designation for individual
representatives of settled Turkic-speaking population groups in Turkistan (with
the exception, of course, of Turkmen, Kazakhs and Kyrgyz) (Baldauf 1991: 86–89).
Primarily Persian-speaking Turkistanis in the cities, who had long been
called tojik by others and were also listed as such in Russian statistics, unlike the
inhabitants of the mountains at the time, preferred regionally based identities.
They did not reveal a group identity, which would have combined a linguistic,
ethnic and a regional self-image, at least in the press or in exposed political positions. Together with other Persian-speaking groups, which were very different
in ethnic and religious terms and in part also had different regional origins (one
thinks, for example, of the Éronī and Afghans), they were seen “only” as a linguistic unit and were combined with them to form a linguistic community.
Therefore, it certainly corresponds to a certain – or more correctly: a missing
or unarticulated – self-image when the constitution of Soviet Turkistan, which
was adopted in 1920, named only kirgizy ‘Kirghiz’, uzbeki ‘Uzbeks’ and turkmeny
‘Turkmen’ among the korennye nacionalʹnosti “long-established nationalities”,
but not “Tajiks” or, for example, “Persian speakers” (Bartol’d 1925: 111). “Tajiks”
had not identified themselves as a separate group – at least in the cities. In the
mountainous areas, where this might have been expected because tojik was
accepted as a self-designation, anti-Soviet sentiments dominated during this
period. The military activities of the Basmachi, which were centered there, continued in part until the end of the 1920s. “Persian speakers” were recognizable as
a group, but far too heterogeneous in their ethnic and religious composition to be
collectively considered a “long-established nationality”. The fact that, of all the
Persian speakers in Turkistan, representatives of the Éronī minority were the first
to speak out politically, may also have influenced the formation of opinion among
the Bolsheviks about the composition of Turkistan’s Persian-speaking population.
But it was not only a lack of a “Tajik” identity. The Persian language, or at
least the only magazine in that language, did not enjoy great popularity even
among the sympathizers of the revolution. Despite all efforts to gain new readers
by developing a self-image oriented toward all of Turkistan, the editors of the
weekly Šū”lai inqilob must still complain about a lack of acceptance more than

1 How Tajik was made into a national language

half a year after the new beginning and point to the positive example of the
“Uzbek brothers”:
In Turkistan there is no large city where a newspaper for the Muslims would not be published. But in relation to the population there, not even one in a hundred, or even one in
a thousand, is a reader. Persian newspapers in particular seem to have no readers at all.
Thus, for more than a year, the magazine Šū”lai inqilob has been published for the Persian
speakers of Turkistan. It has some readers from the city of Samarkand, but even these have
appeared only through the efforts of government and party personnel. From the rest of the
Persian-speaking population in Ūroteppa, Xujand, Koni Bodom, Isfara, and elsewhere, we
have perceived no sign of attention.
Our Uzbek brothers are showing a hint of awakening. Signs of attention and correspondences arrive to the editorial office of our brother newspaper Mehnatkašlar tovuši from
all corners of Samarkand province, even from other places, albeit sparsely. Thus letters,
news and essays come from Andijon, Namangon, Avliyoato and [the city of] Turkiston.
But not a single letter has arrived to our journal from any Persian-speaking corner of
Turkistan. This shows the complete indifference of our Persian-speaking brethren.
(Šū”lai inqilob 40, 17.6.1920, 2)

The deep despair that gripped Aynī in the face of this ignorance cannot be denied.
Nevertheless, he persisted with his “bilingualism” project. In June 1920, on the
eve of the Bukhara Revolution, Aynī was summoned to Tashkent for a month by
Bukharan communists to write projects, speeches, and other agitational materials for the people of Bukhara on their behalf. Aynī wrote these materials in two
languages: Persian and Turki (Aynī 1958a: 104). Thus, the newspaper Qutuluš,
which was published by Bukharan revolutionaries in Tashkent with a total of
eleven issues from June to August 1920, included Persian prose texts and poems
in addition to the articles written mostly in Turki (Aynī 1926: 577).
With the victory of the Bukharan Revolution in September 1920, the “bilingualism” project was to fail for good. The overthrow of the Emir of Bukhara had
been achieved with the significant participation of foreign forces. In addition to
the Red Army, this included Uzbeks from Tashkent and Farǧona as well as prisoners of war from Turkey. The newspaper Šū”lai inqilob, which sent its own correspondent, namely Aynī, to Bukhara in November 1920 to report on the situation, initially painted a very contradictory picture of the situation under the just
two-month-old Soviet power. On the one hand, it praised the discipline among
the revolutionaries and among the local population, who even voluntarily held
out in the square in front of the mosque after Friday prayers to attend a meeting,
whereas in Samarkand people usually had to be driven by force of arms into the
walls of a medressa for a meeting. (Šū”lai inqilob 59, 2.12.1920, 4). On the other
hand, “the small and inexperienced organization called javonbuxoroyon ‘Young
Bukharans’ was completely overwhelmed in setting up a functioning administra-

Lutz Rzehak

tion. Even the help of “three to four Turkish and Tatar Turkistani brothers” could
not make much difference (Šū”lai inqilob 61, 30.12.1920, 2).
The young revolutionary impetus to turn everything upside down and the
inexperience of the new rulers were also to have linguistic consequences. The
newspaper Buxoro axbori ‘Bukharan News’, which was published in Bukhara
from September 1920, was still published bilingually in Turki and Persian (Šū”lai
inqilob 50, 20.9.1920, 8), but soon a new consciousness began to develop in the
course of that revolutionary enthusiasm, which is characteristic of so many overthrows of this kind. Under the influence of the Turks and Tatars who had rushed
to Bukhara, the Persian/Tajik language was pushed more and more into the background. With the establishment of the Republic of Bukhara in September 1920,
turkī-ūzbekī ‘Turki/Uzbek’ became the de facto administrative language of the
Bukhara council government. Even in eastern Bukhara, where the population
did not understand “Turki/Uzbek”, Turki was designated as the language of local
administration and the language of schooling (Muhiddinuf 1928: 16 and Xūǧojif
1930: 6).14
Abdulqodir Muhiddinuf offers a retrospective view of this change. He came
from the family of a wealthy Bukharan factory owner and had joined the Communist Party in Moscow in 1918 (Eisener 1991: 23). Muhiddinuf was a leading
member of the revolutionary movement in Bukhara and later held senior posts
in the state and party apparatus of Tajikistan. In 1928, he recalled – in the style
of an admission of guilt, which was not unusual for the late 1920s – the period of
the Bukharan Revolution:
The political ideology, which at the establishment of the Republic of Bukhara and also for
a long time afterwards had completely influenced the activity of the employees and leaders
of the Soviet government of Bukhara, was the ideology of Pan-Islamism and Pan-Turkism.
In the initial period of the establishment of Bukhara, three groups of people were at the
forefront:
1) Turkish prisoners from the World War;
2) Uzbeks from Farǧona and Tashkent, all of whom were of the same ideology and belief as
the jadidī of Bukhara;
3) We, the jadidī of Bukhara, who had received our first education and ideological development in the circle of the development of the ideology of pan-Islamism and pan-Turkism in
Central Asia and had been under the perfect influence of the ideology of Pan-Turkism and
Pan-Islamism for a long time.

14 Bartolʹd (1925: 110) does point out that Persian (in his usage of 1925: tadžikskij jazyk) remained the state language not only in the Emirate of Bukhara, “but also in the People’s Republic
of Bukhara,“ but this formal status found no counterpart in linguistic practice.

1 How Tajik was made into a national language

The panturkists said:
Uzbeks, Kyrgyz, Kazakhs, Turkmen and the other nations which have relation to a Mongol
origin and all of which now all form their own independent nation, are in fact parts of one
nation. The Tajik-speaking population of Bukhara is also of Turkic origin; under the influence of Iran’s literature and civilization, they have lost their language and their nationality.
We must make them turk again and create from all of them together one millati buzurgi turk
“great turk nation”.
(Muhiddinuf 1928: 16)

A description very similar in content and style to that of Muhiddinuf is provided
in an article by Š. Džabbarov published in the journal Za partiju in 1929, almost
at the same time (Masov 1991: 152–160). When evaluating such statements, the
following should be taken into account: The Bolsheviks were later hardly able to
control the spirit they had evoked with their demand that the oppressed peoples
of the East must first undergo national revolutions before a social revolution. In
order to banish forever any ideas that – like Turkistan nationalism – were not
compatible with the “national delimitation” of 1924, they were subsequently
elevated to the status of conspiratorial “pan-Islamism” or “pan-Turkism”. The
choice of words by Muhiddinuf and Džabbarov can therefore be attributed to the
political thinking and language usage of the late 1920s. This includes the use of
the word tojik, which could be used for identification to a much greater extent
after the “national delimitation” than before it. The meaning attached to it in
1929, cannot simply be applied to the situation in the early 1920s.
But even if we subtract all that springs from the spirit of the post-1924 period,
the retrospective view of Muhiddinuf and Džabbarov offers eloquent testimony to
that intellectual and linguistic change that had received further impetus after the
Bukharan Revolution and also gripped many individuals who had once used primarily the Persian language. In the Turkistani and even more so in the Bukharan
understanding of that time, a commitment to the Bolshevik ideology was inextricably linked to a commitment as Turk or – as one gradually began to say – as
ūzbek, which manifested itself most clearly through language.
In other words, the indifference of some “Persian-speaking brothers” lamented
by Aynī went hand in hand – at least for politically active individuals – with a commitment to Turkistan, which in the final analysis could also be a linguistic commitment. The language question, which had been formulated from the beginning
as an “either-or” question, had found an answer, and the language of Turkistan
nationalism was not Persian. Those who, to put it casually, did not want to miss the
train of time spoke and read Turki. Those who wanted to exist as communists professed to be ūzbek. Such a confession was by no means an attempt to join a “strong
nation” in an inconspicuous and clever way in order to gain personal advantages.
The “Uzbek nation” meant here did not exist at all until then, but was only created
during these years. For this new nation, an old name with a new meaning was

Lutz Rzehak

used, which virtually invited such confessions. Those who professed to be ūzbek
wanted to participate and take part in this creative process. In a society where
speech habits were not rigidly identified with a particular language name, the
esteem of a social ideal could encourage one to identify with a speech group that
was not that of one’s first language.
In a wall newspaper circulated in 1924 in Xujand this confession was
expressed by a person who would be called a “Tajik panturkist” a few years later:
“Therefore, I believe that we will be left behind the culture for years to come if
we take upon ourselves or make a claim to Tajikization [an insistence on Tajik
language and identity – L. R.]” (Novyj put’ 1924: 14, quoted after Masov 1991: 154).
It is not uncommon that people living in a bilingual milieu, where each language, in addition to its function as the first language for one or the other, also
dominates entire social spheres based on the division of labor, choose to use one
language or the other depending on the situation. Such behavior appears reprehensible only when language is instrumentalized ideologically. This means that
language is primarily no longer seen as a means of communication that is supposed to fulfill a certain function for the one who uses it and is therefore selected
according to the criterion of functionality. These functions may include using language as an embodiment of a cultural or political identity. The fact that the use of
Turki and the renunciation of Persian/Tajik did not always have to be imposed,
but could be the result of a voluntary decision – as demonstrated first by Fitrat
and, from 1920 at the latest, also by Muhiddinuf and some of his political comrades-in-arms – is not readily perceived.
Muhiddinuf’s memoirs show that language loyalties were not inherited by
nature, but were the content of confessions. In the successful case – here: for
Central Asian Turki – such confessions had a language-constituting effect. For
Persian/Tajik, they had the opposite effect. In the period from 1922 to 1924, the
view that the tojik were actually Turkic and had lost their original language under
Iranian influence was also propagated in the Turkistani newspapers Zarafšon
and Turkiston. This was accompanied by the demand that only Turki or, as it was
now called, o’zbek tili – ‘Uzbek’ should be considered the official and general colloquial language. In the People’s Republic of Bukhara, in the membership papers
of the party and youth organization, under the heading “nationality”, an entry as
“Persian speaker” or “Tajik” was not provided for. Thus, even in the remote parts
of eastern Bukhara, all party and Komsomol members were listed as “Uzbek” in
1924 (Šakuri 1997: 150–151). Finally, the use of the Persian language was banned
and fined in some of Bukhara’s Soviet administrations, including the People’s
Commissariat for Education headed by Fitrat. In the course of this development,
the journal Šū”lai inqilob, the only Persian-language press publication of the
time, was finally discontinued in December 1921.

1 How Tajik was made into a national language

2 Tajik emancipation
2.1 Prescribed nations
The situation changed with the territorial-administrative reorganization of Central Asia, which was carried out as a hasty and barely prepared campaign between
February and October 1924. Unlike many other problems discussed and decided
in 1924 in connection with “national delimitation” (lit. nacional’noe razmeževanie
‘national unmixing’), there was agreement in principle on the Tajik question from
the very beginning. The Central Committee of the Bukharan CP first discussed
the “delimitation of Soviet Central Asia into a number of republics according to
national characteristics” on February 25, 1924, and the Executive Committee of
the Bukharan Communist Party adopted corresponding theses on March 10, 1924.
At this time, it was already clear: “The Tajik people form an autonomous Tajik
region from Matča (sic), Karategin and Garm within the framework of Uzbekistan-Bukhara” (quoted after Masov 1991: 31). Thus, it was clear that Tajikistan was
to be given only an autonomous status within Uzbekistan and was to be limited
essentially to the remote mountainous areas of former Eastern Bukhara and some
mountainous areas on the upper reaches of the Zarafšon, the latter having until
then belonged to the Samarkand province of the Turkistan Autonomous Socialist
Soviet Republic (ASSR).
A Tajik sub-commission, which did not have voting rights was appointed
only a few days before the meeting of August 21, 1924, at which the boundaries of
the Tajik Autonomous Region were on the agenda. A total of 1.24 million “Tajiks”
were adopted for Turkistan and Bukhara, defined as such essentially according
to linguistic criteria. As language designations, both the designation farsidskij
“Persian” and tadžikskij “Tajik” appear in the minutes of this meeting (the language of negotiation was Russian) with the latter dominating. The members of
the Tajik sub-commission, none of whom, incidentally, came from the areas
under consideration for the autonomous Tajik territory, admitted at the meeting
of August 21, 1924, that the population in the districts of Samarkand and Xujand
(Samarkandskij and Xudžandskij uezd) as well as in the city of Bukhara was
approximately 95 percent Persian/Tajik-speaking. At the same time, however,
they themselves took into account that these population groups were closely
linked to the Uzbeks in economic and administrative terms and that it would not
be possible to incorporate these Tajiks into the Tajik autonomous regions on the
territory of eastern Bukhara because of the spatial distance as well as the lack
of transportation routes. It had therefore been agreed with the Uzbek comrades
that the large number of Tajiks remaining in Uzbekistan would receive education
and culture in their mother tongue. In addition, the major Tajik cities such as

Lutz Rzehak

Bukhara and Samarkand would have to remain temporary cultural centers for
training officials for eastern Bukhara (Masov 1991: 42–43).
The repeated references to the economic and cultural similarities between
the Tajiks and Uzbeks living in Bukhara and Samarkand and to the backwardness of the Tajiks living in the mountains of eastern Bukhara, formally speaking,
were a practical realization of the directive formulated by Zelenksiy (1924: 72–73),
according to which the question of national affiliation must occasionally take a
back seat to other aspects, including economic and cultural ones, in the “national
delimitation“ process. Notwithstanding the appeal that a self-confession as a
“Tajik” gradually began to develop, the attitude expressed in these references
also correlates with the traditional primacy of regional and social identities over
linguistic ones. The self-image as a “Bukharan” or “Samarkandi” was ultimately
more decisive than the primarily linguistically oriented decision as to whether
one was an “Uzbek” or a “Tajik”. With the regionally based identity, it was ultimately also possible to accept the principle of regarding the cities as belonging to
the population by which they are surrounded. The precept that the economically
and culturally more developed city dwellers had to provide civilizational assistance to the backward inhabitants of the mountains of eastern Bukhara and that
Samarkand and Bukhara therefore had to belong to the prosperous Republic of
Uzbekistan, moreover, draws in a flattering way on old resentments against the
Tajiks of the mountains. Such resentments could also be expressed in a pejorative designation as ǧalča.15 The regionally determined identity as “Bukharans”
or “Samarkandis” also meant a social demarcation from the inhabitants of the
mountain regions, which manifested itself in the feeling of cultural and economic
superiority of the townspeople and was reflected in a disdainful view of the dialects spoken in the mountain regions.16 An additional political confirmation of
this resentment towards the Tajiks of the mountains at that time was the fact that
the armed anti-Soviet resistance of the so-called basmači, which had by and large
been put down in 1923, was still continuing in the mountainous areas of eastern
Bukhara at that time. In October 1924, the following administrative regions
belonged to the Tajik ASSR:

15 Lentz (1933: 9–15) provides an overview of the use of the term in the first third of the 20th
century.
16 The Russian scholar A. A. Semenov’s presents in his memories a vivid description of such
reservations, which had been firmly established since pre-Soviet times. For example, the student
body at the medressas of Bukhara, numbering several thousand, was divided into two factions
formed on the basis of regional origin. The students from the city and the adjacent administrative
areas in the plains of the Zarafšon Valley (mullobačahoi tumanī) were opposed to the students
from the mountainous areas known as mullobačahoi kūhistonī (Semenov 1960: 988–989).

1 How Tajik was made into a national language

–

From the former People’s Republic of Bukhara: the province of Qurǧonteppa,
the eastern parts of the province of Sari Osjo, and the provinces of Dušanbe,
Kūlob, and Ǧarm;
From the former Turkistan ASSR: the eastern part of the Samarkand district (upper reaches of the Zarafšon), the western Pamirs (Rušon, Šuǧnon,
Iškošim, Langar); the district of Rošorv on the upper reaches of the Bartang,
and the western parts of the Vaxon.

Dušanbe (from 1929 to 1961: Stalinabad) was declared the capital, but it had
suffered great damage during the battles with the armed opposition. In 1924
Dushanbe had only 42 houses and 242 inhabitants (Materialy 1926: 135).

2.2 The return of the Persian/Tajik language
into the public sphere
In 1924, the Persian/Tajik language returned to the public sphere after a three-year
suppression by Turki hegemonic claims. The establishment of the Tajik Autonomous Soviet Socialist Republic provided the occasion and legitimation for this.
It also determined the character and status that this language would have in the
future: Persian/Tajik was initially tolerated and promoted only in its function as
the first language for the majority of the inhabitants of those remote mountainous
areas that, from then on, comprised the territory of the Tajik ASSR. This functional
limitation, in turn, shaped the status of Persian/Tajik in those cities that for centuries were considered centers of Persian/Tajik culture but were assigned to Uzbekistan rather than Tajikistan during the “national delimitation”. The inhabitants of
these cities were given the task of providing cultural and political assistance to the
mountainous regions of the Tajik ASSR, which were considered backward in cultural and economic terms. (Above all, it was a matter of first establishing a functioning Soviet administration in these opposition-controlled areas). Knowledge of
the Persian/Tajik language, which was widespread in the cities of the plains, was
a welcome prerequisite for the fulfillment of this missionary task, but no more.
Bukhara itself was considered backward and its inhabitants, as the chairman of the Central Asia Bureau put it, were “more sinister and retrograde and
more liable than others to succumb to the provocations and fanatical agitations
of the mullahs and ešons” (Zelenskij 1924: 79). For this reason, Samarkand, which
already had been under Russian rule for many decades, was to have a special role
in providing assistance to the mountainous areas of the Tajik ASSR. However, the
first measures in this direction were initiated in Tashkent, the political center of
Turkistan.

Lutz Rzehak

As early as January 15, 1924, a special class for Tajiks was established at the
Communist Central Asian University, a kind of cadre school for party and state
functionaries in Tashkent in order to train political cadres for the areas of the
later Tajik ASSR. The initiative for this had been taken by the Afghan émigré and
educational politician Nisor Muhammad. However, this group met with little
approval at the time, as potential students were still very suspicious of the idea
of a special class for Tajiks due to their experience with the dominance of Uzbek
and Russian in the administration of Turkistan (Ovozi tojik, 25.8.1924). At the
end of 1924, a teacher training institute was established in Tashkent specifically
for Tajiks, where two hundred young students were to be trained as elementary
school teachers in a five-year course of study. The first rector of this institute was
again Nisor Muhammad, who also appeared as a book author in 1924 and published a primer and a reading book for Tajik schoolchildren (Nisor Muhammad
1924 a, b).

2.3 The new language designation “Tajik” and its meaning
On August 25, 1924, when the territorial borders of the future Tajik ASSR were
already largely determined, the first issue of the socio-political and literary newspaper Ovozi tojik ‘Voice of the Tajik’ appeared in Samarkand. Abdulqayum Qurbī
was appointed as politically responsible editor, but the actual editing was in the
hands of Sayyid Rizo Alizoda. With Hojī Muin as an editorial board member and
Sadriddin Aynī as an author, other personalities who had already been involved
in Persian/Tajik journalism in earlier years collaborated on this newspaper.
The newspaper’s target audience is unambiguously stated in its title. Similarly, Sadriddin Aynī writes in the editorial of the first issue about the language of
the newspaper that this will be, as its title says, zaboni tojik “the Tajik language”
(Ovozi tojik 25.8.1924, 1). This was the first time that the name “Tajik” was used
instead of “Persian” in reference to the written language. This was more than a
change of label. Behind the new name was also a new content. However, in Aynī’s
usage at the time, the word tojik had a limited meaning. It hardly referred to the
Persian/Tajik speaking population of Bukhara or Samarkand. At that time, that
is, in August and September 1924, Aynī preferred to refer as tojik to the inhabitants of those mountainous areas to which the territory of the Tajik ASSR, then
still in the process of formation, was to be limited.
This is confirmed by his choice of words in an article entitled “On Tajik
Schools and Education”, which he had written in September 1924 for the second
issue of the newspaper Ovozi tojik (4.9.1924: 1). Here it is said:

1 How Tajik was made into a national language

Everyone knows that the Tajiks of Turkistan have lagged behind others in terms of science
and education. A question on which life and death of the Tajik people depends today is the
school and education question. [. . .]
I think that in Samarkand, which is close to the mountainous areas of the Tajiks and
is considered the center of culture, a course should be opened at the next opportunity. Students from Falǧar, Mastčoh [sic] and the rest of the mountainous areas, as well as from the
villages of the Tajiks, should come to this course; forty to fifty young people who can read
and write should be gathered and trained in four to five months. The intellectual potential
for such a course is available in Samarkand, because those Samarkand teachers who currently run the Teacher Training Institute and the educational courses in Samarkand know
Tajik well.
(Ovozi tojik 4.9.1924: 1)

In this article, the terms tojikon and samarqandī are de facto juxtaposed. In any
case, they are not congruent in Aynī’s use of language at the time. His statement
that the Tajiks were backward compared to the other peoples of Turkistan only
makes sense if the Persian/Tajik-speaking population of Samarkand remains
excluded from it, because they were given the task to help to overcome this backwardness. The phrase that Samarkand is close to the mountainous areas of the
Tajiks also delimits the settlement areas of the population group to which Aynī
wishes to apply the term tojik. In the next sentence, he adds the villages of the
Tajiks and expands this space to include the rural settlement areas in the plains.
However, as far as the urban population of Samarkand is concerned, Aynī makes
the statement that the Samarkand teachers know Tajik well, thus deciding against
a formulation of the kind that the Samarkand teachers are themselves Tajiks. This
choice of words was by no means accidental. In an article that appeared only
a week later (Ovozi tojik 12.9.1924: 1) he referred to the Persian-speaking urban
population in the plains as forsiyon ‘Persian speakers’, thus also contrasting
them terminologically with the tojik of the mountain areas and villages. When
Aynī actually mentioned Tajiks in Samarkand and other cities of the plains, he
meant exclusively immigrants from the mountain regions who took on casual
work in the cities and hired themselves out as day laborers, transport workers or
guards.
In a paper on textbooks for Tajiks, Aynī also raised the question of which
dialect should be used as the basis for the written “Tajik” language. The language
of Samarkand, Ūroteppa and Xujand, Aynī said, is too mixed with Uzbek and
therefore unsuitable:
Therefore, the language of the Tajik mountain areas must be adopted in Tajik school textbooks. The language of the Tajik mountain areas is a simple forsi, free from Iranian inflections, not mixed with rarely heard Arabic words, and it corresponds to Persian grammar.
Yes, in pronunciation it has a certain turgidity in relation to the language of urban Persian
speakers. But it is pure and it conforms to the rules. Such a language is widespread from
Falǧar and Mastčoh [sic] to Qarotegin and Darvoz and is universally understood . . . From

Lutz Rzehak

these considerations, it can be said that the language of the textbooks of the Tajiks should
be a simple Persian language. In other words, the language of the majority of the Tajiks of
the mountainous areas.”
(Ovozi tojik 12.9.1924: 1)

Aynī’s demand that the “Tajik language” be aligned with the idioms of the mountain regions was not only based on linguistic arguments. In the foreground was
the educational and developmental mission, which he, as a politically active
person of that time, also liked to propagate:
The destitute people of the Tajiks in the mountains are deprived of all the benefits of urban
life and are bound to difficult labor . . . These destitute people are seized with sufferings
from which they can find no relief; they live in torment and have no one to help them. The
Tajik language will be a translator especially to such destitute people.
(Ovozi tojik 12.9.1924: 1)

In calling for a social orientation of the written language, Aynī was in line with
the political guidelines of the Communist Party and the doctrine of “turning to
the village”. On October 19, 1924, a conference on the press in Central Asia was
held in Tashkent, attended by editors of all Central Asian newspapers and magazines. At this conference, the peasantry was named as the most important target
group of Soviet press organs. The editors were therefore called upon to write in
a simple and generally understandable language in order to attract this social
stratum, which until then had shown little interest in Soviet newspapers and
magazines. However, no measures were proposed on how to achieve this goal
(Ovozi tojik 30.10.1924: 1)
The language that returned to the public life of Central Asia in 1924 under the
name “Tajik” differed from the written language that had been called “Persian”
until then not only by its new name. In Aynī’s mind, the new, Tajik language was
to be a language that should be:
– In social terms, addressed the poorest classes and should be distinguished from “Persian” by its simple, popular, and generally understandable character;
– In spatial terms, it should not rely on the dialects of the urban cultural
centers, that is, not on the dialects of Bukhara, Samarkand, Ūroteppa, or
Xujand, but on the idioms spread in the mountainous areas of Masčoh and
Falǧar on the upper reaches of the Zarafšon as far as Qarotegin and Darvoz.
The preference of these idioms was justified by low Uzbek influences and the
rare use of complicated Arabic phrases. However, by pointing to the absence
of allegedly “Iranian” turns of phrase in these dialects, Aynī’s definition of
the new, Tajik language is also directed against the linguistic dominance of
Éronī in the written Persian language of the early post-revolutionary years.

1 How Tajik was made into a national language

Such a territorially and socially determined understanding of language appears
to be the consistent realization of Stalin’s linguistically fixed concept of nation,
according to which a nation is based primarily on the unity of language and territory. Aynī’s definition of the Tajik language, however, was more than political
conformism or the thoroughly credible attempt, as a representative of urban high
culture, to provide civilizational development aid for the inhabitants of the mountain regions, who were regarded as backward. Aynī’s understanding of language
also correlates with the traditional contrast between the Persian/Tajik idioms of
the plains and those of the mountain regions, which was already known from
pre-Soviet times. As is well known, this contrast also found expression in the
fact that one’s own idiom was called forsi in the cities, whereas the name tojik
was used by – predominantly non-settled – Turki speakers as a foreign designation. Only in most mountainous areas could the expression zaboni tojikī also be
used by Persian/Tajik speakers to designate their own idiom, but it then referred
mainly to spoken language.
The fact that Aynī, as a representative of the urban culture characterized by bilingualism, leaned on the Turki-specific use of the word tojik with the unmistakable
basic meaning “the linguistically and culturally different”, also found its linguistic
expression at that time: in the mid-1920s Aynī preferred the expression zaboni tojik
for “Tajik language”. Similarly, in the title of a book published in 1926, he uses the
phrase adabiyoti tojik “Tajik literature” (Aynī 1926, see Figure 2). According to the
linguistic conventions of Persian/Tajik, however, it should be either zaboni tojikī
“Tajik language” or adabiyoti tojikī “Tajik literature” (with the adjective tojikī as an
attribute) or zaboni tojikon “language of the Tajiks” or adabijoti tojikon “literature of
the Tajiks” (with the noun tojik in the plural as an attribute). The expressions zaboni
tojik and adabiyoti tojik can be explained only if one takes into account Aynī’s active
bilingualism and understands these formulations as recreations of the morphosyntactic structure of corresponding Uzbek phrases, namely tojik tili and tojik adabijoti,
respectively. In Uzbek, tojik can be used equally as an adjective and as a noun. This
peculiarity was transferred to the homonymous Persian/Tajik word in the forms
zaboni tojik and adabijoti tojik preferred by Aynī at the time.
Linguistic idiosyncrasies that appear to be recreations of Uzbek phrases or
grammatical structures, are not uncommon in the Persian/Tajik idioms of Central
Asia. However, it is more than a linguistic problem if even in the introduction of
a new language name morphosyntactic features of an Uzbek model were copied.
It also gives us an indication of mental models on which this new name was
based. It must be taken into account that Aynī had already lived for several years
in Samarkand at that time, where the dominant influence of Uzbek was particularly strong. Nevertheless, Aynī saw himself far more than other politically active
contemporaries in the urban centers of Turkistan as an advocate for the people

Lutz Rzehak

of the Tajik Autonomous Region and, with the resources of a publicist, worked
primarily to disseminate factual information about these mountainous regions
among urban readers who were burdened with many prejudices.17

2.4 The awakening of Tajik nationalism
The creation of Tajikistan was not a consequence of Tajik nationalism, but its
birth. The establishment of Tajik autonomy had been under discussion for months
before its realization, but only the legal completion of “national delimitation” and
the actual creation of the Tajik ASSR, as well as the convening of independent
Tajik administrative bodies, could finally eliminate the mistrust that obviously
existed regarding the sincerity of this project.

2.4.1 First attempts to catch up on Uzbekistan’s national delimitation
The realization of Tajik autonomy made the commitment to a Tajik identity suddenly appear attractive even to those who previously had not wanted to profess
it in such a sense or had not dared to do so because of the sociopolitical climate
in the early 1920s. It was not until the establishment of the Tajik ASSR that the
determination to assert a separate linguistic identity in the face of Uzbek claims
to hegemony was reaffirmed, even in the cultural centers that had been characterized by Persian-Uzbek bilingualism and were to become part of Uzbekistan
after national delimitation. Many language activists in Samarkand and Tashkent
saw themselves strengthened and began to claim a “Tajik” identity for themselves and those segments of the population in Uzbekistan that used exclusively
or primarily the Persian/Tajik language.
The newspaper Ovozi tojik also unmistakably struck new notes in this regard
from the end of October 1924. Although none of its authors lived in the Tajik ASSR
at the time, hardly anyone refrained from expressing “a thousand thanks” to
the Soviet government for the establishment of this republic. Immediately after
the Tajik ASSR was proclaimed, Hojī Muin demanded that school instruction in
Tajik be provided in Samarkand and its environs as well (Ovozi tojik 30.10.1924: 1).
17 The newspaper Ovozi tojik published a two-part report entitled tojikoni kūhiston ‘The Tajiks of
the Mountainous Areas’, in which Aynī reported on their customs, economy, and social structure
(Ovozi tojik 21.9.1924: 3 and 5.10.1924: 1–2). A detailed description of the fate of emigrants from
Qarotegin who hired themselves out on the cotton plantations of Farǧona is given by Aynī in his
novel Odina (Aynī 1958b).

1 How Tajik was made into a national language

Analogous demands were made for all Tajik settlement areas in Uzbekistan by a
“Tajik commission” that had met at the Central Asian Communist University in
Tashkent after the establishment of the Tajik ASSR. Great hopes were placed in
the commissions for administrative reorganization (rajonirovanie) of the newly
established republics, which began their activities at the end of 1924 and also
conducted surveys of the national composition of the population in individual
areas. From the results of these surveys, the young Tajik nationalists hoped to
revise previous decisions on language policy in the territory of Uzbekistan.
However, the demands for Tajik-language schools could not even be met in the
Tajik ASSR at that time. The People’s Commissariat for Education, which was established in December 1924 and at that time had to administer only seven schools with
152 pupils and 26 teachers in the whole of Tajikistan, in the second year of autonomy admitted that school instruction in Tajikistan was still conducted in Uzbek. The
lack of Tajik-language teaching materials, instructional programs, and teachers was
pointed to as the main cause. Only a few primers and reading books existed. Moreover, there were hardly any accompanying didactic materials for teachers (Obidov
1967: 3–4). All other teaching materials that existed in 1926, including a mathematics book, a textbook on natural history, and political textbooks, were translations
from Russian and intended for higher grades. Sayyid Rizo Alizoda, who in those
years also produced the first grammar of the Tajik language (Alizoda 1344), had rendered outstanding services as a translator. However, this book was also not suitable
for teaching beginners. The remote location of Tajikistan and the poor transportation routes posed additional difficulties for the implementation of school reform.
These objective difficulties were compounded by efforts on the part of Uzbek
state and party officials to maintain the linguistic-political status quo and to prevent
a return of the Persian/Tajik language to public life in Uzbekistan by any means
necessary. On this occasion, the Tajik party functionary Širinšoh Šohtemur felt compelled to submit a report addressed to Stalin on June 25, 1926, in which he described
in detail the situation of the Tajiks in Uzbekistan and called for the convening of a
special commission of the Central Committee of the Communist Party (Bolsheviks).
Šotemur pointed out, for example, that illegal methods were being used to try to
prevent the circulation of the newspaper Ovozi tojik in Uzbekistan, and that graduates of Tajik teacher training courses held in Samarkand in 1925 were subjected
to persecution and threats when they returned to their schools if they wanted to
conduct classes in Tajik. Particular political tensions arose in 1926 at a celebration
of the anniversary of the founding of Tajikistan, held for Tajik workers and students
in Tashkent but conducted in Uzbek from beginning to end. Participants demanded
that speakers should speak Tajik or Russian. Ultimately, the event ended up being
a political manifestation against Uzbekistan. As a result, the director of the Tajik
Teacher Training Institute in Tashkent had been fired (cf. Masov 1991: 72–77).

Lutz Rzehak

Šohtemur’s objection did not go unheard. The Central Committee of the Communist Party of Uzbekistan had to address the situation of the Tajik minority in
1926. The Uzbek People’s Commissar for Education, Mū”min Xoja, advocated in
several Central Asian newspapers the establishment of Tajik-language elementary schools, literacy courses, clubs, and reading rooms in Bukhara and Samarkand, for which he received high praise from Tajik language activists such as
Zehnī (Ovozi tojik 28.9.1926: 2). In the 1926/1927 school year, Uzbek was replaced
as the language of instruction by Tajik in most of the schools in Samarkand’s
Old City. In 1927, the Tajik Teacher Training Institute in Tashkent was transferred
under the administration of the corresponding authorities in the Tajik ASSR.
But what did the Tajik-Uzbek “delimitation” mean for the Éronī of Samarkand? They were still considered Shiite immigrants from Merv in 1926 and used
a non-written Turkic idiom in addition to Persian. This constellation made it difficult for them to assume a Tajik identity, and they had few advocates for their
interests when education in the schools of the Samarkand district of Boǧi Šamol,
which they inhabited, was changed from Persian to Uzbek in 1926. In this situation, they were not helped by the fact that their most notable representative,
Alizoda, had already rendered great services to Persian/Tajik journalism in Central
Asia for many years. Additional confusion was caused by the demand raised by
some Éronī: If everyone was to be taught in his or her mother tongue, then the
Éronī would have to introduce Azeri Turkic, not Uzbek, in their schools (Ovozi
tojik 31.5.1926: 4). The move from “Persian” to “Tajik”, or, to put it another way, the
evolution from a multifunctional language to one reduced to its sole function as
the primary idiom of a nationally defined community, was to mean for the Éronī
a renunciation of Persian, which they had used for centuries in its function as a
written language as well as a language of literature, religion, and science.

2.4.2 Linguistic models of demarcation
The young Tajik nationalism was as language-fixated as Stalin’s concept of nationhood, which served as its intellectual model. The advocates of Tajik interests were
therefore confronted with the same problems in the multilingual milieu of Uzbekistan that the Bolsheviks had previously had to deal with in the “national delimitation of Central Asia”. Even according to “purely linguistic” criteria, it should
be extremely difficult or even impossible in some areas of intensive Uzbek-Tajik
linguistic contact to make a clear separation into “Uzbeks” and “Tajiks”. Tajik
nationalists were unwilling to recognize the interpenetration of Uzbek and Tajik as
a given and to make this intermingling the basis of their understanding of nationality. Instead, they tried to support their claims with dubious statistical arguments.

1 How Tajik was made into a national language

These included, for example, the assertion that someone who used more than fifty
percent Tajik words in Uzbek could not actually speak Uzbek and therefore had a
right to newspapers and education in Tajik, as declared by Hojī Muin (Ovozi tojik
2.8.1926, 3).
On the other hand, Tajik language activists were confronted with the accusation that the spoken Tajik language of Uzbekistan had too many Uzbek features and in this respect deviated too much from “real” Tajik to be considered an
independent language distinct from Uzbek. The proponents of this theory could
even refer to Aynī, who had claimed that the dialects of Samarkand, Xujand and
Ūroteppa should not be made the basis of the written Tajik language, because
they were too much mixed with Uzbek.
The most energetic attempt to refute a Turkic influence on the spoken Tajik
language, at the time was made by the young literary figure Tūraqul Narsiqulov,
alias Zehnī. With a clear demarcation from Uzbek influences and with his call to
reflect on what is supposedly good, true and genuine, Zehnī’s “Polemic on Language and Literature” marks a new stage in the development of a Tajik language
consciousness. His polemic was published in several parts in the newspaper
Ovozi tojik in 1926.18 Here we can see the beginnings of a Tajik ideology of authenticity. As is almost always the case with purist claims of this kind, language was
not seen primarily in terms of its function as a means of communication and,
accordingly, was not evaluated according to functional criteria. Rather, the focus
was on the linguistic-political intention of being able to clearly distinguish Tajik
from Uzbek by “purifying” it from Uzbek influences and to better assert one’s own
language interests against Uzbek claims to hegemony against the background of
greater diversity. From around 1921 onward, corresponding attempts had been
made to purify Uzbek of “Tajik” elements for similar reasons (Rahbari donish,
1928, p. 17–18, see also Baldauf 1993a: 379, 448, 594 and Baldauf 1993b: 663).
After the “national delimitation” of the peoples of Central Asia, attempts
to “segregate” the history of Central Asia equally according to national criteria
must have seemed logical and permissible. The linguistic fixity of the prevailing
concept of nation should allow history to be “unmixed” according to linguistic
characteristics as well and to be instrumentalized for one’s own interests. It is
therefore not surprising that Tajik nationalists also tried to legitimize their linguistic-political claims by referring to the former Persian linguistic hegemony and
the entire Persian-language literature from past times. Thus, Sadriddin Aynī, who
had decided in 1924 to devote himself only to literary activity, was commissioned

18 Three articles were published under the title Mubohisa dar borai zabon va adabiyot ‘Dispute
on language and literature’ in Ovozi tojik (25.5.1926: 2; 9.7.1926: 2 and 14.7.1926: 2).

Lutz Rzehak

by the government of Tajikistan in the spring of 1925 to compile an anthology of
examples from the history of what was called “Tajik” literature. Aynī immediately undertook this task and after a few months submitted a treatise totaling 40
printed sheets. He had summarized examples of 220 Persian-language poets and
literary figures from the early 10th century to the 1920s, prefacing each of these
examples with a brief historical introduction (Aynī 1958a: 110–111).
This anthology was published in Moscow in 1926 as a 626-page book under
the title Namunai adabiyoti tojik 300–1200 hijrī “Examples of Tajik Literature.
300 to 1200 h.” (Aynī 1926, see Figure 2).

Figure 2: Namunai adabiyoti tojik (Moscow 1926), First page.

The political disputes over this literary-historical work began before its publication. Originally, the book was to be published under the title “Examples of Thousand-Year Tajik Literature”.19 The summary reference to a literary tradition that
spans a thousand years was replaced in the printed edition by a factual statement
of the corresponding years, probably because it all too obviously formulated the
19 This title was announced in Ovozi tojik (6.9.1926: 1).

1 How Tajik was made into a national language

claim of wanting to legitimize current political demands from historical depth.
Against this background, it is also not surprising that this work, financed by the
Tajik government, was published by a Moscow publishing house, although the
Tajik government usually had its books, newspapers and magazines printed in
Uzbekistan at the time. Because of the political explosiveness that this seemingly
innocuous literary treatise possessed, a “neutral” publishing house in Moscow
seemed more reliable than an Uzbek printing house. The communist poet Abulqosim Ahmadzoda, alias Lohutī (1887–1957), who had immigrated from Persia, is
also said to have lobbied for the printing of this book. He lived in Moscow from
1923 and enjoyed a certain influence in political circles. In any case, Lohutī was
involved in the technical completion as a typesetter.
When Aynī, in the preface to this book, described the suppression of the
Persian/Tajik language in the preceding years and explained the motives that had
led him to compose this anthology, he cloaked all the accusations in decidedly
moderate tones and focused on the political ignorance of the Tajik speakers:
Because the Soviet power in Turkistan and Bukhara was established in Uzbek after the
October Revolution, all newspapers, novels, instructions, textbooks, and political works
were also in Uzbek. The Tajik people, possessing an ancient culture and a precious literature, were deprived of the new revolutionary literature, Soviet schools, internal and external transformations, political order, that is, on the whole, the fruits of the revolution. Since
the Tajik people could not properly use the Uzbek newspapers, they also did not know what
position the Soviet government and the Communist Party had on the national question,
to what extent they valued the languages and literatures of the nations, what value and
respect they attached to literature in general.
(Aynī 1926: 4)

Despite all attempts at appeasement, this book provoked heated discussions.
Criticism referred to the hitherto unresolved relationship of “Tajik” to that language which, in the totality of its temporal and spatial varieties, was commonly
associated with the designation “Persian”. It thus took up a contradiction in
the creation of which Aynī was not uninvolved. When Aynī introduced the term
zaboni tojik in 1924, describing it as a simple, easy-to-understand idiom that was
territorially oriented to the dialects of the mountainous areas and intended to
function as a “translator” for the poorest strata of the Tajik people, he reserved
the term “Tajik” for a language that was defined primarily by the criterion of
its social and territorial orientation. Thus, even according to Aynī’s definition,
tojik(ī) was not a full-fledged synonym to the designation forsī, with which the
Persian language was associated in its function as a supra-regional lingua franca
as well as a centuries-old high language of religion, science, literature, and commerce. In his “Examples of Tajik Literature”, Aynī had attempted to resolve this
contradiction by including only those poets and poetesses who came from the
Transoxianian region.

Lutz Rzehak

2.4.3 Changes in the Tajik-Uzbek relationship
Bolshevik nationalities policy in Central Asia in the 1920s resembled a complicated balancing act: the Bolsheviks had to recognize that the effective exercise of
power in these areas required the recruitment of leaders who were proficient in
the native languages and could give their rule a national flavor. Therefore, as early
as the Tenth Party Congress in March 1921, a resolution had been adopted calling
for the promotion of cultures in the non-Russian territories. The corresponding
political slogan of the 1920s was korenizacija “rooting”. It circumscribed the education and involvement of indigenous forces as well as the promotion of national
cultures. In this way, the Bolsheviks simultaneously promoted the development
of nationalist ideologies, which they in turn saw as a political danger and did not
want to tolerate unreservedly. The 1920s therefore offered a constant change of
the corresponding political guidelines.
Turkistan nationalism, on which the Bolsheviks liked to rely when they
wanted to establish their rule in Central Asia, was successfully crushed after the
stabilization of their power with the territorial reorganization of Central Asia in
1924. This created new nations, which in turn sought to legitimize themselves
through appropriate ideologies and assert themselves through nationally oriented policies. The Bolsheviks promoted the new Uzbek culture and thus also
Uzbek nationalism, because it promised a departure from the ideas of a Turkistan
nation. Emerging Tajik nationalism was encouraged, in part because it provided
an opportunity to put Uzbek nationalism in its place when it was seen as politically dangerous. But support for Tajik nationalism also ceased at the point when
it sought to legitimize itself by appealing to pre-Soviet culture.
This play on national ideas and sentiments exposed Central Asian literary
figures, publicists, and politicians to a constant roller coaster of emotions. Individuals such as Abdulqodir Muhiddinuf, who during the period of the Bukharan Revolution presented themselves as Turkistan patriots and spread the view
that the population groups referred to as Tajiks were actually of Turkic origin
and had only lost their language under Iranian influence, emerged after 1924
as ardent Tajik nationalists. Muhiddinuf now described his earlier stance as
wrong. He accused Uzbek politicians of continuing and exacerbating in Uzbekistan the error and treachery committed during the Bukharan Revolution. For
his change of heart, Muhiddinuf was publicly praised in the Tajik press. Other
Tajiks who had previously indicated their nationality as “Uzbek” were called
upon to follow Muddinuf’s example and apply to change the registration of
their nationality.
Abdurauf Fitrat, who at the time of the Bukharan Revolution pursued a language nationalism explicitly directed against Persian/Tajik, also changed his

1 How Tajik was made into a national language

linguistic-political attitudes in the mid-1920s. Possibly, he was one of the few
native intellectuals of his time who saw through the Bolshevik play on national
ideas and sentiments. In any case, he did not turn around and become a Tajik
nationalist. Instead, he recalled the tradition of bilingualism. In addition to
numerous works in Uzbek, he also wrote in Tajik from the second half of the 1920s
and was unmistakably committed to the development of the Tajik language and
culture. In 1927, his historical stage play in Tajik Šūriši Vose” ‘The Revolt of Vose”’
(Fitrat 1927) was published and performed in Xujand in 1933 and in Stalinabad.
In 1930 Fitrat wrote the first grammar of Tajik, which – unlike the grammars presented by S. Alizoda (1344, 1927) – was no longer based on the terms and models
of thought of Arabic morphology and syntax (Fitrat 1930). In 1927, Fitrat participated in teacher training courses in Samarkand conducted for teachers from
Tajikistan and taught Tajik language and literature there.
Fitrat’s commitment to a continuation of traditional bilingualism was nowhere
more evident than in discussions about the creation of a Latin alphabet for Tajik.
Fitrat put forward the only alphabet project that took into account the phonological alignment of the Tajik and Uzbek languages, and cited mainly historical
reasons, in addition to economic ones, for the uniformity of the Uzbek and Tajik
Latin alphabets that he called for (Rahbari doniš 4/5: 13–16 and 10: 8–10). The
question of whether and to what extent the Latin alphabets for Tajik and Uzbek
should be aligned became a central point of contention in the discussions about
the Latin alphabet. Tajik nationalists instrumentalized it for their emancipation
efforts against Uzbek claims to hegemony.
Meanwhile, Tajik politicians and language activists continued their efforts to
change the status of Uzbekistan’s Tajik-speaking population in political terms as
well. They received tailwind from a political campaign that no longer condemned
only the Turkistan nationalism of the early 1920s but also some later manifestations of Uzbek nationalism and elevated them to counterrevolutionary “pan-Turkist” or “pan-Islamic” movements. In party circles, it was even claimed that Uzbek
nationalists were striving for Uzbek independence and secession from the Soviet
Union (Rahbari doniš 5/6: 4). Articles of similar content also appeared in other
Central Asian and central press publications at the time (Masov 1991: 160–169).
Thus, the time seemed ripe to put on the agenda the question of a political
unification of the Tajik speakers of Uzbekistan with the Tajik ASSR. In mid-1928,
nineteen leading members of the party and state apparatus of the Tajik ASSR
drafted a letter to the Communist Party Politburo demanding a revision of the
borders between Uzbekistan and Tajikistan. According to their demands, the territories of Xujand, Surxandarjo, Samarkand, and Bukhara were to be annexed to
Tajikistan (Eisener 1991: 46–54). Although most of these territorial claims were
rejected, the Tajik nationalists scored another success: Henceforth, the fortunes

Lutz Rzehak

of Tajikistan would no longer be at the mercy of Uzbek politics. On October 5, 1929,
Tajikistan was declared an independent Soviet Socialist Republic. The territory of
Tajikistan was also expanded to include the Xujand area north of the Zarafšon
Mountains. Tajikistan was thus granted only those areas that had already had the
status of de facto autonomous Tajik territory in Uzbekistan as of February 1927.
This defined the boundaries of the area, in which Tajik was henceforth to
develop with the status of a national language. The cities of Bukhara and Samarkand remained definitively outside these borders. With Xujand, Tajikistan had
been assigned a center that was primarily of economic importance and had
already been under Russian rule much longer than most areas of the former Tajik
ASSR. This largely predetermined the exposed position that individual representatives of this region would occupy in Tajikistan’s political life in the following decades. This also affected issues such as changing the writing system and
setting a standard for the modern Tajik literary language.

References
The Persian Tajik sources on which this contribution is based were written in different writing
systems, some of which knew distinctly different variants. Sometimes the author appears in a
different writing system than his contribution. For this reason, a unified transcription system is
used for Persian/Tajik sources that is based on the pronunciation standard valid today as fixed
in the Cyrillic writing system of Tajik.
Abdullaev, K. N. 1989. Oružiem pečatnogo slova (Gazety Sovetskogo Turkestana v
ideologičeskom obespečenii razgroma basmačestva v Sredney Azii) [With the weapon of
the printed word (Newspapers from Soviet Turkistan for the ideological protection of the
crushing of the Basmachi in Central Asia]. Dušanbe: Doniš
Alizoda, Sayyid Rizo. 1344. Sarfu nahwi tojikī (baroi maktahoi ibtidoī wa miyonai tojikon tartib
dodaast) [Tajik morphology and syntax (compiled for primary and middle schools of the
Tajiks)]. Samarqand.
Alizoda. 1913. Moro islohi madoris wa makotib lozim ast [We need a reform of madrassas and
schools]. Oyina. 3. 15.11.1913. 74–76.
Alizoda[i Samarqandī], Sayyid Rizo. 1927. Sarfu nahwi tojikī (baroi sinfi 5–6 maktabhoi tojik
tartib šudaast) [Tajik morphology and syntax (compiled for the 5th and 6th classes of Tajik
schools)]. Samarqand-Dušanbe.
Allworth, Edward A. 1990. The Modern Uzbeks. From the Fourteenth Century to the Present. A
Cultural Histoy. Stanford: Hoover Institution Press.
Andreev 1925. Po étnografii tadžikov. Nekotorye svedenija [On the ethnography of the Tajiks.
Some information]. In N. L. Korženovskij (ed.), Tadžikistan. Sbornik statej [Tajikistan.
Collective volume], 151–178. Taškent: Obščestvo po izučeniju Tadžikistana i iranskix
narodnostej za ego predelami.

1 How Tajik was made into a national language

Aynī, S. 1958a. Muxtasari tarjimai holi xudam [A summary of my biography]. Stalinobod:
Našriyoti davlatii Tojikiston.
Aynī, S. 1958b. Odina [Odina]. In S. Aynī, Kulliyot [Collected works]. J. 1, 183–327. Stalinobod:
Našriyoti davlatii Tojikiston.
Aynī, Sadriddin (jam”kunanda). 1926. Namunai adabiyoti tojik [Examples of Tajik literature].
Maskav: Našriyoti markazii xalqi ittihodi ǧamohiri šurawii susiyolistī.
Baldauf, Ingeborg. 1992. Mahmýd Xŭǧa Behbūdī and his journal Ojina (Samarkand, 1913–15):
Pragmatic pluralism versus ethnicist monism. In Sprachkontakt und Mehrsprachigkeit in
iranischen Kulturen (unpublished conference paper).
Baldauf, Ingeborg. 1993a. Schriftreform und Schriftwechsel bei den muslimischen Rußlandund Sowjettürken (1859–1937): Ein Symptom ideengeschichtlicher und kulturpolitischer
Entwicklungen. Budapest: Akadémiai Kiadó.
Baldauf, Ingeborg. 1993b. “Tatarismus in Mittelasien: Das tatarische Vorbild in der Entwicklung
der uzbekischen Sprache” In Jens Peter Laut & Klaus Röhrborn (eds.), Sprach- und
Kulturkontakte der türkischen Völker. Materialien der zweiten Deutschen TurkologenKonferenz Rauischholzhausen, 13.–16. Juli 1990, 13–50. Wiesbaden: Harrassowitz.
Baldauf, Ingeborg. 1991. Some Thoughts on the Making of the Uzbek Nation. Cahiers du Monde
russe et sovietique. Paris: Ecole des Hautes Etudes en Sciences Sociales. XXXII(1). 79–96.
Bartol’d, V. V. 1964. Ešče o slove “sart” [Something more about the word ‘Sart’]. In V. V. Bartold,
Sočineniya [Collections]. II/2, 310–314. Moskva: Nauka.
Bauer Henning, Andreas Kappeler & Brigitte Roth (eds.). 1991. Die Nationalitäten des
Russischen Reiches in der Volkszählung von 1897. A: Quellenkritische Dokumentation und
Datenhandbuch, B: Ausgewählte Daten zur sozio-ethnischen Struktur des Russischen
Reiches – erste Auswertungen der Kölner NFR-Datenbank. (Quellen und Studien zur
Geschichte des östlichen Europa 32 A & B). Stuttgart: Steiner.
Doerfer, G. 1967. Türkische Lehnwörter im Tadschikischen. Wiesbaden: Franz Steiner.
Eisener, Reinhard. 1991. Auf den Spuren des tadschikischen Nationalismus. Aus Texten und
Dokumenten zur Tadschikischen SSR. (Ethnizität und Gesellschaft: Occasional Papers 30).
Berlin: Das Arabische Buch.
Figes, Orlando. 1998. Die Tragödie eines Volkes. Die Epoche der russischen Revolution 1891 bis
1924. Berlin: Berlin-Verlag.
Fitrat. 1927. Šūriši vose”. Čārparda. Yak fojiai ta”rixī az hayyoti tojikhoe ki zeri farmoni amiri
buxoro budand [The Revolt of Vose. A play in four acts. A historical catastrophe in the life
of some Tajiks who lived under the rule of the Emir of Bukhara]. Samarqand-Dušanbe.
Fitrat. 1930. Qoidahoi zabono tojik [Rules of the Tajik language]. Istalinobod/Toškand: Našrijoti
Davlatii Tojikiston.
Fragner, Bert G. 1989. Probleme der Nationswerdung der Usbeken und Tadschiken. In Andreas
Kappeler, Gerhard Simon & Georg Brunner (eds.), Die Muslime in der Sowjetunion und in
Jugoslawien. Identität, Politik, Widerstand, 19–34. (Nationalitäten- und Regionalprobleme
in Osteuropa 3). Köln: Markus-Verlag.
Fragner, Bert G. 1999. Die “Persophonie“: Regionalität, Identität und Sprachkontakt in der
Geschichte Asiens. (ANOR 5). Berlin: Das Arabische Buch
Fragner, Bert G. 2021. Elements of Iranian identities: Historic dimensions of a contemporary
discourse. In: Redkollegiia (ed.). The Written and the Spoken in Central Asia – Mündlichkeit
und Schriftlichkeit in Zentralasien. Festschrift for Ingeborg Baldauf, 17–37. Potsdam:
edition tethys.

Lutz Rzehak

Gramenidskij, S. 1916. Položenie inorodčeskogo obrazovanija v Syr-dar’inskoj oblasti [The
situation of local education in Syr Darya oblast]. Taškent.
Hojī Muin[i Mehrī ibn Šukrulloh]. 1332. Guldastai adabiyot [Bouquet of literature]. Samarqand.
Komatsu, Hisao. 1989. The evolution of group identity among Bukharan intellectuals in
1911–1928: An overview. The Memoirs of the Toyo Bunko 47. 115–144.
Lentz, Wolfgang. 1933. Pamir-Dialekte I. Materialien zur Kenntnis der Schugni-Gruppe.
(Ergänzungshefte zur Zeitschrift für vergl. Sprachforschung auf dem Gebiete der
indogermanischen Sprachen 12). Göttingen: Vandenhoeck & Ruprecht.
Lorenz, Manfred. 1964. Die Tâǧiken. Zeitschrift für Phonetik, Sprachwissenschaft und
Kommunikationsforschung. 17(6). 571–579.
Masov, Raxim. 1991 Istorija topornogo razdelenija [The history of the devision with axe].
Dušanbe: Irfon.
Materialy. 1926. Materialy po rayonirovaniyu Sredney Azii. Kn. I: Territoriya i naselenie Buxary i
Xorezma. Čast’ 1: Buxara. [Materials for the administrative devision of Middle Asia. Book 1:
Territory and population of Bukhara and Chorezm. Part 1: Bukhara] Taškent: Komissija po
rajonirovaniju Srednej Azii.
Muhiddinuf. 1928. Mardumi šahru atrofi buxoro tojikand yo ūzbak [Are the people in the city of
Bukhara and in its surroundings Tajiks or Uzbeks?]. Rahbari doniš 8/9. 15–18.
Nisor, Muhammad. 1924a. Čand hikoyatho ba zaboni tojikī baroi xondani bačagoni tojikon
[Some stories in Tajik for reading by Tajik children]. Toškand.
Nisor, Muhammad. 1924b. Alifboi zaboni tojikī baroi soli avvali makotibi ibtidoiyai Turkiston
[Tajik reading primer for the first classes of primary schools in Turkistan]. Toškand.
Obidov, I. 1967. Az ta”rixi taraqqiyoti maorifi xalq dar RSS Tojikiston [Onthe history of the
development of popular education in the SSR Tajikistan]. Maktabi sovetī 8. 3–6.
Pestovskij, B. A. 1927. Iz poétičeskogo tvorčestva buxarskix evreev [From the poetic work of
Bukharian Jews]. In Aleksandr E. Šmidt (ed.), V. V. Bartol’du – ego turkestanskie druz’ja,
učeniki i počitateli [For V. V. Bartold – his friends, students and admirers from Turkistan],
426–429. Taškent: Obščestvo dlja izučenija Tadžikistana i iranskix narodnostej za ego
predelami.
Sajidūf, A. 1931. Yaki digar zafari mo dar fronti revolyutsijai madanī [One more of our victories
on the front of the cultural revolution]. Hayoti mehnat 1. 1–2.
Schoeberlein-Engel, John Samuel. 1996. Identity in Central Asia: Construction and contention
in the conceptions of “Özbek“, “Tâjik“, “Muslim“, “Samarquandi“ and other groups. Ann
Arbor: UMI Dissertation Services.
Semenov, A. A. 1960. K prošlomu Buxary [On the past of Bukhara]. In Sadriddin Ayni,
Vospominanija. Perevod s tadžikskogo Anny Rozenfel’d [Memories. Translation from Tajik
by Anna Rosenfeld], 980–1015. Moskva and Leningrad: Izdatel’stvo AN SSSR.
Stalin, Josef W. 1950. Werke Bd. 2. Berlin: Dietz.
Šukuri, Muhammadjon (M. Šukurov). 1997. Xuroson ast in jo: Ma”naviyat, zabon va éhyoi millii
tojikon [Khorasan is here: Spirituality, language and national renaissance of the Tajiks].
Dušanbe: Oli Somon.
Suxareva, O. A. 1966. Buxara XIX – načalo XX v.: Pozdnefeodal’nyy gorod i ego naselenie.
[Bukhara in the 19th and at the beginning of 20th century: A late feudal city and its
population]. Moskva: Nauka.

1 How Tajik was made into a national language

Varejkis, I. & I. Zelenskij. 1924. Nacional’noe razmeževanie Sredney Azii [The national
delimitation of Middle Asia]. Taškent.
Xūjoyif, Ikrom. 1930. Čaro mehnatkašoni tojikzaboni buxoro ba nazar girifta namešavad? [Why
are the Tajik speaking workers of Bukhara not considered]. Rahbari doniš. 4/5. 6–7.
Zelenskij, I. 1924. Nacional’noe razmeževanie Srednej Azii [The national delimitation of Middle
Asia]. In I. Varejkis & I. Zelenskij, Nacional’noe razmeževanie Srednej Azii [The national
delimitation of Middle Asia], 69–89. Taškent.
Zenkovsky, Serge A. 1960. Pan-Turkism and Islam in Russia. Cambridge, Massachusetts:
Harvard University Press.

Shinji Ido

2 Standard Tajik phonology
Abstract: The present study offers an overview of standard Tajik phonology,
focusing mainly on its phonemes and their phonetic representations. Prosodic
units and intonation are largely ignored in this article, though some analyses on
interrogative intonation patterns are presented in Section 3.
This article also surveys previous studies on the phoneme inventory of standard Tajik. It aims to reconcile contradictory statements made in those studies,
thereby consolidating them into a coherent description of the standard Tajik
phoneme inventory. It will be demonstrated that the contradiction derives, in
part, from the fact that some major sound changes that have taken place in standard Tajik since its inception are not acknowledged in the Tajik linguistic literature. Accordingly, particular attention is devoted to the diachronic changes that
have taken place in the standard Tajik phoneme inventory, and in the phonetic
representations of some of the phonemes it comprises.
In describing the changes, this study relies not only on the existing literature
in Tajik phonology, most of which was produced during the Soviet period, but
also on a speech corpus of present-day standard Tajik. The speech corpus, which
the present author compiled in 2012, contains recorded speech produced by newsreaders and announcers working at Dushanbe-based television and radio stations.
This use of different data sources facilitates comparison between the standard
Tajik in the Soviet period as it is described in the literature, and that in post-civil
war Tajikistan, allowing us to identify some recent changes in standard Tajik.
This article is organized in four sections. The first section introduces the terminology adopted in this article, after which it describes the development of standard Tajik in relation to its phoneme inventory and the phonetic realization of the
phonemes it contains. The section also explains the relationship between standard
Tajik and the dialects that have affected it. The second section then provides an
overview of the standard Tajik phoneme inventory. It briefly explains the aforementioned speech corpus, after which it describes the standard Tajik phoneme
inventory. It reviews the phoneme inventory that has been widely circulated and
routinely replicated in grammars and textbooks. This is followed by a discussion of
Acknowledgements: This research was partially supported by a Grant-in-Aid for Young Scientists
(B) (22720162) from the Ministry of Education, Culture, Sports, Science and Technology in Japan.
I thank the informants for participating in this investigation. I also thank Corey Miller and Lutz
Rzehak for their comments on earlier versions of this article and gratefully acknowledge the assistance Zubaidullo Ubaidulloev provided during the preparation of this article.
https://doi.org/10.1515/9783110622799-002

Shinji Ido

issues, some of them contentious, related to the inventory. Section 2 also puts the
prescribed realization of some Tajik phonemes in contrast with the actual realization used in standard spoken Tajik where the latter differs from the former. Seciton
3 touches upon the use of intonation in yes/no and wh- question-answer pairs
identified in the aforementioned Tajik speech corpus. The study concludes with
a summary of the insights gained from the overview of standard Tajik phonology.

1 Preliminaries
1.1 Tajik
Tajik is a south-western Iranian language that is spoken in Tajikistan as well as
in all of its neighbouring states except China.1 It is accorded the “state language”
status by the Tajik constitution and is designated as the primary language of politics, society, economy, scholarship, and education by the language law of Tajikistan
(“Qonuni” 2017). Tajik as a standardized and autonomous (Stewart 1968: 534–535)
language variety of Central Asia did not exist until 1924,2 before which it had been
identified by its speakers as a distinct variety of forsī ‘Persian’ and conceivably also
as a centre in a Persian pluricentric dialectal continuum that geographically spans
certain areas within present-day Afghanistan, Iran, Tajikistan, and Uzbekistan.3
1 Northern Tajik dialect speakers were among Central Asian émigrés to the Kunduz area in Afghanistan in the 1920s and 1930s (Shalinsky 1979: 3–4, 12). According to Arlund and Ibrukhim
(2013: 15), in China’s Xinjiang province, there are different groups of people speaking different
(non-Tajik) languages whose self-reference is (rather confusingly) /tɔdʒ͡ik/. This endonym is pronounced as [tɔdʒ͡ic] in the audio recordings accompanying Arlund and Ibrukhim (2013: 33, 62).
2 According to Khalid (2015: xvi), the term Tajik was used for the first time in 1924 in reference to
the south-western Iranian variety of Central Asia.
3 Its speakers considered their variety distinct from the Persian of Iran. This is manifest in the
fact that, “when Bukhārā-yi sharīf (Bukhara the Noble), the first newspaper to be published in
Bukhara, appeared, its Azerbaijani editor was criticized by many readers for using a language
they considered too Iranian” (Khalid 2015: 293). There were, as there are today, linguistic elements that were characteristically Bukharan. Sadr Ziyaʼ et al. (2004: 72–75) contains examples
of such elements, with which “[m]odernist writers did seek to write “in the idiom of Bukhara,”
as Fitrat (1886–1938) did in his early works. A Bukharan Jew (Simon Ḥakham [1843–1910]) also
identified his own variety as one that is uniquely Bukharan, calling it “the Persian dialect that is
current in the towns of Bukhara” (Ido 2016: 218). As such, the autonomy of Tajik from the Persian
of Iran was emphasized during the period of standardization (see, e.g., Zehnī 1928, 1929: 39–41;
“Qarori” 1930), with Zehnī (1928) putting Tajik and zaboni «forsī»-i imrūzai Éron ‘the present-day
“Persian” language of Iran’ in contrast and enumerating lexical, grammatical, and phonological
discrepancies between the two varieties.

2 Standard Tajik phonology

Differences between the numerous regional varieties of Tajik are large, with
those spoken in Uzbekistan in particular “becoming unintelligible to Tajik speakers outside of their own region” (Beeman 2010: 145).4 Tajik dialects are typically
classified into Northern, Southern, Central, and South-eastern groups (Rastorgueva 1964).5 These dialect groups are characterized primarily by their differing vowel systems. Northern Tajik dialects comprise the varieties spoken in such
major cities as Bukhara, Samarkand, and Khujand, and utilize a broadly similar
set of phonemes (Éšniyozov 1977: 64–65). On the other hand, Southern Tajik dialects, whose phoneme inventory differs from that of Northern Tajik dialects (Éšniyozov 1977: 69–71), include the dialect of Kūlob, one of the most populous cities
in Tajikistan (Figure 1).

1.2 Standard Tajik
The present article uses the term “standard Tajik” to refer to the variety of Tajik
whose form is accepted as “correct” by Tajik speakers and is subject to description in grammars and textbooks. It is hence also typically the variety that is learnt
by learners of Tajik and generally warrants high social acceptability in Tajikistan.
The referent of the term zaboni adabii (hozirai) tojik ‘lit. the (modern) Tajik literary language’, which has had wide currency in the Tajik linguistic literature
particularly during the Soviet period, is also identified as standard Tajik in this
article, because it would be described most aptly as the standard Tajik language
in Western linguistic terminology.
In this article, the term standard Tajik is also applied to the spoken variety
that generally conforms to the standard system of pronunciation codified in grammars, textbooks, and dictionaries. In modern times, this is typically the spoken
variety used by newsreaders and announcers, who, by the nature of their work,
gravitate towards socially acceptable pronunciation; recall the use of “BBC
English” as an alternative name for “Received Pronunciation” in English. I will
refer to the standard spoken variety as standard spoken Tajik wherever it needs
to be specifically contrasted with standard Tajik as a whole.

4 The unintelligibility is mutual. Thus, for example, the passive construction with the auxiliary
verb šudan ‘to become’ is unknown to the average Bukharan Tajik speaker (İdo 2002: § 1.1), as are
such basic lexical items as abr ‘cloud’, abrū ‘eyebrow’, and angušt ‘finger’. The commonly used
Bukharan Tajik counterparts of these items are /bulut/ (from Uzbek bulut ‘cloud’), /qoɕ/ (from
Uzbek qosh ‘eyebrow’), and /lili/ (etymology unknown), respectively.
5 The reader is referred to Melex (1960) and Éšniyozov (1977: particularly 62–64), for different
classifications of Tajik dialects.

Shinji Ido

Figure 1: Dialect map of the main area where Tajik is spoken. Pentagons, circles, squares, and
triangles respectively represent approximate locations where Northern, Southern, Central, and
South-eastern dialects are spoken. The classification and locations of dialects are reproduced
from Rastorgueva (1964: 182).

Needless to say, as is the case with the speakers of any language variety, Tajik
speakers’ perception of what phonetic features standard (spoken) Tajik should
have (i.e., what pronunciation norms standard Tajik should be subjected to)
change over time, as does the standard system of pronunciation codified in grammars, textbooks, and dictionaries.6 Accordingly, this study will make reference
to changes and variation in the standard system of pronunciation and pronunciation norms, thus effectively describing diachronic changes and synchronic
variation in standard spoken Tajik. It is important to note that the “diachronic

6 For instance, Henton (1983) reports changes in the phonetic realizations of RP vowel phonemes. See also Cruttenden (2014: xvii), who divides RP into General British and more traditional
Conspicuous General British (Cruttenden 2014: 80–81), as well as Ōnishi and Shibata (2000) who
discuss the loss of the velar nasal as a phoneme in standard Japanese.

2 Standard Tajik phonology

changes and synchronic variation in standard Tajik” do not predate the early 20th
century because, as was explained in § 1.1, the variety that is herein called standard spoken Tajik came into existence in the early 20th century.

1.3 Development of standard Tajik phonology
Standard Tajik has been modelled on, and orientated to, Northern Tajik dialects
(§ 1.1). In particular, its phoneme inventory and phonetic realization of phonemes
have been unmistakably in line with those of certain major Northern Tajik dialects. This section explains how this situation came about by describing the foundational roles the Northern dialects of Bukhara and Samarkand played at the
inception of standard Tajik, as well as by demonstrating how another Northern
dialect, namely the dialect of Khujand, may have kept standard Tajik orientated
towards Northern Tajik dialects during the Soviet period.
§ 1.3.1 explains the standardization of standard Tajik phonology and covers
the period down to about 1930 when the Tajik Soviet Socialist Republic (Tajik
SSR) was created as a successor to the Tajik Autonomous Soviet Socialist Republic within the Uzbek Soviet Socialist Republic. § 1.3.2 then explains facts related
to standard Tajik phonology in the period down to the 1990s when the Tajikistani
Civil War erupted. Finally, § 1.3.3 discusses the situation concerning standard
Tajik phonology in post-civil war Tajikistan.

1.3.1 Standardization in the early 20th century
The standardization of Tajik took place in the early 20th century. There is general
agreement among linguists that Northern dialects, in particular those of Bukhara
and Samarkand, served as the basis of standard Tajik (Melex 1968: 22; Comrie 1981:
164; Fajzov 1985: 3; Kerimova 1995: 118–119, 1997: 97). This rather unusual situation
where the basis of a state’s standard language is “extrastate” in origin undoubtedly came about due to the fact that, when efforts for the standardization of Tajik
commenced in the early 20th century, all but few Tajik literati were residents of
either or both of Bukhara and Samarkand.7 It is therefore hardly surprising that
the resolution adopted by the Scientific Conference of Uzbekistan Tajiks, held in
Samarkand in February 1930, emphasized the necessity of adopting the Bukhara

7 They include, among others, Saidrizo Alizoda, Sadriddin Aynī, Bahriddin Azizī, Narzulloi Bektoš, Abdurauf Fitrat, Abduqodir Muhiddinov, Abdulvohid Munzim, and Tūraqul Zehnī.

Shinji Ido

dialect as the phonetic and orthographical basis for the development of standard
Tajik (“Qarori” 1930; “Baroi” 1930). The fact that the resolution endorsed Bukharan Tajik as the basis of standard Tajik, not only in orthography (imlo), but also in
phonetics (savtiyot) shows that Bukhara, whose standing was in retreat vis-à-vis
Samarkand (the capital of the Uzbek Soviet Socialist Republic to 1930), still commanded prestige among Tajiks (see also “Dar kengoši” 1930), thanks to such early
20th-century literary doyens and political heavyweights as Ahmad Doniš, Abdurauf
Fitrat, Abdulvohid Munzim, Abduqodir Muhiddinov, and Sadriddin Aynī.8
It should be noted that, in the context of this resolution, “orthographical”
mainly meant “in terms of the phoneme inventory”, because orthographic transparency was a priority in the standardization of Tajik (e.g., Fitrat 1927; Lohutī 1928;
Odilzoda 1930), which in 1930 was first and foremost about establishing a new
(Latin-based) alphabet for Tajik.9 Assigning a specific letter to a specific vowel or
consonant was therefore more or less tantamount to admitting that sound as a
Tajik phoneme. As such, standard Tajik was to be equipped with Bukharan Tajik
phonemes and pronunciation, at any rate in the minds of the participants of the
Samarkand conference. Tajik intellectuals in general would also habitually refer
to the Northern dialects of Bukhara and Samarkand in relation to standard Tajik
phonemes and pronunciation norms (see Nabavī et al. 2007).
This habit of referring to Bukharan and Samarkandi Tajik in relation to standard Tajik came to a rather abrupt end at around the time when the nation-state
delimitation and subsequent administrative changes that took place in Central
Asia in the 1920s put Bukhara and Samarkand outside the borders of the Tajik
SSR, the predecessor of present-day Tajikistan (Abazov 2008: maps 37, 38). Indeed,
the First Tajik Linguists’ Conference,10 which was held in August 1930 in Stalina-

8 Sadriddin Aynī, the first president of the Academy of Sciences of Tajik SSR, was from Soktare,
which is situated in the outskirts of Bukhara, and was educated in Bukhara. Additionally, the
vernacular of Bukhara was apparently deemed less “corrupt” (read “Uzbekified”) than that of
Samarkand (Zehnī 1929: 42; Buxorī 1930).
9 Before the attempts at standardization commenced in the 1920s, Tajik had been written almost
exclusively in the Perso-Arabic alphabet (Bukharan Jews would write Tajik in Hebrew script).
10 The conference held in August 1930 in Stalinabad has been variously referred to as Anjumani naxustini ilmii Tojikiston (“Dar anjumani” 1930) ‘the first scientific conference of Tajikistan’,
Anjumani ilmii zabonšinosii umumii Tojikiston (“Qaror” 1930) ‘the scientific conference of general
linguistics of Tajikistan’, Anjumani zabonšinosoni Tojikiston (Dehotī 1930) ‘the conference of Tajikistan linguists’, Anjumani yakumi ilmii zabonšinosoni Tojikiston (Rasulī 1931) ‘the first scientific conference of Tajikistan linguists’, Anjumani avvalini lingvistii tojik (Kalontarov 1974: 3) ‘the
first Tajik linguistic conference’, Anjumani avvalini zabonšinosoni tojik (Olimova 2007: 67) ‘the
first Tajik linguists’ conference’, Anjumani naxustini ilmii zabonšinosoni Tojikiston (Nabavī 2007:
27, 33) ‘the first scientific conference of Tajikistan linguists’, etc.

2 Standard Tajik phonology

bad (today’s Dushanbe) in the then newly established Tajik SSR, and which purportedly had the final say on issues related to the standardization of Tajik, did
not mention Bukharan Tajik in relation to the development of standard Tajik in
its decisions. The resolution adopted at the Samarkand conference was of recommendatory nature and hence did not dictate decisions to the Stalinabad conference (A”lozoda 1930: 5; Dehotī 1930; Asimova 1982: 64).11
However, this ostensible indifference of the Stalinabad conference to Bukharan (and Samarkandi) Tajik belies the fact that the Latin-based Tajik alphabet officially established at the conference (Appendix 2) corresponds with the
phoneme inventory of Bukharan and Samarkandi Tajik. For example, the Latin-based Tajik alphabet of 1930 includes the letter ‹ū›12 for a vowel that is largely
unique to Northern Tajik dialects, of which, as explained above, those of Bukhara
and Samarkand were the most prominent (Uluǧzoda 1930; ‘Ayn. 1930: 18; Ismatī
1930; A”lozoda 1930: 6; Dehotī 1930; Toşpūlotuf et al. 1932: 16), while lacking any
letter representing the voiced and voiceless pharyngeal fricatives, which most
Northern Tajik dialects (unlike, e.g., most Southern dialects) lack as phonemes
(Rastorgueva 1964: 166). Since it was effectively through the establishment of
the Latin-based alphabet that the sounds to be contrasted in standard Tajik were
determined, the establishment of the alphabet resulted in standard Tajik adopting the phoneme inventory of the Northern dialects of Bukhara and Samarkand
as its own.
The provision of the phoneme inventory by the Northern dialects of Bukhara
and Samarkand to standard Tajik was probably inevitable because in the early
20th century, the dialect of Bukhara and perhaps to a somewhat lesser degree
that of Samarkand dominated the Tajik literati. In addition, the absence of Tajik
dialectology at the time of standardization meant that little information on the
phoneme inventories of Tajik dialects other than those of Bukhara and Samarkand was available to Tajik language planners. As such, it was neither practical
nor possible to standardize Tajik based on any dialect but those of Bukhara and

11 Incidentally, the omission of the reference to Bukhara in the Stalinabad conference may be
partly due to the fact that the Tajik participants of the conference who reported on, and probably also were involved with, the establishment of the new alphabet and orthography (Narzulloi
Bektoš and Rahim Hošim) were not from Bukhara but from Samarkand (Nabavī 2007: 27). See
also Rzehak (2001: 255–256) who ascribes some decisions made at the Stalinabad conference to
factional strife.
12 Following the International Phonetic Association (1999: 27), in this article, guillemets enclose orthographic letters and words.

Shinji Ido

Samarkand, both of which had already been subject to phonological investigation in 1927 (Semënov 1927; Orfinskaja 1945).13
That the dialects of Bukhara and Samarkand provided standard Tajik with
its phoneme inventory is important, because it entails that, at least initially, the
same dialects also provided standard Tajik with the phonetic realization of the
phonemes contained in that inventory (Rastorgueva 1955: 24).
To be sure, there are records indicating that some Tajiks unsuccessfully challenged this entailment. For instance, a note accompanying the decisions made
at the First Tajik Linguists’ Conference contains a passage which can be understood to mean that standard Tajik tolerates different readings for the letter ‹ū›
so that someone whose native dialect lacks the Northern Tajik vowel phoneme
represented by ‹ū› can read it differently, using a vowel that is native to his own
dialect.14 Had this practice been followed, for instance, the third person singular
pronoun, whose orthographical representation in the Latin-based Tajik alphabet
of 1930 is ‹ū›15 would have been pronounced differently as [ʊ] by early 20th-century
Northern Tajik speakers (§ 2.2.1.1, Table 1), and as [uː] by Southern Tajik speakers, yet both would have been tolerated as standard pronunciation (perhaps in a
similar way that Australian English tolerates both /æ/ and /ɑː/ as possible readings of ‹a› in castle). However, despite the note, perhaps predictably, the Northern
Tajik realization of the phoneme was promoted as the standard reading of ‹ū›.
For instance, a textbook published in 1932 (Toşpūlotuf et al. 1932: 16) explicitly
prescribes that the reading of the letter ‹ū› be “intermediate between [Russian]
u and o”, a vowel largely unique to Northern Tajik dialects (§ 2.2.1).16 The textbook
adds that the vowel is found in the Northern Tajik dialects of Khujand, Bukhara,
and Samarkand, and that it is absent in the Southern dialects of Stalinabad (present-day Dushanbe), Kūlob, and Ǧarm. Thus, standard Tajik at its inception was

13 In contrast, the use of morpho-syntactic and lexical elements which existed in the dialects
of Bukhara and Samarkand but which intellectuals identified as being too vernacular (or overly
Uzbekified) has tended to be supressed in written standard Tajik. For example, case suffixes,
the circumposition to …-(y)a ‘until, to’ and the pre-nominal relative clause (see İdo 2002 for a
description), etc. that were certainly in wide use at any rate in Bukhara (see Aynī 1928; Azizī 1928;
Munzim 1928; Zehnī 1928; Xojaev 1929; Zehnī 1929), were either never used or used inhibitedly
in writings in the early 20th century before they fell into general disuse in subsequent years in
standard Tajik, and in serious publications.
14 This idea had been present prior to the conference in debates on Tajik orthography, e.g. in
Ismatī (1930), Baqozoda (1930), and ‘Ayn (1930: 19).
15 Note that the macron in this ‹ū› does not represent [ː] in the Latin-based Tajik alphabet of
1930 (see Table 1).
16 Some Central Tajik dialects, whose vowel system was not known to linguists in the early 20th
century, reportedly utilize a back vowel approximating to [ʊ] (Rastorgueva 1964: 15, 36).

2 Standard Tajik phonology

(standardized) spoken Bukharan and Samarkandi Tajik not only in terms of the
phoneme inventory but also in terms of the phonetic realization of phonemes.
Table 1: Representations of early 20th-century Tajik [u], [uː], [ʊ], [i], and [iː] in three different Tajik
alphabets. Note that the macron does not represent [ː] in the alphabets introduced in 1930 and
1940.17 In ‹ī› and ‹ӣ›, which occur only word-finally, the macron indicates that the /i/ that they
represent is not the izafet particle.
Early 20th-century Tajik phone(me)s

[u]

[uː]

[ʊ]

[i]

[iː]

1928 Latin-based alphabet

‹u›

‹ū›

‹ů›

‹i›

‹ī›

1930 Latin-based alphabet

‹u›

‹ū›

‹i›, ‹ī›

1940 Cyrillic-based alphabet

‹у›

‹ӯ›

‹и›, ‹ӣ›

Present-day standard Tajik phonemes

/u/

/ɵ/

/i/

1.3.2 Standard Tajik phonology in the Tajik SSR
In 1930, the phoneme inventory of Bukharan and Samrakandi Tajik was codified in
the form of an official (Latin-based) alphabet (§ 1.3.1), but the phonetic realizations
of the phonemes in the inventory were not codified as formally. In theory, then,
the phonetic realizations of phonemes in standard Tajik could diverge from those
in Bukharan and Samrakandi Tajik. After all, the phonetic realizations of phonemes are volatile in a number of language varieties including standard ones; New
Zealand English, for example, changed the phonetic realizations of some vowel
phonemes in the last one and a half centuries (Maclagan and Hay 2007) while
British English shifted the phonetic realization of /uː/ to [üː], [ʊu], or [ʉː].18
One may be tempted to speculate that departure from Bukharan and Samrakandi Tajik in phonetic realization of phonemes was more likely in the standard
Tajik of post-delimitation Tajikistan because it was customary for Tajik linguists
not to cite Bukharan and Samarkandi Tajik as reference points since they lay
outside the new nation state.
The general absence of Uzbekistan Tajik varieties in descriptions of standard
Tajik published after the nation-state delimitation is conspicuous. For instance,
17 Why ‹ū› was preferred to ‹ů› as the representation of /ʊ/ (§ 2.2.1.1) in 1930 is unclear, but given
that ease of handwriting was on the agenda in the new Tajik alphabet project in the late 1920s (“O
novom” 1928: 245), it is possible that ‹ů› fell into disfavour because of the perceived difficulty in
handwriting it. For instance, Uluǧzoda (1930) proposes that ‹o› with a stroke through it be used
instead of ‹ů› on the ground that the latter is cumbersome to handwrite.
18 The changes in British English cited here are described in Trudgill and Hannah (2008: 17–18),
Hughes et al. (2013: 51), and Cruttenden (2014: 132–133), among others.

Shinji Ido

Buzurgzoda (1940: 4) identifies the speech of progressive intelligentsia congregating in the capital and that of progressive workers of Tajikistan as standard
Tajik. Similarly, Fajzov (1983: 61) presents the speech of intellectuals residing
in Dushanbe as representing the legitimate (standard) pronunciation of Tajik.
Furthermore, a monograph titled Fonetika tadžikskogo jazyka ‘the phonetics of
the Tajik language’ (Sokolova 1949) utilizes data obtained not from Bukharan or
Samarkandi Tajik but from another Northern Tajik dialect, namely that of Varzob,
a settlement situated 25 kilometres north of Dushanbe (in combination with supplementary data from three other dialects). Thus, in post-delimitation Tajikistan,
references to the dialects of Bukhara and Samarkand have generally been avoided
in relation to standard Tajik.
It may seem surprising, then, that there is every sign that the phonetic realization of phonemes in standard Tajik mostly coincided with that in Bukharan and
Samarkandi Tajik throughout the whole Soviet period (see, however, § 2.3.4). For
example, Melex (1968: 5) points to the near-complete correspondence in fonetika
‘phonetics’ between standard Tajik and the Northern Tajik dialect of Ǧižduvon,
a town situated in the outskirts of Bukhara and in between Bukhara and Samarkand. As will be explained in § 2.2.1, there are even records indicating that, during
the Soviet period, the phonetic realization of phonemes in standard Tajik changed
in coincidence with the changes that took place in Northern Tajik dialects.
How, then, did this enduring correspondence come about? The endurance
of the correspondence could be ascribed, at least in part, to the general intent of
Soviet-era Tajik linguists to bring standard Tajik into accord with the speech of
the residents of the capital city, because, as will be explained below, in the Tajik
SSR, the Tajik elite in Dushanbe spoke the Northern Tajik dialects of Bukhara,
Samarkand, and Khujand.
In the Tajik SSR, a number of Tajik phoneticians seemingly attempted to
locate the basis of standard spoken Tajik within Tajikistan, e.g., in the speech of
intellectuals based in Dushanbe, as exemplified above. Unfortunately, no comprehensive phonological or phonetic descriptions exist of the speech of Dushanbe-based intellectuals, or, for that matter, Dushanbe residents, in the twentieth
century (Fajzov’s study is limited to measurements of vowel lengths).19 Despite

19 Describing the speech of the residents of Dushanbe as a single monolithic entity may have
been unrealistic in the first half of the twentieth century, because the population of Dushanbe
consisted almost entirely of non-locals speaking different language varieties. Dushanbe, in 1924,
was only a small village with “forty-two houses and about 260 inhabitants” (“Stalinabad” 1954:
315). With its designation as the capital city of Tajikistan, the population of Dushanbe started
to grow to hundreds of thousands by the mid-20th century, when, in fact, Russians comprised
nearly half of the city’s population (Guboglo 1990: 27). In other words, for the good part of the

2 Standard Tajik phonology

this, it does not seem too far-fetched to assume that the pronunciation of Tajik
intellectuals in Dushanbe was in agreement with, or at least not at odds with,
that of Northern Tajik dialect speakers from Bukhara, Samarkand, and Khujand,
because “the Tajik elite well into the 1950s had been drawn from Bukhara and
Samarkand” (Kalinovsky 2018: 228) and elites from Khujand famously monopolized privileged positions in politics, commerce, and state management in the
Tajik SSR (Niyazi 1998: 148–149, 169; Tunçer-Kılavuz 2009: 327; Nourzhanov and
Bleuer 2013: 96; Kasymov 2013: 7). Anecdotal evidence such as remarks made
by the linguist Saxidod Xorkašev (quoted in “Ševa” [2005]) and Dodikhudoeva
(2004: 282) also suggests that, prior to the Tajik civil war, Dushanbe intellectuals’
spoken Tajik was that of Northern Tajik dialect speakers from the Khujand area.
In other words, in the Tajik SSR, Bukharan and Samarkandi Tajik and Khujandi
Tajik, which share basically the same phonetic realization of phonemes,20 represented the speech of Dushanbe intellectuals.
Given this, one can fairly reasonably assume that the aforementioned correspondence between standard Tajik and Bukharan and Samarkandi Tajik in
the Soviet period was really a correspondence between standard Tajik and the
Northern dialects of Bukhara, Samarkand, and Khujand. Because it represented,
at different times during the Soviet period, the speech of the elite in Dushanbe,
the standard phonetic realization of Tajik phonemes remained in agreement with
that of the Northern dialects of the major cities throughout the Soviet period.

1.3.3 Standard Tajik phonology in post-civil war Tajikistan
Regionalism in Tajik political life appears to have impacted the standard Tajik
phoneme inventory and pronunciation during the Soviet period (§ 1.3.2). This
leads one to wonder if there are any particular regions that supplied elites whose
speech had an impact on standard spoken Tajik in post-Soviet and post-civil war
Tajikistan.
In post-civil war Tajikistan, the elite Northerners from the Khujand region
no longer have a monopoly over privilege, which is now shared with people from
other regions, among them the most notable being the Kūlob region. As such, in
post-civil war Tajikistan, the Northern dialect of Khujand is purportedly in retreat
twentieth century, the residents of Dushanbe consisted of migrants from outside Dushanbe (including Russia). As a result, consensus among its residents on what constitutes acceptable pronunciation in Dushanbe probably did not form until the late 20th century.
20 To be sure, the phonetic realizations of phonemes in the three Northern Tajik dialects have
been in near, but not exact, coincidence. See footnotes for § 2.2.1.2.

Shinji Ido

vis-à-vis the Southern dialect of Kūlob; Wiegmann (2009: 51) writes about Southerners from the Kūlob area “introducing their dialect as the new ‘high Tajik’ used
on radio and television”.21
The question then arises as to whether the purported rise of the Southern
dialect of Kūlob to prominence is changing standard Tajik. While a degree of
uncertainty may be present on the part of intellectuals about what dialect standard (spoken) Tajik should emulate and to what degree, the acoustic analyses presented in § 2 show that the rise of the Southern Tajik dialect to prominence is
not (yet) much in evidence in standard spoken Tajik. In other words, the affinity
between standard spoken Tajik and the Northern dialects of Bukhara, Samarkand, and Khujand has so far remained largely intact.

1.3.4 Summary
The segmental phonology and orthoepy of standard Tajik were standardized in
the early twentieth century with those of the Northern Tajik dialects spoken in
Bukhara and Samarkand as their primary bases, with the result that those dialects provided standard Tajik with its phoneme inventory and the phonetic realization of the phonemes contained in that inventory. The affinity of Bukharan and
Samarkandi Tajik with standard Tajik persisted throughout the Soviet era, thanks
in large part to Khujandi Tajik, another Northern Tajik dialect, which apparently
represented Dushanbe intellectuals’ speech and exerted influence on standard
Tajik during the Soviet era. It remains to be seen whether the purported rise of
Kūlobi Tajik to prominence in post-civil war Tajikistan will affect post-civil war
standard Tajik phonology.

2 Phoneme inventory of standard Tajik
Tajik grammarians and linguists have often been in disagreement as to what set
of consonants and vowels adequately represents the phoneme inventory of standard Tajik. This is despite the standardization of Tajik that took place in the early
20th century, which effectively determined what sounds are (to be) contrasted in
standard Tajik (§ 1.3.1). In the case of the consonant phoneme inventory, the disa21 Other observations pertaining to the Kūlob dialect’s rise to prominence in Tajik politics and
the demography of Dushanbe include Dodikhudoeva (2004: 283), Saxidod Xorkašev (quoted in
“Ševa” 2005), Ido (2014: 88), and Grassi (2018: 218).

2 Standard Tajik phonology

greement derives largely from different ideas the Tajik grammarians and linguists
have had about which consonants of non-native origin should be accorded the
status of phonemes in standard Tajik. In the case of the vowel phoneme inventory, their disagreement stems mainly from the differing views they have about
which vowels are contrasted in Tajik.
In the subsections to follow, § 2.1 explains the Tajik speech corpus used
throughout this section, after which § 2.2 and § 2.3 describe the phoneme inventory of standard Tajik, paying special attention to those consonants and vowels
whose phonemic status has been either ambiguous or contentious.

2.1 The audio data
The present section not only describes the phoneme inventory of standard Tajik,
but also analyses the present-day realizations of certain phonemes that the inventory comprises. Accordingly, audio data are used from a Tajik speech corpus compiled from read speech of 15 residents of Dushanbe recorded during fieldwork in
2012. The recordings were made in sessions where the informants were requested
to read aloud several word lists, sentences including question-answer pairs, and a
Tajik translation of the ‘North Wind and the Sun’ story. Audio was recorded, using
Audix head-worn omnidirectional HT5 as a wired microphone, onto the Sony PCMD50. Most of the recording sessions took place in either quiet rooms or recording
booths within the headquarters of television and radio broadcasting stations.
In the remainder of this article, the audio recordings in the corpus are
divided, based on the informants’ linguistic and professional backgrounds, into
those of “Newsreaders”, “Announcers” (distinguished below), and other residents of Dushanbe. The first 2 data sets comprise recordings of 4 newsreaders (1
female and 3 males) and 8 announcers (4 females and 4 males), who all work at
broadcasting stations based in Dushanbe and speak on air as part of their job. The
3rd data set comprises recordings from other Dushanbe residents (1 female and 2
males) whose primary work could not be confirmed to involve speaking on air.

2.1.1 “Newsreaders”
The first set comprises recordings that belong to 1 television newsreader and 3
news radio announcers. The 4 informants were 1) a female editor and announcer,
born 1971, working at the Tajik language department of Xovar, the government-run national news agency, which operates its own radio station, 2) a male
radio announcer, born 1971, working at Imrūz, a private news radio broadcasting

Shinji Ido

station, 3) a male television newsreader, born 1982, working at Safina, a state-run
television broadcasting station, and 4) a male radio editor and announcer, born
1986, working at the news report department of Xovar, at the time of fieldwork.
In the remainder of this article, these four informants will be referred to respectively as F71, M71, M82, and M86, and collectively as the “Newsreaders”. Both
F71 and M71 were born and raised in Dushanbe, while M82 and M86 were born
and brought up in Almosī in the Hisor district and Čūzī in the Šahrinav district,
respectively. They all have native proficiency in Tajik and stated that they speak
only Tajik with their parents.
During recording sessions with the Newsreaders, they were requested to
read the word lists and text aloud in the same way they would when they are on
air. This was to elicit their pronunciation in broadcast speech rather than their
speech in everyday conversation.
The recordings of these 4 informants are classified separately from those of
the other announcers (§ 2.1.2), because newsreaders can be expected to gravitate
more strongly towards pronunciation that warrants high social acceptability than
do other types of broadcasters such as reporters and presenters.22 Accordingly,
in this article, where appropriate, the recordings obtained from the Newsreaders will be identified as most closely representing present-day standard spoken
Tajik.

2.1.2 “Announcers”
The second set consists of recordings from 8 (4 females and 4 males) broadcasters (reporters and presenters, among others) who, at the time of fieldwork,
were working at various Dushanbe-based radio and television broadcasting stations and declared to have on-air experience at their respective Dushanbe-based
national broadcasting stations. Unlike the Newsreaders, these 8 informants’ work
could not be confirmed to comprise a large component of news script reading.
They will be called collectively as the “Announcers” hereafter. All the Announcers, like the Newsreaders, said that they have Tajik as their first language and
speak only Tajik with their parents.

22 This is probably the case also with newsreaders in a number of other countries. For instance,
Schiffman (1998: 368) notes that “much wider tolerance is now permitted in the pronunciation of
standard Englishes [. . .] although there seems to be less tolerance in news broadcasting”.

2 Standard Tajik phonology

2.1.3 Other residents of Dushanbe
The third set contains recordings from three supplementary informants consisting of two academics, one female and one male, and one male journalist, all
residing in Dushanbe. The male academic stated that he speaks Tajik with his
living parent and would also speak it with his now deceased parent, while the
female academic stated that she speaks both Tajik and Uzbek with her parents.
The male journalist said that he uses Shughni in addition to Tajik with his mother
and only Tajik with his father. Herein, audio data obtained from the two academics and one journalist are used only for Figures 9B and 10.

2.2 Vowels
The standard Tajik vowel system is fairly unremarkable in terms of size and the distribution of vowels in the (acoustic) vowel space. It has one of the more common
configurations of vowel phonemes among many languages of the world (see Becker-Kristal 2010: 191). From an articulatory point of view, its utilization of a central
rounded vowel may be less usual among many languages of the world, though
another language variety in Central Asia, namely Uzbekistan Arabic, has been
reported to utilize the same central rounded vowel (Zimmerman 2008: 614–615),
as have some Eastern Iranian languages spoken in Tajikistan (Novák 2013: Ch. 1).23
Synchronically and prescriptively, the vowel system of standard Tajik contains the six vowels schematically arranged in Figure 2, which is reproduced from
Kerimova (1997: 97). Figure 2, in which vowels are represented by symbols based
on Cyrillic letters, represents the vowel system that has been prescribed in grammars and generally accepted as standard since the late-20th century (e.g. Fajzov
1985: 26; Xaskašev 1985: 26–27, 1983: 64; Efimov et al. 1982: 21).

y◦

y
o

a
Figure 2: Standard Tajik vowels. Figure reproduced from Kerimova (1997: 97).

23 Judging from Efimov’s (1965: 12) description, Hazaragi (spoken in Afghanistan) may also
have [ʊ] or a central rounded vowel as a phoneme .

Shinji Ido

The vowels in Figure 2 represented by Cyrillic “и”, “e”, and “а” are unrounded
while those represented by Cyrillic “о”, “у”, and “ẙ” are classified as rounded
vowels (Xaskašev 1983: 71–75, 1985: 27).24 Accordingly, the symbols “и”, “e”, “а”,
“у”, “ẙ”, and “о” in Figure 2 can be interpreted as representing vowels approximating to [i], [e], [a], [u], [ө], and [o], respectively. The vowel system of standard
Tajik can hence be represented as one consisting of /i/, /e/, /a/, /u/, /ө/, and /o/
(Figure 3).

Figure 3: Standard Tajik vowels with example words. The bracketed vowels are explained
in §§ 2.2.1, 2.2.3.

Yet, although synchronically true, the statement that the standard Tajik vowel
system consists of /i/, /e/, /a/, /u/, /ө/, and /o/ belies the heated debate that took
place among Tajik intellectuals in the early 20th century about what vowels should
be contrasted in standard Tajik. The bracketed close vowels in Figure 3, namely
[iː] and [uː], represent the vowels whose phonemic status was once advocated by
a number of intellectuals and linguists (§ 2.2.3). Indeed, as will be explained in
the following subsections, standard Tajik initially (i.e., at its inception in the early
20th century) had a different vowel system from the one shown in Figure 2. The mid
rounded vowels in square brackets (Figure 3), namely [ʊ] and [ɔ], respectively represent the approximate positions within the standard Tajik vowel system that /ɵ/
and /o/ once occupied, and, in the case of the former, may still marginally occupy
(§ 2.2.1).
There will be an explanation of the mid rounded vowels and long close
vowels in § 2.2.1 and § 2.2.3. In addition, a brief discussion on pharyngealization
affecting formant frequencies in standard Tajik will be given in § 2.2.2.

24 Tajik grammarians differ in how many degrees of frontedness they identify in open vowels.
As a result, /a/, which is simply open or open front (as distinct from back) for some grammarians
(e.g. Kerimova, Sokolova, Rastorgueva, Efimov) is open central (as distinct from both front and
back) for others (e.g. Xaskašev and Fajzov). Here the phoneme is simply regarded as /a/, i.e. an
open unrounded vowel that is not (fully) back.

2 Standard Tajik phonology

2.2.1 Shifting of mid rounded vowels
The mid rounded vowels in present-day standard Tajik can be aptly represented
by /ɵ/ and /o/ (§ 2.2.1.3). However, in the early 20th century, they were /ʊ/ and /ɔ/,
respectively. As will be explained later, the shifting of /ʊ/ to /ɵ/ and that of /ɔ/
to /o/ can be interpreted as constituting part of the Northern Tajik Chain Shift,
which dislocated Early New Persian ō and ā to present-day standard Tajik /ɵ/ and
/o/, respectively.
In this subsection, the chain shift consisting of /ʊ/→/ɵ/ and /ɔ/→/o/ in
standard Tajik is explained in detail, after which the audio data of present-day
standard Tajik is analysed for the present-day phonetic realization of the vowels
in question.
2.2.1.1 /ʊ/→/ɵ/ and /ɔ/→/o/ in standard Tajik
Descriptions of and references to the mid rounded vowels found in early writings
about Tajik differ from those found in later works. For example, in 1927, Fitrat, a
prominent Bukharan intellectual who wanted the Tajik Latin-based alphabet to
be as similar as possible to its Uzbek counterpart (Fitrat 1928b: 9), proposed that
‹o› and ‹a› be used to represent the Tajik phonemes that correspond to presentday standard Tajik /ɵ/ and /o/, respectively (Fitrat 1927).25 This proposal suggests
that present-day standard Tajik /ɵ/ and /o/ were respectively more back (perhaps
like [ʊ̠ ]) and open (presumably like [ɔ]) in the Tajik of the late 1920s, because ‹o›
and ‹a› in the Uzbek Latin-based alphabet in the making in the late 1920s (Jamolxonov and Sapaev 2007: 31–35) represented Uzbek vowels that were intermediate
between Russian /u/ and /o/ (Gromatovič 1930: 3)26 and intermediate between
Russian /o/ and /a/ (Gromatovič 1930: 2),27 respectively.
Similarly, a description of phonetic characteristics of Tajik based mainly on
Samarkandi Tajik data collected in 1927 identifies the same phonemes as close
“o” and open “o” (Orfinskaja 1945: 88); in other words, it describes the phonetic
realizations of the phonemes in question as vowels approximating to [ʊ]/[o̝ ] and
25 See also Polivanůf (1934: 21) where he writes that ‹o›, ‹ů›, and ‹a› used in writing zaвoni
jahudihoi mahali ‘the language of local (i.e. Bukharan) Jews’ (Ido 2017: 85–88) correspond with
Uzbek ‹a›, ‹o›, and ‹ə›, respectively.
26 This description (i.e. “intermediate between Russian /u/ and /o/”) is in agreement with
Fitrat’s characterization of Tajiks’ pronunciation of the vowel in ‹‫›زور‬. Fitrat characterized the
vowel in ‹‫›زور‬, which is zūr /zɵɾ/ ‘strength’ in present-day standard Tajik, as u (damma) that is
close to Russian /o/ (Fitrat 1928a: 14).
27 According to Fitrat, Tajik shared with Uzbek the same vowel, namely the vowel that was
intermediate between Russian /o/ and /a/, hence his choice of ‹a› for the Tajik vowel phoneme.

Shinji Ido

[ɔ], respectively. Orfinskaja (1945: 89–90) also notes a diphthongal and fronted
realization of the former, and transcribes it as “ᵘo˫”, which might represent something like [u̯ o̟ ] or [ʊ̯ o̟ ] in today’s IPA.28 Orfinskaja’s description, then, is an indication that present-day standard Tajik /ɵ/ and /o/ were more retracted and lower
than they are today and hence were [ʊ]-like and [ɔ]-like, respectively, in the Tajik
of the late 1920s.
Furthermore, in the late 1920s, Russian /o/ would often be represented in the
Tajik Latin-based alphabet of the time by ‹ů› (Appendix 1), the letter for the Tajik
phoneme that corresponds with present-day Tajik /ɵ/ (Odilzoda 1930; Diyokuv
1930). This allows the assumption that the phoneme which is /ɵ/ today had a
back realization in the early 20th century Tajik.29 Indeed, “O novom” (1928: 244)
explains that ‹ů› represents a (Tajik) vowel that is intermediate between (Russian)
“o” and “u”.
These records indicate that, in early 20th-century Tajik, the phonetic realization of the vowel phoneme that corresponds with present-day Tajik /ɵ/ approximated to [o] or [ʊ], while that which corresponds with present-day Tajik /o/
was relatively more open like [ɔ]. Accordingly, selecting somewhat arbitrarily a
symbol for each phoneme, the early 20th-century (Northern) Tajik vowel system
may be represented as something like /i e a u ʊ ɔ/.
This early 20th-century Tajik vowel system was adopted into standard Tajik
apparently “as is”; as a result, the early 20th-century standard Tajik vowel system
consisted of /i e a u ʊ ɔ/. That the mid rounded vowels in the standard Tajik vowel
system were initially not /ɵ/ and /o/ but /ʊ/ and /ɔ/ is evident from descriptions
of standard Tajik vowels in textbooks published in the first half of the 20th century.
For example, a textbook published in 1932 characterizes the vowel represented by ‹ū›, the letter in the 1930 Latin-based alphabet (Appendix 2) for the early
20th-century standard Tajik phoneme that corresponds with present-day standard
Tajik /ɵ/ (Table 1), as very narrow and perceptively intermediate between “u” and
“o” (Toşpūlotuf et al. 1932: 16), perhaps like [ʊ], [ʊ̠ ] or [o̝ ]. Perhaps most tellingly, in
Fonetikaji zaвoni adaвiji toçik ‘the phonetics of literary/standard Tajik’ published in
1940, Buzurgzoda (1940: 40–42) describes the two mid rounded vowel phonemes
as back vowels, representing them by “ʊ∥ọ”30 and “ɔ” (Figure 5A; [ọ] in the IPA of

28 Note that the IPA in 1945 did not have [◌̟] or [ʊ] as a symbol.
29 Phonetic realizations of Russian /o/ include [ᶷo], [ᶷɔᶺ], or [ɑ̟ ], according to Yanushevskaya
and Bunčić (2015: 225).
30 Buzurgzoda (1940) uses the symbol ∥ here presumably as a notation for logical disjunction
(i.e. “or”).

2 Standard Tajik phonology

1938 represents a very close [o]).31 This strongly suggests that early 20th-century
standard Tajik has /ʊ/ and /ɔ/ where present-day standard Tajik has /ɵ/ and /o/.
Thus, the standard Tajik vowel system evidently underwent a shift consisting of /ʊ/→/ɵ/ and /ɔ/→/o/, with the result that it now is a system consisting not
of /i e a u ʊ ɔ/ but of /i e a u ө o/ (Figures 2 and 3). In other words, the standard
Tajik mid rounded vowels underwent a chain shift where early 20th-century Tajik
/ʊ/ was fronted and early 20th-century Tajik /ɔ/ was raised.32 The chain shift is
schematized in Figure 4. Figure 4 Inset A incorporates transcriptions33 of the late1920s phonetic realizations of the phonemes in question as they appear in the
works cited above. [ʊ̜] is incorporated in Figure 4 Inset B because it seems to be
acceptable, at least marginally, as a standard phonetic realization of /ɵ/ in present-day standard spoken Tajik (see § 2.2.1.3).
A

Figure 4: Chain shift that took place in standard Tajik after its inception in the early 20th century.
A early 20th-century standard Tajik mid rounded vowels; B present-day standard Tajik mid
rounded vowels.

The chain shift that took place in standard Tajik, which might be referred to as
the Standard Tajik Chain Shift, is also visible in Figure 5, where different vowel
charts depicting the standard Tajik vowel system are aligned in the chronological
order of publication.
A comparison of the charts in Figure 5 reveals the fronting of the phoneme
corresponding with present-day standard Tajik /ɵ/ (represented variously by

31 Buzurgzoda (1940: 40) does write that the tongue is raised towards kom ‘palate’ in the phonetic realizations of /ʊ/, which could suggest fronting of the tongue, but judging from his description of [u] (Buzurgzoda 1940: 39), during the articulation of which, according to him, the
tongue is raised in the direction of kom, what he calls kom probably comprises not only the palate
but also the uvula (and possibly also the pharynx).
32 That the chain shift consisting of /ʊ/→/ɵ/ and /ɔ/→/o/ took place only recently, i.e., in the
20th century, is consistent with the fact that the phonemes under discussion are transcribed as
/uɔ/ and /ɑ/~/uɑ/, respectively, in huihuiguan zazi, a Timurid Persian (a 15th-century Samarkandi
variety of New Persian)-Chinese glossary compiled in Ming China (Ido 2015).
33 Transcriptions are by the author.

Shinji Ido
A
i i:

B
u u:
i(ī)

ʊ || o. (ů)

a
D

C
N

y°

є
o
a

u(ū)

(ů)

o^
a

Figure 5: Vowel triangles/trapezoids published in A Buzurgzoda (1940: 42), B Sokolova
(1949: 19), C Rastorgueva (1955: 25), and D Efimov et al. (1982: 21).

“ʊ∥ọ”, “ů”, and “ẙ”) as well as the raising of early 20th-century /ɔ/ to the position of /o/ (represented by “o” and “ô”). Note that Figure 5 Inset D represents the
current standard Tajik vowel system shown in Figure 2, meaning that the chain
shift had reached its culmination by the late 20th century.
Thus, the end product of the Standard Tajik Chain Shift appears to have been
generally accepted by linguists and Tajik speakers alike by the late 20th century;34
hence the presence in various late-20th century and present-day grammars and
textbooks (e.g., Efimov et al. 1982: 21; Xaskašev 1983: 64, 1985: 26–27; Fajzov 1985:
26; Kerimova 1997: 97) of the vowel system that Figure 2 typifies.
2.2.1.2 The Northern Tajik Chain Shift
The Standard Tajik Chain Shift can in fact be identified as constituting the latter
half of the Northern Tajik Chain Shift (Figure 6). The Northern Tajik Chain Shift
is a chain shift in which Early New Persian ō was fronted to the position of present-day Tajik /ɵ/ perhaps through /ʊ/, while Early New Persian ā was raised to
the position of present-day Tajik /o/ through /ɔ/. As its name suggests, it is a
chain shift that took place in Northern Tajik dialects.35
34 Arzumanov and Džalalov’s (1969: 116–117) description of the standard Tajik vowel system
published in 1969 agrees with the vowel system shown in Figure 2.
35 Although the Northern Tajik Chain Shift certainly took place in Bukhara and Samarkand,
as it evidently did also in Panjakent, Iskandar, and Khujand (Orfinskaja 1945: 90), the exact
geographical, societal, and temporal extent of its spread is unknown. Apparently, the shifting
of vowel phonemes in the chain shift has not been constant or synchronous even among major

2 Standard Tajik phonology
A
Early New Persian

B
Early 20th-century Tajik

C
Present-day Tajik

c
ā
Figure 6: Northern Tajik Chain Shift reproduced from Ido (2017: 97) with modification.

The resemblance of the Standard Tajik Chain Shift (Figure 4) to the latter half
of the Northern Tajik Chain Shift (Figure 6 Insets B and C) seems appreciable.
The identification of the Standard Tajik Chain Shift as part of the Northern
Tajik Chain Shift explains why the former took place in standard Tajik in the first
place in spite of the standardization that had prescribed in the early 20th century
that standard Tajik vowel phonemes be /i e a u ʊ ɔ/ (and not /i e a u ө o/ as they
are in present-day standard Tajik).
Had Tajik government officials tried to prevent the standard vowel system
from undergoing the chain shift, the vowel system comprising /i e a u ʊ ɔ/ might
have persisted as the standard Tajik vowel system to date. However, in reality,
no preventive measures appear to have been implemented by officials; one can
speculate that the Northerners’ dialects played a role here as it did in the early
20th century (§ 1.3). Given the fact that the elite in the Tajik SSR were speakers of
major Northern Tajik dialects (§ 1.3.2), in which the Northern Tajik Chain Shift was
underway in the 20th century, it seems conceivable that the Standard Tajik Chain
Shift took place because changes in major Northern Tajik dialects would permeate easily into standard spoken Tajik during the Soviet era.
2.2.1.3 The current situation
In this subsection, I describe the phonetic representation of the 6 standard Tajik
phonemes in present-day standard Tajik, paying particular attention to the status
of /ɵ/ and /o/, which underwent major changes in the last century (§ 2.2.1.1).

Northern Tajik dialects. For example, Khujandi Tajik reportedly spearheaded the fronting of /o/
to /ө/ among Northern Tajik dialects (Orfinskaja 1945: 90), while the fronting of early 20th-century Tajik /ʊ/ may still be in progress among certain groups of Northern Tajik dialect speakers.
For example, one female Bukharan informant who was in her late 30s at the time of fieldwork in
2013 consistently produced an [ø̞ ]-like vowel for /ɵ/, which was realized as a close-mid central
vowel by most other Bukharans (Ido 2014, 2018). In addition, Bukharan males’ /ɵ/ tended to be
realized in the range of [ʊ] to [ɵ]. A similar gender imbalance in the degree of frontedness is also
observable in the production of the phoneme in question in Jewish Bukharan Tajik (Ido 2017).

Shinji Ido

First, observe Figure 7 in which the F1 and F2 values of vowels produced by
the Newsreaders and Announcers (§ 2.1) are shown in scatterplots. The vowels
whose formant frequency data are shown in Figures 7–10 were produced in isolation as well as in the test words of saxt /saχt/ ‘hard’, se /se/ ‘three’, sī /si/ ‘thirty’,
sū /sө/ ‘side’, sūxt /sөχt/ ‘s/he burnt’, soxt /soχt/ ‘s/he made’, and Suǧd /suʁd/
‘Sughd (province)’.36 The allophonic variation in /u/ and /o/ that is evident from
the F1 values of the vowels in /suʁd/ and /soχt/ will be explained in § 2.2.2.
Scatterplots in Figure 7 are in agreement with the vowel system presented
in Figures 2 and 3. This indicates that, in standard spoken Tajik, unsurprisingly,
vowel phonemes are realized as per the prescribed standard vowel system.
However, the agreement conceals the disproportionately wide variation in
the phonetic realization of one particular phoneme, namely /ө/. The phonetic
realization – in particular the second formant – of the phoneme /ө/ exhibits particularly wide variation among the Newsreaders and Announcers. The width of
the variation is quantitatively analysed later in the present subsection.
One may expect the variation in the phonetic realization of /ө/ to be minimal
among the Newsreaders. However, the present acoustic analysis shows that this
is not necessarily the case. The high variability of the second formant of /ɵ/ manifests itself even among the Newsreaders’ pronunciation. Observe, for instance,
in Figure 8, that while F71 and M71, both of whom are from Dushanbe, coincide
almost completely in their production of vowels in isolation, /ө/ is noticeably
more advanced in M82’s pronunciation than in F71 and M71’s pronunciation in
the F1-F2 plane.
Thus, even the Newsreaders are not immune to the inter-speaker variation
in the phonetic realization of the phoneme /ө/. It is unclear what induces this
inter-speaker variation, but it may be partially due to the fact that some of the

36 Two of these words, namely se /se/ and sī /si/ were randomized along with 11 other monosyllabic numerals to form 4 different lists of words. The other 5 test words were similarly randomized with 21 other monosyllabic words to form another 4 lists of words. Consequently, a total
of 4 repetitions per informant was recorded for each test word. Each informant produced the
vowels in isolation 4 times by reading aloud 4 differently randomized lists of the letters ‹и›, ‹э›,
‹а›, ‹у›, ‹ӯ›, and ‹о›. Some male Announcers produced certain vowels in excess of, or less than,
4 times. As a result, the number of tokens obtained from male Announcers for a vowel varies
between 33 and 35 (Appendix 3). Regardless, all the tokens obtained from the Newsreaders and
Announcers are plotted in Figure 7. The formant frequency data used here are mean frequencies
at the middle 50 milliseconds of each vowel duration, which were obtained automatically using
a script in Praat (Boersma and Weenink 2015). The data plotting and analysis were carried out in
R (R Development Core Team 2019) using the phonR (McCloy 2016), cowplot (Wilke 2019), and
psych (Revelle 2018) packages.

2 Standard Tajik phonology

Figure 7: F1 and F2 values in Hz of the Tajik vowels /i e a u ө o/ produced in isolation and in the
test words of /saχt, se, si, sө, sөχt, soχt, suʁd/ by the Newsreaders and Announcers of whom A
5 are females and B 7 are males. Larger points represent means.

Newsreaders and Announcers do not have the central close-mid rounded vowel
natively, which is absent in many non-Northern Tajik dialects (Rastorgueva
1964: 31–41).37 (See Figure 10 Inset A for an example of the Northern Tajik vowel
system from Khujandi Tajik which does have [ө] for /ө/.) Moreover, not all dialects
customarily classified as Northern Tajik dialects or subtypes thereof have [ө] for
/ө/ either, meaning that the phonetic realization of /ө/ is not necessarily uniform,
even among Northern Tajik speakers.
For example, a radio announcer from the Zirakī village in the Hisor district
produces [ʊ] for /ө/ (Figure 9 Inset A), which conceivably is the “default” sound
he uses for /ө/ in his native dialect (see Uspenskaja 1962: 12 for a description of
/ө/ in the Hisor “mountain” dialect).38 Note that M82, a television newsreader
who produces a distinctively advanced [ө] for /ө/ (Figure 8), is from Navrūzī, a
village only a kilometre south of Zirakī.39 Their different realizations of /ө/ could
be ascribed to the possible absence of the close-mid central rounded vowel in
their native dialect(s).

37 Tellingly, during fieldwork, a radio programme presenter, whose data are not used in this article, indicated in apparent embarrassment that she was not good at producing standard Tajik /ө/.
38 An apparent ‘native variety-to-standard variety’ transfer of this kind has been reported for
Osaka Japanese speakers’ realization of standard Japanese /ɯ/ (Fujisaki and Hasegawa 1983;
Fujisaki et al. 1983; Sugitoh 1997: 4).
39 This allows the speculation that M82’s advanced /ө/ results from hypercorrection.

Shinji Ido

Figure 8: F1 and F2 values in Hz of the Tajik vowels /i e a u ө o/ produced in isolation and in the
test words of /saχt, se, si, sө, sөχt, soχt, suʁd/ by the 4 Newsreaders F71, M71, M86, and M82
(clockwise from top left). Larger points represent means.

Thus, while a pronunciation norm seems to exist which dictates that Tajik
vowels should be pronounced in standard Tajik in accordance with the vowel
system shown in Figure 2, considerable variation in the phonetic realization of
/ɵ/ is observed in standard spoken Tajik.
It should also be noted that, notwithstanding the pronunciation norm, people
in Dushanbe, in general, are not unanimous in accepting the current standard pronunciation or feel obliged to emulate it in their own pronunciation. For example,
in one recording session, a male informant, who is an academic originally from
the Qulmunda village in the Hisor district, and who occasionally appears in the
media, used a non-standard vowel system in which /ɵ/ is realized as a fully back

2 Standard Tajik phonology

Figure 9: F1 and F2 values in Hz of the Tajik vowels /i e a u ө o/ produced in isolation and in the
test words of /saχt, se, si, sө, sөχt, soχt, suʁd/ by A a male radio (Sadoi Dushanbe)40 announcer
from the Zirakī village in the Almosī area of the Hisor district and B a male academic affiliated
with the Tajik Academy of Sciences from the Qulmunda village in the Dehqonobod area of the
Hisor district. Larger points represent means.

vowel without any apparent inhibition (Figure 9 Inset B). Another informant, a
male journalist, declared his dislike of the standard phonetic realization of /ө/,
adding that he would pronounce it as [uː] even when he speaks on the air. Incidentally, his insistence on producing [uː] for /ө/ conversely bespeaks the prescribed
status of [ө] as the standard realization of /ө/. The journalist, whose vowel system
is presented in Figure 10 Inset B, is from Ǧarm, the dialect of which lacks the closemid central rounded vowel.41
It was stated earlier in this subsection that there would be a discussion of
how wide the variation in F2 is in the phonetic realization of /ө/. The width of the
variation can be observed in Table 2, which shows that the standard deviation for
the second formant of /ө/ is more than double that of any other standard Tajik
vowel phonemes.
Articulatorily, much of this variation in F2 can be ascribed to inter-speaker variation in the place of constriction within the palatal region, because F1 and F3 for
/ө/ are both fairly constant among the informants (Table 3) and at the values suggestive of lip rounding (Ericsdottir 2005: § 8.4; Lindbolm and Sundberg 2014: 746).

40 Sadoi Dušanbe is a state-run radio broadcasting station.
41 This characteristic of the Ǧarm dialect has been known since the early 20th century (see
Polivanůf 1934: 20).

Shinji Ido

Figure 10: F1 and F2 values in Hz of the Tajik vowels /i e a u ө o/ produced in isolation and in the
test words of /saχt, se, si, sө, sөχt, soχt, suʁd/ by A a Tajik-Uzbek bilingual female university
lecturer from Guliston, a town situated fifteen kilometres east of Khujand, and B a TajikShughni bilingual male journalist from Ǧarm. Larger points represent means.

This said, given that F2 values of non-front vowels are highly susceptible
to lip rounding, the variation in F2 mentioned above may be ascribed in part to
inter-speaker variation in lip rounding; after all, Bobomurodov (1978: 6) observes
that /ɵ/ is less rounded than /u/ in standard Tajik. It is interesting to note in this
respect that some Tajik speakers do seem to pronounce /ɵ/ as a [ʊ̜]-like vowel, a
vowel that is less rounded than [ө]. This may not be surprising given that many
Tajik dialects lack the close-mid central rounded vowel (recall that the close-mid
central rounded vowel is largely unique to Northern Tajik dialects) and rounded
non-back vowels can be difficult to distinguish from unrounded or moderately
rounded back vowels for those who are not familiar with such vowels.42 In addition, diphthongization seems to be present in certain contexts of some Tajik
speakers’ realization of /ө/, which, naturally, is also a potential source of variation in F2.43

42 For example, as noted by Lewin (2018: 174), in an experiment conducted by Ladefoged (1967:
133–141), phoneticians’ judgments of Scottish Gaelic “/ɯ(ː)/ and /ɤ(ː)/ varied greatly in degree of
perceived rounding and backness”.
43 Diphthongal realization of /ɵ/ has been reported for different varieties of Tajik (e.g., in
Buzurgzoda 1940: 40; Orfinskaja 1945: 89; Rastorgueva 1956: 13–14). See also § 2.2.1.1.

2 Standard Tajik phonology

Table 2: Mean F2 values in Hz of the Tajik vowels produced
by the 5 female (n. of tokens: 20 per vowel) and 7 male
(n. of tokens: 33 to 35 per vowel) Newsreaders and Announcers.
Female
/i/
/e/
/a/
/o/
/u/
/ө/

Male

Mean

2839
2577
1546
917
766
1651

163
196
180
101
122
400

2285
2016
1288
879
776
1421

136
101
91
114
134
304

Table 3: Mean F1, F2, and F3 values in Hz of the Tajik vowel
/ө/ produced by the 5 female (n. of tokens: 20) and 7 male
(n. of tokens: 33) Newsreaders and Announcers.
Female
F1
F2
F3

Male

Mean

466
1651
2933

68
400
209

401
1421
2521

60
304
160

2.2.1.4 Summary
The mid rounded vowels shifted in standard Tajik in the 20th century. The shifting of the mid rounded vowels comprises fronting and raising, i.e., /ʊ/→/ɵ/ and
/ɔ/→/o/. It can be identified as constituting part of the Northern Tajik Chain Shift,
which in standard Tajik culminated by the late 20th century in the establishment
of the vowel system comprising the six vowels /i e a u ө o/.
There is considerable variation in the phonetic realization of /ө/ among the
standard Tajik-speaking informants. The articulatory variation might involve not
only the place of articulation but also the degree of lip rounding and/or diphthongization.
While standard spoken Tajik generally adheres to the vowel system presented
in grammars and textbooks, Dushanbe residents today generally do not seem to
feel obliged to use the standard vowel system in their speech.

Shinji Ido

2.2.2 Pharyngealization
Figures 7–10 indicate allophony of back vowels involving uvular fricatives. The
F1 of the /u/ in /suʁd/ is higher than that of /u/ produced in isolation, as is, to a
lesser degree, the F1 of the /o/ in /soχt/ in comparison with that of /o/ produced
in isolation. On the other hand, it has been previously reported that, in the adjacency of a uvular consonant, the close front vowel phoneme /i/ is realized as a
more central and open allophone (Sokolova 1949: 21; Ido 2012: 22–23).
These seemingly disparate phenomena can be the result of the constriction
of the vocal tract at the pharyngeal region which Shahin (2002: 24) suspects is
corollary to uvular articulation,44 because pharyngeal constriction (Tiede 1996;
Fulop et al. 1998; Fulop and Warren 2014) has been associated with raising of F1
and lowering of F2 in various acoustic studies on languages utilizing uvular consonants such as Arabic (Harrel 1957: 46; Bin-Muqbil 2006: 41–45), Interior Salish
(Bessel 1998: 5–6; Flemming et al. 2008), and West Greenlandic (Wood 1971). The
classical study by Chiba and Kajiyama (1941) also corroborates this association.
However, admittedly, the involvement of pharyngeal constriction in the allophonic realizations of the phonemes in question cannot be positively established
without examinations of other vocal tract modifications.45

2.2.3 Length contrast in close vowels
This subsection reviews facts and claims made about vowel length distinction
in standard Tajik, following which it analyses audio data taken from the Tajik
speech corpus (§ 2.1), in order to see how or if distinction is made between “long”
and “short” close vowels in present-day standard (spoken) Tajik. There have
been conflicting views on which vowels should be contrasted in standard Tajik.
Vowel length was a point of contention during the period of the standardization
of Tajik, when Tajik intellectuals were in disagreement as to whether [iː] and [uː]
should be credited the status of phonemes distinct from /i/ and /u/ in standard

44 See also Delattre (1971) who describes the ‘backing-and-rising’ motion of the tongue observed
in the production of uvular consonants in Arabic, German, Spanish, and French.
45 For instance, in terms of the three-parameter model of vowel production by Stevens and
House (1955), these allophonic variations would be analysed as resulting from differing degrees
of mouth opening with the constriction of the vocal track fixed at 6 to 7 centimetres from the glottis for /o/ and /u/, and at around 9 centimetres (with the cross-sectional area at the constriction
being 0.3 cm2), or 11 centimetres (with the cross-sectional area at the constriction being 0.4 cm2)
from the glottis for /i/.

2 Standard Tajik phonology

Tajik, and whether they should receive unique orthographical representations
(Fitrat 1927; Halimzoda 1929; Baqozoda 1930; Ismatī 1930; ‘Ayn 1930; Odilzoda
1930; Sulaymonova 1930; Uluǧzoda 1930).46 Tajik intellectuals in the early 20th
century eventually decided that the Tajik alphabet should be rid of unique letters
for [iː] and [uː], thereby dismissing vowel length as phonologically irrelevant in
standard Tajik (Table 1). Since then, the general consensus among Tajik linguists
on vowel length has been that no phonological vowel length distinction needs to
be incorporated into the phonology of standard Tajik; Xaskašev (1983: 63) goes
as far as stating that no minimal pairs where vowel length is phonologically distinct can be found in modern standard Tajik. As a result, prescriptively, standard
Tajik today does not contrast vowel length in its phonology. Nevertheless, heated
debates about length contrast in close vowels would resurface from time to time
(see Karimov 1973, 1982; Fajzov 1983: 60), long after the vowel system that contrasts six vowels (Figure 2) had generally been accepted as standard.
2.2.3.1 Facts in favour of /iː/ and /uː/
There are some facts that are in favour of the admission of [iː] and [uː] as phonemes
in standard Tajik. First, the predecessor of Tajik, namely Early New Persian, has
short i and long ī, as well as short u and long ū as its phonemes (Miller 2012: 165);
in other words, vowel length is phonologically distinctive in Early New Persian
close vowels. The debates about length contrast in close vowels are therefore
about whether the historical vowel length distinction – is still intact in Tajik and
hence – should be incorporated into the segmental phonology of standard Tajik.
Second, [iː] and [uː] do occur in a number of Tajik dialects, among which are
the ones that served in the early 20th century as the dialectal basis for standard
Tajik, namely the Northern Tajik dialects of Bukhara and Samarkand.47 Reports

46 Indeed, in the standardization of Tajik, representing vowels in writing was a constant source
of contention as much as it was a centre of attention for Tajik intellectuals (Kumitai markazii
alifboi navi tojikī 1929). The establishment of the new alphabet was arguably the single most
debated issue in the orthographic standardization of Tajik, because it was closely connected to
various other contentious issues, such as the selection of a variety (or varieties) that provides the
basis for standardization and determination of the phoneme inventory of standard Tajik.
47 This resulted in an interesting statement about the letter ‹u› (that represents /u/ in the Latin-based Tajik alphabet established in August 1930; see Table 1) made in the textbook of Tajik
published in 1932, which is to the effect that vowel length distinction is not phonological, though
in pronunciation there should be vowel length distinction (Toşpūlotuf et al. 1932: 14). Incidentally, the same textbook treats the contrast between short and long close front vowels (both represented graphemically with ‹i›) as semantically relevant, citing only one minimal pair, namely
bino /bino/ ‘building’ vs. bino /bi[ː]no/ ‘sighted’.

Shinji Ido

abound of words in which historically long close vowels are pronounced long
in the Northern dialects of Bukhara and Samarkand (e.g., Begbudi 1956: 5; Kerimova 1959: 6). Indeed, the Latin-based alphabet that was in limited use (along
with the Perso-Arabic alphabet) between 1928 and 1930 in the nascent Tajik publishing industry is equipped with 4 letters for short and long close vowels, namely
‹i›, ‹ī›, ‹u›, and ‹ū› (Appendix 1, Table 1).
In addition, the importance of the existence of vowel length distinction in
poetry recitation has often been invoked by the proponents of admitting [iː] and
[uː] as phonemes and/or representing them graphemically in standard Tajik
(e.g., Karimov 1982: 87). Apparently, the professed importance of vowel length
distinction in poetry recitation was met with general apathy, if not resistance
(e.g., A”lozoda 1930: 6–7).
2.2.3.2 Facts in favour of [iː] and [uː]
The reason why vowel length is dismissed as phonologically irrelevant in standard Tajik is obvious; semantically relevant contrasts between short and long close
vowels are rare in Northern Tajik dialects. Apparently, minimal pairs involving
vowel length distinction were already rare when attempts at standardizing Tajik
commenced in the early 20th century, with both Sulaymonova (1930) and Uluǧzoda
(1930) providing only the oft-cited bino /bino/ ‘building’ vs. bino /bi[ː]no/ ‘sighted’
as the minimal pair. The bino vs. bino pair is one of few vowel length-induced
minimal pairs consisting of commonly used words.
There do, and did, exist phonetically long close vowels in Bukharan and
Samarkandi Tajik (§ 2.2.3.1), but their occurrence is conditioned (§ 2.2.3.3) and is
limited to certain words. As such, in the early 20th century, the vowel length distinction in close vowels in the dialect of Samarkand was described as seemingly
non-existent (Zarubin 1927: 355; Orfinskaja 1945: 90). Similarly, reports on vowel
length in Bukharan varieties of Tajik that have appeared in the dialectological
literature typically do not contain minimal pairs distinguished exclusively by
vowel length (e.g., Kerimova 1959: 6; Melex 1968: 5). Unsurprisingly, vowel length
distinction in Northern dialects has routinely been described as “rudimentary”
(Rastorgueva 1964: 23), or “residual” (Sokolova 1949: 19).
2.2.3.3 The current situation
Prescriptively, standard Tajik today does not contrast vowel length in its phonology. Hence, theoretically, the historically long vowels [iː] and [uː] need not be
pronounced longer than their historically short counterparts in standard Tajik.
Are they, then, pronounced no different from their historically short counterparts?

2 Standard Tajik phonology

Studies of vowel length in standard Tajik published in the mid- to late 20th
century show that historically long vowels are pronounced long (in at least certain
words) under certain conditions. More specifically, according to those studies, in
standard Tajik, phonetically long close vowels occur that correspond with historically long vowels, albeit only in the open unstressed syllable (Rastorgueva 1955:
39–40) that precedes the stressed syllable (Fajzov 1983). Fajzov (1983) presents
data in support of this observation (Figure 11C).
Thus, [iː] and [uː] (not /iː/ or /uː/) occurred, under certain conditions in certain
words, in standard Tajik in the mid- to late 20th century. What, then, is the situation concerning vowel length in the standard Tajik of post-civil war Tajikistan?
To find out, an analysis was performed on some words in the Tajik speech corpus
(§ 2.1) that have been identified in previous studies as containing phonetically
long close vowels.
Figure 11 Insets A–B present the result of the analysis in which the durations
of /i/ and /u/ in the test words of bino /bi[ː]no/ ‘sighted’, bino /bino/ ‘building’, dur
/du[ː]ɾ/ ‘far’, dur /duɾ/ ‘pearl’, šifo /ʃifo/ ‘healing’, šiša /ʃi[ː]ʃa/ ‘glass’, šištan
/ʃiʃtan/ ‘to sit’, sukut /sukut/ ‘silence’, surat /su[ː]ɾat/ ‘appearance’, surx /suɾχ/
‘red’, surxak /suɾχak/ ‘reddish’, and surud /suɾud/ ‘song’ obtained from the Newsreaders and Announcers were measured. Figure 11 Inset C shows Fajzov’s (1983)
measurements of the durations of /i/ and /u/, which he obtained from informants
whose speech was recorded between 1980 and 1982 (historically long vowels are
marked with [ː] in Figure 11).
An analysis of the audio data largely confirms the previous observations; the
analysis reveals a tendency for historically long vowels to be pronounced longer
in the pre-stressed open syllable in certain words in present-day standard Tajik.
However, it also reveals variation in the strength of the tendency across informants and test words.
For example, Figure 11 shows that, while /du[ː]ɾ/ and /duɾ/ are pronounced
identically in terms of vowel length, close vowels in /ʃi[ː]ˈʃa/ and /su[ː]ˈɾat/ are
pronounced longer than in /ʃiˈfo/ and /suˈkut/ or /suˈɾud/, respectively. While
this result is indeed in agreement with previous observations of vowel length in
standard Tajik, in the pronunciation of the bino pair, there is evidence for interspeaker variation in the length of the historically long vowel [iː] relative to that
of the historically short vowel [i]; some informants pronounce the close vowel
in /bi[ː]no/ distinctively longer than that in /bino/, while others make little or no
length distinction between them. A comparison between Figure 11 Insets A–B and
C also suggests that present-day newsreaders and announcers are not as unanimous as Fajzov’s informants were in the 1980s in pronouncing the historically
long /i/ in /bi[ː]ˈno/ longer than the historically short /i/ in /biˈno/.

Shinji Ido

Figure 11: Durations in milliseconds of /i/ and /u/ in the word-initial syllables of A the test
words of /duɾ, du[ː]ɾ, suɾχ, sukut, suɾχak, su[ː]ɾat, ʃifo, ʃiʃtan, ʃi[ː]ʃa, bino, bi[ː]no/ produced
by the Newsreaders, B the same test words produced by the Announcers, and C (data from
Fajzov 1983) the test words of /duɾ, du[ː]ɾ, suɾud, su[ː]ɾat, ʃi[ː]ʃa, bino, bi[ː]no/ produced by six
intellectuals residing in Dushanbe.

2 Standard Tajik phonology

Figure 11 (continued)

Shinji Ido

Given the fact that the Newsreaders and Announcers have on-air experience
at their respective Dushanbe-based national broadcasting stations (§ 2.1.2), it
seems fair to say that standard Tajik as it is used in the media today is highly
tolerant to variations in vowel length.
2.2.3.4 Summary
Vowel length distinction in close vowels was a contentious issue in the standardization of Tajik in the early 20th century when conflicting claims were made about
whether it was phonologically distinctive in major Northern dialects of Tajik.
A vowel-length measurement run on present-day standard Tajik audio data
reveals that phonetic vowel length distinction exists as a tendency in present-day
standard Tajik, as it reportedly also did in the standard Tajik of the Soviet period,
albeit only in close vowels and only in the pre-stress open syllable in certain
words.48 The analysis also evinces inter-speaker variation in the strength of the
tendency, which in turn suggests the high tolerance of present-day standard
(spoken) Tajik to the lack (or presence) of vowel length distinction. Consequently,
as has been repeatedly pointed out by a number of Tajik intellectuals and linguists since the early 20th century, there does not seem to be much evidence in
support of phonological vowel length distinction in standard Tajik.

2.3 Consonants
The consonant phoneme inventory of standard Tajik is unremarkable in terms
of size, though it comprises three uvular consonants, which Maddieson (2005:
30) observes to be “one of the less common types of consonants”; in fact, it may
be more remarkably characterized by its resemblance to the consonant phoneme
inventory of Uzbek (Kononov 1960: 24; Sjoberg 1962; Jamolxonov 2009: 117–139),
with which Tajik has been in intensive contact for centuries and which also utilizes the same three uvular consonants.
Table 4 presents the consonant phonemes of standard Tajik, in which table
consonants whose phonemic status in Tajik is contentious or ambiguous are
bracketed. Arrows indicate diachronic sound changes that have taken place in
standard Tajik since its inception in the early 20th century.

48 It is worth noting that, unlike in standard Tajik, the unstressed (historically short) /i/ and
/u/ in šifo ‘healing’ and sukut ‘silence’ can be devoiced or elided outright in Bukharan Tajik (Ido
2014: 99), as this may represent one aspect in which Bukharan Tajik and standard (spoken) Tajik
have diverged since the early 20th century.

2 Standard Tajik phonology

Table 4: Consonant phonemes of Tajik.
Labio- DentiBilabial dental alveolar Alveolar
Plosive

Nasal

d
[ts͡]

Affricate
m

Tap
Fricative

ɾ
f

Approximant
Lateral
approximant

Postalveolar

Palatal Velar Uvular Glottal
k ɡ

tʃ ͡
dʒ͡
͡
←tɕ ←dʑ͡
ʃ
←ɕ

[ʒ]
←[ʑ]

[ʔ]

In the following subsections there will be discussions on the phonemic statuses of [ʔ], [ts ]͡ [ʒ], and [ɕ], in some detail. The contention over the phonemic status
of [ʔ] and [ts ]͡ ultimately arises from the complexities in Tajik phonology that loanwords (from Arabic in the case of [ʔ] and from Russian in the case of [ts ]͡ ) bring
about. As for [ʒ], its phonemic status is subject to contention because of its limited
occurrence in Tajik. [ɕ], [tɕ]͡ , and [dʑ]͡ all had been phonemes in standard Tajik until
͡ and /dʒ/͡ , respectively.
the mid-20th century, but subsequently gave way to /ʃ/, / tʃ/,
2.3.1 The status of [ts ͡]
There have been conflicting views among Tajik grammarians and linguists as to
whether the affricate [ts ͡] should be accorded the status of a phoneme in standard
Tajik.49 The conflict essentially derives from the different approaches grammarians and linguists take in describing standard Tajik phonology.
2.3.1.1 Facts in favour of /ts ͡/
One of the approaches taken by Tajik grammarians and linguists in describing
standard Tajik phonology is to expand the scope of description beyond the native
stratum of the Tajik lexicon to include the stratum of borrowed lexical items. This
leads to the inclusion of [ts ͡] in the consonant inventory of standard Tajik, because
49 A similar situation concerning [ts͡] in Russian loanwords exists among Uzbek grammarians
and linguists as illustrated in Jamolxonov (2009: 133–135)

Shinji Ido

the occurrence of the voiceless affricate [ts ͡] in standard Tajik is limited to loanwords. Most of such loanwords have entered Tajik from Russian, which utilises
the phoneme /ts ͡/ which has its own orthographical representation, namely ‹ц›.
This approach is popular among pedagogically oriented Tajik linguists. Thus,
instructional materials are generally less inclined to dismiss [ts ͡] as a consonant
foreign to standard Tajik. For example, in a textbook prepared for use at Tajik universities, Niyozī (1956: 28) includes [ts ͡], which he writes as “ц”, among standard
Tajik consonant phonemes. His insistence on the status of [ts ͡] as a phoneme in
standard Tajik is echoed in a number of other textbooks such as one for secondary education (Niyozmuhammadov et al. 1955: 15) and Arzumanov and Sanginov’s (1988: 122–124) textbook for learners of Tajik at higher education institutions, though they do not use the term fonema “phoneme” in reference to the
sound in question. Karimov’s chapters in textbooks (Karimov 1973: 96–100, 1982:
88–96) also count (for no stated reason) [ts ͡] among standard Tajik phonemes.
One characteristic of these works is that they typically do not provide data (such
as minimal pairs) in support of their identification of [ts ͡] as a standard Tajik
phoneme, except for simply stating that [ts ͡] is used in words borrowed from
Russian into Tajik (e.g., in Niyozī 1956: 28).
2.3.1.2 Facts in favour of [ts ͡]
The other approach to the description of standard Tajik phonology is to limit
the scope of description to the native stratum of the Tajik lexicon. This results
in excluding the affricate [ts ͡] from the consonant inventory of standard Tajik,
because the voiceless affricate does not occur in native Tajik words. As such,
excluding all loanwords from consideration, a description of standard Tajik phonology can safely ignore [ts ͡].
Descriptive works of Rastorgueva (1954: 533–534), Xaskašev (1983: 64, 1985:
21), and Kerimova (1997: 98) are among those that employ this approach in describing standard Tajik phonology. Tajik orthography has also taken this approach,
dispensing with the orthographic representation of [ts ͡], albeit only before 1940
and after 1998, because from 1940 to 1998, Russian loanwords in Tajik would
be written in accordance with Russian orthography, in which ‹ц› exists for the
Russian phoneme /ts ͡/.
This approach makes historical sense in that, only a century ago, [ts]͡ was unequivocally not a phoneme in Tajik. That none of the descriptions of the Tajik of the
late 1920s and early 1930s contain any mention of [ts]͡ testifies to its foreignness
to Tajik phonology in the early 20th century, when the contact between Tajik and
Russian was limited (Semёnov 1927; “Alifboi” 1928; “O novom” 1928; Zarubin 1928;
Toşpūlotuf et al. 1932: 13–17; Orfinskaja 1945; Nabavī et al. 2007). In 1940, [ts]͡ receives

2 Standard Tajik phonology

one of its first mentions in Fonetikaji zaвoni adaвiji toçik “the phonetics of literary/
standard Tajik”, which, however, notes that Russian /ts/͡ is replaced by (Tajik) /s/ in
loanwords (Buzurgzoda 1940: 48). The fricative realization of the Russian affricate
in Tajik has been reported in other works too (Rastorgueva 1992: 8; Gacek 2012: 358).
2.3.1.3 The current situation
Does [ts ͡], then, occur in standard Tajik today? The answer is yes; in fact, it may
have started to occur in standard Tajik in as early as 1940, because the aforementioned Fonetikaji zaвoni adaвiji toçik (1940) notes that the affricate was making its
way into Tajik, suggesting that [ts ͡], which arguably had been absent in Tajik prior
to its contact with Russian, was at least sometimes pronounced as [ts ͡] in Russian
loanwords in 1940.
Today [ts ]͡ does occur in standard spoken Tajik, despite the orthographic change
in 1998 which purged the letter ‹ц›, the orthographic representation of Russian /ts/͡ ,
from the Tajik alphabet. Thus, for instance, the consonant in the penultimate syllable of the Russian loanword konstitutsiya ‹конститутсия› ‘constitution’, whose
orthographic representation in the source language is konstitucija ‹конституция›,
is pronounced consistently as [ts ]͡ by announcers appearing in video clips prepared
by none other than the National Centre of Legislation.50 One could therefore claim
that there are comparatively more grounds for [ts ]͡ as a phoneme in standard Tajik
today than there were in the early 20th century.
On the other hand, interestingly, it is not the case that the affricate in Russian
loanwords is invariably realized as [ts ]͡ in present-day standard Tajik. For instance,
one can hear the loanword konsepsiya ‹консепсия› ‘conception’ being pronounced as [konseptsi͡ ja] in broadcasts. Note that the two instances of Russian
/ts /͡ in koncepcija ‹концепция›, its Russian original, are rendered differently as /s/
and [ts ]͡ in the broadcasts. In other words, a pronunciation norm seems to exist
which mandates that Russian /ts /͡ be pronounced as [ts ]͡ in Russian loanwords,
but this norm does not extend to all instances of /ts /͡ in Russian loanwords.
2.3.1.4 Summary
The affricate [ts͡] occurs in standard spoken Tajik today, though, in the apparent
absence of minimal pairs involving the affricate, its phonemic status is ambig50 Incidentally, konstitutsiya as pronounced in a video clip that is put on the website of the National Centre of Legislation exhibits not only the affricate realization of [ts͡] in its penultimate syllable, but also the replacement of /o/ with /a/ (so-called Russian akan’e; Yanushevskaya and Bunčić
2015: 225) in its initial syllable. As L. Rzehak (p.c. 11 April 2021) points out, Tajik-Russian bilingualism may be among the factors that affect one’s propensity to produce [ts͡] in Russian loanwords.

Shinji Ido

uous. The Russian alveolar affricate is rendered into either [ts]͡ or /s/ in standard Tajik today, but what – or indeed whether – conditions exist on which the
Russian affricate can be rendered into [ts͡] in Tajik is unclear and certainly merits
further investigation.

2.3.2 The status of [ʔ]
The glottal plosive [ʔ] is another of the consonants which are put into square brackets in Table 4. As is the case with [ts ͡], its phonemic status in standard Tajik is as
ambiguous as it is contestable. [ʔ] is absent among the lists of standard Tajik consonant phonemes put forward by Rastorgueva (1954: 533–534), Niyozmuhammadov
et al. (1955: 15), Niyozī (1956: 25–28), and Kerimova (1997: 98). In contrast, Xaskašev
(1983, 1985) is explicit in identifying it as a phoneme in standard Tajik. See Table
5 where Xaskašev uses the letter “ъ” for representing the glottal stop in his handbook of phonetics prepared for use by students studying linguistics (filologiya) at
Tajik universities. Tajik grammars and textbooks typically use Cyrillic letters in representing Tajik phonemes, hence the absence of IPA symbols in the Table.
Xaskašev’s use of the letter “ъ” for the glottal stop is unlikely to be an arbitrary choice, because ‹ъ› is used in Tajik orthography as the representation of the
word-medial and word-final ‹‫‘( ›ع‬ayn) and ‹‫( ›ء‬hamza) (Olimov and Aliev 1999: 11;
“Qoidahoi” 2011) in Arabic loanwords,51 and ‘ayn and hamza represent the voiced
pharyngeal fricative and glottal stop, respectively, in Arabic orthography (Ryding
2005: 13). Moreover, the letter has often been called alomati sakta ‘stop sign’ and
described in orthographical dictionaries as representing hamsadoi halqī ‘glottal
consonant’ (Kalontarov 1974: 9; Maniyozov and Mirzoev 1991: 10), suggesting that
‹ъ› is understood as representing the glottal stop by a number of Tajik linguists.
The functional rationale for representing Arabic ‘ayn and hamza in Tajik
orthography is the existence of pairs of words that are orthographically distinguished only by the letter ‹ъ›. Such pairs include nav ‹нав› ‘new’ vs. nav” ‹навъ›
‘sort’, bad ‹бад› ‘bad’ vs. ba”d ‹баъд› ‘after’, and azo ‹азо› ‘mourning’ vs. a”zo
‹аъзо› ‘member’.52 Do these words constitute minimal pairs and is [ʔ] therefore a
phoneme? This is a matter of contention, hence the aforementioned inconsistencies among different consonant inventories proposed by different linguists.

51 To be sure, not every non-word-initial ‘ayn or hamza in Arabic orthography are represented
with ‹ъ› in Tajik orthography (Olimov and Aliev 1999: 12). For example, the Arabic source of the
loanword muallim ‹муаллим› ‘teacher’ contains an ‘ayn.
52 Kamoliddinov (2007: 13) presents thirteen such orthographic “minimal pairs”.

affricate

plosive

vibrant

н
й

voiceless

voiced

voiceless

voiced

voiceless
voiceless

voiced
voiced

uvular glottal

apical laminal dorsal
д

continuant

occlusive

double-constriction

continuant single-constriction

occlusive

lingual

sonorant

obstruent

Participation of
obstruction and
sonority

53 Obvious errors in the original table are corrected in Table 5. Note that pešzabonī, whose literal translation would be ‘front-lingual’ is translated as
‘apical’ here, though phonetic descriptions in the Tajik linguistic literature tend to be indiscriminate in that they use a single expression such as pešzabonī for both the tip and the front/blade of the tongue. Accordingly, ‘apical’ in Table 5 should probably be better understood as meaning ‘of the tip and
the front/blade of the tongue’.

Manner of articulation

labial

Place of articulation

Table 5: Tajik consonant phoneme chart adapted from Xaskašev (1983: 65) in translation.53

2 Standard Tajik phonology

Shinji Ido

2.3.2.1 Facts in favour of /ʔ/
As has been explained in § 2.3.2, in Tajik orthography, ‹ъ› (or its predecessor ‹’› in
pre-1940 Latin-based Tajik alphabets; Appendices 1–2) is primarily a place holder
for non-word-initial ‹‫ ›ع‬and ‹‫ ›ء‬in Arabic loanwords, and hence can be seen as
representing not a Tajik phoneme but two Arabic letters. That ‹ъ› exists in Tajik
orthography, therefore, does not necessarily imply that it has a stable representation in (standard) Tajik pronunciation; Nor does it necessarily mean that ‹ъ› is
read in standard Tajik as [ʔ], which hamza represents, or, for that matter, as [ʕ]
which ‘ayn represents in Arabic. Naturally, if [ʔ] is absent in standard Tajik, we
would not be able to even discuss its phonemic status.
Descriptions of ‹ъ› in grammars indicate that [ʔ] does occur in standard Tajik.
For example, Rastorgueva (1992: 11) writes that [ʔ] occurs in the intelligentsia’s
careful pronunciation. She also writes that, within a word, ‹ъ› preceded by a
letter representing a consonant and followed by a letter representing a vowel “is
not pronounced but indicates a syllable division”. This explanation is unclear as
to how the division of syllables is (phonetically) achieved in pronunciation, but
could indicate the presence of [ʔ] or creak (see below) where ‹ъ› occurs in writing.
In fact, in today’s speech too, when a Tajik speaker attempts to speak “right”,
e.g., in broadcast speech, creak or a suspended closure of the glottis, namely
[ʔ], can be clearly identified where orthographical ‹ъ› occurs, even word-finally
(Figure 12).

Figure 12: Creaky voicing (the irregular voicing at the time-point indicated with arrows) in the
word voze” ‹возеъ› ‘founder’ produced in isolation by a male academic.

This seems to mean that, producing [ʔ] or creak where ‹ъ› is there in writing
is perceived by Tajik speakers as a pronunciation norm that standard spoken
Tajik should observe. Indeed, a textbook for Tajik secondary education contains

2 Standard Tajik phonology

a pronunciation exercise for [ʔ] (Kamoliddinov 2007: 13). This explains why a
number of newsreaders, in reading words whose orthographical representations
contain the letter ‹ъ›, produce apparent glottal or laryngeal constriction even in
their very rapid newsreading. It therefore seems possible to argue for the phonemic status of [ʔ] in standard Tajik based on the existence of this pronunciation
norm.
2.3.2.2 Facts in favour of [ʔ]
There are also facts that pose problems for the argument that [ʔ] is a phoneme in
standard Tajik.
First, the ostensible (orthographical) “minimal pairs” involving ‹ъ› such as
the ones listed in § 2.3.2 invariably involve words of Arabic origin. Therefore, if one
confines oneself to describing standard Tajik phonology as it is represented in the
native stratum of the Tajik lexicon, none of such pairs would need to be acknowledged as a minimal pair. Second, according to Rastorgueva (1992: 11), while ‹ъ› is
realized as [ʔ] in intelligentsia’s careful pronunciation, it can otherwise disappear
altogether in the word-final position or lengthen the vowel preceding it when it
is followed by a consonant.54 Finally, glottal constriction has a non-phonemic use
in Tajik at least in its present-day standard spoken variety. In standard spoken
Tajik today creak or [ʔ] resulting from varying degrees of glottal or laryngeal constriction occurs optionally before the word-initial vowel, apparently for boundary
marking.55
For example, observe in Figure 13 the presence of creak at around the 0.8
second time-point that marks the word boundary between joma-aš-ro (padded
robe-3sg-acc) ‘his/her padded robe’ and az ‘from’.56
Importantly, the occurrence of boundary-marking creak or [ʔ] is optional
in standard spoken Tajik. The optionality of boundary-marking with creak/[ʔ]
is obvious in Figures 14 and 15. Observe Figure 14 in which creak/[ʔ] precedes
each of oftob /oftob/ ‘sun’, az /az/ ‘from’, and ū /ɵ/ ‘s/he’, and compare it with

54 Rastorgueva (1992) is silent about what happens to [ʔ] in the intervocalic position within a
word.
55 A similar use of the glottal stop as a boundary marker has been attested in a number of other
languages such as Dutch, English, and Finnish (Jongenburger and van Heuven 1991; Umeda
1978; Lennes et al. 2006).
56 Note also that no fall in intensity is observable at around the 0.4 second time-point, where
there is a morpheme boundary between joma and -aš, suggesting that boundary-marking creak
or [ʔ] occurs (optionally) not within, but between words.

Shinji Ido

Figure 13: An example of creaky voicing that precedes the word-initial vowel of az ‘from’ in the
phrase . . . jomaašro az tan . . . read by F71.57

Figure 15, where creak/[ʔ] precedes only az.58 The utterances whose spectrograms
are shown in Figures 14 and 15 could therefore be transcribed broadly as [(ki)
ʔoftob ă̰ az ʔɵ] and [ki oftob ă̰ az ɵ], respectively. The optional occurrence of [ʔ]
at word boundaries leaves the phonemic status of [ʔ] at the word-final position
such as the one in ‹возеъ› voze” ‘founder’ open to question, because word-final
[ʔ] can coincide with boundary-marking [ʔ] whose occurrence is optional. In other
words, word-finally, the glottal stop cannot reliably produce minimal pairs. The
optional occurrence of [ʔ] at word boundaries also means that [ʔ] cannot reliably serve as a phoneme word-initially either. However, this fact in itself is not in
disfavour of identifying [ʔ] as a phoneme, because the letter ‹ъ› never appears in
the word-initial position (Xaskašev 1985: 22) and the (orthographical) “minimal
pairs” mentioned earlier do not contain words starting with ‹ъ›.
2.3.2.3 Summary
In sum, there are grounds for establishing [ʔ] as a phoneme in standard Tajik,59
if one admits that Arabic loanwords are a constituting part of the Tajik lexicon
on which to base a phonological description. However, if one identifies [ʔ] as
a phoneme in Tajik, the distribution of the phoneme would be limited to the
57 The phrase is an excerpt from a Tajik translation of North Wind and the Sun (Appendix 5).
58 The phrase is an excerpt from a Tajik translation of North Wind and the Sun (Appendix 5). In
Figures 14–15, it is preceded by ki /ki/ ‘that’.
59 There may be fewer grounds for establishing [ʔ] as a phoneme in careless (non-standard)
speech. The inclusion of such entries as faol ‹фаол› ‘active’ as an alternative spelling for fa”ol
‹фаъол› ‘active’ to a dictionary (Mamatov et al. 2005) may also imply the phonemically ambiguous status of [ʔ] in non- or less standard Tajik.

2 Standard Tajik phonology

Figure 14: The passage . . . (ki) Oftob az ū . . . read by F71.

Figure 15: The passage . . . ki Oftob az ū . . . read by a female radio announcer at Sadoi Dušanbe.

word-medial position in standard Tajik, in which non-phonemic glottal/laryngeal constrictions such as creak and [ʔ] optionally occur at word boundaries.

2.3.3 The status of [ʒ]
As is clear from the preceding subsections, Russian and Arabic loanwords are
mainly responsible for some grammarians and linguists’ inclusion of [ts͡] and
[ʔ] in the standard Tajik consonant phoneme inventory. However, limiting one’s
attention to the native lexicon does not remove all inconsistencies among different standard Tajik consonant phoneme inventories put forward by different
Tajik linguists; [ʒ], another consonant whose phonemic status is ambiguous, has
its place in some grammarians’ Tajik consonant inventories primarily by virtue

Shinji Ido

of its occurrence in words native to Tajik. The main reasons why its phonemic
status is ambiguous have to do with the rarity of its occurrence and the absence
of minimal pairs involving [ʒ].
[ʒ] has a unique orthographic representation as ‹ж› in the Cyrillic-based
Tajik alphabet. However, the number of native Tajik words whose orthographic
representations contain ‹ж› is limited. Moreover, in the phonetic realizations of
such native Tajik words, [ʒ] is not always present, because in the majority of them
‹ж› is read not as [ʒ] but as /dʒ͡/. For instance, many Tajik speakers read ‹жола›
žola ‘hail’ not as [ʒola] but as /dʒ͡ola/. Such affricativization of [ʒ] among Tajik
speakers is not a recent phenomenon; in as early as 1930, a contributor to rahвari
doniş, a journal which served as a forum for discussion about issues related to
Tajik orthography, maintained that Persian ž ‹‫›ژ‬, representing [ʒ], was not used
in the language of contemporary Tajiks (‘Ayn 1930: 19). They endorsed a Russian
orientalist’s proposal to remove from the Latin-based Tajik orthography the letter
‹ƶ› representing the voiced fricative and to use ‹ç› representing a voiced affricate
in its stead,60 on the grounds that Tajiks replaced ž ‹‫ ›ژ‬representing [ʒ] with j ‹‫›ج‬
representing [dʒ͡] in reading such words as žāle ‹‫‘ ›ژاله‬hail’ and aždar ‹‫‘ ›اژدر‬dragon’.61 (There is some evidence that the voiced postalveolar fricative in Tajik was
palatal like [ʑ] until the mid-20th century, hence the arrow in [ʒ]←[ʑ] in Table 4
[see § 2.3.4].)
In fact, the alternation of the fricative with the affricate has been repeatedly
noted in many works on Tajik phonology (e.g., Zarubin 1928: 106; Sokolova 1949:
86; Xaskašev 1983: 83; 1985: 38). The reason for such affricativization of [ʒ] is
unclear, though one can suspect an influence exerted on Tajik from Uzbek, which
has a voiced postalveolar affricate but natively lacks [ʒ] as a phoneme. In any
case, the occurrence of [ʒ] in Tajik is obviously limited.
Another fact that potentially undermines the phonemic status of [ʒ] is the
apparent lack of minimal pairs involving [ʒ]. This is inevitable considering the
limited occurrence of [ʒ] in Tajik, but in the lack of minimal pairs involving it, [ʒ]
cannot be straightforwardly claimed to have a phonemic status.
On the other hand, there are also some facts that potentially support the
establishment of [ʒ] as a phoneme in Tajik. For example, there exist words in
which [ʒ] appears to be relatively resistant to affricativization. They include such

60 The Russian orientalist (Nabavī et al. 2007: 704) who is referred to in their article simply as
‹‫ ›سوخه‌روا‬is probably Ol’ga Aleksandrovna Suxareva (1903–1983).
61 The contributor in fact writes that Tajiks write ‹‫ ›اژدر‬aždar ‘dragon’ and read it as ‹‫›اجدها ر‬
ajdahār.

2 Standard Tajik phonology

Russian loanwords as žurnalist ‘journalist’, and a few specific native words such
as každum ‹каждум› ‘scorpion’ and (despite the aforementioned statement from
a contributor to rahвari doniş) aždar ‹аждар› ‘dragon’.62 Perhaps more importantly, today’s standard spoken Tajik is prone to “spelling pronunciation”. As a
result, in broadcasts, [ʒ] tends to occur wherever ‹ж› occurs in writing, despite the
general tendency among many Tajik speakers to affricativize it to /dʒ͡/. In other
words, a pronunciation norm seems to exist according to which every ‹ж› is read
as [ʒ] in careful pronunciation. These facts point to the phonemic status of [ʒ] in
standard spoken Tajik.
In summary, it seems that the phonemic status of [ʒ] is ambiguous with arguments existing both for, and against, its phonemic status. On the one hand, it
occurs infrequently and is prone to affricativization in spoken Tajik; and on the
other hand, unlike [ts]͡ and [ʔ], which are foreign in origin, [ʒ] is a consonant that
is native to Tajik and tends not to undergo affricativization in standard spoken
Tajik, which inclines towards spelling pronunciation.

2.3.4 The status of [ɕ]
There are records indicating that the postalveolar fricative phoneme in standard
Tajik was not /ʃ/ but /ɕ/ until the mid-20th century. To be sure, that the voiceless
postalveolar fricative underwent the change of /ɕ/ to /ʃ/ in the last half century
in standard Tajik is not the received wisdom of Tajik linguistics. However, the
change in question can be inferred rather straightforwardly from a number of
descriptions of the Tajik postalveolar fricative published in the 20th century.
On the basis of her analysis of data collected in 1927 from various Tajik dialects including Samarkandi Tajik, Orfinskaja (1945: 100) concludes that the Tajik
voiceless postalveolar fricative is always pronounced with palatalization and
is articulated “dorsally (dorsal’no)”. This description strongly suggests that the
postalveolar fricative in Tajik was alveolo-palatal in 1927. Incidentally, Orfinskaja
(1945: 100) also notes that both the voiced and voiceless affricates are palatalized
in the Tajik of 1927; hence the arrows (/tʃ ͡/←/ tɕ͡/ and /dʒ͡/←/dʑ͡/) in Table 4.
The identification of [ɕ] as the phonetic realization of the Tajik voiceless
postalveolar fricative consonant phoneme was apparently duly transferred into

62 It remains to be seen whether the apparent decrease in the number of Russian loanwords
used in broadcasts in favour of native Tajik words will affect the status of [ʒ]. For example, broadcast programmes seem to give precedence to rūznomanigor ‘journalist’ over žurnalist ‘journalist’,
which are a native Tajik word and a loanword from Russian, respectively.

Shinji Ido

standard spoken Tajik. This is evident in some descriptions of (standard) Tajik
published in the mid-20th century. For example, Rastorgueva (1992: 9; the original Russian was published in 1954: 533–534) points to the raising of the front and
middle parts of the tongue towards the palate that takes place as part of the articulation for the postalveolar fricatives in standard Tajik. She also writes that the
Tajik postalveolar fricatives “are distinguished from the corresponding Russian
sounds by greater softness”, which is in agreement with Sokolova’s (1949: 88–89)
description of the Tajik postalveolar fricatives.63 In addition, palatograms presented in Rastorgueva (1955: 47–48) indicate that their articulation involves the
formation of a much narrower palatal channel than observed in the articulation
of Russian /ʃ/, which is characterized by a lack of palatalization (Skalozub 1963;
Wade 1992: 3; Yanushevskaya and Bunčić 2015).64 These descriptions indicate
that the standard Tajik voiceless postalveolar fricative in the mid-20th century
was “softer” than Russian /ʃ/ and was palatal like [ɕ].
In contrast, “softness” is generally not mentioned in post-mid-20th-century
descriptions of the standard Tajik voiceless postalveolar fricative,65 nor is the fricative in question pronounced as [ɕ] in standard spoken Tajik today. The Newsreaders and Announcers (§ 2.1) on average do not pronounce the voiceless postalveolar fricative “soft”; they pronounce it as [ ʃ ], rather than as [ɕ]. Some acoustic
measurements exist that are compatible with this observation. The first spectral
moment (centre of gravity) calculated from the noise spectra of /ʃ/ produced by
the Newsreaders and Announcers falls in the same general range as that calcu-

63 Buzurgzoda’s (1940: 47) description of the Tajik postalveolar fricatives, which he represents
as ‹ş› and ‹ƶ› using letters from the Tajik Latin-based alphabet of the early 20th century (Appendix
2), differs somewhat from those of Rastorgueva and Sokolova. With such a statement as nūgi
zaвon andak вardoşta şuda, вa milk nazdik meşavad ‘the tongue tip is raised a little and approaches the alveolar ridge’, his description may appear more in line with the articulation of [ ʃ ]
or [ʒ]. However, given that phonetic descriptions in the Tajik linguistic literature are frequently
indiscriminate in their application of the expression “the tip of the tongue” to both the tip of the
tongue and the front of the tongue, Buzurgzoda’s description does not necessarily contradict that
of Rastorgueva and Sokolova. This said, since sibilant fricatives have previously been reported to
be variable in articulatory gestures between speakers, at any rate in English (Fletcher and Newman 1991: 856), the difference may be attributed in part to inter-speaker variability.
64 Alternative transcriptions for the Russian (non-palatalized) voiceless postalveolar fricative
include “ʂ” and “š” (e.g., Kortlandt 1973; Zygis 2003; Kochetov 2017).
65 One exception is the textbook written by Arzumanov and Sanginov (1988: 122) who do mention softness in the phonetic realization of the fricative in question. However, the section (Arzumanov and Sanginov 1988: §8) in which this description appears to be a reformatted version of
a similar section in Arzumanov and Džalalov (1969: 118). The mention, therefore, likely belongs
not to the 1980s but to the 1960s.

2 Standard Tajik phonology

lated from the noise spectra of Russian /ʃ/. For instance, the average COG value
for Tajik /ʃ/ in šast ‘sixty’ produced by the male Newsreaders and Announcers
differs by only 34 Hz from that for Russian /ʃ/ in šali ‘shauls’ produced by male
speakers of standard Russian (Ido 2019; Kochetov 2017: 345).66
To sum up, an analysis of existing descriptions of the postalveolar fricative
consonant phoneme in Tajik points to a sound change which took place in standard Tajik after the mid-20th century. As a result of the change which shifted /ɕ/
to /ʃ/, the phonetic realization of the voiceless postalveolar fricative is no longer
“palatal” or “soft” in present-day standard Tajik.

2.3.5 Voice Onset Time
Voice Onset Time (VOT) seems to have attracted limited attention in Tajik linguistics. As a result, there is not a large body of research to review regarding
VOT in standard Tajik either.67 Accordingly, this subsection presents a pilot
study of standard Tajik VOT, which measures the VOT values of word-initial /p
b t d k ɡ q/, as well as one word-medial /p/ as they are produced by the Newsreaders (§ 2.1).
The VOT values of the word-initial plosives /p b t d k ɡ q/ in pur /puɾ/ ‘full’,
bur /buɾ/ ‘cut!’, tar /taɾ/ ‘wet’, dar /daɾ/ ‘door’, kūr /kөɾ/ ‘blind’, gūr /ɡөɾ/
‘tomb’, qūr /qөɾ/ ‘embers’, and the word-medial plosive in sipas /sipas/ ‘then’
are shown in Figure 16, in which each data point represents one VOT measurement.68 Figure 16 presents VOT data of only one non-word-initial plosive,
because the Tajik speech corpus (§ 2.1) does not contain many occurrences of
word-medial plosives.
The data presented in Figure 16 suggest aspiration accompanying voiceless plosives word-initially in standard Tajik. In addition, the standard deviations given in Table 6 show that word-initial voiced plosives tend to vary widely
between the Newsreaders in the length of pre-voicing.

66 Kochetov (2017) represents this phoneme as /ʂ/.
67 Ido (2014: 88–89) discusses VOT in Bukharan Tajik, but only briefly.
68 The plotting of data points was carried out in R (R Development Core Team, 2019) using the
ggplot2 package (Wickham et al. 2019). As in Abramson and Whalen (2017: Figure 9a–c), positive
VOT was measured as the interval between the onsets of release burst and glottal pulsing and
negative VOT as the interval between the onsets of voiced closure and release burst.

Shinji Ido

Figure 16: VOT in milliseconds of the plosives in /puɾ buɾ taɾ daɾ kөɾ ɡөɾ qөɾ sipas/ produced by
the Newsreaders.
Table 6: Mean values and standard deviations of VOT in milliseconds of the word-initial /p, b, t,
d, k, ɡ, q/ and intervocalic /p/ in test words produced by the Newsreaders.
Word-initial, voiceless
Word Mean SD
Bilabial

/p/ /puɾ/ 66.28 10.37

Word-initial, voiced
Word Mean

/b/ /buɾ/ −108.4 30.95

Alveolar

/t/ /taɾ/

54.49 22.08

/d/ /daɾ/ −108.16 37.18

Velar

/k/ /kөɾ/ 69.94 18.05

/ɡ/ /ɡөɾ/ −100.93 40.56

Uvular

/q/ /qөɾ/ 74.59 29.61

Word-medial, voiceless
Word

Mean SD

/p/ /sipas/ 17.11 5.96

The small amount of data considered here precludes drawing any firm conclusions about VOT in standard Tajik, but there seems to exist conspicuous variation among the VOT values of voiced plosives produced by the Newsreaders,
despite their shared profession, namely script reading. This may be an indication
of the usefulness of VOT as a metric for identifying inter-speaker, or inter-varietal,
differences in Tajik.

Intonation
In Tajik, the yes/no question can be distinguished from the corresponding declarative sentence by intonation. The distinction is observable in Figure 18, which

2 Standard Tajik phonology

shows the pitch contours of the interrogative Navišt? (wrote.3sg) ‘did s/he write?’
and its answer Navišt (wrote.3sg) ‘s/he did’ read aloud by the Newsreaders.

Figure 17: Pitch contours of navišt /naviʃt/ (wrote.3sg) produced as a yes/no question-answer
pair (Navišt? and Navišt) by the Newsreaders.

Figure 17 indicates that F71 uses an apparently different interrogative intonation
from the rising intonation used by M71, M82, and M86. This may suggest a need
to identify, in standard Tajik, multiple interrogative intonations corresponding to
different types of yes/no questions.69

69 For instance, Di Cristo (1998: 203) distinguishes between “Yes/No questions for confirmation
(where one specific response is expected) and Yes/No questions for information” in his description of French intonation patterns.

Shinji Ido

Wh-questions and their answers also exhibit some characteristic intonation
patterns. Figure 18, which shows the pitch contours of three question-answer
pairs produced by F71, exemplifies some such patterns. Sentences (1a–c and 2)
in which ‘accusative’ and ‘third person singular’ are abbreviated as acc and 3sg,
respectively, combine to form the three question-answer pairs shown in Figure 18.
(1) a. Kī
Sūhrob-ro
kušt?
/ki
sɵhɾobɾo
kuʃt/
who Sūhrob-acc killed.3sg
‘Who killed Sūhrob?’
b. Rustam ki-ro
kušt?
/ɾustam kiɾo
kuʃt/
Rustam who-acc killed.3sg
‘Who did Rustam kill?’
c. Kī
ki-ro
kušt?
/ki
kiɾo
kuʃt/
who
who-acc killed.3sg
‘Who killed whom?’
(2) Rustam Sūhrob-ro
kušt
/ɾustam sɵhɾobɾo
kuʃt/
Rustam Sūhrob-acc killed.3sg
‘Rustam killed Sūhrob’
Figure 18 shows that the interrogative pronoun kī ‘who’ induces a sequence of
a rise and a fall in f0 in the syllable immediately following it. This intonation
pattern is readily identifiable in Figure 18 Inset A where a sharp rise and fall is
identifiable in the first syllable of Sūhrob, which F71 starts at a much higher f0
than that of kī. It can also be found in other questions, e.g., in the accusative
case marker -ro following kī in Figure 18 Inset B. In Figure 18 Inset C, the first kī
appears to raise f0 in the second kī, which in turn raises it further in the accusative case marker that follows it.
Figure 18 also suggests that a noun receiving focus is (unless it is followed
immediately by another focus) followed by a sharp fall in f0, for instance in the
answers in Figure 18 Insets B–C where f0 is much lower in the accusative case
marker -ro than in the noun preceding it, namely Sūhrob. Note the absence of a
similar fall between Sūhrob and -ro in the answer in Figure 18 Inset A.

2 Standard Tajik phonology

Figure 18: Pitch contours of the question-answer pairs, /ki sɵhɾobɾo kuʃt/-/ɾustam sɵhɾobɾo
kuʃt/, /ɾustam kiɾo kuʃt/-/ɾustam sɵhɾobɾo kuʃt/, and /ki kiɾo kuʃt/-/ɾustam sɵhɾobɾo kuʃt/,
produced by F71.

While these observations are admittedly based on a limited number of
samples, and hence are of doubtful generalizability to standard Tajik as a whole,
they are replicated in a number of other samples not presented here and are
worthy of future investigation.

Shinji Ido

Concluding remarks
This article provided an overview of standard Tajik phonology focusing mainly on
its phonemes as well as their phonetic realizations. It looked into the historical
process through which the phoneme inventory of standard Tajik was determined
and ascribed the affinity between the phoneme inventory of standard Tajik with
that of the Northern Tajik dialects to the speech of the elite in the Tajik SSR who
were Northern Tajik dialect speakers from Bukhara, Samarkand, and Khujand (§ 1).
The present article then described the phoneme inventory of standard Tajik
based on both previous observations of Tajik phonemes and acoustic analyses
performed on present-day standard Tajik audio data (§ 2).
It discussed the phonemic status (or lack thereof) of [ts͡], [ʔ], [ʒ], [iː], and [uː], and
investigated some phonetic properties that are amenable to acoustic measurement
in standard Tajik. The investigation 1) acoustically characterized the vowel system,
2) identified a vowel chain shift, which in turn can be identified as constituting part
of the Northern Tajik Chain Shift, 3) identified pharyngealization in the allophonic
variations of /i o u/, 4) confirmed that, while historically long vowels tend to be pronounced longer in certain words under certain conditions, vowel length is hardly
phonologically distinctive, 5) suggested a diachronic sound change where the voiceless postalveolar fricative has become depalatalized, and 6) revealed some features
of intonation patterns for question-answer pairs in standard Tajik (§ 3).

Appendices

2 Standard Tajik phonology

Appendix 1 The Latin-based alphabet which was adopted at a conference in
Tashkent in October 1928 (“O novom” 1928: 246). This alphabet is
the predecessor of the official Latin-based alphabet adopted at the
First Tajik Linguists’ Conference in August 1930 (Appendix 2), until
which it was in limited use (along with the Perso-Arabic alphabet)
for a few years in the nascent Tajik publishing industry centred in
Tashkent, Samarkand, and Dushanbe (Ismatī 1929).

Appendix 2 The Latin-based alphabet that was officially adopted in 1930 at the
First Tajik Linguists’ Conference held in Stalinabad. (Toşpūlotuf
1932: 17).
Appendix 3 Formant frequency values of Tajik vowels produced by the Newsreaders and Announcers.70
Females (n=5)

Males (n=7)
Mean

/i/

20 tokens

Mean

333

288

2839

163

2285

136

3515

253

2915

229

/i/

35 tokens

70 Some male Announcers produced certain vowels more or less than 4 times. As a result, the
number of tokens obtained from male informants for a vowel varies between 33 and 35.

Shinji Ido

(continued)
Females (n=5)

Males (n=7)
Mean

/e/

/a/

/o/

/u/

/ɵ/

20 tokens

Mean

482

455

2577

196

2016

101

3231

223

2644

198
76

/e/

33 tokens

898

706

1546

180

1288

2919

248

2549

173

539

494

917

101

879

114

2942

166

2576

131

391

334

766

122

776

134

2846

184

2526

159

466

401

1651

400

1421

304

2933

209

2521

160

/a/

/o/

/u/

/ɵ/

34 tokens

33 tokens

34 tokens

33 tokens

Appendix 4 The means and standard deviations of the lengths in milliseconds
of /i/ and /u/ in the word-initial syllables of twelve test words. In
the table, “n.”, “4 news”, “8 annc”, “Fajzov”, “Hist.”, “Str.”, and
“Open” represent “number of tokens”, “four Newsreaders”, “eight
Announcers”, “data taken from Fajzov (1983)”, “historical length”,
“(syllable is) stressed”, and “(syllable is) open”, respectively. Historically long vowels are marked with [ː].
Word

Hist.

Str.

Open

4 news
n.

8 annc

Mean

Fajzov
SD n. Mean

/bi[ː]no/

long

yes

129

100

120

/bino/

short

yes

100

/du[ː]ɾ/

long

yes

225

213

/duɾ/

short

yes

222

206

/ʃifo/

short

yes

/ʃi[ː]ʃa/

long

yes

108

/ʃiʃtan/

short

20
(continued)

2 Standard Tajik phonology

(continued)
Word

Hist.

Str.

Open

4 news
n.

8 annc

Mean

Fajzov
SD n. Mean

/sukut/

short

yes

/su[ː]ɾat/

long

yes

110

107

/suɾχ/

short

yes

144

112

/suɾχak/

short

/suɾud/

short

yes

Appendix 5 Tajik translation of The North Wind and the Sun71
Bodi Šimol va Oftob bo ham bahs mekardand, ki kadome az onho zūrtar ast. Hangomi bahsi
onho musofire omad, ki ba jomai garme pečida bud. Bodi Šimol va Oftob ba qarore omadand,
ki har kase, ki peštar jomai on musofirro az tanaš berun kunad, hamon zūrtar donista
mešavad. Sipas, Bodi Šimol bo tamomi quvvataš vazid. Ammo, har qadare ū saxttar mevazid,
hamon qadar musofir xudro beštar ba joma mepečonid. Dar oxir, Bodi Šimol az bahri in kor
baromad. Sipas, Oftob duraxšid va musofir az garmī zud jomaašro az tan berun kard. Hamin
tariq, Bodi Šimol majbur šud to iqror kunad, ki Oftob az ū zūrtar ast.
Orthographic version
Боди Шимол ва Офтоб бо ҳам баҳс мекарданд, ки кадоме аз онҳо зӯртар аст. Ҳангоми
баҳси онҳо мусофире омад, ки ба ҷомаи гарме печида буд. Боди Шимол ва Офтоб ба
қароре омаданд, ки ҳар касе, ки пештар ҷомаи он мусофирро аз танаш берун кунад,
ҳамон зӯртар дониста мешавад. Сипас, Боди Шимол бо тамоми қувваташ вазид.
Аммо, ҳар қадаре ӯ сахттар мевазид, ҳамон қадар мусофир худро бештар ба ҷома
мепечонид. Дар охир, Боди Шимол аз баҳри ин кор баромад. Сипас, Офтоб дурахшид
ва мусофир аз гармӣ зуд ҷомаашро аз тан берун кард. Ҳамин тариқ, Боди Шимол
маҷбур шуд то иқрор кунад, ки Офтоб аз ӯ зӯртар аст.

References
‘Ayn. 1930. оvozhoji lahçaji toçikī [Sounds of the Tajik dialect]. rahвari doniş January 1930.
18–19. (Reprinted in Nabavī et al. eds. 2007. 578–583).
A”lozoda, F. 1930. Boz ham dar atrofi mas”alai alifboi navi tojikī [A reprise on the issue of the
new Tajik alphabet]. rahвari doniş March 1930. 5–7. (Reprinted in Nabavī et al. eds. 2007.
696–602).

71 I thank Zubaidullo Ubaidulloev for his translation of the fable into Tajik.

100

Shinji Ido

Abazov, Rafis. 2008. The Palgrave concise historical atlas of central Asia. New York: Palgrave
MacMillan.
Abramson, A. S. & D. H. Whalen. 2017. Voice Onset Time (VOT) at 50: Theoretical and practical
issues in measuring voicing distinctions. Journal of Phonetics 63. 75–86.
Alifboi navi tojikī. Samarqand: Našriyoti Kumitai Alifboi Navi Tojikī. 1928.
Arlund, Pam & Neikramon Ibrukhim. 2013. A Chinese Tajik reader: An introduction to Sarikoy
(Sarikol) Tajik. Grandview: All Nations Publishing.
Arzumanov, Stepan Džavodovič & Ahmadžon Sanginov. 1988. Zaboni tojikī [the Tajik language].
Dushanbe: Maorif.
Arzumanov, Stepan Džavodovič & Obid Džalalovič Džalalov. 1969. Zaboni tojikī [the Tajik
language]. Dushanbe: Irfon.
Asimova, B. S. 1982. Jazykovoe stroitel’stvo v Tadžikistane (1920–1940 gg.) [Language
construction in Tajikistan (1920–1940)]. Dushanbe: Doniš.
Aynī, Sadriddin. 1928. Zaboni tojikī [the Tajik language]. rahвari doniş November–December
1928. (Reprinted in Nabavī et al. eds. 2007. 318–327).
Azizī, Bahriddin. 1928. Ba zaboni darī durri suxan suftan mexoham [I want to compose poetry in
Dari]. tojikistoni surx 28 December 1928. (Reprinted in Nabavī et al. eds. 2007. 357–363).
Baqozoda, Hamid. 1930. Dar girdi alifboi navi tojikī [About the new Tajik alphabet]. оvozi tojik
24 January 1930. (Reprinted in Nabavī et al. eds. 2007. 517–519).
Baroi tayyorī ba kanferensiyai ilmii Istalinobod [In preparation for the Stalinabad scientific
conference]. rahвari doniş June 1930. 1–4. (Reprinted in Nabavī et al. eds. 2007. 629–635).
Becker-Kristal, Roy. 2010. Acoustic typology of vowel inventories and Dispersion Theory:
Insights from a large cross-linguistic corpus. Los Angeles: University of California, Los
Angeles, dissertation.
Beeman, William O. 2010. Sociolinguistics in the Iranian world. In Martin J. Ball (ed.), The
Routledge handbook of sociolinguistics around the world, 139–148. Abingdon: Routledge.
Begbudi, Nadim Masudovič. 1956. Govor samarkandskix tadžikov [The dialect of Samarkand
Tajiks]. Moscow: Moscow State University dissertation.
Bessel, Nicola J. 1998. Local and non-local consonant-vowel interaction in Interior Salish.
Phonology 15. 1–40.
Bin-Muqbil, Musaed S. 2006. Phonetic and phonological aspects of Arabic emphatics and
gutturals. Madison: University of Wisconsin-Madison dissertation.
Bobomurodov, Šakar. 1978. Sadonoki “ū” va mavqei on dar sistemai vokalizmi zaboni
adabii tojik [The vowel ū and its place in the vocalic system of literary Tajik]. Dushanbe:
Universiteti Davlatii Tojikiston ba nomi V. I. Lenin.
Boersma, Paul & David Weenink. 2015. Praat: doing phonetics by computer. Version 6.0.04.
http://www.praat.org/ (retrieved 1 November 2015).
Buxorī. 1930. Dar girdi mas”alai zaboni adabii tojik [On the issue of literary Tajik]. оvozi tojik 13
February 1930. (Reprinted in Nabavī et al. eds. 2007. 602–606).
Buzurgzoda, L. 1940. Fonetikaji zaвoni adaвiji toçik [The phonetics of literary Tajik]. Stalinoвod;
Leningrad: Naşrijoti davlatiji Toçikiston.
Chiba, Tsutomu & Masato Kajiyama. 1941. The vowel: Its nature and structure. Tokyo: Kaiseikan.
Comrie, Bernard. 1981. The languages of the Soviet Union. Cambridge: Cambridge University
Press.
Cruttenden, Alan. 2014. Gimson’s pronunciation of English. 8th edn. London: Routledge.
Dar anjumani naxustini ilmii Tojikiston [At the first scientific conference of Tajikistan]. tojikistoni
surx 1 September 1930. (Reprinted in Nabavī et al. eds. 2007. 665–668).

2 Standard Tajik phonology

101

Dar kengoši ilmii tojikoni ūzbakiston [At the scicentific conference of Uzbekistan Tajiks]. оvozi
tojik 17 February 1930. (Reprinted in Nabavī et al. eds. 2007. 567–570).
Dehotī, Abdusalom Pirmuhammadzoda. 1930. Dar borai imlo, alifbo, istiloh va zaboni adabii
tojikī [On orthography, the alphabet, terminology, and literary Tajik]. оvozi tojik 29
December 1930. (Reprinted in Nabavī et al. eds. 2007. 672–678).
Delattre, P. 1971. Pharyngeal features in the consonants of Arabic, German, Spanish, French,
and American English. Phonetica 23. 129–155.
Di Cristo, Albert. 1998. Intonation in French. In Daniel Hirst & Albert Di Cristo (eds.), Intonation
systems: A survey of twenty languages, 198–218. Cambridge: Cambridge University Press.
Diyokuv. 1930. Dar girdi mas”alai zabon va alifbo [On the issue of the language and alphabet].
rahвari doniş July 1930. (Reprinted in Nabavī et al. eds. 2007. 642–653).
Dodikhudoeva, Leyla. 2004. The Tajik language and the socio-linguistic situation in the
mountainous Badakhshan. Iran and Caucasus 8(2). 281–288.
Efimov, Valentin Aleksandrovič. 1965. Jazyk afganskix xazara: Jakaulangskij dialekt) [The
language of the Afghan Hazara: The Yakawlang dialect]. Moscow: Nauka.
Efimov, Valentin Aleksandrovič, Vera Sergeevna Rastorgueva & E. N. Šarova. 1982. Persidskij,
tadžikskij, dari [Persian, Tajik, Dari]. In Vera Sergeevna Rastorgueva et al. (eds.), Osnovy
iranskogo jazykoznanija: Novoiranskie jazyki: Zapadnaja gruppa, prikaspijskie jazyki
[Basics of Iranian linguistics: New Iranian languages: Western group, Caspian languages],
5–230. Moscow: Nauka.
Ericsdottir, Christine. 2005. Articulatory-acoustic relationships in Swedish vowel sounds.
Stockholm: Stockholm University dissertation.
Éšniyozov, Misr. 1977. Dialektologiyai tojik (Qismi yakum) [Tajik dialectology (Volume 1)].
Dushanbe: Universiteti Davlatii Tojikiston ba nomi V. I. Lenin.
Fajzov, Maxram. 1983. K voprosu o količestvennoj xarakteristike glasnyx v sovremennom
tadžikskom literaturnom jazyke [On the question of the quantitative characteristics of
vowels in modern literary Tajik]. Voprosy jazykoznanija 1983(5). 59–69.
Fajzov, Maxram. 1985. Tadžikskoe literaturnoe proiznošenie [Literary Tajik pronounciation].
Dushanbe: Doniš.
Fitrat, Abdurauf. 1927. Loihai alifboi navi tojikī [The new Tajik alphabet project]. rahвari doniş
March 1927. (Reprinted in Nabavī et al. eds. 2007. 105–111).
Fitrat, Abdurauf. 1928a. Dar girdi alifboi toza [On the new alphabet]. rahвari doniş April–May
1928. 13–16. (Reprinted in Nabavī et al. eds. 2007. 155–162).
Fitrat, Abdurauf. 1928b. Dar girdi alifboi nav [On the new alphabet]. rahвari doniş 10(13). 8–10.
(Reprinted in Nabavī et al. eds. 2007. 215–222).
Flemming, Edward, Peter Ladefoged & Sarah Thomason. 2008. Phonetic structures of Montana
Salish. Journal of Phonetics 36. 465–491.
Fletcher, Samuel G. & Dennis G. Newman. 1991. [s] and [ʃ] as a function of linguapalatal contact
place and sibilant groove width. Journal of the Acoustical Society of America 89(2).
850–858.
Fujisaki, Hiroya & Ki-ichi Hasegawa. 1983. Kyōtsūgo boin ni okeru kojinsa oyobi hōgensa
[Inter-speaker and inter-dialectal differences in the pronunciation of standard Japanese
vowels]. Proceedings of the Meeting of Acoustical Society of Japan 1983(3). 185–186.
Fujisaki, Hiroya, Hiroyoshi Morikawa & Ki-ichi Hasegawa. 1983. Nihongo kyōtsū boin no
tokuchō to sono hendō [The characteristics of standard Japanese vowels and their
variation]. Transactions of the Committee on Speech, Acoustical Society of Japan S83(32).
245–252.

102

Shinji Ido

Fulop, Sean A. & Ron Warren. 2014. An acoustic analysis of advanced tongue root harmony in
Karaja. Proceedings of Meetings on Acoustics 21(1).
Fulop, Sean A., Ethelbert Emmanuel Kari & Peter Ladefoged. 1998. An acoustic study of the
tongue root contrast in Degema vowels. Phonetica 88. 80–98.
Gacek, Tomasz. 2012. Some remarks on the pronunciation of Russian loanwords in Tajik. Studia
Linguistica Universitatis Iagellonicae Cracoviensis 129 supplementum. 353–361.
Grassi, Evelin. 2018. One country, two (official) languages: Remarks on Pashto-Dari coexistence
in Afghanistan and Tajik-Russian coexistence in Tajikistan (20th–21st centuries). Cahier de
Studia Iranica 61. 205–222.
Gromatovič, K. D. 1930. Kratkoe rukovodstvo po izučeniju uzbekskogo jazyka dlja
kratkovremennyx kursov vzroslyx evropejcev, služaščix sovetskix učreždenij Uzbekskoj SSR
[A concise guide to learning Uzbek for short courses for European adults and employees at
Soviet institutions of the Uzbek SSR]. 2nd edn. Tashkent: Pravda Vostoka.
Guboglo, Mikhail. 1990. Demography and language in the capitals of the Union Republics.
Journal of Soviet Nationalities 1(4). 1–42.
Halimzoda. 1929. Dar borai imloi muvaqqatii nav [On the new provisional orthography].
tojikistoni surx 26 July 1929. (Reprinted in Nabavī et al. eds. 2007. 466–468).
Harrel, Richard S. 1957. The phonology of colloquial Egyptian Arabic. New York: American
Council of Learned Societies.
Henton, C. G. 1983. Changes in the vowels of received pronunciation. Journal of Phonetics 11(4).
353–371.
Hughes, Arthur, Peter Trudgill & Dominic Watt. 2013. English accents & dialects. 5th edn.
London: Routledge.
İdo, Shinji. 2002. Şimdiki Buharalı gençlerin Tacikçesinin sözdizimsel ve şekilbilgisel özellikleri
[Syntactic and morphological characteristics of the Tajik of today’s young Bukharans]. İlmî
Araştırmalar 13. 51–65.
Ido, Shinji. 2012. Tajikugo bunpō binran [A handbook of Tajik grammar]. Sendai: Tohoku
University Press.
Ido, Shinji. 2014. Bukharan Tajik. Journal of the International Phonetic Association 44(1). 87–102.
Ido, Shinji. 2015. New Persian vowels transcribed in Ming China. Cahier de Studia Iranica 57.
99–136.
Ido, Shinji. 2016. A late 19th-century Uzbek text in Hebrew script. Turkic Languages 20. 216–233.
Ido, Shinji. 2017. The vowel system of Jewish Bukharan Tajik: With special reference to the Tajik
vowel chain shift. Journal of Jewish Languages 5. 81–103.
Ido, Shinji. 2018. Formant frequency values of vowels produced by ‘Iranians’ in Bukhara. In Rohi
abrešim va robitahoi baynifarhangii Avruosiyo / Šelkovyj put’ i evrazijskie mežkul’turnye
otnošenija / Silk Road and Eurasian transcultural relations, 16–19. Dushanbe: Avicenna
Tajik State Medical University.
Ido, Shinji. 2019. A spectral analysis of the voiceless postalveolar fricative in two varieties of
Tajik. Manuscript in preparation.
International Phonetic Association. 1999. Handbook of the International Phonetic Association.
Cambridge: Cambridge University Press.
Ismatī, Obid. 1929. Yak nigoh ba matbuoti inqilobii tojikī [A look at the Tajik revolutionary
press]. rahвari doniş September 1929. 3–5.
Ismatī, Obid. 1930. Dar girdi zabon, imlo va alifboi navi tojikī [On language, orthography, and
the new Tajik alphabet]. оvozi tojik 9 February 1930. (Reprinted in Nabavī et al. eds. 2007.
522–526)

2 Standard Tajik phonology

103

Jamolxonov, Hasanboy. 2009. O‘zbek tilining nazariy fonetikasi [The theoretical phonetics of
Uzbek]. Tashkent: Fan.
Jamolxonov, Hasanboy & Qalandar Sapaev. 2007. Imlo muammolari [Problems of orthography].
Tashkent: Nizomiy nomidagi Toshkent Davlat Pedagogika Universiteti.
Jongenburger, Willy & Vincent J. van Heuven. 1991. The distribution of (word initial) glottal stop
in Dutch. Linguistics in the Netherlands 8(1). 101–110.
Kalinovsky, Artemy M. 2018. Laboratory of socialist development: Cold War politics and
decolonization in Soviet Tajikistan. Ithaca: Cornell University Press.
Kalontarov, Ya. I. 1974. Luǧati imloi zaboni adabii tojik [An orthographical dictionary of literary
Tajik]. Dushanbe: Irfon.
Kamoliddinov, Bahriddin. 2007. Zaboni tojikī: Kitobi darsī baroi sinfi 11 maktabi tahsiloti umumī
[The Tajik language: A textbook for general educational school 11th graders]. Dushanbe:
Sobiriyon.
Karimov, Hilol. 1973. Fonetika [Phonetics]. In B. Niyozmuhammadov (ed.), Zaboni adabii hozirai
tojik (Qismi 1 Leksikologiya, fonetika va morfologiya): Kitobi darsī baroi fakul’tethoi
filologiyai maktabhoi olī [Modern literary Tajik (Volume 1 lexicology, phonetics, and
morphology): A textbook for faculties of linguistics at schools of higher education],
83–108. Dushanbe: Irfon.
Karimov, Hilol. 1982. Fonetika [Phonetics]. In Šarofiddin Rustamov (ed.), Zaboni adabii hozirai
tojik, Qismi 1: Kitobi darsī baroi fakul’tethoi filologiyai maktabhoi olī [Modern literary
Tajik, Volume 1: A textbook for faculties of linguistics at schools of higher education],
76–101. Dushanbe: Maorif.
Kasymov, Shavkat. 2013. Regional fragmentation in Tajikistan: The shift of powers between
different identity groups. Asian Geographer 30(1). 1–20.
Kerimova, Aza Alimovna. 1959. Govor tadžikov buxary [The dialect of Bukhara Tajiks]. Moscow:
Izdatel’stvo vostočnoj literatury.
Kerimova, Aza Alimovna. 1995. Ob osnovnyx processax razvitija sovremennogo tadjikskogo
literaturnogo jazyka [The basic process of the development of modern literary Tajik].
Voprosy jazykoznanija 1995(3). 118–126.
Kerimova, Aza Alimovna. 1997. Tadžikskij jazyk [The Tajik language]. In Vera Sergeevna
Rastorgueva, Vjačeslav Vladimirovič Moškalo & Džoj Iosifovna Édel’man (eds.), Iranskie
jazyki I. Jugo-zapadnye jazyki [Iranian languages I: Southwestern languages], 96–120.
Moscow: Indrik.
Khalid, Adeeb. 2015. Making Uzbekistan: Nation, empire, and revolution in the early USSR.
Ithaca: Cornell University Press.
Kochetov, Alexei. 2017. Acoustics of Russian voiceless sibilant fricatives. Journal of the
International Phonetic Association 47(3). 321−348.
Kononov, Andrej Nikolaevič. 1960. Grammatika sovremennogo uzbekskogo literaturnogo jazyka
[A grammar of modern literary Uzbek]. Moscow: Izdatel’stvo Akademii Nauk SSSR.
Kortlandt, F. H. H. 1973. Phonetics and phonemics of standard Russian. Tijdschrift voor
Slavische Taal- en Letterkunde 2. 73–83.
Kumitai markazii alifboi navi tojikī. 1929. Ba diqqati ommai muallimon, muallif parvaron va
tarafdoroni alifboi navi tojikī! [To the attention of the cohort of teachers, fans of writers,
and proponents of the new Tajik alphabet!]. rahвari doniş December 1929. 5.
Ladefoged, Peter. 1967. Three areas of experimental phonetics. London: Oxford University Press.
Lennes, Mietta, Eija Aho, Minnaleena Toivola & Leena Wahlberg. 2006. On the use of the
glottal stop in Finnish conversational speech. In Reijo Aulanko, Leena Wahlberg & Martti

104

Shinji Ido

Vainio (eds.), Fonetiikan päivät 2006 / The phonetics symposium 2006, 93–102. Helsinki:
Hakapaino Oy.
Lewin, Christopher. 2018. The vowel /əː/ ao in Gaelic dialects. Papers in Historical Phonology 3.
158–179.
Lindbolm, Björn & Sundberg, Johan. 2014. The human voice in speech and singing. In Thomas
D. Rossing (ed.), Springer handbook of acoustics, 703–746. 2nd edn. Berlin: Springer.
Lohutī, Abulqosim. 1928. Dar girdi loihai alifboi navi tojikī [On the new Tajik alphabet project].
rahвari doniş January–February 1928. (Reprinted in Nabavī et al. eds. 2007. 146–149).
Maclagan, Margaret & Jennifer Hay. 2007. Getting fed up with our feet: Contrast maintenance and
the New Zealand English “short” front vowel shift. Language Variation and Change 19. 1–25.
Maddieson, Ian. 2005. Uvular consonants. In Martin Haspelmath, Matthew S. Dryer, David Gil
& Bernard Comrie (eds.), The world atlas of language structures, 30–33. Oxford: Oxford
University Press.
Mamatov, Jahangir, S. J. Harrell, Kathy Kehoe & Karim Khodjibaev (eds.). 2005. Tajik-English
dictionary. Springfield: Dunwoody Press.
Maniyozov, Abduqodir & Abdusattor Mirzoev. 1991. Luǧati imlo [A dictionary of orthography].
Dushanbe: Maorif.
McCloy, Daniel R. 2016. phonR: Tools for phoneticians and phonologists. R package version 1.0-7.
Melex, N. A. 1960. Tadžikskie govory i ix rasprostranenie [Tajik dialects and their distribution].
Vestnik Leningradskogo universiteta 14. 149–151.
Melex, N. A. 1968. Gižduvanskij govor tadžikskogo jazyka [The Gʻijduvon dialect of Tajik].
Leningrad: Leningrad State University dissertation abstract.
Miller, Corey. 2012. Variation in Persian vowel systems. Orientalia Suecana 61. 156–169.
Munzim, Mirzo Abdulvohidi. 1928. Zaboni buxoriyon tojikī yo ūzbakī [Is the language of
Bukharans Tajik or Uzbek?]. tojikistoni surx 14 December 1928. (Reprinted in Nabavī et al.
eds. 2007. 380–383).
Nabavī, Abduxoliqi. 2007. Bahshoi ilmī va mafkuravī oid ba zaboni tojikī dar solhoi 20-um
[Academic and ideological debates on Tajik in the 1920s]. In Abduxoliqi Nabavī,
Nurmuhammad Odinaev & Parvin Olimova (eds.), Zaboni tojikī dar mabnoi mubohisaho:
Majmūai maqolahoi solhoi 20-um [Tajik in the context of debates: A collection of articles
from the 1920s], 9–38. Dushanbe: Irfon.
Nabavī, Abduxoliqi, Nurmuhammad Odinaev & Parvin Olimova (eds.). 2007. Zaboni tojikī dar
mabnoi mubohisaho: Majmūai maqolahoi solhoi 20-um [Tajik in the context of debates: A
collection of articles from the 1920s]. Dushanbe: Irfon.
Niyazi, Aziz. 1998. Tajikistan I: The regional dimension of conflict. In Michael Waller, Bruno
Coppieters & Aleksei Malashenko (eds.), Conflicting loyalties and the state in post-Soviet
Russia and Eurasia, 145–170. London: Frank Cass.
Niyozī, Š. 1956. Fonetika. In B. N. Niyozmuhammadov, Š. N. Niyozī & D. T. Tojieva (eds.),
Grammatikai zaboni tojikī (Qismi 1 fonetika va morfologiya): Kitobi darsī baroi maktabhoi
olī [A grammar of Tajik (Volume 1 phonetics and morphology): A textbook for schools of
higher education], 17–32. Stalinobod: Našriyoti Davlatii Tojikiston.
Niyozmuhammadov, B., Š. Niyozī & L. Buzurgzoda. 1955. Grammatikai zaboni tojikī:
Fonetika va morfologiya baroi maktabhoi haftsola va miyona [A grammar of Tajik: Phonetics
and morphology for seven-year and middle schools]. Stalinobod: Našriyoti Davlatii Tojikiston.
Nourzhanov, Kirill & Christian Bleuer. 2013. Tajikistan: A political and social history. Canberra:
ANU E Press.

2 Standard Tajik phonology

105

Novák, Ľubomír. 2013. Problem of archaism and innovation in the Eastern Iranian languages.
Prague: Charles University dissertation.
O novom tadžikskom (latinizirovannom) alfavite. 1928. Izvestija obščestva dlja izučenija
tadžikistana i iranskix narodnostej za ego predelami 1. 242–247.
Odilzoda, A. 1930. Dar girdi mas”alai alifboi navi tojikī [On the issue of the new Tajik alphabet].
оvozi tojik 3 January 1930. (Reprinted in Nabavī et al. eds. 2007. 505–508).
Olimov, Abdurahmon & Mahmadnazar Aliev. 1999. Imloi zaboni tojikī (Bo qarori Hukumati
Jumhurii Tojikiston az 3 sentyabri soli 1998, No 355 tasdiq šudaast) [The orthography of
Tajik (authorised by the 3 September 1998 no. 355 resolution of the Government of the
Republic of Tajikistan)]. Dushanbe: Irfon.
Olimova, Parvin. 2007. Mas”alahoi zaboni adabii tojikī az nigohi Tūraqul Zehnī [Issues
of literary Tajik as seen from Tūraqul Zehnī’s perspective]. In Abduxoliqi Nabavī,
Nurmuhammad Odinaev & Parvin Olimova (eds.), Zaboni tojikī dar mabnoi mubohisaho:
Majmūai maqolahoi solhoi 20-um [Tajik in the context of debates: A collection of articles
from the 1920s], 57–68. Dushanbe: Irfon.
Ōnishi, Katsunari & Minoru Shibata. 2000. Anaunsā no bidakuon shiyō jittai to onsei bunseki
sofuto ni yoru hantei ni tsuite [On announcers’ use of the velar nasal and its softwaremediated identification]. Hōsō kenkyū to chōsa 50(4). 30–47.
Orfinskaja, V. K. 1945. Materialy k xarakteristike fonetičeskogo sostava tadžikskogo jazyka
[Materials for characterizing the phonetic composition of Tajik]. In I. I. Meščaninov (ed.),
Iranskie jazyki 1 [Iranian languages 1], 87–106. Moscow: Izdatel’stvo Akademii Nauk SSSR.
Polivanůf, J. D. 1934. Masəalahoji zavoni adabiji jahudihoji maⱨali [Issues of the literary
language of local Jews]. Tashkent: Naşrijoti Davlatiji Uz SSR.
Qaror dar borai mas”alai zaboni adabī va dar borai zaboni ta”lim va adabiyoti ommagii tojik
[Decision about the issue of the literary language and about the language of instruction
and Tajik people’s literature]. оvozi tojik 14 September 1930. (Reprinted in Nabavī et al.
eds. 2007. 668–671).
Qarori majlisi mašvarati ilmii tojikoni ūzbakiston: Dar borai zaboni adabii tojik (Az ta”rixi 10–15
fevrali soli 30-um) [Decision of Uzbekistan Tajiks’ scientific counsel meeting: On literary
Tajik (made as of 10–15 February 1930)]. оvozi tojik 2–4 April 1930. (Reprinted in Nabavī et
al. eds. 2007. 614–618).
Qoidahoi imloi zaboni tojikī (Bo qarori Hukumati Jumhurii Tojikiston az 4 oktyabri soli 2011, No
458 tasdiq šudaast) [Orthographical rules of Tajik (authorised by the 4 October 2011 no.
458 resolution of the Government of the Republic of Tajikistan)]. 2011. Dushanbe.
Qonuni Jumhurii Tojikiston «Dar borai zaboni davlatii Jumhurii Tojikiston» [The law of Tajikistan
“On the state language of the Republic of Tajikistan”]. 2009. http://www.kumitaizabon.
tj/tg/content/konuni-chumkhurii-tochikiston-dar-borai-zaboni-davlatii-chumkhuriitochikiston (accessed 17 August 2018).
R Development Core Team. 2019. R: A language and environment for statistical computing
(version 3.5.2). R Foundation for Statistical Computing, Vienna, Austria.
Rastorgueva, Vera Sergeevna. 1954. Kratkij očerk grammatiki tadžikskogo jazyka [A short
sketch of Tajik grammar]. In Muhammedžan V. Rahimi & Ljudmila Vladimirovna
Uspenskaja (eds.), Tadžiksko-russkij slovar’ [Tajik-Russian dictionary], 531–570. Moscow:
Gosudarstvennoe izdatel’stvo inostrannyx i nacional’nyx slovarej.
Rastorgueva, Vera Sergeevna. 1955. Kratkij očerk fonetiki tadžikskogo jazyka [A short sketch of
the phonetics of Tajik]. Stalinabad: Izdatel’stvo Akademii Nauk Tadžikskoj SSR.

106

Shinji Ido

Rastorgueva, Vera Sergeevna. 1956. Leninabadsko-kanibadamskaja gruppa severnyx
tadžikskix govorov [The Leninabad-Kanibadam group of Northern Tajik dialects]. Moscow:
Izdatel’stvo Akademii Nauk SSSR.
Rastorgueva, Vera Sergeevna. 1964. Opyt sravnitel’nogo izučenija tadžikskix govorov [An
attempt at a comparative study of Tajik dialects]. Moscow: Nauka.
Rastorgueva, Vera Sergeevna. 1992. A short sketch of Tajik grammar. Translated and edited by
Herbert H. Paper. Bloomington: Indiana University.
Rasulī, M. 1931. Mas”alai alifbo va imloro ba zudī hal kardan darkor [It is necessary to resolve
the issue of the alphabet and orthography soon]. rahвari doniş May 1931. (Reprinted in
Nabavī et al. eds. 2007. 678–684).
Revelle, William. 2018. psych: Procedures for psychological, psychometric, and personality
research. R package version 1.8.12.
Ryding, Karin C. 2005. A reference grammar of Modern Standard Arabic. Cambridge: Cambridge
University Press.
Rzehak, Lutz. 2001. Vom Persischen zum Tadschikischen: sprachliches Handeln und Sprachplanung
in Transoxanien zwischen Tradition, Moderne und Sowjetmacht (1900–1956) [From Persian
to Tajik: Linguistic behaviour and language planning in Transoxiana between traditions,
modernism, and Soviet power (1900–1956)]. Wiesbaden: Reichert Verlag.
Sadr Ziyaʼ, Sharif Jan Makhdum, Rustam Shukurov & Muhammadjon Shukurov. 2004. The
personal history of a Bukharan intellectual the diary of Muḥammad Sharīf-i Ṣadr-i Z̮ iya.
Leiden: Brill.
Schiffman, Harold F. 1998. Standardization or restandardization: The case for “Standard”
Spoken Tamil. Language in Society 27. 359–385.
Semёnov, Aleksandr Aleksandrovič. 1927. Kratkij grammatičeskij očerk tadžikskogo jazyka s
xrestomatiej i slovarёm [A short grammatical sketch of Tajik with a reader and dictionary].
Tashkent.
Ševa yo zaboni adabī? [A dialect or the literary language?]. Radioi Ozodī 24 August 2005.
https://www.ozodi.org/a/603114.html (accessed 9 September 2020).
Shahin, Kimary N. 2002. Postvelar harmony. Amsterdam: John Benjamins.
Shalinsky, Audrey. 1979. Central Asian émigrés in Afghanistan: Problems of religious and ethnic
identity. New York: Afghanistan Council.
Sjoberg, Andrée Frances. 1962. The phonology of standard Uzbek. In Nicholas Poppe (ed.),
American studies in Altaic linguistics, 237–262. Bloomington: Indiana University Press.
Skalozub, Larisa Georgievna. 1963. Palatogrammy i rentgenogrammy soglasnyx fonem
russkogo literaturnogo jazyka [Palatograms and radiographs of standard Russian
consonants]. Kiev: Izdatel’stvo Kievskogo Universiteta.
Sokolova, Valentina Stepanovna. 1949. Fonetika tadžikskogo jazyka [The phonetics of Tajik].
Moscow: Izdatel’stvo Akademii Nauk SSSR.
Stalinabad – the new capital. 1954. Central Asian Review 2(1). 314–321.
Stevens, Kenneth & Arthur S. House. 1955. Development of a quantitative description of vowel
articulation. Journal of the Acoustical Society of America 27(3). 484–493.
Stewart, William A. 1968. A sociolinguistic typology for describing national multilingualism. In
Joshua A. Fishman (ed.), Readings in the Sociology of Language, 531–545. The Hague: Mouton.
Sugitoh, Miyoko. 1997. Nihongo onsē no onsēgakuteki tokuchō [Phonetic characteristics of the
sounds of Japanese]. BME (Biomedical Engineering) 11(4). 2–8.
Sulaymonova, M. 1930. Dar girdi alifboi tojikī [On the Tajik alphabet]. оvozi tojik 2 February
1930. (Reprinted in Nabavī et al. eds. 2007. 521–522).

2 Standard Tajik phonology

107

Tiede, Mark K. 1996. An MRI-based study of pharyngeal volume contrasts in Akan and English.
Journal of Phonetics 24. 399–421.
Toşpūlotuf, M., B. Gitelmaxfr & S. Klimcitskij. 1932. Zaвoni toçikī вaroji avrupoijon [Tajik for
Europeans]. Stalinoвod: Naşri Toçik.
Trudgill, Peter & Jean Hannah. 2008. International English. 5th edn. London: Routledge.
Tunçer-Kılavuz, İdil. 2009. Political and social networks in Tajikistan and Uzbekistan: ‘Clan’,
region and beyond. Central Asian Survey 28(3). 323−334.
Uluǧzoda, M. 1930. Alifboi tojikī čī guna boyad šavad [How must the Tajik alphabet be like].
оvozi tojik 15 January 1930. (Reprinted in Nabavī et al. eds. 2007. 508–514)
Umeda, Noriko. 1978. Occurrence of glottal stops in fluent speech. The Journal of the Acoustical
Society of America 64(88). 88–94.
Uspenskaja, Ljudmila Vladimirovna. 1962. Govory tadžikov gissarskogo rajona [The dialects of
Tajiks in the Hissar region]. Dushanbe: Akademija Nauk Tadžikskoj SSR.
Wade, Terence. 1992. A comprehensive Russian grammar. Oxford: Blackwell.
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus
Wilke & Kara Woo. 2019. ggplot2: Create elegant data visualisations using the grammar of
graphics. R package version 3.1.1.
Wiegmann, Gunda. 2009. Socio-political change in Tajikistan. Hamburg: University of Hamburg
dissertation.
Wilke, Claus O. 2019. cowplot: Streamlined plot theme and plot annotations for ‘ggplot2’. R
package version 0.9.4.
Wood, Sidney. 1971. A spectrographic study of allophonic variation and vowel reduction in West
Greenlandic Eskimo. Working papers (Phonetic Laboratory, Lund University) 4. 58–94.
Xaskašev, Talbak Nabotovič. 1983. Fonetikai zaboni adabii hozirai tojik (Qismi 1) [The phonetics of
modern literary Tajik (Volume 1)]. Dushanbe: Universiteti Davlatii Tojikiston ba nomi V. I. Lenin.
Xaskašev, Talbak Nabotovič. 1985. Fonetika [Phonetics]. In Šarofiddin Rustamov & Razzoq
Ǧafforov (eds.), Grammatikai zaboni adabii hozirai tojik (Jildi 1), 13–78. Dushanbe: Doniš.
Xojaev, Valī. 1929. Dar mas”alai soda kardani zaboni kitobii tojik [On the issue of simplifying
literary Tajik]. rahвari doniş February–March 1929. (Reprinted in Nabavī et al. eds. 2007.
413–422).
Yanushevskaya, Irena & Daniel Bunčić. 2015. Russian. Journal of the International Phonetic
Association 45(2). 221–228.
Zarubin, Ivan Ivanovič. 1927. Otčet ob étnologičeskix rabotax v Srednej Azii letom 1926 goda
[A report on the ethnological work(s carried out) in Central Asia in the summer of 1926].
Izvestija Akademii nauk SSSR. VI serija 21(3). 351–360.
Zarubin, Ivan Ivanovič. 1928. Očerk razgovornogo jazyka Samarkandskix evreev [A sketch of the
spoken language of Samarkand Jews]. Iran 2. 95–180.
Zehnī, Tūraqul. 1928. Fikri man dar borai zaboni čopagī [My thought on the language of printing].
оvozi tojik 4 and 6 December 1928. (Reprinted in Nabavī et al. eds. 2007. 334–347).
Zehnī, Tūraqul. 1929. Maslihathoi man dar borai zabon [My pieces of advice about language].
rahвari doniş May–June 1929. 39–45. (Reprinted in Nabavī et al. eds. 2007. 431–443).
Zimmerman, Gerit. 2008. Uzbekistan Arabic. In Kees Versteegh, Mushira Eid, Alaa Elgibali,
Manfred Woidich & Andrzej Zaborski (eds.), Encyclopedia of Arabic language and
linguistics, volume 4, 612–623. Leiden: Brill.
Zygis, Marzena. 2003. Phonetic and phonological aspects of Slavic sibilant fricatives. ZAS
Papers in Linguistics 3. 175–213.

Sepideh Koohkan, Roohollah Mofidi

3 Modality and mood in Tajik
Abstract: This chapter aims to introduce modality and mood in Tajik, both conceptually and in terms of the linguistic elements that express them. To do this, we
use Nuyts’ approach (2000–2017) for modality, which groups the three primary
categories of modality—dynamic, deontic, and epistemic—into a qualificational
category with aspect, time, and evidentiality. Based on the data from Tajik grammars, our fieldwork data (gathered through interviews), and data analysis of three
movies, we examine modal auxiliaries from a historical, syntactic, and semantic
viewpoint. We also introduce those adverbs, adjectives, nouns, lexical verbs, and
prepositional phrases which express modality, to show that modality is far broader
than what Tajik grammars usually address. Most modal elements, particularly the
auxiliaries, are polyfunctional, expressing a range of modal meanings. Finally,
assuming mood as a morphological category, we consider four subcategories
of it from Tajik grammars: indicative, imperative, subjunctive, and conjectural.
Throughout our discussions, in a critical approach, we conclude that these subcategories should be re-defined by relying on more precise criteria. The developing
trends of Tajik mood markers should be studied in future research as well.

1 Introduction
This chapter discusses the lexical and functional devices which denote modality
in Tajik and addresses the semantic content of each subcategory of modality, as
well as the mood distinctions (indicative, subjunctive, imperative, etc.) which are
distinguished morphologically in Tajik.
The majority of data in this chapter comes from the available literature on
Standard Tajik. Furthermore, we collected data from the Tajik dialect of Dushanbe,
which is a Southern dialect of Tajik according to Aliev and Okawa (2010).1 These

1 Aliev and Okawa (2010) believe that there are three Tajik dialects spoken in Tajikistan: 1) The
Northern dialect in the northern region of Tajikistan; 2) the Central dialect in Zarafšān and Hesār
Acknowledgements: We would like to extend our deep gratitude to Mr. Voris Muqimi, our informant, for participating in various interviews and being available online whenever necessary. We
would also wish to thank Professor Johan van der Auwera for his valuable suggestions and constructive comments.
https://doi.org/10.1515/9783110622799-003

110

Sepideh Koohkan, Roohollah Mofidi

data were collected using a questionnaire in a face-to-face interview with a male
Tajik informant, aged 23.2 The questionnaire, adopted from Koohkan (2019), contained 200 scenarios and 80 sentences. The scenarios, in an open-ended format,
pertained to hypothetical ‘situations’ in which the informant was expected to use
a target modal element while talking about them. In addition to the situations, the
informant was instructed to translate 80 sentences containing modal elements
from Persian to Tajik. Along with the interviews, two hours of daily conversations, and also two hours of monologues (story-telling and diaries) were recorded
by the informant.3 We also extracted the sentences containing modal elements
from three movies Mihmon-I noxonda-1, ‘Uninvited guest-1’ (2017), Arus-i zamonavi, ‘Modern bride’ (2016), and Mujassama-i išqi, ‘Statue of love’ (2003).*
After the introduction, there will be a section providing the required theoretical
background, i.e., the categorization we adopt from Nuyts (2005, 2006, 2016, 2017 and
MS.4) for modality and our understanding of mood. Section 3 reviews some previous
works in the fields of modality and mood in Tajik. In Sections 4 and 5, the research
engages with modality from two perspectives, formal and functional, respectively, in
order to identify modal elements and modal meanings. Section 6 is devoted to mood,
critically assessing the mood categories proposed by Tajik grammarians.

2 Theoretical assumptions
Modality was first used in logic and philosophy (van der Auwera and Aguilar
2016: 16), where Immanuel Kant employed it to refer to “the necessity and possibility of propositions” (Pape 1966: 14–15). The term entered linguistics only in the
valleys, which is similar to Standard Tajik; and 3) the Southern dialect, spoken in Dushanbe and
some other regions.
2 In this article we intended to rely solely on spoken Tajik; however, we quickly found out that
we could not disregard written, formal Tajik and previous studies. Moreover, we lost contact with
most of our informants (10 males and 10 females) and we were unable to complete our full set of
questionnaires and interviews with most of them. As a result, we opted to only use data from one
informant, whose interviews were completed.
3 The conversations, including phone calls and face-to-face dialogues of the informant with his
family members and friends were recorded by him during his stay in Dushanbe.
*
All data from the previous literature and also those from our sources (interviews, daily recordings, and movies) are transliterated and re-glossed (where necessary) according to Leipzig glossing rules. The examples from the literature are all cited. The data from our informant will be labeled as (#Dushanbe), and those from the movies are marked as (#Film1) for Mihmon-i noxonda-1,
(#Film2) for Arus-i zamonavi, and (#Film3) for Mujassama-i išq.
4 MS. is used to refer to Nuyts, Book Manuscript. Modality in mind.

3 Modality and mood in Tajik

111

twentieth century (cf. van der Auwera and Aguilar 2016 for a detailed historical
background of mood and modality).
Defining modality is an issue of dispute. A more general understanding of
modality considers it a representation of the speaker’s attitudes towards a State
of Affairs (SoA). The term ‘state of affairs’ refers to “any type of situation, event
or state, which can be evaluated in terms of its existence” (Van Linden 2012: 2).
The category traditionally covers “obligation, probability, and possibility” (Bybee
et al. 1994: 176; also cf. Nuyts 2006). It is “a notional category which is similar to
time (as opposed to the grammatical category of tense), to sex (as opposed to
the grammatical category of gender), etc.” (Rothstein and Thieroff 2010: 3). The
current literature on modality encompasses several of its different subcategories,
among which the basic and widely accepted ones are dynamic, deontic, and epistemic. In this chapter, we will rely on Nuyts (2005, 2006, 2016, 2017 and MS.), to
specify our understanding of modality as a type of “semantic modification”.5
From Nuyts’ point of view, the category TAM (tense, aspect, and mood), or
recently called TAME (tense, aspect, mood, and evidentiality), is inadequate,
because “the labels ‘tense’ and ‘mood’ in the traditional term only refer to the grammatical devices expressing time and modality, which is an undue limitation” (Nuyts,
MS: 56). He therefore suggests a more semantic term, qualificational category, which
is the conceptual representation of “all aspects of the semantic organization of an
utterance which concern the modification, situation or evaluation of the state of
affairs” (Nuyts, MS: 40). It includes phasal aspect, dynamic modality, quantitative
aspect, time, deontic modality, epistemic modality, and evidentiality.6 Beyond the
qualificational category, there is communication planning that covers directivity,
volition, intention, and even evidentiality. In all the members of this category “the
regulation of interaction is the central (and often only) meaning or function” (Nuyts
MS: 217), and therefore, they are different from modality in that modality’s semantic
core property relies only on the speaker’s attitude, and not the interactions with
others (cf. Nuyts 2001; Byloo and Nuyts 2014; Nuyts and Byloo 2015; Nuyts 2017).
The main three types of modality, i.e., dynamic, deontic and epistemic, are
all members of qualificational category. They are defined below, with examples
from Nuyts (2005, 2006, 2016):

5 Nuyts’ approach to modality includes numerous precise subcategories that make the framework appropriate for examining a non-European language. Furthermore, it enables the researcher to arrange the modal elements in their proper space more accurately.
6 For the sake of clarity, brief definitions of the other notions are provided here from Nuyts’
point of view: evidentiality marks the information source; time concerns situation of the SoAs
in time; quantitative aspect marks the frequency of the SoAs (iterative, habitual, etc.); phasal
aspect marks the state of deployment of the SoAs (inchoative, progressive, egressive, etc.).

112

Sepideh Koohkan, Roohollah Mofidi

a) Dynamic modality concerns an ability or need of the first argument participant (which can be, but not necessarily, the speaker) in the SoA, inherent or
imposed by the circumstances. There are three subtypes: participant-inherent (1a), participant-imposed (1b), and situational modality (1c).
(1) a. John can dance the tango.
b. I must go now if I want to catch my bus.
c. It can rain here any time of the year.

[Participant-inherent]
[Participant-imposed]
[Situational potential]

b) Deontic modality involves a specification of the degree of moral acceptability, desirability or necessity of the SoA, as in (2a–b).
(2) a. You cannot let those poor people stand there in the pouring rain like
that.
[Moral unacceptability]
b. We’d better make sure such bad things won’t happen anymore.
[Moral desirability]
c) Epistemic modality is “an indication of the assessment, typically but not
necessarily by the speaker, of the degree of likelihood that the state of affairs
expressed in the clause applies in the world or not” (Nuyts, MS: 82–83), as
in (3).
(3) I hear someone opening the front door, that will be Susan coming home from
work.
[High probability]
The notions of permission and obligation are traditionally subtypes of deontic
modality (cf. Bybee et al. 1994; Palmer 2001; van der Auwera and Plungian
1998; among many others). In the view introduced and advocated in this
chapter, these notions are considered as directives, and therefore, non-modal.
Directiveness differs from deontic modality in that a) it does not indicate the
first argument participant’s commitment to the SoA (which is preliminary for
any definition of modality), rather (s)he is addressed in an interaction; and b)
directivity is non-scalar. This definition is illustrated with (4a–b), as types of
directivity:
(4) a. You may come in now.
b. You must leave now.

[Permission]
[Obligation]

Based on this understanding of modality and its subtypes, we will investigate the
modal elements of Tajik in Sections 4 and 5.

3 Modality and mood in Tajik

113

Representing the speaker’s attitudes, beliefs, and thoughts to the SoA is not
only carried out by modality, but also by mood. More precisely, “the grammatical
realization of modality via verb inflections is known as mood” (Collins 2009: 11;
also cf. Binnick 1991: 73; Whaley 1997: 219). Mood is “a morphological category of
the verb, just as are the verbal categories person, number, aspect, tense and voice”
(Rothstein and Thieroff 2010: 2). De facto, at one side of the continuum, modality (as
a notional category) can be represented through mood (as a grammatical and morphological category), and at the opposite side of the continuum, it can be expressed
with lexical categories such as nouns, adjectives, adverbs, etc. In the middle of this
notional continuum lie the auxiliaries, which are less grammatical than the former,
and less lexical than the latter. (5) sketches the continuum of modality and mood:
(5) lexical categories
(V, N, Adj, Adv)

auxiliaries

grammatical categories
(inflections on verbs)

Portner (2009: 4) distinguishes verbal moods, such as indicative and subjunctive, as
opposed to sentence moods, such as imperative, interrogative, and declarative. He
believes that both verbal and sentence moods are “distinct in terms of morphosyntactic [morphosyntax] and in terms of meaning”, but “they are closely related” (Portner
2009: 4–5; also cf. Bybee et al. 1994: 176). According to Timberlake (2007: 326), “a distinction of at least imperative as opposed to realis, or indicative, mood is nearly universal”, where the latter (indicative or realis) is the unmarked mood, and the former
(imperative) is “semantically extremely rich and in that sense marked”, though it
“is not uncommonly the barest stem form of a verb”. Then, a third, subjunctive category could be distinguished in the mood system of the languages, provided that the
morphology of the language supports the distinction, i.e., if there is a morphological
device to distinguish this third category from the indicative and imperative in the language. If the morphology of a language provides more complicated marking strategies, there would be more distinctions in the mood system, such as optative, irrealis,
conditional, jussive, etc. (cf. Aikhenvald 2010 for the definitions of these distinctions).

3 Review of the literature: Modality and mood
in Tajik
Rzehak (1999: 51), Baizoyev and Hayward (2004: 145–149), Perry (2005: 330–337),
Ido (2005: 69), Khojayori and Thompson (2009: 69), and Windfuhr and Perry (2009:
492–493) all consider tavonistan ‘be able’, xostan ‘want’, and boistan ‘must’, as

114

Sepideh Koohkan, Roohollah Mofidi

modal auxiliaries or defective verbs to express modality. However, Baizoyev and
Hayward (2004: 145–149), Ido (2005: 69), and Khojayori and Thompson (2009:
98–105) add the third person singular form of the verb šoistan, i.e., šoyad meaning
‘maybe’, to this list, as an expression of possibility, while Perry (2005: 342–344)
and Windfuhr and Perry (2009: 492–493) consider šudan ‘to become’ as a modal
to express possibility. Rzehak (1999: 52) categorizes šoyad ‘maybe’, mumkin (ast)
‘it can be, maybe’, darkor and éhtimol both meaning ‘probably, maybe’ as modal
words. Perry (2000) not only addresses modality directly, but also dedicates his
paper to epistemic modality, which can be expressed through different types of
perfect tenses, i.e., quotative, inferential, and presumptive, as well as through a
speculative perfect in the past, present, and future.
In Baizoyev and Hayward (2004: 145–149), darkor and lozim, ‘necessary’,
both can replace boyad, ‘must’, and the adjective mumkin can be used in place
of šoyad. The combination of “the second form of the past participle (e.g. raftagī)
with the abbreviated form of the copula -st and the subject marker verb endings”
is an alternative to express probability (Baizoyev and Hayward 149–150).
In addition to the modal auxiliaries, modality in Perry’s (2005) terminology is
featured through the adjectives lozim, darkor, and zarur, all meaning ‘necessary’,
and the modal adverbials no-čor, no-iloj, and čor-nočor, all meaning ‘without
recourse’, to represent necessity and obligation (Perry 2005: 332–334). Presumption, probability, and possibility are indicated with boyad and darkor, the conjectural mood. The modal idioms and adverbials such as éhtimol dorad (ki) ‘there
is the probability that’, (ba) éhtimol ‘in probability, probably’, az aft=aš ‘probably’, šoyad ‘maybe’; the adjective mumkin, and the nouns imkon and imkoniyat all
express ‘possibility’ (Ibid. 334–337). Ability is expressed through tavonistan ‘can’,
and the compound adjectives which express ‘capable of being’, such as xūrdanī
‘edible’, and very rarely the adjective qodir ‘able, capable’ (Ibid. 337–340).
Regarding the issue of mood, Rastorgueva (1954[1992]), as one of the oldest
scholarly publications on Tajik grammar, distinguishes four moods, namely
indicative, subjunctive, conditional, and imperative in Tajik. According to her,
indicative mood is used to report an evident or non-evident fact. Subjunctive, in
her definition, has two forms on the level of tense (past and present), and it refers
to uncertain actions through “wish, possibility, admission, supposition, intention, expectation, condition, etc.” (Rastorgueva 1992: 68–69). The imperative
mood has two forms for the second person (singular and plural). The conditional
mood includes past, present-future, and present definite forms, and expresses
conjecture (Rastorgueva 1992: 76–77).
Rzehak (1999: 25) distinguishes five types of mood in Tajik, viz. indicative,
narrative (auditory), aorist (subjunctive), presumptive, and optative. Apart from

3 Modality and mood in Tajik

115

indicative and subjunctive (aorist in his terminology),7 which are compatible with
the traditional definitions, narrative mood emphasizes the knowledge that the
speaker has received from hearsay or on the basis of a logical conclusion (Rzehak
1999: 80). Presumptive mood denotes the modal meanings of probability or possibility (Ibid. 87), and the optative indicates wish (Ibid. 36). Baizoyev and Hayward
(2004: 163) introduce imperative and conditional as the distinctions of mood.
According to Perry (2005), Tajik has three types of mood: indicative, conjectural as the expression of “unsupported presumption of the action”, and subjunctive which includes “Prohibitive, Optative, Precative, and Imperative” as its
subtypes (Perry 2005: 8).
Ido (2005: 52) asserts that there are “six principal mood categories, namely
indicative, inferential, imperative, conditional, speculative, and intentional”. Inferential is a type of mood which signals “that the speaker’s proposition
is based on inference drawn from a certain situation or on hearsay (reported/second-hand information)”, and not from direct evidence (Ido 2005: 58). Speculative
expresses “a degree of uncertainty”, and Intentional is “the mood that expresses
one’s intention to perform the action denoted by the verb” (Ido 2005: 63, 65).
In Khojayori and Thompson (2009: 69), mood can be expressed through
indicative, subjunctive, imperative, and probable mood. However, they do not
define probable mood, neither do they address it anymore. Windfuhr and Perry
(2009: 452–459) distinguish between indicative, subjunctive, and counterfactual as subtypes of mood, where counterfactual pertains to an unlikely or unreal
action (Ibid. 488–490).
Table 1 summarizes the linguistic elements which are treated as the expressions of modality and mood by the above-mentioned authors:
Table 1: Modality and mood in the literature.
Modality
Rastorgueva (1992)

Defective verbs:
– boistan
– šoistan

Rzehak (1999)

Modal verbs:
– boistan
– xostan
– tavonistan

Mood
Indicative
Subjunctive
Conditional
Imperative

Modal words:
– šoyad
– mumkin
– darkor
– éhtimol

Indicative
Narrative (auditory)
Presumption
Optative
Aorist (subjunctive)

7 The term aorist is mainly used following “Turkological nomenclature” (Windfuhr and Perry
2009: 456).

116

Sepideh Koohkan, Roohollah Mofidi

Table 1 (continued)
Modality

Mood

Perry (2000)

Epistemic:
– Perfect tense
– Speculative
perfect

Baizoyev and Hayward (2004)

Modal verbs:
– boistan
– šoistan
– xostan

Probability:
– darkor
– lozim
– mumkin
– -agī

Perry (2005)

Defective verbs:
– boyad
– tavonistan
Verbs:
– majbur šudan
– éhtimol doštan

Adjective:
– qodir
– lozim
– darkor
– zarur
Adverbial:
– no-iloj
– no-čor
– čor-no-čor
– darkor
– az aftaš
– šoyad
– mumkin
Nouns:
– imkon
– imkoniyat

Ido (2005)

Modal verbs:
– tavonistan
– xostan
– boistan
– šoistan

Indicative
Imperative
Conditional
Inferential
Speculative
Intentional

Khojayori and Thompson (2009)

Modal verbs:
– xostan
– tavonistan
– boyad
– šoyad

Indicative
Subjunctive
Imperative
Probable

Windfuhr and Perry (2009)

Modal verbs:
– xostan
– boyad
– tavonistan

Circumlocution:
– darkor

Indicative
Conjectural
Subjunctive:
– Prohibitive
– Optative
– Predicative
– Imperative

Indicative
Subjunctive
Counterfactual

3 Modality and mood in Tajik

117

4 Modal elements of Tajik
The most common markers of expressing modality are auxiliaries (e.g., can, may,
must), adverbs (e.g., possibly, perhaps, doubtless), adjectives (e.g., able, probable, necessary), and nouns (e.g., ability, possibility, obligation) (Jacobson 1982:
61). This section introduces these modal elements in Tajik, as extracted from the
grammars and the fieldwork data conducted for this essay.

4.1 Modal auxiliaries
As members of the larger class of auxiliaries, modal auxiliaries exhibit several
inflectional and syntactic properties that distinguish them from main, lexical
verbs. Firstly, in a similar way to other auxiliaries, and different from lexical
verbs, they accept the inflectional negative prefix (Collins 2009: 12). Secondly,
modals are morphologically defective, lacking some forms which are common for
full-fledged lexical verbs. Finally, they are not semantically the main predicates
of the sentences, and “in an unreal conditional, the first verb of the apodosis
must be a modal” (Collins 2009: 13).
There are two candidates for being modal auxiliaries, which were already
introduced in Section 3: boyad (from the verb boistan), and tavonistan. These
will be discussed in depth in the subsequent sections. Furthermore, šudan can
be added to the list, as the third candidate, though it is not usually counted as
a modal auxiliary in Tajik grammars. However, Khojayori and Thompson (2009:
106–107) and Perry (2005: 342) discuss the modal meaning of the third person
singular form of šudan as a modal verb.
Šoyad ‘maybe’ (from the Classical Persian verb šāyestan, which is not currently in use in Tajik or other dialects of Persian anymore) is included in the list
of modal auxiliaries in some Tajik grammars (cf. Ido 2005: 69; Khojayori and
Thompson 2009: 102), but we do not find convincing evidence in support of this
claim, and we will categorize it as a modal adverb (see Section 4.2.1).
Similarly, xostan is usually mentioned as a type of deontic or volitional modal
auxiliary in Tajik grammars (cf. Baizoyev and Hayward 2004: 145; Ido 2005: 69;
Khojayori and Thompson 2009: 98; Rzehak 1999: 51–52), but we will not deal with it
as a modal auxiliary, following the theoretical framework which has been adopted
in the chapter. In fact, in Nuyts’ view, all these notions (volition and intention) are
“beyond the borders of the attitudinal category . . . or qualificational hierarchy”, and
along with directivity, they are related to the cognitive domain of communication
planning (Nuyts, MS: 156). He defines volition as “an indication of a desire/wish . . .
of the speaker, that the state of affairs in the clause will get realized” (Ibid.: 150).

118

Sepideh Koohkan, Roohollah Mofidi

4.1.1 boistan ‘to be necessary, must’
The most prototypical modal auxiliary in many Iranian languages, including Tajik,
is derivationally related to the Classical Persian bāyistan, derived from apāyītan, ‘to
be proper, necessary, fitting’ in Middle Persian, and aβāyišn, ‘necessity, need’, in Parthian (cf. Cheung 2007: 155; Hassandoust 2014: 403–404; Rastorgueva 2000: 172–173).
Morphologically, the Tajik auxiliary boyad is the third person singular form of the verb
boistan. In Classical Persian, some other persons and numbers of the verb were also
used (Mahmoodi-Bakhtiari 2009: 165–167), but today, the third person singular is the
only remaining form in both Persian and Tajik.8 It can also take the past suffix, with or
without the imperfective prefix, to form boist(-i) (must.PST[3SG](-IRR)) and me-boist
(IPFV-must.PST[3SG]), which are the less frequent variants (cf. Baizoyev and Hayward
2004: 421; Perry 2005: 332; Rastorgueva 1992: 61; Rzehak 1999: 51; Windfuhr and Perry
2009: 490). Perry (2005: 332) considers them synonymously in a sentence like (6):
(6) vay
boyad / boist / me-boist
xona me-raft-Ø.
(s)he must / must.pst / ipfv-must.pst home ipfv-go.pst-3sg
‘He had to go home.’
(Perry 2005: 332)
Structurally, the presence of this auxiliary usually calls for the subjunctive mood
for the main verb of the clause, which always follows the auxiliary (for the placement of the verbs, cf. Ido 2005: 70; Khojayori and Thompson 2009: 102–103). This
main verb can appear in the present-future tense (7a), the imperfective past (7b),
or the participial form (7c).
(7) a. man boyad ba šumo yak čiz=ro
guy-am.
I
must to you.pl one thing=acc say.prs-1sg
‘I must tell you one thing.’

(#Film2)

b. soro boyad to
hozir post=šon
me-kand-Ø.
Sara must until now skin=3pl.poss ipfv-peel.pst-3sg
‘Sara had to peel them by now.’
(#Dushanbe)
c. boyad kor=šon
tamum šud-a
boš-a.
must work=3pl.poss finished become.pst-ptcp be.prs-3sg
‘Their work must have been finished.’
(#Dushanbe)
8 The third person singular form of this verb (apāyēt/abāyed) was employed as a modal/impersonal
verb in Middle Persian (cf. Brunner 1977: 188–191; Rastorgueva 2000: 172–173), as it is in today’s Persian. In Classical Persian, probably because of analogy, this verb was conjugated for the other persons
and numbers as well. We like to thank to an anonymous reviewer who brought it to our attention.

3 Modality and mood in Tajik

119

In negative sentences, the negative prefix na- is added to either boyad or the
main verb (or even both). Depending on what the speaker decides to negate,
the meaning of the sentences will change. According to Rzehak (1999: 51), if the
modal auxiliary is negated, it means that it is not necessary for the subject to
perform the predicate function, or (s)he does not need to do it (as in [8a] below).
But if the main verb is negated, it means that the subject is not allowed to do the
action (as in [8b]). Moreover, Khojayori and Thompson (2009: 102) claim that if
the auxiliary is negated, “it has more emphatic sense of prohibition”.9
(8) a. hozir na-boyad ba-xand-am.10
now neg-must sbjv-laugh.prs-1sg
‘I should not laugh now.’
b. musulmon boyad šarob na-xūr-ad.
[A] Muslim must wine neg-eat.prs-3sg
‘Muslims are not allowed to / shouldn’t drink wine.’

(#Dushanbe)

(Rzehak 1999: 51)

It is also possible to use the auxiliary with the non-finite form of the main verb. In
this case, “the short infinitive replaces the subjunctive” (Perry 2005: 331), as in (9),
whose subject is a generic one, without any specification for person and number:
(9) boyad kor
kard.
must work do.sinf11
‘One (we/you/people) has to work’.

(Perry 2005: 331)

9 The third form (negating both the auxiliary and the main verb) is theoretically and pragmatically possible, but we could not find any example for it in Tajik. In Persian, the form is not
frequent, but it is used:
in harf-hā=rā na-bāyad na-goft,
behtar ast-Ø
goft-e
šav-ad.
this talk-pl=acc neg-must neg-say.sinf better be.prs-3sg say.pst-ptcp become.prs-3sg
‘One shouldn’t refrain from saying these things, they are better to be said.’ (Akhlāghi 2008: 108)
Depending on the tense of the main verb, the sentence can mean either ‘the speaker shouldn’t refrain
from doing the action’ (past tense) or ‘it is necessary for her/him not to refrain from the action’.
10 For more examples and details of the subjunctive prefix bi-, see Section 6.3.
11 We use sinf as an abbreviation for short infinitive, which is the past stem of the verb without
the infinitive marker -an (cf. Perry 2005: 256). This label is used after traditional grammars of Persian
(e.g., cf. Anvari and Ahmadi-Givi 2010: 106). [What does the author mean essetianly used after? It is
a bit vague→ We mean we follow their point of view towards Masdar and Masdare moraxxam]

120

Sepideh Koohkan, Roohollah Mofidi

4.1.2 tavonistan ‘to be able to, can’
Tavonistan is “conjugated to show the person and number of the subject”
(Baizoyev and Hayward 2004: 145), and it participates in various TAM constructions, including present imperfective, future, present perfect, simple past, past
imperfective, past perfect, perfect progressive, infinitive, and short infinitive (cf.
Perry 2005: 337–340); and also present subjunctive (based on the fieldwork data).
Therefore, the verb is not defective, as far as the spoken variety of Tajik is concerned.12 Nonetheless, it is considered an auxiliary here, mainly because it has
some other auxiliary properties introduced by Heine (1993: 22–23): it is an expression for one of the “notional domains, i.e. modality”; it is “neither clearly lexical”
(as nouns, verbs, adjectives and other content words are) “nor clearly grammatical”
(as inflectional affixes are), and it is not the main predicate of the sentence. Whenever tavonistan appears as the only verb of the clause, a subordinate action can be
assumed, either as an explicit deverbal noun, as in (10a) below, or as an implicit
verb recovered by the situation, as in (10b). More commonly, however, it is found
with a clausal object that includes the main verb, as is shown below (11a–b).13
(10) a. omidvor=am
ki
ğarq
na-šav-am
dar ob
hopeful=be.1sg that drown neg-become.prs-1sg in water
čun
man ob-bozi
na-me-ton-am.14
because I
water-play neg-ipfv-can.prs-1sg
‘I hope I won’t drown in water because I can’t swim.’
(#Dushanbe)

12 However, there is a defective form of this modal in the present tense, as an alternative to the
fully inflected form in the same tense: me-tavon, as illustrated in the following example. Perry
(2005: 340) alludes to this form as “judged as literary in register”, which is in line with the fact
that this is quite a common form in Classical Persian (cf. Ahmadi–Givi 2001: 1362–1364). Furthermore, we did not find this form in our fieldwork data which targeted Spoken Tajik.
me-tavon
ba
osonī
in
ipfv-can.prs with simplicity this
‘This task can easily be carried out.’

kor=ro
task=acc

ijro
perform

namud.
do.sinf

(Perry 2005: 340)

13 In Persian, the auxiliary boyad can govern tavonistan, as in the sentence below. However, this
structure was not attested in Tajik.
bāyad be-tun-e
barā-ye in
kār=eš
dalil
must
sbjv-can.prs-3sg for-ez
this work=3sg.poss reason
‘(S)he must be able to give a reason for what (s)he has done.’

bi-yar-e.
sbjv-bring.prs-3sg

14 The verbal root in this example represents a colloquial pronunciation of the modal (also cf.
Baizoyev and Hayward 2004: 145 for an indication to this option for the present forms, i.e., meton-am, me-ton-ī, etc.).

3 Modality and mood in Tajik

121

b. zan=aš
zoid-a
na-tavonist-a,
dard-i
wife=3sg.poss give.birth.pst-ptcp neg-can.pst-ptcp.3sg pain-ez
saxt
kašid-a
istod-a
ast-Ø.
difficult bear.pst-ptcp stand.pst-ptcp be.prs-3sg
‘His wife can’t deliver her baby and is in great pain.’
(Windfuhr and Perry 2009: 538)
(11) a. zan-on
dar injo me-ton-an
bayt ba-xon-an.
woman-pl in here ipfv-can.prs-3pl song sbjv-sing.prs-3pl
‘Here, women can sing songs.’
(#Dushanbe)
b. me-tavonist-Ø
tojik gap
ipfv-can.pst-3sg Tajik talk
‘Could (s)he speak Tajik?’

zan-ad?
hit.prs-3sg

(#Film2)

Generally, the main verb of the clause may appear in the subjunctive (as in 11a–b),
infinitival (as in 12a below), the short infinitive (see Footnote 11 in the current
section), or the past participle (12b) (cf. Khojayori and Thompson 2009: 98, who
describe the main verb as “occurring either in a non-finite form or in the subjunctive”; Ido 2005: 69; Perry 2005: 338). The infinitival and participial precede the
auxiliary, while the subjunctive follows it. Below illustrates the infinitival and
participial use of the main verb:
(12) a. dinašab
boron na-borid-a
emroz bozor raftan
last.night rain
neg-rain.pst-prcp today Bazar go.inf
me-ton-am.
ipfv-can.prs-1sg
‘It didn’t rain last night; I can go to bazar today.’
(#Dushanbe)
b. šumo soat-i
panj omad-a
me-tavonist-ed?
you.pl hour-ez five come.pst-ptcp ipfv-can.pst-2pl
‘Could you come at five o’clock?’
(#Dushanbe)
Both the auxiliary and the main verb can change to a negative form by adding the
prefix na-. If the main verb is negated, it “indicates refraining from that action”,
and the negative auxiliary “indicates an inability into the action” (Khojayori and
Thompson 2009: 99). Theoretically, both the main verb and the auxiliary can be
negated to show an inability, impossibility or not having permission to refrain
from the action. Above(10a-b) illustrate the auxiliary in negative use, and (13a)
below shows how the main verb can be negated, while the second example (13b)
highlights both the auxiliary and the main verb in negative forms:

122

Sepideh Koohkan, Roohollah Mofidi

(13) a. me-ton-i
ba:d-i
abit
xob
na-kun-i?
ipfv-can.prs-2sg after-ez lunch sleep neg-do.prs-2sg
‘Can you not sleep after lunch?’
(#Dushanbe)
b. na-me-ton-a
injo bozi na-kun-a?
hama jo
zir
neg-ipfv-can-3sg here play neg-do-3sg every place down
me-kun-a.
ipfv-do.prs-3sg
‘Can’t (s)he not play here? (S)he disarranges everywhere.’ (#Dushanbe)
As a final remark, tavonistan is a stative verb, and it is not progressivized: neither
formally, nor semantically.15 With regard to the forms, it cannot participate in
Tajik periphrastic progressive construction with istodan ‘to stand’, neither in the
present tense, nor in the past (cf. Khojayori and Thompson 2009: 99 define this
limitation and say that “it cannot take the present or past continuous tenses”).
Although it takes the imperfective prefix me- in the present and past (10a, 11a–b,
12a–b, and 13a–b), it is not interpreted as progressive in these examples. Rather, it
denotes a general ability of the subject (to do an action) in a permanent manner,
which is its stativity feature. This can be observed in all the examples of tavonistan in this section, and interestingly, even below (14), as a progressive construction, in which the focus is on ‘a time point’, but at least, tavonistan cannot be
claimed to be in progress:
(14) az
kor
dast kašid-a
na-tavonist-a
istod-a=and.
from work hand pull.pst-ptcp neg-can.pst-ptcp stand.pst-ptcp=3pl
‘They are unable to stop working’
(Perry 2005: 339).

4.1.3 šudan ‘to become’
Today, šudan ‘to become’ is an inchoative verb in Tajik, employed as a copula, a light
verb in complex predicates, and an auxiliary in passive constructions (cf. Tabibzādeh
2012: 183 for Persian; Windfuhr and Perry 2009: 498–499 for Tajik). It also serves in
the modal system to express ‘to be possible’, which is the concern of this section. As
far as could be ascertained, the only grammars that consider a modal role for šudan

15 This limitation can be stated for boistan and its derivatives as well. However, as boistan is
generally a defective verb, such a limitation has not been noted. On the contrary, it is conspicuous for tavonistan, since morphologically this verb behaves like any other full-fledged lexical
verb as mentioned earlier in this section.

3 Modality and mood in Tajik

123

in Tajik are Perry (2005: 342–344), and Windfuhr and Perry (2009: 492–493), who
mention it as an indication of social acceptability and possibility. These grammars
do not provide any information on the chronology of this development (šudan as a
modal). However, Ahmadi-Givi (2001: 1412) mentions that “nowadays [i.e., in Contemporary Persian] it is used as a semi-auxiliary that is semantically close to tavānestan”, from which we can infer that, at least in his opinion, this function of šudan
did not exist in Classical Persian. This inference is complies with our observation that
this verb is absent in the list of Classical modals in the grammars.
As a modal auxiliary, however, the use of šudan is restricted to the third person
singular,16 and it appears in three forms: the present form me-šav-ad (colloquially
me-š-a in our informant’s dialect), the subjunctive šav-ad (colloquially, šav-a or
ba-šav-a), and the past form me-šud; all of which can be negated, if required.17
The main predicate either precedes the auxiliary (15), or follows it. In the latter
case, it appears as a short infinitive (as in 16) or a present subjunctive (with or
without be- (17a and17b, respectively), or even as a past subjunctive (as in 18):
(15) az
injo ba onjo raft-a
na-me-šav-ad.
from here to there go.pst-ptcp neg-ipfv-become.prs-3sg
‘It is not possible to get there from here’ (Khojayori and Thompson 2009: 107).18
(16) asr
kofa me-š-a
yoft=aš.
evening café ipfv-become.prs-3sg find.sinf=3sg.obj19
‘In the evening, it is possible to find him/her in the café.’

(#Dushanbe)

16 Elsewhere, in its lexical and less grammatical uses, šudan is a fully inflecting verb, having
all TAM forms.
17 It is possible to negate the auxiliary (as in [17]), the main verb (as in [a] below), or both (as
in [b]):
a.

me-šav-ad
soro na-rav-ad
xona?
ipfv-become.prs-3sg Sara neg-go.prs-3sg home.
‘May/Can Sara refrain from going/not go to home?’

(#Dushanbe)

na- me-šav-ad
soro na-rav-ad
tūy-i
apa=š.
neg-ipfv-become.prs-3sg Sara neg-go.prs-3sg wedding-ez older.sister=3sg.poss
‘It is not possible that she doesn’t go to her older sister’s wedding (party).’
(#Dushanbe)

18 They translate the sentence as ‘You can’t get there from here.’
19 The pronominal clitic =aš on the main verb (yoft) refers to direct object. The same function
is observed in 17a.

124

Sepideh Koohkan, Roohollah Mofidi

(17) a. na-me-š-a
bi
yagon dalēl
sər=aš
neg-ipfv-become.prs-3sg without any
reason head=3sg.obj
ken-im.
do.prs-1pl
‘It is not possible to fire him/her without any reason.
(#Dushanbe)
b. me-š-a
injo be-nišin-am?
ipfv-become.prs-3sg here sbjv-sit.prs-1sg
‘May I sit here?’

(#Dushanbe)

(18) ey xudo, me-šav-ad
ki
mo mošin došt-a
O god
ipfv-become.prs-3sg that we car
have.pst-ptcp
boš-em.
be.prs-2sg
‘O God, is it possible that we have a car?’
(#Dushanbe)

4.2 Modal adverbs
As expected from any type of modal elements, modal adverbs also “express the
speaker’s attitude to what he is saying, his evaluation of it, or shades of certainty
or doubt about it” (Greenbaum 1969: 94). According to Guimier (1988: 256), inference and epistemic meanings are the distinguishing features of modal adverbs.
The modal adverbs detected in our data, and observed in the literature cover
a range of elements for expressing possibility, probability, and certainty (šoyad
and balki ‘maybe’, ba éhtimol ‘probably’, az aft=aš ‘most likely’, hatman ‘definitely’, haqiqatan ‘truly’, be-šubha ‘undoubtedly’), estimation (taqriban, qarib
and qaribat ‘approximately’), and necessity and obligation (majburan ‘forcedly’,
no-čor, čor-no-čor, no-iloj, xoh=u no-xoh ‘without recourse’).
Unlike the modal auxiliaries, we categorize the above-mentioned adverbs in
three traditional, semantic groups: a) assumption, b) approximation, and c)
necessity (for rather similar classifications, cf. Khojayori and Thompson 2009:
127–129; Perry 2005: 330–337). This categorization is achieved on two grounds.
Firstly, unlike auxiliaries, the modal adverbs constitute an open, lexical category,
and it is not easy to make a their list. Therefore, in order to introduce them in a
straightforward and and unambiguous manner we had to opt for a selective strategy. Instead of creating new terms or going into the semantic analysis of the framework we have adopted, we selected a terminology used by many linguists (cf. Bybee
et al. 1994; Palmer 2001; Portner 2009; van der Auwera and Plungian 1998, among

3 Modality and mood in Tajik

125

many others) to arrange the modal adverbs. The second reason is that modal auxiliaries are polysemous, i.e., each one can express various modal notions, while
most of the modal adverbs covey single modal notions or a group of closely related
notions, which can be distinguished from the other members of the category.

4.2.1 Assumption
–

šoyad ‘maybe, possibly’

Once a verb, šoyad could be conjugated for person and number until the end
of Early New Persian in forms of šāy-am, šāy-i, šāy-ad (first, second and third
person singular), šāy-im, šāy-id, and šāy-and (first-, second- and third-person
plural) (Mahmoodi-Bakhtiāri 2009: 159–161). Below (19a–b) are Classical Persian
examples in which the former represents šāyestan as the main verb, and the latter
as an auxiliary:
(19) a. ān . . . dār-ol-molk rā
be-šāyest-Ø.
that . . . capital.city acc pfv-deserve.pst-3sg
‘That . . . deserved the capital city’
(Tārix-i Bal’ami; Ahmadi-Givi 2001: 1405).
b. qal’e-ye u
ne-mi-šāyest-Ø
setadan.
castle-ez (s)he neg-ipvf-deserve.pst-3sg capture.inf
‘It was not appropriate/fitting to capture his castle’
(Fārs-nāmeh; Ahmadi-Givi 2001: 1406).
Several Tajik grammars, including Baizoyev and Hayward (2004: 148), Ido (2005:
70), Khojayori and Thompson (2009: 102), and Perry (2005: 336) call šoyad the
only, or the frozen, form of šoistan ‘to be able, be worthy’, which belongs to the
category of modal verbs, along with boistan, xostan, and tavonistan. Baizoyev and
Hayward (2004: 149) claim that “the modal šoyad is used to denote the subjunctive – that is, to express hypothesis or contingency”, and Perry (2005: 336) calls it
“a defective impersonal verb”. Khojayori and Thompson (2009: 102) indicate that
the modal auxiliaries boyad and šoyad are “frozen forms of verbs that have otherwise fallen out of use in all modem Persian dialects; they thus act like adverbs
but can take [the negative marker] na-”. They do not provide any example for the
combination of na- and šoyad, they possibly had in mind the Classical Persian
usage na-šāy-ad (neg-deserve-3sg) ‘it does not deserve’, as in (20):

126

Sepideh Koohkan, Roohollah Mofidi

(20) na-šāy-ad
ke
nām=at
nah-and
ādami
neg-deserve.prs-3sg that name=2sg.poss put.prs-3pl human
‘You are unworthy to be called human’
(‘The name of human you does not befit you’)
(Golestān-e Sa’di; Taleghani 2008: 18).
This negative form is no longer in use in Tajik, which concludes that šoyad does
not have the verbal property of allowing negation. As Ahmadi-Givi (2001: 1405)
mentions, the verb šāyestan has no use today in the Persian of Iran either, and
the only remnant of this verb (i.e., šāyad) is definitely an adverb that means ‘it is
possible or probable’.
A second argument in favor of the adverbial status of šoyad/šāyad is that it can
appear with indicative verbs, just like its synonymous adverbs (e.g., éhtimol as shown
below [21a]). The other options for the main verb with this element are present subjunctive with or without be- (21b–c), present progressive (21d), past subjunctive with past
or past-future reference (21e), and past perfect (21f). Khojayori and Thompson (2009:
104–105) explain that if the following verb is in the present subjunctive, the sentence
indicates “that the subject might perform an action in the future”. This explanation,
originally fitting (21b–c), can also be extended to other expressions (21a). However, the
perfect subjunctive after šoyad demonstrates that “the subject might have performed
an action in the past”, while employing šoyad with the main verb in past perfect suggests that “the subject had possibility to perform an action in the past”.
(21) a. šoyad barnomi=aš
ba-ham
me-riz-a
maybe plan=3sg.poss in-together ipfv-pour.prs-3sg
‘Maybe his plan will mess up’
(#Dushanbe)
b. havo
xunuk=a, berun na-me-y-am,
šoyad
weather cold= 3sg out
neg-ipfv-come.prs-1sg maybe
kasal š-am
sick become.prs-1sg
‘The weather is cold, I won’t come out at all, I might get sick’ [Lit.:
‘Maybe I get sick.’]
(#Dushanbe)
c. šoyad pagoh
bē-y-an.
maybe tomorrow sbjv-come.prs-3pl
‘They might come tomorrow’ [Lit.: ‘Maybe they come.’]

(#Dushanbe)

d. jor=am
šoyad jornal
me-xon-a
brother=1sg.poss maybe newspaper ipfv-read.prs-3sg
‘My brother is probably reading a newspaper’
(#Dushanbe)

3 Modality and mood in Tajik

127

e. hamsoya-ho šoyad in
kor=ro
kard-a
boš-and
neighbors-pl maybe this task=acc do.pst-ptcp be.prs-3pl
‘Perhaps the neighbors did this’
(Perry 2005: 336)
f.

–

tu
šoyad ba joy=aš
lağmon
me-xūrd-ī
you.sg maybe in place=3sg laghmon ipfv-eat.pst-2sg
‘You might have eaten laghman instead [of it]’
(Khojayori and Thompson 2009: 104)

(ba) éhtimol ‘probably’

This element, consisting of an optional preposition ba ‘with’,20 and a noun
éhtimol ‘probability’, is an adverb that may call for a verb in subjunctive mood.
There is a cognate adverb ending in the suffix -an, such as éhtimolan ‘probably’,
which Perry (2005: 335) claims that it “is not used in Tajik”, although Nazarzoda
et al. (2008: II/673) include it as a synonym of farzan, mumkin ast, and šoyad. The
noun éhtimol can be used in predicative constructions and may be a coverb for a
compound predicate (see Sections 4.4.1 and 4.5.1). In our data, we did not encounter the use of (ba) éhtimol, and in all sentences that we made up with this adverb
our informant corrected them with the adverb mumkin, instead of (ba) éhtimol.
The grammars represent this adverb (22a–b) in the following ways:
(22) a. éhtimol, fardo
boron bor-ad
probably tomorrow rain
rain.prs-3sg
‘Perhaps it will rain tomorrow’
(Baizoyev and Hayward 2004: 444)
b. ba-éhtimol
dar hayrat ham mond-a
boš-and
in-probability in wonder also stay.pst-ptcp be.prs-3pl
‘Probably they wondered, too’
(Perry 2005: 336)
–

az aft=aš ‘most likely’

20 The preposition ba primarily indicates ‘to, in’. However, in rare cases, such as ba ehtimol, ba
zur ‘with force, forcefully’, it equals the preposition bo ‘with’. This role appears to be a remnant
of Classical Persian where be (ba in Tajik), among other meanings, could also signify ‘with’, as in
the following example from Nāsir Khusraw Qubādiyāni (11th century):
be

kašti-hā

qasd-e

ānjā

kon-and

with

ship-pl

intention-ez

there

do.prs-3pl

‘They (intend to) go there by ships’

(Anvari 2002: 1072)

128

Sepideh Koohkan, Roohollah Mofidi

(Az) aftaš (or az aft-i kor ‘from face-ez task’) is mentioned by Perry (2005: 336) as
a “pertinent sentence adverbial”, and elsewhere, as one of the “adverbial idioms”
(Perry, 2005: 160), meaning ‘probably, by the look of things’, or as Baizoyev and
Hayward (2004: 444) translate it, ‘most likely’ (also cf. Nazazoda et al. 2008: I/95,
who translate it as ‘it turns out, it seems’). In this compound adverb, az ‘from’
is a preposition, and aft is a noun, meaning ‘face, appearance, shape, facial
expressions’ (Nazarzoda et al. 2008: I/95), while =(a)š is the third person singular
clitic (called possessive suffix in the grammars). This (23) is an example with the
accompanying main verb in the indicative form:
(23) az
aft=aš
ū
az
in
hodisa xabar
from face=3sg.poss (s)he from this event information
na-dor-ad
neg-have.prs-3sg
‘Most likely, he doesn’t have any information about this event’
(Baizoyev and Hayward 2004: 444)
If we accept the translation of this construction as ‘it seems, apparently’, we take
it to be a type of evidentiality, which is traditionally classified in the realm of
modality. However, in the framework adopted in this chapter, evidentiality is not
a subtype of modality, since it is not the speaker’s perspective towards the SoA;
rather, it indicates the source of information. On the other hand, the equivalents
that Baizoyev and Hayward (2004: 444) and Perry (2005: 336) introduce for (az)
aftaš – and the examples they present – prompt us to see a modal role in it. There
are some other constructions, such as ma’lum me-boš-ad (apparent ipfv-be.sbjv3sg) ‘it seems’, and the adverb zohiran ‘evidently’, that clearly have no modal
role and only signify evidentiality, which is not considered in this research.
–

balki ‘maybe’

Although the use of this adverb as a modal element is infrequent in the grammars,
and even in our fieldwork data, it can be considered to be a modal adverb,21 as it
expresses possibility, in addition to its function of representing “a strong contrast

21 In Iranian languages, including Colloquial Persian, Balochi, and many others, balke/balki
has the same two-sided function. In Balochi, its modal role is quite active, and it is used instead of other Iranian adverbs such as šāyad and ehtemālan, both meaning ‘maybe, probably’
(cf. Koohkan 2019: 304–305).

3 Modality and mood in Tajik

129

between the actions in two VPs” (Perry 2005: 310), meaning ‘but’.22 Below (24a)
is seen one of the uses of this adverb in our fieldwork data, and one of the examples is an arguably modal example (24b): arguable because of -agī which could
be responsible for expressing the epistemic meaning of the sentence instead of
the adverb. However, it is unlikely for balki to function as expression of strong
contrast. Therefore, Balki in this example expresses ‘maybe’.
(24) a. iltimos ū=ro
girift-a
balki
please he=acc arrest.pst-ptcp maybe
fešor=am
bolo šav-ad
pressure=1sg.poss high become/go.prs-3sg
‘Please arrest him; maybe my blood pressure will increase’

(# Film2)

b. balki dar oyanda hamroh-i
ū
. . . zindagī
maybe in
future companion-ez (s)he . . . life
ba sar me-burd-ag-em
to end ipfv-take.pst-conj-1pl
‘Maybe in the future he and I will live our lives together’ (Perry 2005: 245)
–

hatman ‘definitely’

Perry (2005: 150) lists hatman as an adverb, and Baizoyev and Hayward (2004:
444) as a modal word, meaning ‘definitely, certainly, for sure’. According to
Palmer (2001: 35) and Magni (2010: 210), adverbs with such meanings are modal
adverbs. However, we consider a grammatical element a modal adverb only if it
presents the state of the speaker’s mind (25a–b). This usage can appear in the
indicative present-future (25a), or present perfect (26b). The adverb may have two
meanings, viz. ‘surely’ (as Perry suggests) or high probability (if he has not gone
to the city, it is highly probable for him to be at home) (25a). From our fieldwork
data , the second reading is more probable based on the context that we provided
for the informant (25b).

22 In the following example, this role is clear:
na
not

tanho
only

man,
I

balki
but

doxtar=am
daughter=1sg.poss

‘Not only I, but my daughter too, saw him’

ham
also

ū=ro
he=acc

did-Ø
see.pst-3sg
(Perry 2005: 311)

130

Sepideh Koohkan, Roohollah Mofidi

(25) a. agar dar šahr na-raft-a
boš-ad,
hatman
if
in city neg-go.pst-ptcp be.prs-3sg definitely
dar xona xoh-ad
bud.
in home want.prs-3sg be.sinf
‘If he hasn’t gone to the city, he will surely be at home’(Perry 2005: 381)
b. gorba injo ne-st-Ø,
hatman az
xona
cat
here neg-be.prs-3sg definitely from home
gurext-a
escape.pst-ptcp.3sg
‘The cat is not here; it is highly probable that it has escaped from home’
(#Dushanbe)
However, hatman (and similar adverbs, such as haqiqatan) are also means for
intensifying the propositions, usually in the direct or indirect imperatives,
without implying any modal concepts. (26a–b) illustrate the non-modal function
of hatman, i.e., intensifier.
(26) a. hatman ba xona bi-yo-Ø.
definitely to home imp-come.prs-2sg
‘Definitely come to home.’
b.

–

man ū=ro
hatman nazd-i tu
ravon
I
(s)he=acc definitely to-ez you.sg going
me-kon-am.
ipfv-make.prs-1sg
‘I will definitely send him/her to you.’

(Perry 2005: 150)

(#Film1)

haqiqatan ‘truly’

This adverb (borrowed from Arabic) was not attested in our fieldwork data, but it
was mentioned in a source, cited as (27) below. In the same way as hatman, this
adverb can be used to imply modality or to intensify the proposition.
(27) haqiqatan, in
utoq xele mayda me-boš-ad.
truely
this room very small ipfv-be.prs-3sg
‘As a matter of fact, the room is too small.’ (Conroy and Shukurov 1998: 106)
–

be-šubha ‘undoubtedly’

3 Modality and mood in Tajik

131

This is an adverb very close to the English ‘undoubtedly’ and Persian bi-šak and
bi-šobhe, composed of the preposition be- ‘without’ and the borrowed Arabic
noun šubha ‘doubt’.
(28) be-šubha,
ū
az
ūhda-i
in
kor
without-doubt (s)he from undertaking-ez this task
me-bar-oy-ad.
ipfv-up-come.prs-3sg
‘Undoubtedly, he will rise to the occasion’
(‘Undoubtedly, he can do it’.)
(Baizoyev and Hayward 2004: 444)

4.2.2 Approximation
–

taxminan/taqriban/maqruban/qarib/qaribat ‘approximately, almost, nearly’

These are sentential adverbs, usually employed at the beginning of the clause (cf.
Baizoyev and Hayward 2004: 138–141, 527; Ido 2005: 41). Taqriban, maqruban,
and taxminan consist of an originally Arabic noun (taxmin, maqrub and taqrib
‘conjecture, guess’) and the suffix -an, which changes the noun to an adverb in
such cases as (29a–b). Besides, qarib is an adjective in Arabic, but in (29c), it
functions as an adverb. The adverb qaribat, as in (29d), could have been qaribatan, from which the last part has been dropped. As the examples suggest, these
adverbs can be used in indicative contexts, and they do not collocate with subjunctive mood, unless there is another modal element, such as boyad, for which
the dependent verb can be subjunctive (29e).
(29) a. rost
rav-Ø,
taxminan
ba”d az
sesad metr
straight go.prs-2sg approximately after from 300
meter
dar taraf-i rost me-bin-ī.
on side-ez right ipfv-see.prs-2sg
‘Go straight, and after about 300m you’ll see it on your right.’
(Baizoyev and Hayward 2004: 141)
b. taqriban bist=o
šeš sola
hast-am.
almost
twenty=and six years.old be.prs-1sg
‘I am almost 26 years old.’

(#Dushanbe)

c. qarib
har
roz me-bin-am=aš.
approximately every day ipfv-see.prs-1sg=3sg.obj
‘I see him/her nearly every day.’

(#Dushanbe)

132

Sepideh Koohkan, Roohollah Mofidi

d. qaribat
ham-sin-i
xud=aš hast-Ø.
approximately same-age-ez self=3sg be.prs-3sg
‘(S)he is about her/his age.’

(#Dushanbe)

e. boyad qarib
panjoh kilometr
rost
rav-ed.
must
approximately fifty
kilometer straight go.prs-2pl
‘You must go straight approximately for fifty kilometers.’ (#Dushanbe)

4.2.3 Necessity
–

majburan ‘surely’

The only source in which we could find this adverb was Perry (2005: 334), who classifies majburan among modal adverbials, translating it as ‘must (surely)’. The adverb is
composed of the adjective majbur and the suffix -an, which converts it to an adverb:
(30) ū
majburan halok
šud-a
ast-Ø.
(s)he surely
perished become.pst-ptcp be.prs-3sg
‘He must (surely) have perished’.
(Perry 2005: 334)
–

no-čor, no-iloj, čor-no-čor, xoh=u no-xoh ‘without recourse’

Perry (2005: 333–334) lists these elements as modal adverbials, expressing “necessity, obligation, or more accurately force majeure, qualifying any appropriate VP”.
He translates nočor as ‘have to, without recourse’, no-iloj as ‘obliged’, čor-nočor
as ‘forced’, and xoh=u noxoh as ‘whether he likes it or not’, as in (31a–d). No-čor
is made of the negative morpheme (no-) and the old noun čor, the short form of
čora ‘solution, remedy, cure’ (Nazarzoda et al. 2008: II/560). In Nazarzoda et al.
(2008: I/942), no-čor is a synonym of no-iloj, majbur, and no-guzir, all meaning
‘obliged’, and also a synonym of hatman ‘definitely’. It may also mean ‘helplessly,
necessarily, having no other way, without recourse’ as its equivalent (nāčār) in
Persian (Anvari 2002: 7609).
(31) a. man no-čor
sabr kard-am.
I
not-recourse wait do.pst-1sg
‘I had to wait.’ [‘I waited without recourse.’]

(Perry 2005: 333)

3 Modality and mood in Tajik

133

b. no-iloj
yak asp-i
nağz-i
xud=ro
peškaš
not-recourse one horse-ez good-ez self=acc tribute
dod-a
ast-Ø.
give.pst-ptcp be.prs-3sg
‘He was obliged to give one of his good horses as tribute’. (Perry 2005: 333)
c. čor-no-čor
rozī
šud-a
ast-Ø.
recourse-not-recourse convinced become.pst-ptcp be.prs3sg
‘He was forced to accept.’ [‘He agreed perforce.’] (Perry 2005: 333–334)
d. xoh=u
no-xoh
ba tasvir-i
zamon . . .
want.prs=and not-want.prs to depicting-ez time
...
me-pardoz-ad.
ipfv-deal.prs-3sg
‘Whether he likes it or not/willy-nilly sets himself to depict [his own]
time.’
(Perry 2005: 334)

4.3 Modal adjectives
Kamp and Partee (1995) classify adjectives into two categories, which they call subsective and non-subsective. Adjectives of the former category combine with nouns,
and the combination refers to a subset of the referents of the noun modified, e.g. a
big house is a type of house. The latter category of adjectives modifies nouns as well,
but the combination is not a type of the noun at the reference time, e.g. a possible
solution is not a solution at the reference time. Modal adjectives are non-subsective
adjectives; they are intentional and refer to the speaker’s attitude towards the SoA. In
contrast to modal adverbs, there are negative modal adjectives, such as improbable in
English, (and qeir-e momken ‘non-ez possible’ in Persian). According to Bellert (1977:
345), “modal adjectives are predicates over the fact, event, or state of affairs referred
to by the sentence, and sentences with modal adjectives express one complex proposition”. In her definition, sentences such as (32a), with a modal adverb, express two
propositions (‘being probable’ and ‘John will come’), while (32b) is the corresponding modal adjective, which illustrates one proposition (‘John will come’):
(32) a. It is probably true that John will come.
b. It is probable
that John will come.
However, some studies (such as Lang 1979, mentioned by Nuyts 1993: 936) argue that
modal adjectives are part of the proposition, while modal adverbs can only demon-

134

Sepideh Koohkan, Roohollah Mofidi

strate the speaker’s attitude toward the proposition. Van Linden (2012: 3) divides nonepistemic modal adjectives into two sets, “namely weak and strong adjectives”. She
(2012: 47) adds that strong adjectives, such as essential in English (and zaruri ‘essential’ in Persian), “express a stronger degree of desirability than weak adjectives such
as proper” in English (and monāseb in Persian). She argues (2012: 49) that weak adjectives “are conceived of as unbounded: they are not associated with a boundary, but
represent a range on a scale . . . they are fully gradable in that they occur in the comparative and superlative. In addition, they combine with scalar degree modifiers”.
Perry (2005: 340) considers compound adjectives such as xurdani ‘edible’ as
modal adjectives, claiming that they involve a notion of “capable of being [done]”
(cf. Ilkhānipour 2013: 54–55 for similar examples in Persian). Rejecting this claim,
Koohkan (2019: 310–314) investigates these adjectives in different contexts and argues
that it is plausible to assume that the sense of being capable is inherently in the edible
object. It is not the speaker who estimates the capability of being edible or breakable;
rather the edible or breakable objects already have these features: a glass, desirable to
the speaker or not, is breakable and not edible. If we consider these elements as modal
adjectives, then lots of everyday words and expressions can be grouped within the
scope of modality. Consider the sentence ‘the chocolate is too bitter’. On the one hand,
it is an inherent property of chocolate to be bitter. On the other hand, it might be bitter
for one speaker and normal for others. Stating ‘it’s too bitter’, then, is the speaker’s
idea and stance about the chocolate, and therefore, very close to the spirit of modality.
But we never group them under the category of modality, since bitterness, as a flavor,
is the feature of a 98% dark chocolate, not an estimation of the speaker about it.
In our interviews, questionnaire and fieldwork data, we targeted the adjectives
meaning ‘possible, necessary, probable, essential, certain, definite, compulsory
and obligatory’. The result showed a highly frequent adjective mumkin ‘probable,
possible’, vojib ‘compulsory’, and zarur(i) ‘necessary’. In the literature, though,
lozim ‘necessary’, qodir ‘able, capable’, ma”qul ‘reasonable, sensible, acceptable, pleasing’, majbur ‘compelled, forced, obliged’, and the controversial darkor
‘needed, necessary’ were also introduced as modal adjectives. Like the adverbs,
and for the same reasons, we investigate these adjectives under three subsections:
‘assumption’, ‘necessity’, and a general category named ‘other concepts’.

4.3.1 Assumption
–

mumkin ‘possible’

According to Perry (2005: 337), mumkin ‘possible’ is used to “introduce sentential
complements”. In addition to this meaning, Olson (1994: 64) considers another

3 Modality and mood in Tajik

135

meaning for the combination of this adjective and the third person singular form
of the verb budan ‘to be’ in the present and past tenses, viz. ‘it is allowed’. In
our fieldwork data, mumkin is frequently employed instead of šoyad. Baizoyev
and Hayward (2004: 144) also support our observation, reporting that in Colloquial Tajik, mumkin is used instead of šoyad. Khojayori and Thompson (2009:
128) assert that the present form of budan, i.e., ast (and the clitic form =a), is
omitted in spoken Tajik. It is followed by a subordinate clause, which can be
introduced either with or without the complementizer ki ‘that’ (Rzehak 1999: 52).
The dependent verb might be in the present or past subjunctive (33a–b) or the
imperfective present tense (33c).
(33) a.

na-me-don-am,
mumkin pagoh
bi-ra-m
neg-ipfv-know.prs-1sg possible tomorrow sbjv-go.prs-1sg
peš=aš.
near=3sg
‘I don’t know, maybe tomorrow I’ll go to him/her.’
(#Dushanbe)

b. mumkin=a
raft-a
boš-a
donišgo.
possible= 3sg go.pst-ptcp be.prs-3sg university
‘It is possible that (s)he has gone to the university.’

(#Dushanbe)

c. mumkin=a
ki
bi tūy=iš
fikr
me-kon-a.
possible= 3sg that to wedding=3sg.poss think ipfv-do.prs-3sg
‘It is possible/maybe (s)he is thinking about her/his wedding.’
(#Dushanbe)
In Colloquial Tajik, it is possible to have the infinitive form of the main verb at the
beginning of the sentence before mumkin, to make an impersonal construction
(Rzehak 1999: 52). (34a) illustrates this construction in present/future and (34b)
in past tense:
(34) a.

na-raftan=aton
mumkin ne.23
neg-go.inf=2pl.poss possible neg
‘It’s impossible for you not to go.’

(Perry 2005: 337)

23 The copula budan ‘be’ (with a default interpretation of present tense, i.e. ast) has been omitted from this sentence (and also in [59f]). The form ne is comparable to ne-st-Ø (as in [25b]), in
which the zero agreement refers to the third person singular. In Classical Persian, it is common
for ni (as an allomorph of the negative marker) to host other agreement markers as clitics, to
make ni=yam ‘I am not’, ni=yi ‘You are not’, ni=yim ‘We are not’, ni=yid ‘You are not’, ni=yand
‘They are not’ (cf. Ahmadi-Givi 2001: 1271 for the examples).

136

Sepideh Koohkan, Roohollah Mofidi

bar-vaqt ba onjo rasidan
mumkin na-bud-Ø.
on-time to there arrive.inf possible neg-be.pst-3sg
‘Arriving there on time was not possible.’
(Rzehak 1999: 52)

In the examples (33a–c) and (34a–b), there is an estimation of the first argument
participant towards the SoA: in his/her perspective, the action is either probable
or improbable. Therefore, they all indicate modality. Olson (1994: 64), however,
provides an example, in a conversation about visiting a zoo, in which he translates mumkin nest as ‘it is not allowed’:
(35) Speaker 1:

Speaker 2:

tu
ba palang yagon čiz
dod-ī ?
you.sg to tiger
any
thing give.pst-2sg
‘Did you give the tiger anything?’
ne, mumkin ne-st-Ø.
No, possible neg-be.prs-3sg
‘No, it is not allowed.’

(Olson 1994: 64)

Modal elements are famous for being polysemous. It is no surprise then that
a modal expresses a non-modal meaning. The choice is between an epistemic
modality and a permission reading, as a directive notion.

4.3.2 Necessity
–

darkor ‘necessary, needed, probably’

A modal word for Nazarzoda et al. (2008: 415), an adverb for Olson (1994: 50), an
adjective for Perry (2005: 331) in its combination with the copula budan ‘to be’,
and a circumlocution for Windfuhr and Perry (2009: 491), darkor is translated to
‘necessary, needed’ (Baizoyev and Hayward 2004; Ido 2007; Nazarzoda et al. 2008;
Olson 1994; Perry 2005), ‘probably, possibly’ (Rzehak 1999: 52) and ‘in the act,
appropriate’ (Windfuhr and Perry 2009: 491). It is also a synonym for lozim and a
substitute for boyad. The adjective darkor sketches “self-assessed need as distinct
from imposed obligation” (Perry 2005: 331), and it usually comes at the end of
the sentence (Perry 2005: 334; Rzehak 1999: 52). The examples in (36a–c) express
necessity and need, and (37) illustrates probability. The subject of this predicative
adjective may be nominal (as in 36a–b), infinitival (36c), or clausal (37). In the
infinitival usage, the pronominal clitic =am (called possessive suffix by Baizoyev
and Hayward 2004: 148) has attached to the infinitive raftan to refer to the agent.

3 Modality and mood in Tajik

(36) a. Rostam darkor.
Rostam needed
‘Rostam is needed.’
b. čand
to
kitob darkor ast-Ø?
how.many number book needed be.prs-3sg
‘How many books are needed?’

137

(#Film2)

(Olson 1994: 50)

c. soat-i
du raftan=am
darkor.
hour-ez two go.inf=1sg.poss necessary
‘I need to go at two o’clock.’ [Lit.: It is necessary for me to go at two
o’clock]
(Ido 2007: 20)
(37) onho dar roh boš-and
darkor.
they in way be.prs-3pl probably
‘They must be on the way.’ [‘It’s probable that they are on the way.’]
(Perry 2005: 334)
–

vojib, lozim, zarur, farz ‘necessary, needed’

Like mumkin and darkor, lozim can either stand alone or it may be used with the
present or past form of the copula budan ‘to be’, which cannot be omitted in the
past tense. As in the case of darkor, the main verb of these modals may appear
in infinitival form, and its agent is marked with pronominal (possessive) clitics
(Khojayori and Thompson 2009: 129). In Colloquial Tajik, lozim and darkor can
substitute the auxiliary boyad (Baizoyev and Hayward 2004: 148). (38a) shows
the use of lozim with the present tense, and (38b) illustrates the past tense of the
modal predicate with an infinitive.
(38) a. ba man tabib-e
lozim ast-Ø
ki
to I
doctor-indf need be.prs-3sg that
‘I need a doctor who . . .’
(Conroy and Shukurov 1998: 136)
b. ba xona raftan=aš
lozim
bud.
to home go.inf=3sg.poss necessary be.pst-3sg
‘He had to/needed to go home.’ [‘His going home was necessary.’]
(Perry 2005: 332)
Zarur is also listed as a modal adjective by Perry (2005: 331). Along with lozim and
darkor, zarur(i) indicates an internal need or necessity rather than an external
obligation:

138

Sepideh Koohkan, Roohollah Mofidi

(39) ruboi-ho=ro
judo
kardan-i mo zarur.
ruba’i-pl=acc separate do.inf-ez we necessary
‘We must separate the ruba’is.’ [‘Our separating the ruba’is [is] necessary.’]
(Perry 2005: 331)
–

majbur ‘forced’

Unlike the adjectives above, majbur ‘forced’ expresses external obligation
imposed by an external force (Khojayori and Thompson 2009: 128; Perry 2005:
332). The adjectives vojib and farz ‘essential’, which were attested in our fieldwork
data, can show the same range of meanings. Perry (2005: 331) defines majbur
as an adjective, meaning ‘obliged, forced’, which modifies the agent, while Khojayori and Thompson (2009: 128) treat it as an auxiliary, accompanied by the
clitics for present and the copula budan for past tense to mean ‘to be obliged,
compelled, forced’.
(40) a. onho majbur=and taslim
šav-and.
they forced=3pl
surrendered become.prs-3pl
‘They are being forced to surrender.’ (‘They have to surrender.’)
(Khojayori and Thompson 2009: 128)
b. ota, šumo
majbur bod-i
ki
bor-e man mošin
dad you.pl forced
be.pst-2sg that for-ez I
car
be-xar-en.
sbjv-buy.prs-2pl
‘Dad, you had to buy me a car.’
(#Dushanbe)

4.3.3 Other concepts
–

qodir ‘able’; ma”qul ‘acceptable’; ravo ‘admissble’

Qodir is an adjective meaning ‘able, capable’, used instead of tavonistan “in more
formal style” (Perry 2005: 340):
(41)

ū
ba ijro-i
in
kor qodir
ast-Ø.
(s)he in performing-ez this task capable be.prs-3sg
‘He is capable of carrying out this task.’
(Perry 2005: 340)

3 Modality and mood in Tajik

139

Ma”qul ‘reasonable, sensible, acceptable, pleasing’ may precede the third person
singular form of budan in the present tense (ast) or participate in a complex predicate ma”qul šudan/aftidan ‘to like (it)’ (Perry 2005: 138):
(42) in
fikr-i
šumo
ba man on
qadar
ma”qul
this thought-ez you.pl to I
that amount acceptable
ne-st-Ø.
neg-be.prs-3sg
‘This idea of yours doesn’t seem all that good to me.’ [‘I don’t much like this
idea of yours.’]
(Perry 2005: 157)
Perry (2005: 258) sorts ravo ‘permissible’ as a verbal adjective that can express
moral quality in attributive and predicative constructions with a form of the
copula budan. He presents the following example:
(43) ravo
ne-st-Ø
(raftan).
permissible neg-be.prs-3sg go.inf
‘It is not permissible (to go).’ [Lit.: ‘It is not morally acceptable (to go).’]
(Perry 2005: 258)

4.4 Modal nouns
In linguistic investigations, modal nouns are rarely addressed, and among the
language users, they are less frequent, compared to other modal elements. Modal
nouns can be modified by adjectives, and replace modal adverbs. Jacobson (1982:
63) classifies nouns such as ‘ability, capability, capacity, compulsion, consent,
decision, demand, desirability, determination, impossibility, inability, intention,
necessity, need, permission, possibility, probability, refusal, request, willingness, wish, etc.’ as modal nouns. In predicative constructions, the modal nouns
require a that-clause. In these constructions, the that-clause “reports a proposition” and the modal noun “reports the author’s stance towards that proposition”
(Biber et al. 1999: 648). Modal nouns, in a similar way to other modal elements,
involve the participants and the extent to which they are committed to the SoA
expressed in that-clause (Nuyts 2005: 17).
The most frequent use of modal nouns in Iranian languages is their role in
constructing complex predicates, in combination with light verbs. In our fieldwork data, we targeted the nouns which meant ‘probability, possibility, necessity, permission’ in Tajik, including éhtimol ‘probability’ and imkon ‘possibility’,

140

Sepideh Koohkan, Roohollah Mofidi

ruxsat ‘permission’, man” ‘prohibition’, and zarurat ‘necessity’, all borrowed from
Arabic. In what follows, we introduce these nouns in two subsections: assumption and necessity. However, in some Tajik contexts that contain these nouns
(including their predicative role in combination with the verb budan and also
with the third person singular clitic =a), the adjectival reading is also possible. In
(45) and (46), these elements are translated as adjectives.

4.4.1 Assumption
–

éhtimol ‘probability’

Baizoyev and Hayward (2004: 444) and Rzehak (1999: 52) all treat éhtimol as a modal
word or construction meaning ‘probably, maybe, perhaps’ without distinguishing
its grammatical functions. This probably happens because of the multiple-roles of
éhtimol in different constructions. In addition to its role in constructing the adverbial (ba) éhtimol, as a modal noun it can either be used with the copula ‘to be’
in a predicative form such as éhtimol ast ‘it is probable’, followed by a subordinate clause, with or without ki (Rzehak 1999:52), or act as a coverb in a complex
predicate. However, in none of our sources, neither the literature nor our fieldwork
data, could we find the éhtimol ast construction. The use of éhtimol alone is very
common, but as shown above, it seems that this is the adverb (ba) éhtimol whose
preposition has been omitted.
–

imkon/imkoniyat ‘possibility’

Imkon and its less frequent “extension” imkoniyat are used in a main clause with
budan which requires a that-clause complement (Perry 2005: 337). However, their
primary role is to participate in constructing the modal complex predicate imkon/
imkoniyat doštan ‘to have the possibility’. In every context we have checked,
our informant was reluctant to use imkon, and he restructured the sentences to
employ mumkin instead. The only source that introduces imkon as a modal noun
and provides an example is Perry (2005: 337). The example he gives is not the
stand-alone noun or its non-finite use with budan. Instead, it is a complex predicate of the noun imkon and the light verb doštan (see Section 4.5). Baizoyev and
Hayward (2004: 326) use the noun imkoniyat with the subjunctive form of budan
to request for something in the following example from a conversation, where the
speaker requests to arrange a meeting with the ambassador, and in response to
his preferable date and time, he says:

3 Modality and mood in Tajik

141

(44) agar imkoniyat boš-ad,
pagoh
soat-i
dah-i pagohī.
if
possibility be.prs-3sg tomorrow hour-ez ten-ez morning
‘If possible, tomorrow morning at 10 o’clock.’
(Baizoyev and Hayward 2004: 326)

4.4.2 Necessity
–

zarurat ‘necessity’

Éhtiyoj, hojat, niyoz ‘need’ and luzum (the nominal form of lozim) ‘necessity’ are
all considered as synonyms of zarurat in Nazarzoda et al. (2008: I/513). According
to them, the expression az rūi zarurat means ‘out of necessity’. Although the noun
exists in the language, as seen in Nazarzoda et al. (2008), we could not find any
example for it. This supports our claim that modal nouns are less frequent than
other modal elements.
–

ruxsat/ruhsat ‘permission’

The nominal usage means ‘permission’24 (Nazarzoda et. al 2008: II/175), and in
negative predicative constructions, it means ‘be forbidden’ (Conroy and Shukurov 1998: 141). The latter use is modal and therefore of our concern. (45) is the only
example we could find for the modal usage in a predicative construction. Even in
this example, the permission reading is also possible: ‘It is not allowed that you
smoke in here’.
(45)

ruhsat
ne-st-Ø,
ki
šumo
dar injo sigor
permission neg-be.prs-3sg that you.pl in here cigarette
čok-ed.
smoke.prs-2sg
‘It is forbidden for you to smoke here.’ [‘It is not allowed/permitted that
you smoke here.’]
(Conroy and Shukurov 1998: 141)

24 It is also used as a noun in the complex predicate ruxsat dodan ‘to let’ (Perry 2005: 349).

142
–

Sepideh Koohkan, Roohollah Mofidi

man”/ma”n ‘prohibition’

Unlike ruxsat, we can unambigiously translate man” with a form of the copula
budan as ‘to be forbidden’. With the light verbs kardan and šudan, man” functions
as a nominal to participate in constructing the complex predicates man” kardan
‘to prohibit, forbid’ and man” šudan ‘to be prohibited, be forbidden’ with a following subjunctive verb in a subordinate clause. (46) represents this noun in the
predicative usage, and (51) shows it in a complex predicate:
(46) ob-bozi
kardan ma”n
ast-Ø.
water-playing do.inf forbidden be.prs-3sg
‘Swimming is forbidden.’
(Conroy and Shukurov 1998: 142)

4.5 Modal lexical verbs
Lexical verbs are among “the most precise and versatile means to express modality” (Salazar and Verdaguer 2009: 210). Verbs such as think, believe, know, feel,
seem, appear, guess, etc., and their equivalents in other languages are among the
modal lexical verbs.
Most Tajik modal verbs that we could identify are complex predicates, also
called coverb constructions. Classically, a complex predicate is composed of more
than one linguistic element. It mainly “involves two constituents: a coverb and a
verb” (Amberber, Baker and Harvey 2010: 13). In Iranian languages, the coverb (or
non-verbal element) is either a noun, an adjective, an adverb, or a prepositional
phrase, plus a light verb which can be a simple or a prefixal verb (cf. Anvari and
Ahmadi-Givi 2010: 25; Dabir-Moghaddam 2006: 90–97), as in fikr kardan ‘to think’
(noun+verb), and nağz didan ‘to prefer’ (adjective+verb) in Tajik. The whole compound has a single meaning; however, they are separable by some elements. The
non-verbal element, in theory, is responsible for the semantic properties of the
predicate, i.e., it carries the core meaning of the construction and expresses the
overall meaning of the complex predicate. The light verb is in charge of grammatical functions via receiving inflectional prefixes (such as negation, aspect, mood
prefixes) to designate the grammatical situation of the whole composition.
Exploring the modal verbs in our fieldwork data, we could identify the following complex predicates in Tajik: fikr kardan ‘to think’, gumon kardan ‘to suppose’,
xayol kardan ‘to think’, éhtimol doštan ‘to have probability, bovar-i doštan ‘to
believe’, majbur šudan ‘to be obliged to’, man” kardan ‘to forbid (someone)’, iloj/
čora doštan ‘to have a solution, have a way out’,imkon doštan ‘to have possibility,

3 Modality and mood in Tajik

143

be possible’, be probable’, imkon doštan/imkoniyat doštan ‘to be permitted, have
the possibility’, haq doštan ‘to have the right’, nağz didan ‘to prefer’, behtar donistan/šumurdan ‘to know/consider better’.
Based on their semantic scopes, these verbs are investigated under the
assumption, obligation, and preference in the following subsections.

4.5.1 Assumption
–

fikr kardan/gumon kardan/xayol kardan ‘to think, assume’

Firk kardan ‘to think, assume’ is one of the most frequent modal complex predicates. It has two synonyms: 1) gumon kardan ‘to think, suppose’ (Rastorgueva
1992: 73), ‘to imagine’ (Rastorgueva 1992: 75), or ‘to have an idea’ (Baizoyev and
Hayward 2004: 221); and 2) xayol kardan ‘to think, dream’ (Rastorgueva 1992:
73). All three verbs indicate uncertainty. The verb fikr kardan “in present or past
continuous tense” is “followed by an object clause with its verb in the present
subjunctive” (Khojayori and Thompson 2009: 127). Our materials show that the
complement clause after fikr kardan, gumon kardan and xayol kardan does not
necessarily need a verb in present subjunctive (as in 47a). Other types of subjunctive (47b), indicative (47c) and past tense (47d) are also possible:
(47) a. fikr
me-kon-am
in
tadobir
bar šumo
maqbul
thought ipfv-do.prs-1sg this measures to you.pl accepted
ba-šav-ad.
sbjv-become.prs-3sg
‘I think these measures will be acceptable to you.’
(#Film2)
b. fikr
na-kon-Ø
ki
man durūğ guft-a
though neg-do.prs-2sg that I
lie
say.pst-ptcp
istod-a
boš-am.
stand.pst-ptcp be.prs-1sg
‘Don’t think that I am lying.’
(Rastorgueva 1992: 76)
c. ba tarz-e
ki
binanda onho=ro murd-a
in manner-ez that onlooker they=acc die.pst-ptcp
gumon
me-kard-Ø,
xobid-a
bud-and.
supposition ipfv-do.pst-3sg sleep.pst-ptcp be.pst-3pl
‘They were sleeping so [soundly] that an onlooker would think them
dead.’
(Perry 2005: 367)

144

Sepideh Koohkan, Roohollah Mofidi

d. to
xayol
kard-i
man to=ro
na-šnoxt-am?
you.sg thought do.pst-2sg I
you.sg=acc neg-know.pst-1sg
‘Did you think I didn’t know you?’
(#Film1)
–

éhtimol doštan ‘to have probability’

According to Perry (2005: 335) éhtimol doštan, meaning ‘it is probable, perhaps’
or more accurately ‘to have the probability’, is a “modal idiom” that requires “the
Subjunctive (Present for present or future reference, Past for past)” in the following clause. The frozen third singular forms of this predicate (éhtimol dorad in
present tense, and éhtimol došt in past tense) signify probability. That could be
the reason that Perry calls it a modal idiom. However, following the definition of
the complex predicate, we look at éhtimol doštan as a complex predicate, composed of the nominal éhtimol and the verb doštan.
We could not find any example of this verb in the past tense in the literature.
Our informant also preferred mumkin bud over éhtimol došt. (48) refers to the use
of the verb in the present tense, with a following present subjunctive main verb
(ras-am):
(48) éhtimol
dor-ad
ki
dar ayn-i
vaqt=aš ba
probability have.prs-3sg that in same-ez time=3sg to
moskva ras-am.
Moscow arrive.prs-1sg
‘I will probably get in Moscow just in time.’
(Rzehak 1999: 52)
–

bovar-i doštan ‘to believe’

It is composed of the noun bovar-i ‘belief’ and a verb meaning ‘have’. Literally,
the compound means ‘to have belief or faith’. It needs a complement clause with
a verb in the indicative. (49) shows the complement clause with a verb in the
present-future negative form of indicative:
(49) tu
bovar-i
dor-i
ki
pidar=at
you.sg belief-indf have.prs-2sg that father=2sg.poss
na-me-fahm-ad ?
neg-ipfv-understand.prs-3sg
‘Are you sure/do you believe that/do you think that your father won’t
understand?’
(#Film2)

3 Modality and mood in Tajik

145

4.5.2 Obligation
–

majbur šudan ‘to be obliged to’

This verb represents external obligation and compulsion (Khojayori and Thompson 2009: 128). Majbur šudan requires a complement clause with a subjunctive
verb, unless the verb itself is in the subjunctive form (as in conditionals, such as
agar majbur šav-am ‘if I have to, if I become obliged’). In contrast to other verbs
that can be passivised by a form of the auxiliary šudan, the complex predicates
formed with šudan are either inherently passive (such as majbur šudan) or pertain
to a change of state (e.g. bidor šudan ‘to wake up’).
Khojayori and Thompson (2009: 127) also mention the combination of the
adjective majbur ‘obliged, forced’ and the copula budan in present (majbur ast),
and the clitics (majbur=am, etc.) as an auxiliary that accompanies by the subjunctive main verb. In our definition of complex predicates, majbur budan is an
adjective in a predicative construction. Šudan, on the other hand, is viewed as
a light verb which contributes to the formation of the complex predicate, as in
(50a-b):
(50) a. majbur šud-Ø
ki
xona rav-ad.
obliged become.pst-3sg that home go.prs-3sg
‘He had to/was obliged to go home.’
(Perry 2005: 332)
b. ba xona raftan majbur šud-Ø.
to home go.inf obliged become.pst-3sg
‘He had to/was obliged to go home.’
–

(Perry 2005: 332)

man” kardan ‘to forbid (someone)’

It is made of the nominal man” ‘forbidden’ and the light verbs kardan ‘to do’, as
in (51) from Spoken Tajik:
(51) ino=ya
man
kun-it,
na-rav-an.
they=acc forbidden make.prs-2pl neg-go.prs-3pl
‘Stop [them] leaving.’ [Lit.: ‘Prevent them, let them not leave.’]
(Perry 2005: 349)

146
–

Sepideh Koohkan, Roohollah Mofidi

iloj/čora doštan ‘to have a solution, have a way out’

These are composed of the nouns iloj/čora ‘solution’ and the verb doštan ‘to have’.
The combinations are typically employed in negative forms to portrait external
forces that leave the participant no way out or no solution:
(52) pagoh
soat-i
davozda boyad ba moskva šav-ad;
digar
tomorrow hour-ez twelve
must to Moscow go.prs-3sg other
iloj/čora na-dor-ad.
solution neg-have.prs-3sg
‘Tomorrow at 12 o’clock he has to go to Moscow; he has no other way.’
(#Film1)
–

imkon doštan/imkoniyat doštan ‘to be permitted, have the possibility’

The complex predicate imkon/imkoniyat doštan expresses possibility and permission (Conroy and Shukurov 1998: 105; Perry 2005: 337). It usually appears at
the end of the sentence, while the verb of the complement clause is in infinitival
form.
(53) kas
ba onjo raftan imkon
na-dor-ad.
person to there go.inf possibility neg-have.prs-3sg
‘One is not permitted to go in there.’
(Conroy and Shukurov 1998: 105)
This predicate is not commonly used in Spoken Tajik, maybe because there is an
alternative, viz. mumkin, which denotes potentiality in the SoA (see Section 5.3.3),
and may also be used to ask for permission to express ‘can I, may I?’ (Baizoyev
and Hayward 2004: 498).
–

haq doštan ‘to be entitled to, be permitted to’

The combination means ‘to be entitled to, to be permitted to’ (Conroy and Shukurov 1998: 105) or ‘to have the right’ (Perry 2005: 315). The following clause can
occur with or without ki. Haq doštan, either in the present or past tense, requires
a subjunctive verb in this following clause:
(54) a. kas
haq na-dor-ad
surud xon-ad.
person right neg-have.prs-3sg song sing.prs-3sg
‘No one is permitted to sing.’
(Conroy and Shukurov 1998: 105)

3 Modality and mood in Tajik

147

b. haq došt-ed
ki
ba onjo na-rav-ed.
right have.pst-2pl that to there neg-go.prs-2pl
‘You were right not to go there.’ [Lit.: You had the right not to go there.]
(#Dushanbe)
4.5.3 Preference
–

nağz didan/donistan ‘to like, prefer’

The main meaning of nağz didan is ‘to like’, composed of the adjective nağz ‘well,
good, pleasing’ and the verb didan ‘to see’. However, it has also been translated
as ‘prefer’ by Conroy and Shukurov (1998:119) in the following example:
(55) man duš=ro
nağz me-bin-am.
I
shower=acc well ipfv-see.prs-1sg
‘I prefer a shower.’
(Conroy and Shukurov 1998: 120)
–

behtar donistan/šumurdan ‘to prefer’

Behtar ‘better’ is a comparative adjective, with the verbs donistan ‘to know’ and
šumurdan ‘to count, consider’ make a complex predicate, which means ‘to prefer’,
or literally, ‘to consider something better’. In a predicative construction, behtar,
together with the third person singular form of budan ‘to be’ (ast ‘is’ and its
clitic ‘=a’), creates a frozen form that appears at the beginning of the sentence. It
is also possible to omit the copula (Khojayori and Thompson 2009: 127). This predicative construction is more frequent in spoken Tajik, compared to the complex
predicates behtar donistan/šumurdan. While the predicative form requires a
complement clause, the complex predicate can stand alone as the main verb of a
single clause. (56a) below is the predicative use of the adjective, with the omitted
copula. (56b) illustrates the complex predicate:
(56) a. xūrok-ho-i millī
boš-ad,
behtar.
food-pl-ez national be.prs-3sg better
‘National food would be better.’
(Baizoyev and Hayward 2004: 193)
b. man futbol-bozi=ro
nisbat ba tenis-bozi
behtar
I
football-play=acc relative to tennis-play better
me-šumur-am.
ipfv-count.prs-1sg
‘I prefer soccer to tennis.’
(Conroy and Shukurov 1998: 141)

148

Sepideh Koohkan, Roohollah Mofidi

4.6 Modal prepositional phrases
This category includes expressions composed of the preposition ba ‘to’ and the
nouns fikr ‘thought’, gumon ‘assumption’, nazar ‘idea’, xayol ‘imagination’ and
the first person singular possessive clitic =am, to make ‘in my idea’ (ba fikr=am
and ba xayol=am), ‘in my speculation’ (ba gumon=am), and ‘in my opinion’ (ba
nazar=am). They can also be used to ask about somebody’s idea. The verb in the
complement clause can be indicative or subjunctive, both past and present, to
refer to an event in the past, present, or future. (57a–b) show them in sentential
contexts.
(57) a. ba fikr=am
ki
dar iškof-i
to thought=1sg.poss that in cupboard-ez
apa=aš
boš-ad.
older.sister=3sg.poss be.prs-3sg
‘I think it should be in her sister’s cupboard.’

(#Dushanbe)

b. ba nazar=am
me-raft-an
donišgo.
to idea=1sg.poss ipfv-go.pst-3pl university
‘I think they were going to university.’

(#Dushanbe)

c. ba xayol=am
to
man na-yo-m
to idea=1sg.poss until I
neg-come.prs-1sg
na-me-xob-a.
neg-ipfv-sleep.prs-3sg
‘I think he won’t sleep until I come.’ [Lit.: ‘In my idea, until I don’t come,
(s)he won’t sleep.’]
(#Dushanbe)

5 Expressing modality in Tajik
Adhering to Nuyts (2005 and further), we will be concerned with three main types
of modality and many subtypes. In this section, after introducing the exact subcategories of each type of modality and their meanings, we will discuss those
linguistic elements of Tajik that can represent epistemic, deontic, and dynamic
modality. Considering the polysemous nature of modal elements in the language,
especially modal auxiliaries, we expect them to express various types of modality.

3 Modality and mood in Tajik

149

5.1 Epistemic modality
Epistemic modality indicates “that a certain hypothetical state of affairs under
consideration . . . will occur, is occurring, or has occurred in a possible world”
(Nuyts 2005: 21). It is the degree of likelihood that the SoA applies or will apply in
reality or not (Nuyts, MS: 82). The term ‘degree’ implies a continuum, going from
the positive pole to mark that the SoA is certainly true, and ending in the negative
pole showing that the SoA is certainly not true. This path from positive side to
negative side goes via the probability, possibility, and improbability of the SoA,
based on the speaker’s evaluation.

5.1.1 SoA is certainly true
The adverbs hatman (25a–b), haqiqatan (example [27] in Section 4.2.3, and [58a]
below), be-šubha (28), and the verb bovari doštan ([49] above and [58b] below)
are used to imply that the first argument participant is certain that the SoA is true:
(58) a. haqiqatan in
xele qimat ast-Ø.
truely
this very price be.prs-3sg
‘In fact, it is far too expensive.’
(Conroy and Shukurov 1998: 80)
b.

bovar-i
dor-am
ki
tu
na-me-guzor-i
ki
belief-indf have.prs-1sg that you.sg neg-ipfv-let-2sg that
man az
tu
xafa
šav-am.
I
from you.sg disappointed become.prs-1sg
‘I am certain that you won’t let me be disappointed with you.’ (#Film1)

Be-šubha is morphologically negative (including a negative, derivational prefix
be- ‘without’), nonetheless it does not refer to the negative pole of the epistemic
modality. On the contrary, it points to a strongly positive side of the continuum
(see example 28).

5.1.2 SoA is possible, probable or improbable
The auxiliaries boyad ([7b-c] in Section 4.1.1, and [59a]) and me-šavad ([18] in
Section 4.1.3, and [59b] below), the modal adverbials šoyad, (ba) éhtimol, az aftaš,
taqriban, maqruban, qarib, qaribat, balki, taxminan (examples [21]-[24] and [29] in
Section 4.2.1), the adjectives mumkin (examples [33] in Section 4.3.1) and darkor

150

Sepideh Koohkan, Roohollah Mofidi

(example [36b] in Section 4.3.2), the nouns imkon, imkoniyat (example [44] in
Section 4.4.1), lexical verbs éhtimol doštan, imkon/imkoniyat doštan, fikr kardan,
gumon kardan, xayol kardan (Section 4.4), and the modal prepositional phrases
ba fikram, ba nazaram, ba xayolam (Section 4.5), all can denote the middle degree
of the epistemic modality. (59a–f) provide some further examples.
(59) a. modar boyad dar xona boš-ad.25
mother must
in home be.prs-3sg
‘Mother must be at home.’

(Perry 2005: 334)

b. ba fikr-i
to
me-šav-ad
saroyanda
in thought-ez you.sg ipfv-become.prs-3sg singer
šav-am?
become.prs-3sg
‘In your opinion, is it possible that I become a singer?’

(#Film1)

c. éhtimol ra”dubarq
šav-ad.
probably thunderbolt become.prs-3sg
‘Probably, the thunderbolt will strike.’

(Rzehak 1999: 52)

d. balki xato
kard-agi=m.
maybe mistake do.pst-conj=1sg
‘Maybe, I made a mistake.’

(Perry 2000: 242)

e. onho dar roh boš-and
darkor.
they in way be.prs-3pl probably
‘They are probably en route.’
f.

(Rastorgueva 1992: 69)

xayol
me-kun-am
ki
imruz ruz-i
moy ne.
Though ipfv-do.prs-1sg that today day-ez we neg
‘I think today is not our day.’

(#Film1)

Among the modal lexical verbs mentioned above, Perry (2005: 335) calls éhtimol
dorad a modal idiom that marks possibility. It is plausible that he uses this expression for this verb, because it is restricted to the third person singular. It means
that other forms (such as ✶éhtimol dor-am/ī the first and second person singular, and ✶éhtimol dor-em/ed/and the first, second and third person plural) do not
exist in the language. This is also true about imkon/imkoniyat doštan. Although

25 The sentence is ambiguous for deontic and epistemic readings.

3 Modality and mood in Tajik

151

Perry does not call this verb modal idiom, this one is also restricted to the third
person singular.26
It is worth noting that as far as we could see, Tajik has no separate modal item
to represent improbability. Instead, the negated predicative adjectives and nouns
take this function, as well as negative forms of boyad and šudan (i.e. na-boyad
and na-me-šav-ad, respectively), and the verbs éhtimol doštan (as in éhtimol
na-dor-ad), imkon/imkoniyat doštan (as in imkon/imkoniyat na-dor-ad). (60a–b)
illustrate improbability.
(60)

a. na-me-š-a
kor-i
maryam boš-a;
un
neg-ipfv-become.prs-3sg work-ez Maryam be.prs-3sg she
hamon ruz-o
xona na-bud-Ø.
same
day-pl home neg-be.pst-3sg
‘It is improbable/not possible that Maryam has done it; she was not at
home those days.’
(#Dushanbe)
b. in
hodisa=ro navišt-a
giriftan=am
this event=acc write.pst-ptcp take.inf=1sg.poss
imkon/imkoniyat na-dor-ad.
possibility
neg-have.prs-3sg
‘It’s not possible for me to write down this episode.’
[Lit.: ‘My writing down this episode has no possibility.’]
(Perry 2005: 337)

26 In Persian, it is possible to use emkān as a specific object, followed by a possessive clitic =aš/
eš. In this construction, other conjugated forms of the simple verb dāštan are allowed, as in the
following example:
fe’lan emkān=eš=o
na-dār-am
ke
avaz=eš
yet
possibility=3sg.poss=acc neg-have.prs-1sg that change=3sg.obj
‘I don’t have the possibility (or the conditions) to change it yet.’

kon-am.
do.prs-1sg

This construction would neither be called modal idiom nor complex predicate. Rather, the noun
emkān has been used in a main clause. This claim can be supported by changing the main verb
to an existential predicate:
a.

fe’lan
emkān=eš
vojud
yet
possibility=3sg.poss
exist
‘The possibility does not exist yet.’

fe’lan emkān=eš
yet
possibility=3sg.poss
‘It is not possible yet.’

na-dār-e.
neg-have.prs-3sg

ni-st-Ø.
neg-be.prs-3sg

152

Sepideh Koohkan, Roohollah Mofidi

Adverbs do not play a role in the negative side of the mid-continuum to presume
improbability and impossibility, because: a) they cannot be negated; b) they
do not participate in predicative constructions; therefore, they do not take the
negative copula; and c) modal adverbs donot participate in constructing modal
complex predicates; therefore, they do not employ the negated light verb.

5.1.3 SoA is certainly not true
The negative pole of the epistemic continuum is often produced by negating the
modal elements on the positive side. For the same reason we referred to in the
last paragraph of the previous section, this is not true for the modal adverbs.
Even negating the modal main verbs does not lead to the negative pole. Instead,
it transfers them a step backwards, semantically, to the mid-continuum. Bovari
na-dor-am ki me-rav-ad (belief neg-have.prs-1sg that ipfv-go.prs-3sg), contrary
to what we expect, does not mean ‘I am certain that (s)he won’t go’; rather, it
implies that the speaker does not think, or does not see it possible that (s)he goes.
Negating the verb of the subordinate clause also will be of no help, since
it still shows that the speaker is certain the negative proposition is true. Bovari
dor-am ki na-me-rav-ad (belief have.prs-1sg that neg-ipfv-go.prs-3sg) means
that the speaker (I) is certain that the other participant will not (or does not) go.
The negative form of both the main and subordinate clauses, as bovari na-dor-am
ki na-me-rav-ad (belief neg-have.prs-1sg that neg-ipfv-go.prs-3sg), is close to ‘I
don’t think (I am not sure) that (s)he will not (or does not) go.’
However, there are some inherently negative adjectives that can express the
negative side of the continuum. We did not mention them in section 4, mainly
because we could not find any instance of them. The adjectives mahol, ğayri
mumkin, no-mumkin (Nazarzoda et al. 2008), in their predicative use with the
third person singular clitic, represent that the SoA is certainly not true. Using the
same adjectives in the Persian of Iran might help to understand how they work:
(61) a. mahāl=e
az
in
taraf raft-e
bāš-e.
impossible=3sg from this way go.pst-ptcp be.prs-3sg
‘It is impossible that he has gone this way.’
(The speaker is certain that ‘for the participant, going from this way is
not true.’)

3 Modality and mood in Tajik

153

b. qeir-e momken=e tanhā bi-yā-d.
not-ez possible=3sg alone sbjv-come.prs-3sg
‘It is impossible that (s)he comes alone.’
(The speaker is certain that ‘for the participant, coming alone is not true.)

5.2 Deontic modality
In Nuyts’ stance, deontic modality is an evaluation of the moral acceptability, desirability or necessity of the SoA (Nuyts 2005: 25). In this definition, we are dealing with
the notion of ‘degree’, leading us to a continuum on which the absolute moral necessity of the SoA makes the positive pole, and via desirability, acceptability, and undesirability, as intermediate levels, it ends in absolute moral unacceptability of the SoA.
Being morally accepted refers to the general norms among people or moral criteria
of the first participant argument, which could be (but not necessarily) the speaker.

5.2.1 Absolute moral necessity of the SoA
The main linguistic elements to express absolute moral necessity are the auxiliary boyad (examples [6], [7a–c], [8b] and [62]), the adjectives darkor ([36c] and
[63a–b]), vojib, lozim ([38b] and [63c]), zarur(i) ([39] and [63e]), majbur (40b), farz
(63e), and the verb lozim donistan (64). In Nuyts’ perspective, if this necessity is
about moral issues, i.e., if the first argument participant of the SoA believes that
the SoA is morally necessary, we are dealing with the positive side of the continuum of deontic modality.
(62) a. boyad naqše-i pelon=ro baro-i xud muayyan soz-em.
must map-ez plan=acc for-ez self specified make.prs-1pl
‘We have to specify the plan (of the building) for ourselves.’ (#Film2)
b. šumo boyad sahar
ba-xiz-ed
ba-rav-ed.
you.pl must morning sbjv-wake.up.prs-2pl sbjv-go.prs-2pl
‘You must wake up in the morning to go.’
(#Film1)
c. vay
ki
korgar ast-Ø
va
tu
ki
dehqon
(s)he that worker be.prs-3sg and you.sg that farmer
hast-ī,
boyad beštar ošno
šav-ed.
be.prs-2sg must
more acquaintance become.prs-2pl
‘He who is a worker and you who are a farmer must get better acquainted.’
(Perry 2005: 408)

154

Sepideh Koohkan, Roohollah Mofidi

(63) a. aknun čora-i
in=ro
andišidan=amon
now
solution-ez this=acc think.inf=1pl.poss
darkor
ast-Ø.
necessary/needed be.prs-3sg
‘Now, we have to find a way to solve this.’
b. kamtar dam
giriftan darkor.
little
breath take.inf necessary
‘Resting a little is necessary.’

(Perry 2005: 254)

(#Dushanbe)

c. on=ro
mo misl-i asar,
misl-i ša”r
xub ijod
that=acc we like-ez litrary.work like-ez poem good make
kardan lozim.
do.inf necessary
‘Like a literary work, like a poem, we have to make it fine.’
(#Film2)
d. zarur
ast-Ø
ki
man ham kor
došt-a
necessary be.prs-3sg that I
also work have.pst-ptcp
boš-am.
be.prs-1sg
‘It is necessary that I have a job, too.’
(#Dushanbe)
e. xondan-i
namoz va
rūza
giriftan dar islom
read.inf-ez prayer and fasting take.inf in Islam
farz
ast-Ø.
obligatory be.prs-3sg
‘Saying prayers and fasting is obligatory in Islam.’
(#Dushanbe)
(64) tanzim
yoftan-i
tamom-i maktab-ho-i musulmon=ro
organizing find.inf-ez all-ez
school-pl-ez muslim=acc
lozim
me-don-am.
necessary ipfv-consider.prs-1sg
‘I consider it necessary to organize all the Muslim schools.’ (Perry 2005: 255)
In all the examples in (62), the main predicates (i.e., muayyan soxtan ‘to specify’,
az xob xestan ‘to wake up’, raftan ‘to go’, ošno šudan ‘to get acquainted’) express
moral necessity. However, other readings are plausible too: (62a–b) can be directives, participant-imposed dynamic (see Section 5.3.3), or even they can express
desirability or acceptability in deontic modality (see Section 5.2.2). The latter
reading is also plausible for (62–c), where ošno šudan is acceptable or even desirable, morally. (63a) refers to a necessity that is morally required. This need or necessity might be due to the forces in the situation or be imposed on the participant.

3 Modality and mood in Tajik

155

In these readings, they do not express deontic, but dynamic modality. The same is
true about (63b–d). Darkor is a polysemous word, expressing both possibility and
necessity. In addition to the above readings, this sentence might mean that ‘sleeping a little’ is possible (epistemic) or even allowed (directive). This ambiguity is
weaker in the case of farz (63e) and lozim donistan (64). In the former example, the
necessity is about a religious issue, a topic that is always connected to morality.
The use of first singular ending -am with the verb, in the latter example, implies
that, in the speaker’s point of view, the predicate is morally necessary.

5.2.2 Desirability and acceptability
The auxiliaries tavonistan ([11a] and [65a]) and šudan (65b), the adjectives behtar
(in predicative constructions or complex predicates, or with the clitic =aš, as in
[56a–b]), darkor (as one of the readings of [36c]), ma”qul (in complex predicates),
and the verb haq doštan (in affirmative sentences, as in [54b]) can express the
positive side of the deontic modality, i.e. desirability and acceptability based on
the moral issues. These are shown in (65a–b) for the auxiliaries, (66a–d) for the
adjectives, and (67) for the complex predicate haq doštan.
(65) a. to
baro=yaš kor
namud-a-i,
me-ton-i
you.sg for=3sg
work do.pst-ptcp-2sg ipfv-can.prs-2sg
oila=t=ro
ba-xoh-i;
haq=at
salary=2sg.poss=acc sbjv-want.prs-2sg fright=2sg.poss
me-boš-ad.27
ipfv-be.prs-3sg
‘You have worked for him, you can ask for your salary; this is your right.’
(#Dushanbe)
b. me-šav-a
soro rav-ad
nazd-i xoli=aš.
ipfv-become.prs-3sg Sara go.prs-3sg to-ez aunt=3sg.poss
‘It is possible for Sara to go to her aunt’s.’
(#Dushanbe)
(66) a. behtar ast-Ø
hič čiz
ma-gu-i.
better be.prs-3sg any thing neg-say.prs-2sg
‘It’s better you don’t say anything.’

(#Film1)

27 This sentence can have still another reading which is participant-imposed dynamic modality
(see Section 5.3.2)

156

Sepideh Koohkan, Roohollah Mofidi

b. behtar=aš az
digar kas
purs-ed.
better=3sg from other person ask.prs-2pl
‘You should ask someone else.’ [Lit.:‘It’s better that you ask someone else.’]
(Baizoyev and Hayward 2004: 139)
c. ma”lum na-bud-Ø
ki
ū fikr-ho-i
alijon=ro
known
neg-be.pst-3sg that he thought-pl-ez Alijon=acc
ma”qul
yoft-a.
acceptable find.pst-ptcp.3sg
‘It was not clear whether he . . . approved of Alijon’s idea [or not].’
(Perry 2005: 307)
d. imruz yagšanba ast-Ø,
xob
raftan darkor.
today Sunday
be.prs-3sg sleep go.inf necessary
‘Today is Sunday; one must/can/may sleep.’

(#Film1)

(67) amma-i tu
haq dor-ad
gumon kun-ad
ki
man
aunt-ez you.sg right have.prs-3sg thought do.prs-3sg that I
pir šud-a:-m.
old become.pst-ptcp-1sg
‘Your aunt is right to think that I’m old.’ [Lit.: ‘Your aunt has the right to
think that I became old.’]
(#Film3)
In addition to physical ability, the verb tavonistan may also express acceptability in the deontic modality. As (65a) shows, since the first argument participant has been working for the second argument, the speaker considers it
morally acceptable to ask for his salary. This is also true for (11a), in which
one of the readings could be that here it is morally acceptable for the woman
to sing. (65b) is ambiguous between dynamic, directive and deontic meanings. Deontically, it implies that due to the situation, it is morally acceptable
for Sara to go to her aunt’s. Changing these auxiliaries to negative forms will
move the degree of acceptability to the negative side of the continuum (see
Section 5.2.3).
Behtar, either in predicative constructions or complex predicates (with donistan/šumurdan), along with showing the preferences of the participant, may
denote the degree of what is morally acceptable or desirable for the first participant of the SoA (as shown in [66a–b]). Interestingly, in the negative form, the
notion of desirability does not shift to undesirability or unacceptability. Rather, it
refers to the fact that not doing the following predicate is desirable (see example
68). Ma”qul, on the other hand, expresses what is morally and logically acceptable, and negating the construction sends it to the negative pole of the continuum

3 Modality and mood in Tajik

157

(see example [70b]). The polysemous modal element darkor, in the example (76d),
among other meanings, refers to the fact that sleeping and resting on Sunday, as
a holiday, is morally acceptable. Like ma”qul, in the negative use, it implies that
the action is not desirable (as shown in [69a]).
(68) behtar=aš na-rav-am.
better=3sg neg-go.prs-1sg
‘I’d better not go.’

(Khojayori and Thompson 2009: 129)

The complex predicate haq doštan ‘to have the right, be entitled, be permitted’ in
affirmative form can denote the acceptability of SoA. Negating the verb expresses
absolute moral unacceptability (see example 71).

5.2.3 Undesirability of the SoA and moral unacceptability of the SoA
The auxiliaries boyad (as shown in [8a–b] and [69a]), tavonistan (69b–c), and
šudan ([17a] and [69c]), the predicative use of the adjectives darkor and ravo, the
noun ruxsat, and the verb haq doštan all in negative sentences, and also man”
(in positive predicative form), imply undesirability or unacceptability of the SoA.
This is shown in (69a–c), (70a–d) and (71):
(69) a. bača=ro
na-boyad dar kūča hay=aš
child/boy=acc neg-must in street leaving=3sg.obj
kun-i
sar-i poy=aš
bi-mon-a.
do.prs-2sg on-ez foot=3sg.poss sbjv-stay.prs-3sg
‘You shouldn’t leave the child in the street to stay on foot.’ (#Dushanbe)
b. bo
yagon zan-i
dehoti dar tojikiston gap zadan
with
one
woman-ez rural in Tajikistan talk hit.inf
ba rusi
na-me-ton-i.
na-me-ton-i.
in Russian neg-ipfv-can.prs-2sg neg-ipfv-can.prs-2sg
‘You cannot speak in Russian with a rural woman in Tajikistan.’
(#Dushanbe)
c. na-miš-a / na-me-ton-im
bedun-i
neg-become.prs-3sg / neg-ipfv-can.prs-1pl without-ez
adella-yi
sar=iš
kun-im.
reasons-indf head=3sg.obj do.prs-1pl
‘It is not possible/we cannot dismiss him without any reason.’
(#Dushanbe)

158

Sepideh Koohkan, Roohollah Mofidi

(70) a. ū
ba bimoriston raft-a,
xob
giriftan darkor
(s)he to hospital
go.pst-ptcp sleep take.inf necessary
ne-st-Ø.
neg-be.prs-3sg
‘(S)he has gone to the hospital, sleeping is not necessary/permitted/
acceptable.’
(#Film1)
b. in
fikr=at
ba man ma”qul
this though=2sg.poss to I
accepted/logical
na-šud-Ø,
aziza.
neg-become.pst-3sg Aziza.
‘Aziza, this idea of yours is not acceptable to me.’

(#Film1)

c. sigor
čokidan
ma”n
ast-Ø.
cigarette smoke.inf forbidden be.prs-3sg
‘Smoking is forbidden.’
(Conroy and Shukurov 1998: 142)
(71) kas
haq na-dor-ad
čiz-e
gir-ad.
person right neg-have.prs-3sg thing-indf take.prs-3sg
‘One is not entitled/permitted to take anything.’
(Conroy and Shukurov 1998: 105)
Utterances of this kind express a degree of the moral unacceptability of the SoA;
however, the degree is lower for (70b), in which Aziza’s idea is not desirable or
acceptable to the speaker. In (70a), (70c) and (71), we can think of a situation in
which the sentences have directive readings.

5.3 Dynamic modality
In Nuyts’ categorization, dynamic modality not only refers to the general ability,
it also involves a capacity or need imposed on the participant, known as participant-imposed dynamic, or the capacity and need that is already available in
the SoA itself. This latter one is called situational dynamic. We will define each
subtype of dynamic modality in separate subsections, followingly, introducing
the modal elements of Tajik which express them.

3 Modality and mood in Tajik

159

5.3.1 Participant-inherent dynamic
To Nuyts, participant-inherent dynamic covers capacities/abilities/potentials
and needs/necessities which are fully inherent to the first argument participant
of the SoA (Byloo and Nuyts 2014: 89; Nuyts, MS: 77). In Tajik, the elements to
feature ability are the auxiliary tavonistan and the adjective qodir. They refer to
the ability, capacity or potentials of the participant, while the auxiliary boyad
is responsible for showing those needs and necessities which are inherent to the
participant. Sentences (10a), (11b), and (41) point to ability, and therefore, they
are instances of participant-inherent dynamic modality; i.e., the speaker does
not know how to swim (10a), the participant is able to speak Tajik (11b), and the
participant is capable of performing the task (41). (72) is an instance of inherent
necessity to the participant, indicated by the auxiliary boyad:
(72) man boyad ba”d-i
abed xob
kun-am
baro-in-ki
I
must
after-ez lunch sleep do.prs-1sg for-this-that
sar-dard
me-gir-am.
head-ache ipfv-take.prs-1sg
‘I have to sleep after lunch, or I’ll have a headache.’
(#Dushanbe)
The above sentence shows that sleeping after lunch is an inherent and internal
need for the participant, because if (s)he does not do that, (s)he will have a headache.

5.3.2 Participant-imposed dynamic modality
This type of dynamic modality covers “abilities/potentials and needs/necessities of
the participant that are determined by the circumstances of the state of affairs, hence
may be partly beyond the participant’s power and control” (Nuyts, MS: 75). The three
modal auxiliaries in Tajik, i.e. boyad, tavonistan and šudan, are all capable of featuring participant-imposed dynamic modality. The sentences in (73a–c) illustrate this:
(73) a. baro-in-ki
dar=o
vo
kun-i
avval boyad
for-this-that door=acc open do.prs-2sg first must
dar=o
tera kun-i
ba”d kalid=ro be-čarxon-i.
door=acc push do.prs-2sg then key=acc sbjv-turn.prs-2sg
‘To open the door, you have to push it first, and then turn the key.’
(#Dushanbe)

160

Sepideh Koohkan, Roohollah Mofidi

b. bo
in
arus
diga
sar=at=o
with this daughter-in-law anymore head=2sg.poss=acc
bardoštan na-me-ton-i.
pick.up.inf neg-ipfv-can-2sg
‘You cannot hold up your head with this daughter-in-law.’ [i.e., she
disgraced you.]
(#Film2)
c. har
ruz soat-i
panj onjo=st-Ø;
asr
kofa
every day hour-ez five there=be.prs-3sg evening café
me-šav-ad
yoft=aš.
ipfv-become.prs-3sg find.sinf=3sg.obj
‘He is there every day at five o’clock; he can be found in the café in the
evening.’
(#Dushanbe)
The necessity that is needed for opening the door (73a) is not something inherent to the participant. Instead, it is imposed by the situation of the door. In fact,
there is no other way for the participant to open the door, apart from pushing it
first, and then, turning the key. Therefore, boyad expresses the necessity that is
imposed on the participant. In (73b), there is no ability targeted by the modal
element. Rather, the disgrace that the daughter-in-law of the family has caused
makes it impossible for the participant to be confident anymore. This external
force is also found in (10b), in which, due to the sickness or any other forces that
are not inherent to the participant, the woman cannot give birth to a baby. (12a)
points to an external cause (not raining) that makes the participant able to go to
the bazar. In (73c), the only place where it is possible to find the absent participant is the café, because this is the place where he usually goes in the evenings.
The sentence adverbials no-čor, no-iloj, čor-no-čor, xoh=u no-xoh also can
point to the external necessity forced on the participant. Sentences (31a–d) represent external sources that force the participant to wait (31a), to give one of his
horses (31b), to accept something because he had no other way (31c) or to depict
time (31d).
The nouns iloj ‘solution, cure, remedy, treatment’ and čora ‘solution’ constructing the complex predicate iloj/čora doštan ‘to have a solution or cure’ are
used in negative forms to show the external forces that leave the participant with
no other way or solution. This is shown in (52) and (74a–b). In the former, because
the participant must be in Moscow at a specific time, which is an external force,
(s)he does not have any other way. In the latter, due to some external forces,
probably meeting the needs, the first argument participant has no other way but
working at the other participant’s house.

3 Modality and mood in Tajik

161

(74) a. xona-i
šumo kor
kun-am,
digar iloj
na-dor-am.
home-ez you.pl work do.prs-1sg other solution neg-have.prs-1sg
‘I (will) work in your home; I have no other way.’
(#Film2)
b. juz
girya va zorī
digar
čora-e
na-došt-em.
except cry and weeping otherwise solution-indf neg-have.pst-1pl
‘We had no recourse but to weep and wail.’
(Perry 2005: 91)

5.3.3 Situational dynamic
The ability or necessity in a sentence containing dynamic modality is not necessarily inherent or imposed on the participant. In situational dynamic, potential
possibility or inherent necessity resides in the situation or the SoA itself, not in
the participant or imposed by others. This is what van der Auwera and Plungian
(1998) call participant-external modality, and Nuyts labels as situational dynamic.
This type of modality appears in the clauses where there is no participant, or the
participant is inanimate, or animate but implicit (Byloo and Nuyts 2014: 89). The
modals boyad, tavonistan, and šudan, the predicative adjectives mumkin, lozim,
zarur(i), darkor (63b), along with other readings are also able to convey the need
or possibility in situational dynamic. This is shown in (75a–e):
(75) a. me-ton-i
rav-i
dar un
ğor, liken avval boyad
ipfv-can.prs-2sg go.prs-2sg in that cave but first must
az
on
kūh
bi-gzar-i.
from that mountain sbjv-pass.prs-2sg
‘You can go to that cave, but first you have to pass that mountain.’
(#Dushanbe)
b. vobasta
ba xonavoda čand namuna sardard
dependent to family
some type
headache
me-šav-ad.
ipfv-become.prs-3sg
‘Depending on the family [i.e. its genetics], there can be different types
of headache.’
(#Dushanbe)
c. ya ruz na ya ruz boyad ba-mēr-em.
one day no one day must sbjv-die.prs-1pl
‘Finally (if not today, the other day), we have to die.’

(#Dushanbe)

162

Sepideh Koohkan, Roohollah Mofidi

d. ba xona raftan=aš
lozim
bud-Ø.
to home go.inf=3sg.poss necessary be.pst-3sg
‘He had to/needed to go home [‘His going home was necessary.’] (Perry
2005: 332)
e. afšin
dar xona=y?
yak kor-i
zaruri
Afshin in
home=3sg one take-ez necessary
dor-am=ša.
have.prs-1sg=3sg.obj
‘Is Afshin at home? I have something important to do (with him)’
(#Film2)
(75a) refers to the potential that already exists in the SoA: ‘going to the cave’ is
potentially possible, and there is only one way to do that; you must ‘climb the
mountain’. The sentence also has the directive meaning, referring to the fact that
you are allowed to go to the cave. In (75b), the genetic potentiality causes the
existence of different types of headaches. On the other hand, the examples in
(75c–e) all feature inherent necessity in the SoA itself: the necessity is in dying,
which is the SoA itself (75c). Not only the external forces can make it necessary
for the participant in (75d) to go home (hence a participant-imposed reading),
but it is possible to think of a situation that the speaker was late or his/her job
was finished. Therefore, ‘(s)he had to go home’ adds a situational nuance, and
finally, the necessity that the speaker can see in the task itself in (75e) makes it a
situational dynamic. The sentence cannot have the deontic meaning, since there
is no moral issue considered here.
Table 2 summarizes modal elements of Tajik and their semantic scope:

6 Tentative categories of mood: Discussion
As mentioned in Section 2, a mood distinction can be distinguished if, and only
if, morphological evidence supports the existence of that distinction in the
grammar of the language. In other words, the mood system is defined based on
morphological oppositions, not upon semantic criteria. Therefore, regardless of
the categorizations of mood in Tajik grammars (as discussed in Section 3), we will
check and discuss the existence of these morphological oppositions in Tajik, and
at some points, our discussions turn out to be inconclusive. For this reason, we
are proposing the label tentative categories for the mood system, which implies
that these distinctions require further investigation in Tajik. In the following subsections, we will address four distinctions of mood (indicative, imperative, sub-

Dynamic

Deontic

Situational dynamic

Participantimposed dynamic
modality

Participant-inherent
dynamic

mumkin, lozim,
zarur(i), darkor

qodir

darkor,
ravo

Undesirability of
the SoA and Moral
unacceptability of
the SoA
ruxsat,
man”

behtar,
darkor, ma”qul

Desirability and
acceptability

Adjectives
darkor, vojib,
lozim, zarur,
majbur, farz

Nouns

Absolute moral
necessity of the
SoA

Table 2: Expressing modality in Tajik.

no-čor, no-iloj,
čor-nočor, xoh=u
no-xoh

Adverbs

Iloj/čora doštan

haq doštan

lozim donistan

Verbs

Prepositionals

(continued)

boyad,
tavonistan,
šudan

tavonistan

boyad,
tavonistan,
šudan

tavonistan,
šudan

boyad

Auxiliaries

3 Modality and mood in Tajik

163

Epistemic

SoA is certainly not
true
mahol, qayri
mumkin,
no-mumkin

šoyad,
(ba) éhtimol,
az aftaš, taqriban,
maqruban,
qarib, qaribat,
balki, taxminan

Adverbs

SoA is possible,
probable or
improbable

Adjectives
hatman,
haqiqatan,
be-šubha

Nouns

SoA is certainly true

Table 2 (continued)

éhtimol doštan,
imkon/ imkoniyat
doštan,
fikr kardan,
gumon kardan,
xayol kardan,
éhtimol doštan

bovari doštan

Verbs

ba fikram,
ba nazaram,
ba xayolam

Prepositionals

boyad, šudan

Auxiliaries

164
Sepideh Koohkan, Roohollah Mofidi

3 Modality and mood in Tajik

165

junctive, and conjectural) and show the extent to which each one is supported
morphologically.

6.1 Indicative
In Tajik, a range of verbal forms and constructions are included within the broad
category of indicative. We can mention the present imperfective (76a), past imperfective (76b), past perfective (76c), present progressive (76d), and past progressive (76e), as well as a number of less frequent forms and constructions (cf. Perry
2005: 178 for an overview of all indicatives in a single Table).
(76) a. apa=t
har
ruz kojā
me-rav-a?
older.sister=2sg.poss every day where ipfv-go.prs-3sg
‘Where does your older sister go every day?’
(#Dushanbe)
b. sol-i
piš dar xojan
zindigi me-kard-am.
year-ez last in Khojand living ipfv-do.pst-1sg
‘Last year, I was living in Khojand.’

(#Dushanbe)

c. az
sahar
xest-am,
raft-am
dānišga.
from morning rise.pst-1sg go.pst-1sg university
‘I got up in the morning, and I went to the university.’

(#Dushanbe)

d. man kor
kard-a
istod-a=am.
I
work do.pst-ptcp stand.pst-ptcp=1sg
‘I’m working.’

(#Dushanbe)

e. mo dars
xond-a
istod-a
bud-em,
ki
we lesson read.pst-ptcp stand.pst-ptcp be.pst-1pl that
az
berun
ovoz-i
girya-i
kūdak ba gūš rasid-Ø.
from outside sound-ez crying-ez child to ear arrive.pst-3sg
‘We were studying the lesson, when we heard the sound of a child’s cry
from outside.’
(Baizoyev and Hayward 2004: 256)
As observed in the examples (76a–e), the indicative mood has no dedicated
marker which can be recognized, and it can even be claimed that these instantiations have nothing in common at the formal level. Therefore, the category of
indicative cannot be defined formally by itself, unless it is contrasted to some
other forms and constructions. In fact, even if we call indicative the unmarked
or “common” mood (the latter term from Perry 2005: 179), a formal opposition

166

Sepideh Koohkan, Roohollah Mofidi

to some marked mood(s) should be established in the grammar. Specifically,
it is noteworthy here that me- (as used in [76a] and [76b]) cannot establish a
general, formal opposition in the mood system; therefore, it is not an indicative marker. Although it is theoretically possible for two grammatical morphemes to be realized as a single portmanteau morph, the Tajik me- cannot be
an instance of this phenomenon; i.e., it cannot be considered as a portmanteau
aspect-mood marker.
As alternative analysis is that in Tajik present and past imperfectives (76a–b),
me- represents aspect and mood simultaneously, but other indicative forms and
constructions (including [76a–c], as well as some others) lack a mood marker.
Some grammarians seem to show a tendency towards this analysis, though not
using theoretical terms and notions. For example, Windfuhr and Perry (2009:
451) believe that “with stem I forms [=the present tense forms], mi-/me- distinguishes present/future indicative from subjunctive/optative”, calling mi-/me- an
“imperfective marker” of stem I and stem II forms (=present and past, respectively). In a similar way, Paul (2019: 584) gives the label “present indicative, durative” to mi-, and assigns the expression of “indicative present tense, imperfect”
to all of its dialectal allomorphs (mi, [ha]mē, mē, me) (Paul 2019: 604). As a final
example, according to Rzehak (1999: 17), the prefix me- forms the indicative in
the present tense.
However, although me- establishes a partial opposition with bi- for imperfective forms (see Section 6.3), the non-generalized nature of this analysis (i.e., its
partial coverage of the indicative forms) makes it less strong as a hypothesis. In
the next section, we will look at the existence of indicative category from another
perspective: its possible opposition to the imperative forms.

6.2 Imperative
We start our discussion in this section with a typological generalization from Timberlake (2007: 326), which explains why we concern ourselves with imperatives
before subjunctives: “a distinction of at least imperative as opposed to realis, or
indicative, mood is nearly universal”. In order to check this generalization against
Tajik data, first consider the following examples of imperative:
(77) a. in
seb=ro
gir-Ø.
this apple=acc take.prs-2sg
‘Take this apple.’

(Rzehak 1999: 33)

3 Modality and mood in Tajik

b. ba xona rav-ed.
to home go.prs-2pl
‘Go home.’

(Baizoyev and Hayward 2004: 63)

c. šumo gap zad-a
šin=eton.
you.pl talk hit.pst-ptcp sit.prs=2pl
‘You sit and chat.’
(78) a. injo bi-yo-Ø.
here imp-come.prs-2sg
‘Come here.’

167

(Perry 2005: 240)

(Baizoyev and Hayward 2004: 236)

b. yak salat-i
naǧz ham bi-yor-ed.
one salad-ez good also imp-bring.prs-2pl
‘Also bring a good salad.’
(Baizoyev and Hayward 2004: 194)
The imperative gir in (77a) is the bare form of the verb without any inflectional
affix (neither an aspect or mood prefix, nor a tense or agreement suffix), and it
is used by the speaker to address a singular hearer. The verb raved in (77b), consisting of the present stem plus an agreement suffix, is either used to address a
number of people (i.e., a plural noun), or to address a singular hearer in a polite
way; therefore, it is ambiguous for person reference. The singular reference is,
in fact, a pragmatic extension of the morphological, plural usage. Finally, (77c)
is used in colloquial speech as an alternative to (77b), providing “an explicit
plural” (i.e. it refers to plural person), though “exceptionally, this may be used
for emphasis in a (familiar) singular” (Perry 2005: 240). The ending =eton is an
allomorph of the clitic pronoun of the second person plural =aton28 (cf. Rzehak
1999: 33, 43–45). It is added to the present stem immediately, or after the regular
second person plural suffix (-ed). This latter form has been mentioned by Perry

28 The full paradigm of the pronominal clitics in Tajik is consisted of =am (1sg), =at (2sg), =aš
(3sg), =amon (1pl), =aton (2pl), =ašon (3pl). These forms can function as possessive/genitive
and (in)direct object, as well as the agreement marker in an impersonal construction (cf. Perry
2005: 112–117 for all these functions; Paul 2019: 591–592 for their usage in the Persian of Iran). The
ending =eton “may also occur in the Indicative and other moods”, as in
šumo-yon
pul-i
naqd na-dor=eton?
you.pl-pl
money-ez cash neg-have.prs=2pl
‘Don’t you people have any cash?’

(Perry 2005: 196)

This usage is reminiscent of a rare form in Classical Persian, which is mostly used in irrealis
contexts, as in the following example (cf. Nātel-Khānlari 1986: II/319, 327–330 for several other
examples):

168

Sepideh Koohkan, Roohollah Mofidi

(2005: 194, 196, 240) in “Perso-Arabic script”, without specifying if it differs from
the former phonologically or it is just a morphological analysis for the development of the letter form.
Contrary to the examples (77a–c), the imperatives in Tajik can be marked
overtly for mood with bi-, as in (78a–b), or its allomorph bu-,29 which are, both,
accented prefixes, receiving the primary stress of the verb. Tajik grammars
mention several factors for the appearance of this alternative strategy of imperative formation: 1) the literary register, poetry, and older texts (Ido 2005: 61;
Rzehak: 1999: 34; Baizoyev and Hayward 2004: 64), 2) the elevated, prestigious
colloquial speech (Rzehak [1999: 34]: “der gehobeben Umgangssprache”), which
is very close to the first factor, 3) very short present stems, such as o- ‘to come’
and or- ‘to bring’ (Khojayori and Thompson 2009: 91; also cf. Rzehak 1999: 34,
who believes that the verb omadan ‘to come’ is always formed with bi-; Perry
2005: 198, who restricts the optional occurrence of bi- to “common stem-monosyllabic verbs”), and 4) adding “a tone of pleading or cajolery” or politeness (Perry
2005: 198, 241). Furthermore, some verbs have been, specifically, mentioned for
not accepting bi-: boš- ‘to be’, prefixal verbs with omadan/ovardan/doštan, and
complex predicates with kardan (cf. Perry 2005: 241; Rzehak 1999: 33–34).
A major question to raise now would be whether bi-marking is a developing
strategy in Tajik grammar, or just an unproductive, peripheral strategy, restricted
to some specific styles or verbs. The first option means that a grammaticalizing element is on its way to be expanded afterwards in the future of Tajik. The
second option would be a grammatically unimportant issue in Tajik,30 unless
agar

išān

bi-girift=amān-i,

bi-kušt-amān-i.

if
they pfv-catch.pst=1pl-irr
pfv-kill.pst-1pl-irr
‘If we had caught them, we would have killed them.’
(Tārix-i Bal’ami; in Nātel-Khānlari 1986: II/328)
Also cf. Brunner (1977: chapter IV) for an extensive introduction to a range of grammatical
functions of these pronominal clitics in Middle Iranian; Bubenik (2019: 195) for a short account
of their origin in Old Persian.
29 Cf. Perry (2005: 199) for the examples of bu-.
30 In this perspective, it is comparable to the imperative forms of istodan ‘to stand’ and doštan
‘to have’, each one employing a peripheral strategy of imperative-formation. The former verb
receives a suffix -o in its singular form (as in isto. ‘Stop.’ and boz-isto. ‘Halt.’; from Perry 2005:
240). The latter verb is sometimes used as a past participle in a periphrastic imperative construction with the auxiliary boš- (as in on ro došt-a boš-ed. ‘Have/Keep it.’; from Windfuhr and Perry
2009: 460), though it can be used as a regular imperative form, as well (as in šarm dor-ed. ‘Shame
on you’/Lit.: ‘Have shame.’; from Perry 2005: 207). The periphrastic construction mentioned for
doštan can infrequently be also used for a restricted number of other verbs: šišt-a boš-ed. ‘Stay
seated.’ (from Windfuhr and Perry 2009: 456, who call it “perfective-resultative imperative”).

3 Modality and mood in Tajik

169

this restricted, existing strategy triggers a grammaticalization process later in the
future (i.e. bi- begins to be expanded). As far as we can suggest, the statistical
pattern for the usage of this marker can be investigated as a fieldwork project
among regional and social dialects of Tajik, to provide an answer to the above
question.
On the other hand, as far as the issue of the subtypes of mood is concerned, the
imperative verbal forms in (77a–c) and (78a–b) are all distinct from the indicative
verbal forms ([76a–c] as well as other less frequent ones). The following subsection will introduce a partial, formal overlap between imperatives and subjunctives,
which turns out to have a consequence for the issue of mood distinctions.

6.3 Subjunctive
According to the grammatical references of Tajik, the category of subjunctive
mood is represented with a canonical form of the present stem and three periphrastic constructions of the past stem. All of these forms and constructions
have in common the semantic feature of expressing some degree of uncertainty
on the part of the speaker or reporting any less-than-real situation (see Section
2). In this section, first, the present-oriented canonical form will be discussed,
especially with regard to a bi-marking strategy (in a rather similar way to the
imperatives), and then, the past-oriented periphrastic constructions will be
introduced. Let us start with the examples (79a–c) and (80a–b), which illustrate
the present-oriented category with the unmarked and bi-marked representations, respectively:
(79) a. agar ba bozor rav-ī,
du-se
kilo seb
xar-Ø.
if
to market go.prs-2sg two-three kilo apple buy.prs-2sg
‘If you go to the market, buy two or three kilos of apples.’
(Baizoyev and Hayward 2004: 166)
b. mo ba parviz guft-em
ki
vay
ba dušanbe
rav-ad.
we to Parviz say.pst-1pl that (s)he to Dushanbe go.prs-3sg
‘We told Parviz to go to Dushanbe.’ (Khojayori and Thompson 2009: 126)
C. suxan dar bayn-i
mo mon-ad.
speech in between-ez we stay.prs-3sg
‘Let the word stay between us.’

(Perry 2005: 243)

170

Sepideh Koohkan, Roohollah Mofidi

(80) a. me-xoh-am
ū=ro
bu-bin-am.
ipfv-want.prs-1sg (s)he=acc sbjv-see.prs-1sg
‘I want to see him.’
b. bi-gū-em
ki
sbjv-say.prs-1pl that
‘Let us say that . . .’

(Perry 2005: 340)

(Ido 2005: 62)

(79a) is a typical conditional clause, in which the protasis expresses a possibly
realizable event (though not realized yet), and the apodosis represents a realizable imperative (again not realized yet). In this type of conditional, the apodosis
can include an indicative as well, to refer to a future event (again a realizable
event). (79b) and (80a) represent indicatives in the main clause and subjunctives
in the subordinate clause. Finally, (79c) and (80b) are labelled as optative in their
relevant data sources (which are two grammars, in this case), though both can
also be interpreted as weak obligations or recommendations, either: ‘The word
should stay between us’ (79c), and ‘I should say that . . .’ (80b).
As far as Tajik grammars tell us, the subjunctive is “mainly” expressed without
a prefix (Windfuhr and Perry 2009: 456), though they can “occasional[ly]” be prefixed with bi-/bu- “in the elevated style or poetry” (Perry 2005: 199, 208). Paul
(2019: 604) mentions that “in central and northern Tajik dialects, the subjunctive
is expressed by a bare present stem” (referring to Lazard 1956: 145), and adds that
“in some Tajik verbs a historical bi- (or bu-, etc.) has been lexicalized and integrated into the verbal stem, e.g. buraftan ‘go’.” Also, Perry (2005: 234–235) claims
that the prefixation of bi- to “some common verbs” in spoken usage and to some
other verbs in literary Tajik “rarely applies to kardan”. Generally, it seems that
the bi-marking strategy for subjunctives enjoys the same status as for the imperatives, and thus raises similar research questions (see Section 6.2).
With regard to distinguishing the moods, first of all, an opposition between
subjunctives and indicatives can be supported for both the unmarked subjunctives and the bi-marked ones. In fact, none of the subjunctive verbal forms in
(79a–c) and (80a–b) is identical with the indicatives. However, the second person
plural forms (e.g. kun-ed and bi-kun-ed) are all identical for both subjunctive
and imperative moods, i.e. they are “indistinguishable” (Perry 2005: 204) and
formally ambiguous (to be disambiguated contextually or pragmatically). We
wonder if this is the reason for calling the imperative “a specialized form of the
subjunctive” by Perry (2005: 240).
In the next step, as the second goal of this section, we introduce three periphrastic constructions, which are traditionally categorized as past subjunctive in
Tajik grammars, as exemplified in (81a–c), (82a–b), and (83), below. These con-

3 Modality and mood in Tajik

171

structions are formed with the past participle of the main verb and the inflecting
auxiliary boš- ‘to be’. This past participle consists of the verbal root, the past morpheme, and a participial suffix -a (glossed as ptcp.)
(81) a. agar vay
ba xona raft-a
boš-ad,
mo ba vay
if
(s)he to home go.pst-ptcp be.prs-3sg we to (s)he
zang zad-a
me-tavon-em.
ring hit.pst-ptcp ipfv-can.prs-1pl
‘If he went home, we can call him.’ (Khojayori and Thompson 2009: 143)
b. bovar na-me-kard-Ø
ki
du gūsfand gum
belief neg-ipfv-do.pst-3sg that two sheep
lost
šud-a
boš-and.
become.pst-ptp be.prs-3sg
‘He still didn’t believe that two sheep were/may have/could have been
lost.’
(Windfuhr and Perry 2009: 457)
c. mabodo ma=ro na-šinoxt-a
boš-ad.
not.be
I=acc neg-recognize.pst-ptcp be.prs-3sg
‘Did he really not recognize me?’ [‘Could he really not have recognized
me?’]
(Perry 2005: 236)
(82) a. ba kujo
me-raft-a
boš-ad?
to where ipfv-go.pst-ptcp be. prs-3sg
‘Where might she be going (I wonder)?’ (Windfuhr and Perry 2009: 465)
b. boyad šodī
ham az
in
kor-ho-i
modar-i
must Shodi also from this task-pl-ez mother-ez
mehrubon=aš zavq
me-girift-a
boš-ad.
kind=3sg.poss enjoyment ipfv-take.pst-ptcp be.prs-3sg
‘Shodi too must have derived enjoyment from the things his dear mother
did.’
(Perry 2005: 238)
(83) vay
kitob xond-a
istod-a
boš-ad,
xalal
(s)he book read.pst-ptcp stand.pst-ptcp be.prs-3sg obstacle
na-rason-ed.
neg-touch.prs-2pl
‘If (s)he is reading a book, do not interfere.’
(Ido 2005: 63)

172

Sepideh Koohkan, Roohollah Mofidi

Semantically speaking, the examples above reflect some sort of doubt, guess, etc.
as the speaker’s attitude towards the event. These semantic features keep us in
the domain of modality. Then, morphologically speaking, no formal opposition
could be established between these forms/constructions and the indicatives or
imperatives which already were introduced. In other words, no bi-marking or
even zero-marking is observed, nor any other morphological device has been
employed to express the mood. This conclusion is confirmed with the contrast
between (81a–c) and (82a–b), both of which belonging to the same (tentative)
subjunctive category, while they differ morphologically: the former group is
unmarked, and the latter is marked with me-. This prefix has the aspectual value
of imperfective marking, and its presence or absence makes no difference in the
modality/mood value.
We believe that what is responsible for the expression of doubt, guess, etc.
in the examples (81) to (83) is the inflecting auxiliary boš- ‘to be’, not the participial, main verb and its inflectional capacity (i.e. being marked with me-),
nor the auxiliary istodan ‘to stand’ of (83), which has the aspectual contribution of progressive. In fact, the form of ‘be’, employed in the examples (81) to
(83), is the same as the present subjunctive form that is employed in the typical
contexts of condition, subordination, etc. (i.e., of the type already exemplified
in [79a–c] for other verbs), as shown in (84a–b) for the verbs ‘be’ and ‘have’.
This root of ‘be’ is exactly the one used in the imperatives, (as in došt-a boš-ed;
see Section 6.2, footnote), and used in the indicatives me-boš-am (IPFV-be.PRS1SG), me-boš-ī, me-boš-ad, me-boš-em, me-boš-ed, me-boš-and (cf. Perry 2005:
205 for the examples).
(84) a. agar rozī
boš-ed,
man pagoh
me-oy-am.
if
satisfied be.prs-2pl I
tomorrow ipfv-come.prs-1sg
‘If you agree, I’ll come tomorrow.’ (Baizoyev and Hayward 2004: 166)
b. agar vaqt došt-a
boš-ad,
me-oy-ad.
if
time have.pst-ptcp be.prs-3sg ipfv-come.prs-3sg
‘If he has time he’ll come.’
(Rzehak 1999: 68)
As a concluding remark with regard to the second goal of the section, the past
subjunctives could be assumed to belong to the mood category of subjunctive
through the grammaticalization of a construction with a subjunctive auxiliary.
This auxiliary is contrasted to ast-Ø (be.prs-3sg) and the agreement clitics in the
perfect construction. (85a–c) show this contrast, with raftan ‘to go’ in the singular
third person:

3 Modality and mood in Tajik

(85) a. raft-a
boš-ad.
go.pst-ptcp be.prs-3sg
‘(S)he may have gone.’

~ raft-a
go.pst-ptcp
‘(S)he has gone.’

173

ast-Ø.
be.prs-3sg

b. me-raft-a
boš-ad.
~ me-raft-a
ast-Ø.
ipfv-go.pst-ptcp be.prs-3sg
ipfv-go.pst-ptcp be.prs-3sg
‘(S)he may have been going.’
‘(S)he has been going.’
c. raft-a
istod-a
go.pst-ptcp stand.pst-ptcp
boš-ad.
be.prs-3sg
‘(S)he may be going.’

~ raft-a
istod-a
go.pst-ptcp stand.pst-ptcp
ast-Ø.
be.prs-3sg
‘(S)he is going.’

6.4 Conjectural
In Section 6.3, we introduced a past participle, which is used in subjunctive constructions. This form is employed in perfect and passive constructions of Tajik as
well, and also in all these three constructions of Classical and Contemporary Persian
(cf. Lazard 1963; Windfuhr 1979; Mahootian 1997; Yousef 2018). In addition to this
inherited participial form, Tajik grammar has developed a different, though morphologically closely related, past participle to be used in a variety of grammatical
constructions. This form consists of the verbal root plus a past morpheme and -agī,
as in ovard-agī (bring.PST-CONJ) ‘may have brought’. Tajik grammarians assign to
this participle the function of expressing conjectural mood (also called speculative
or presumptive). As described by Perry (2005: 243), “it expresses an unsubstantiated
conjecture or assumption”, and it “incorporates within itself the sense of boyad
‘must’ (which externally signals the past subjunctive of supposition, ‘have done’)”.
In simpler words, “this mood expresses a degree of uncertainty, which in English
can be expressed typically with such words as probably and may (well)” (Ido 2005:
63). In fact, the conjectural expresses the same meaning as the modal words and
expressions šoyad ‘probably’, mumkin (ast) ‘(is) possible’, dar-kor ‘needed, necessary’, éhtimol ‘probably’, without having to use these modals (Rzehak 1999: 87).
The development of this participle has been attributed to the northern Tajik dialects (Perry 2005: 243), and it has been suggested that this form entered the written,
literary variety of Tajik “after the October revolution [of 1917]”, gradually becoming
frequent (Saidov 2011: 22, who reports the absence of this form in a text from the 14th
century [A.D.]). Paul (2019: 607) mentions the occurrence of “forms in -agī ”, often
with a resultative meaning, in certain Early Judeo-Persian exegetical texts (cf. Paul

174

Sepideh Koohkan, Roohollah Mofidi

2013 for the examples), though it is not clear whether the Tajik usage could be developmentally related to these pre-Classical forms or not, since as far as we could check
in the grammars of Classical Persian, no such forms are attested elsewhere.
With regard to the formation of the -agī-participles, not much information
is provided in Tajik grammars. Windfuhr and Perry (2009: 466) represent the
participle-ending as -ag-ī, without giving any explanation. This representation
can motivate a hypothesis about the formation of the participle: -ag can be the
Middle Persian adjectival suffix, which could attach to verbal past stems, and -ī
can be the attributive suffix for adjective formation (as in raftan-ī [go.inf-attr]
‘going’, e.g. mo raftan-ī šud-em. ‘We got ready to go.’; from Perry 2005: 261). To
give more details for the hypothesis, such Middle Persian participles as kard-ag
(from kardan ‘to do’), šud-ag (from šudan ‘to go’), and stad-ag (from stadan ‘to
take’)31 lost the final consonant [g] of their citation form in their transition to New
Persian, while they could have retained this sound in the intervocalic environment of their combination with the vocalic suffix -ī. The attested examples of this
phonological phenomenon for participles are extremely rare (among them, the
above-mentioned Early Judeo-Persian examples), though generally, the phenomenon is quite common elsewhere. For example, the Middle Persian nouns stārag
‘star’ and xānag ‘house’ lost the final [g]-sound in New Persian, and became
sitāra and xāna, respectively, but sitārag-ān ‘stars’ and xānag-i ‘of home’ retained
[g] intervocalically (i.e., between two vowels).
Generally, the conjectural participle can be employed in at least four syntactic constructions, instead of the regular participle of these constructions.32 In
each pair, as shown in (86a–d) below (also comparable to [85a–c] of Section 6.3),
the conjectural counterpart expresses an additional mood/modal meaning.
(86) a. raft-agi=st-Ø.
go.pst-conj=be.prs-3sg
‘(S)he may have gone.’
b. me-raft-agi=st-Ø.
ipfv-go.pst-conj=be.prs-3sg
‘(S)he may have been going.’

~ raft-a
ast-Ø.
go.pst-ptcp be.prs-3sg
‘(S)he has gone.’
~ raft-a
ast-Ø.
go.pst-ptcp be.prs-3sg
‘(S)he has gone.’

31 Cf. Brunner (1977: 34) for the examples; Rastorgueva (1966: 116); Rastorgueva and Molčanova
(1981: 129).
32 This is in line with Windfuhr and Perry (2009: 466), who believe that “while theoretically
the conjectural mood may have all tense, modal, and aspectual forms, only four forms are used
in Tajik”.

3 Modality and mood in Tajik

175

c. raft-agī
istod-a
go.pst-conj stand.pst-ptcp
ast-Ø.
be.prs-3sg
‘(S)he may be going.’

~ raft-a
go.pst-ptcp
ast-Ø.
be.prs-3sg
‘(S)he is going.’

istod-a
stand.pst-ptcp

d. raft-agī
me-bud-Ø.
go.pst-conj ipfv-be.pst-3sg
‘If (s)he might have gone . . .’

~ raft-a
me-bud-Ø.
go.pst-ptcp
ipfv-be.pst-3sg
‘If (s)he had gone . . .’

The examples (87a–d) illustrate conjecturals in sentential contexts:
(87) a. kolxoz-či-yon
dar sahro kor
kard-agi=st-and.
Kolkhoz-of/related-pl in field work do.pst-conj=be.prs-3pl
‘The Kolkhoz33 farmers probably worked in the field.’ (Rzehak 1999: 87)
b. vay
ba xona me-raft-agi=st-Ø.
(s)he to home ipfv-go.pst-conj=be.prs-3sg
‘Probably he will go home.’/‘I suppose that he will go home.’
(Ido 2005: 64)
c. bača-gon ba xona omad-a
istod-agi=st-and.
child-pl to home come.pst-ptcp stand.pst-conj=be.prs-3pl
‘The children are probably just coming home.’ (Rzehak 1999: 87)
d. agar čašm-i yodgor ro
andeša-i
oyanda-i sioh torik
if
eye-ez Yodgor acc thought-ez future-ez black dark
na-kard-agī
me-bud-Ø,
neg-make.pst-conj ipfv-be.pst-3sg
‘If the thought of a black future had not darkened the vision of Yodgor,
. . .’ / ‘If Yodgor’s vision had not been clouded by the prospect of a black
future, . . .’
(Windfuhr and Perry 2009: 467)
In the examples above, the inflecting auxiliary -st- seems to be the contracted
form of the existential verb hast- ‘to be, exist’, which is cliticized to its preceding
conjectural participle in (87a–b) and the participle of the progressive auxiliary
istodan in (87c) (cf. Ido 2005: 64; Perry 2005: 244; Windfuhr and Perry 2009: 466).
Alternatively, a variant of these constructions employs the pronominal clitics
instead of -st-forms; e.g. kard-agī=am (do.pst-conj=1sg) ‘I might have done.’ (cf.

33 Kolkhoz: a form of collective farm in the former Soviet Union.

176

Sepideh Koohkan, Roohollah Mofidi

Perry 2005: 244 for other colloquial variants, including the ending =eton, which
was introduced in the example [77c] for the imperatives).
However, the semantics of the constructions introduced above is less
straightforward. First of all, the semantic feature of perfect is mostly restricted to
the first type (86a), and only as one of its possible interpretations, and its other
instances can refer to completed actions (as in [87a]). The second type (86b)
expresses a potential (i.e. future) action, or a current (ongoing or habitual) one
(Windfuhr and Perry 2009: 467), the former interpretation being exemplified with
(87b). Perry (2005: 245) points out that this construction “appears to be the most
frequently-used of the conjectural series in Tajik literature”. In the third construction (86c), the progressive meaning, triggered by the auxiliary istodan ‘to
stand’, is quite dominant, as in (87c) (cf. Perry 2005: 223–226 for other progressive
constructions with this auxiliary; also see example [83] in the current chapter).
Finally, the fourth construction (86d) has an irrealis interpretation,34 as in (87d),
which is “further marked by the prefix me- as a non-indicative marker” (Windfuhr
and Perry 2009: 467).
The second semantic property of the constructions under discussion is
that the semantic feature of conjecture is not necessarily present in all of the
instances. In addition to some verbal instances, such as (86d) and (87d), which
do not include the conjectural meaning, this is mostly true of the adjectival and
nominalized uses of the conjectural participle. The examples of the lack of conjectural meaning are presented below in (88a–d), (89a–b), and (90), roughly as
the respective counterparts of the semantically-conjectural forms (86a–c). In all
of the examples below, -agī has been glossed as ptcp, instead of conj, and this
glossing can be extended to (86d) and (87d) as well.
(88) a. in
odam az
šahr omad-agī.
this person from city come.pst-ptcp
‘This man has come from town.’ (Rastorgueva 1992: 81; in Ido 2005: 66)
b. dar=aš
pūšid-agī
bud-Ø.
door=3sg.poss cover.pst-ptcp be.pst-3sg
‘Her door was locked.’
(Rastorgueva 1992: 81; in Ido 2005: 66)

34 Our reason for not recognizing irrealis (or the so-called counterfactual) as a distinctive mood
of Tajik is that no dedicated morphological marker has developed for it. This semantic notion can
be expressed with the past imperfective form as well as the past perfect (cf. Perry 2005: 378–379
for several examples).
35 The symbol [oʹ] in this example from Bukharan Tajik represents a rounded mid central vowel
(Ido 2007: 3).

3 Modality and mood in Tajik

c. kitob-i na-xond-agī
book-ez neg-read.pst-ptcp
‘an unread book.’

177

(Khojayori and Thompson 2009: 73)

d. ba”ze did-agi-ho=yaš=ro
hikoya me-kard-Ø.
some see.pst-ptcp-pl=3sg.poss=acc reciting ipfv-do.pst-3sg
‘He told (us) some of the things he had seen.’
(Perry 2005: 272)
(89) a. hujra-ba ham me-šišt-agi,
ham avqot me-xoʹrd-agi,35
cell-loc also ipfv-sit.pst-conj also meal ipfv-eat.pst-conj
ham xob
me-raft-agi,
ham dars
tayyor
also sleep ipfv-go.pst-conj also lesson ready
me-kard-agi.
ipfv-make.pst-conj
‘[Students] would live, eat meals, sleep and prepare for lessons in cells.’
(Ido 2007: 70)
b. duxtar kurta-i
me-dūxt-agi=aš=ro
ba
girl
dress-ez ipfv-sew.pst-ptcp=3sg.poss=acc to
modar=aš
nišon
dod-Ø.
mother=3sg.poss show
give.pst-3sg
‘The girl showed the dress she was sewing to her mother’ [‘the dress
(being) sewn by her’]
(Perry 2005: 276)
(90) ob-i
az
hawz ovard-a
istod-agi=amon
water-ez from pool bring.pst-ptcp stand.pst-ptcp=1pl.poss
‘the water that we were bringing from the pool’
(Windfuhr and Perry 2009: 510)
The participles in (88a) and (89a) could be presumed to be the short forms of
(86a) and (86b), respectively, from which the copula/auxiliary ast ‘be’ has been
removed36 (cf. Perry 2005: 273, who compares an example of the [88a]-type with
Tajik regular perfect constructions). However, the example (88b) is definitely a
predicative adjective, accompanied by a copula, and the examples (88c), (89b),
and (90) are attributive adjectives. Finally, (88d) is a nominalized example, with
an accompanying existential quantifier, a possessive genitive, and an accusa-

36 This phenomenon of dropping the copula can be viewed as a more general feature of Tajik
copulative constructions in the present tense, as already shown in (33a), (36.b), and (39).

178

Sepideh Koohkan, Roohollah Mofidi

tive marker.37 It is noteworthy that in all these examples, the semantic difference
between the conjectural participle and the regular participle has apparently
been neutralized, leading to the removal of the feature of conjectural, sometimes
giving an interchangeable status to the participles (cf. Perry 2005: 272–276 for a
short indication of such a neutralization and interchangeability).38 This is plausibly expected for deverbal adjectives and nouns, as these categories can generally lose verbal characteristics, such as tense, aspect, and mood. However, this
removal phenomenon seems to go beyond such decategorializations, since there
is a good number of instances, such as (88a) and (89a), which are still verbal
instances (though from a non-finite subcategory) and lack the meaning of conjecture. Whether this means that the conjectural mood (as distinct from the indicative) is getting weakened or not, is a question that could be further investigated,
to see if this neutralization is an increasing trend among verbal instances or not.

7 Conclusion
This chapter aimed to accomplish a number of objectives. At a basic level, it aimed
to introduce modal and mood elements of Tajik, more comprehensively than
what has been done previously in the available literature. As a second goal, we
aimed to sum up the already-known modals from different sources, and finally at
a higher, theoretical level, we wished to examine a new method for categorizing
modal elements, based on Nuyts’ studies.
The first achievement of the chapter, in our view, is to show that the notion
of modality is far broader than what is usually addressed in the traditional grammars of Tajik. In this broader sense, in addition to modal auxiliaries, which are

37 In (95.b), the pronominal possessive clitic and the accusative clitic, both, have attached to
the adjectival participle, but it is only their phonological host, while syntactically, the whole
noun phrase hosts them.
38 As a matter of fact, the regular past participle can also be used as an adjectival or nominalized category, as in the following example:
man
šaftolu-i
dirūz
did-a=am=ro
xarid-am.
I
peach-ez yesterday
see.pst-ptcp=1sg.poss=acc buy.pst–1sg
‘I bought the peaches that I saw yesterday.’
(Khojayori and Thompson 2009: 73)
Paul (2019: 607) believes that the use of -agī-forms as nouns and other such constructions “show
a tendency in Tajik towards clause subordination through non-finite verb forms (participles),
as opposed to the predominance of finite verbs used in subordination in Fārsi” (=the Persian of
Iran).

3 Modality and mood in Tajik

179

typically introduced as responsible for expressing modal meanings, there are
several other categories that contribute to this semantic field. These categories
include adjectives, adverbs, and lexical verbs, to a larger extent, and nouns and
prepositional phrases, to a lesser extent. Throughout the chapter, we introduced
some instances of each category in Tajik, though our lists can easily be extended
to include more items, as a consequence of the open nature of lexical categories.
This wider perspective helps distinguish between two autonomous, though
closely related, strata of analysis with respect to modality: i) the linguistic devices
that are employed to express modality; and ii) the modal meanings/functions
that are expressed by modal elements. In addition to the formal classification
which was mentioned above, the categorization of the semantic field of modality
in Tajik can be the second achievement of this chapter (as related to both the
second and the third goals which were stated at the beginning of the chapter).
Although some semantic distinctions of modality have been explicitly made for
Tajik in the previous studies, the mapping between these two levels of analysis
was a preliminary attempt made in this chapter, which can be followed by further
investigation. What encourages such further attempts is the fact that most modal
elements, particularly the auxiliaries, are polyfunctional, expressing a range of
modal meanings. This factor, by itself, can complicate the semantic picture of
modal elements, to a large extent.
Finally, in a tight relation to modality, as depicted above, resides the notion
of mood. This more grammaticalized field employs morphological devices, by
definition. At least, it can be claimed that traditional grammars usually do not
draw a clear border between mood and modality, which can lead to confusion,
both formally and functionally. In the case of Tajik, we tried to show that the
mood categories, as introduced and defined in the existing grammars, are not
well-distinguished in a theoretical perspective. This means that further research
is required in this respect, in order to re-define the mood system of Tajik. There
are a number of questions to be addressed with regard to the current behavior of
mood markers of Tajik, and possibly, the developing trends among these markers,
calling for further fieldwork on Tajik dialects.

References
Ahmadi-Givi, Hassan. 2001. Dastur-e tārixi-e fe’l [The historical grammar of verb]. Tehran:
Ghatreh.
Aikhenvald, Y. Alexandra. 2010. Imperatives and commands. Oxford: Oxford University Press.
Akhlāghi, Faryār. 2008. Bāyestan, šodan va tavānestan: Se fe’l-e vajhi dar Fārsi-e emruz [Bāyestan,
šodan va tavānestan: Three modal verbs in Contemporary Persian]. Grammar 3. 82–132.

180

Sepideh Koohkan, Roohollah Mofidi

Aliev, Bahruddin & Aya Okawa. 2010 [last updated]. Colloquial Tajiki in comparison with Persian
of Iran. Encyclopædia Iranica. http://www.iranicaonline.org/articles/tajik-iii-colloquial
(September 24, 2010).
Amberber, Mengistu, Brett Baker & Mark Harvey. 2010. Complex predicates: Cross-linguistic
perspectives on event structure. Cambridge: Cambridge University Press.
Anvari, Hassan. 2002. Farhang-e bozorg-e soxan [The great dictionary of Sokhan]. Tehran:
Sokhan.
Anvari, Hassan & Hassan Ahmadi-Givi. 2010. Dastur-e zabān-e Fārsi, II [A grammar of Persian,
2], 3rd edn. Tehran: Fātemi.
Baizoyev, Azim & John Hayward. 2004. A beginner’s guide to Tajiki. London: Routledge Curzon.
Bellert, Irena. 1977. On semantic and distributional properties of sentential adverbs. Linguistic
Inquiry 8. 337–351.
Biber, D., S. Johansson, G. Leech, S. Conrad & F. Finegan. 1999. Longman grammar of spoken
and written English. Essex: Longman.
Binnick, I. Robert. 1991. Time and the verb: A guide to tense and aspect. Oxford: Oxford
University press.
Brunner, Christopher J. 1977. A syntax of Western Middle Iranian. Delmar, New York: Caravan
Books.
Bubenik, Vit. 2019. Grammaticalization and degrammati(calizati)on in the development of
the Iranian verb system. In Lars Heltoft, Iván Igartua, Brian D. Joseph, Kirsten Jeppesen
Kragh & Lene Schøsler (eds.), Perspectives on language structure and language change,
193–204. Amsterdam & Philadelphia: John Benjamins.
Bybee, Joan, Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect, and
modality in the languages of the world. Chicago & London: The University of Chicago Press.
Byloo, Pieter & Jan Nuyts. 2014. Meaning change in the Dutch core modals: (Inter)
subjectification in a Grammatical Paradigm, Acta Linguistica Hafniensia 46. 85–116.
Cheung, Johnny. 2007. Etymological dictionary of the Iranian verb. Leiden & Boston: Brill.
Collins, Peter. 2009. Modals and quasi-modals in English. Amsterdam & New York: Rodopi.
Conroy, Joseph F. & Firdaus Shukurov. 1998. Tajik-English/English-Tajik dictionary and
phrasebook. New York: Hippocrene Books.
Dabir-Moghaddam. 2006. Compound verbs in Persian. In Dabir-Moghaddam, Studies in Persian
linguistics: Selected articles. Tehran: Markaz-e našr-e dānešgāhi.
Greenbaum, Sidney.1969. Studies in English adverbial usage. London: Longman
Guimier, C. 1988. Syntaxe de l’adverbe Anglais. Lille: Presses Universitaires de Lille.
Hassandoust, Mohammad. 2014. Farhang-e riše šenāxti-e zabān-e Farsi [An etymological
dictionary of the Persian language]. Tehran: The Academy of Persian Language and
Literature.
Heine, Bernd. 1993. Auxiliaries: Cognitive forces and grammaticalization. New York & Oxford:
Oxford University Press.
Ido, Shinji. 2005. Tajik. München: Lincom Europa.
Ido, Shinji. 2007. Bukharan Tajik. München: Lincom Europa.
Ilkhānipour, Negin. 2013. Sefāt-e vajhi dar zabān-e Farsi [Modal adjectives in Persian]. Tehran:
Markaz.
Jacobson, Sven. 1982. Modality nouns and the choice between to+infinitive and of+ing. Studia
Anglica Posnaniensia 15. 61–71.
Kamp, Hans & Barbara Partee. 1995. Prototype theory and compositionality. Cognition 57.
129–191.

3 Modality and mood in Tajik

181

Khojayori, Nasrullo & Mikael Thompson. 2009. Tajiki reference grammar for beginners.
Georgetown University Press.
Koohkan, Sepideh. 2019. The typology of modality in Iranian languages. Ph.D. Dissertation,
Tehran & Antwerp: Tarbiat Modares and University of Antwerp.
Lang, Ewald. 1979. Zum status der satzadverbiale. Slovo a Slovenost 40. 200–213.
Lazard, Gilbert. 1963. La langue des plus anciens monuments de la prose Persan. Paris:
Librairie C. Klincksieck.
Magni, Elisabetta. 2010. Mood and modality. In Philip Baldi & Pierluigi Cuzzolin (eds.). New
Perspectives on historical Latin syntax. Volume 2: Constituent syntax: Adverbial phrases,
adverbs, mood, tense, 193–275. Berlin & New York: Mouton de Gruyter.
Mahmoodi-Bakhtiāri, Behrooz. 2009. Sāxvāže af’āl-e bāyad va šāyad dar zabān-e fārsi [On the
morphology of the verbs šāyad and bāyad in Persian]. Gramamr 4. 152–169.
Mahootian, Shahrzad. 1997. Persian. London & New York: Routledge.
Nātel-Khānlari, Parviz. 1986. Tārix-e zabān-e Fārsi [The history of Persian]. Tehran: Ferdows.
Nazarzoda, Sayfiddin, Ahmadjon Sanginov, Said Karimov & Mirzo Hasani Sulton. 2008.
Farhangi tafsirii zaboni Tojikī. I/II. [An exegetical dictionary of Tajik language]. Dushanbe:
Rudaki Institute of Language and Literature.
Nuyts, Jan. 1993. Epistemic modal adverbs and adjectives and the layered representation of
conceptual and linguistic structure. Linguistics 31(5). 933–970.
Nuyts, Jan. 2000. Epistemic modality, language and conceptualization. Amsterdam: John
Benjamins.
Nuyts, Jan. 2005. The modal confusion: On terminology and the concepts behind it. In Alex
Klinge & Henrik Hegel Müller (eds.), Modality: Studies in form and function, 5–38. London:
Equinox.
Nuyts Jan. 2006. Modality: Overview and linguistic issues. In William Frawley (ed.), The
expression of Modality. 1–26. Berlin: Mouton de Gruyter.
Nuyts, Jan. 2016. Surveying modality and mood. In Jan Nuyts & Johan van der Auwera (eds.), The
Oxford handbook of modality and mood, 1–8. United Kingdom: Oxford University Press.
Nuyts, Jan. 2017. Evidentiality reconsidered. In Juana Isabel Marin Arresse, Gerda Haßler &
Marta Carretero (eds.), Evidentiality revisited, 57–85. Amsterdam & Philadelphia: John
Benjamins.
Nuyts, Jan. Book Manuscript (MS.). Modality in mind.
Nuyts, Jan, Pieter Byloo & Janneke Diepeveen. 2010. On deontic modality, directivity, and mood:
The case of Dutch mogen and moeten. Journal of Pragmatics 42. 16–34.
Nuyts, Jan & Pieter Byloo. 2015. Competing modals: Beyond (inter)subjectification. Diachronica
32(1). 34–68.
Olson, Randall. B. 1994. A basic course in Tajik (Grammar and workbook). http://talktajiktoday.
com/documents/ABasicCourseInTajik.pdf (accessed 20 April 2021)
Palmer, Frank. R. 2001. Mood and modality. 2nd edn. Cambridge: Cambridge University Press.
Pape, Ingetrud. 1966. Tradition und Transformation der Modalität. Band I. Möglichkeit –
Unmöglichkeit. Hamburg: Felix Meiner.
Paul, Ludwig. 2019. Persian. In Geoffrey Haig & Geoffrey Khan (eds.), The languages and linguistics
of Western Asia: An areal perspective, 569–624. Berlin & Boston: De Gruyter Mouton.
Perry, John. R. 2000. Epistemic verb forms in Persian of Iran, Afghanistan and Tajikistan. In
Lars Johanson & Bo Utas (eds.), Evidentials: Turkic, Iranian and neighbouring languages,
229–58. Berlin & New York: Mouton de Gruyter.
Perry, John R. 2005. A Tajik Persian reference grammar. Leiden & Boston: Brill.

182

Sepideh Koohkan, Roohollah Mofidi

Portner, Paul. 2009. Modality. Oxford & New York: Oxford University Press.
Rastorgueva, Vera S. 1954. Kratkij očerk grammatiki Tadžikskogo jazyk [A short sketch of Tajik
grammar]. (Translated by Herbert H. Paper in 1992). Bloomington: Research Institute for
Inner Asian Studies, Indiana University.
Rastorgueva, Vera S. 1966. Srednepersidskij jazyk [Middle Persian language]. (Translated by
Vali-allah Shadan in 2000). Tehran: Sociery for the appreciation of cultural works and
dignitaries.
Rastorgueva, Vera S. & E. K. Molčanova. 1981. Srednepersidskij jazyk [Middle Persian
language], In Osnovy II. 6–146.
Rothstein, Bjorn & Rolf Thieroff. 2010. Mood in the languages of Europe. Amsterdam &
Philadelphia: John Benjamins.
Rzehak, Lutz. 1999. Tadschikische studiengrammamtik. Wiesbaden: Reichert Verlag.
Saidov, Rahimjon. 2011. Molāheze-hā rāje’ be šekl-hā-ye fe’l dar Resāle-ye Qoddusiye-ye Mir
Seyyed Ali Hamedāni [Considerations about verb forms in Resale-ye Qoddusiye]. Rudaki
32. 19–26.
Salazar, Danica & Isabel Verdaguer. 2009. Polysemous verbs and modality in native and
nonnative argumentative writing: A corpus-based study. International Journal of English
Studies. Special Issue. 209–219.
Tabibzādeh, Omid. 2012. Dastur-e zabān-e Fārsi: Bar asās-e nazariye-ye goruhhā-ye xodgardān
dar dastur-e vābastegi [Persian grammar: A theory of autonomous phrases based on
dependency grammar]. Tehran: Našr-e Markaz.
Taleghani. Azita H. 2008. Modality, aspect and negation. Amsterdam & Philadelphia: John
Benjamins.
Timberlake, Alan. 2007. Aspect, tense, mood. In Timothy Shopen (ed.), Language typology and
syntactic description, Volume III: Grammatical Categories and the Lexicon, 280 – 333. 2nd
edn. Cambridge: Cambridge University Press.
van der Auwera, Johan & Vladimir Plungian. 1998. Modality’s semantic map. Linguistic Typology
2. 79–124.
van der Auwera, Johan & Alfonso Zamorano Aguilar. 2016. The history of modality and mood.
In Jan Nuyts & Johan Van der Auwera (eds.), The Oxford handbook of modality and mood,
9–30. Oxford: Oxford University Press.
van Linden, An. 2012. Modal adjectives: English deontic and evaluative constructions in
synchrony and diachrony. Berlin & Boston: De Gruyter Mouton.
Whaley, Lindsay. 1997. Introduction to typology: The unity and diversity of language. Thousand
Oaks: Sage Publications, Inc.
Windfuhr, Gernot & John Perry. 2009. Persian and Tajik. In Gernot Windfuhr (ed.), The Iranian
languages, 416–544. London & New York: Routledge.
Yousef, Saeed. 2018. Persian: A comprehensive grammar. New York: Routledge

Pictorial source

#Film1. Mihmoni noxonda-1 [Uninvited guest]. Directed by Ališer Ravšanov, written by
Dustmorod Šaripov. 2017.
#Film2. Arusi zamonavi [Modern bride]. Directed and written by Navijon Pirmatov. 2016.
#Film3. Mujassamai išq [Statue of love]. Directed by Omid Mirzo Širinov, written by Abror Zohir.
2003.

Roohollah Mofidi, Negin Mohammadi Nafchi

4 Aspect in Tajik

Abstract: In this chapter, the syntax-morphology and semantics of the aspect
system of Tajik is addressed. Firstly, the imperfective vs. perfective aspect in
this language is represented by an overt marker of the former as opposed to the
unmarked status of the latter. In fact, an adverb of Middle Persian was grammaticalized as a prefix, and it was inherited by Tajik as me. This prefix is employed
by almost all verbs obligatorily in all imperfective environments, to express a
variety of imperfective meanings, including (durative and focalized) progressive
and habitual as well as the extended interpretations of future and irrealis. Three
stative verbs are the exceptions, and they do not generally take this prefix. Secondly, the lexical verb istodan ‘to stand’ was grammaticalized in Tajik as an auxiliary in a construction consisting of the participial form of the main verb plus
the (present or past) perfect form of the auxiliary. This construction expresses
the progressive meaning specifically, along with the general imperfective marker.
Thirdly, there are some other auxiliaries which are used in periphrastic constructions, to express some notions of Aktionsart. Furthermore, the chapter addresses
the interaction of Tajik aspect system with some other domains such as tense,
mood and modality, and event structure, and it is concluded by a comparison of
the aspectual devices of Tajik with Dari, Classical Persian and the Persian of Iran,
to provide a speculative chronology for the emergence of these devices in Tajik.

1 Introduction
The aspect system of languages interacts with tense, mood and modality, and
even evidentiality, referred to collectively as TAM or TAME systems, which implies
a close relationship between these categories. This interaction can partly be
manifested with shared markers, and partly with the inter-dependence of interpretations of forms and constructions. This chapter, which is mainly intended
to describe the aspect system of Tajik, will also address the issue of such interactions under the rubric of several points, viewing the grammatical category of
aspect as part of the verbal domain, both in formal and functional terms.
The primary goal is first to address the aspectual distinction between perfective and imperfective. The former category is mainly identified with the lack of
aspectual marking in Tajik, as opposed to a specialized morphological device for
the latter: me- as a prefix of general imperfective. This is followed by an introduchttps://doi.org/10.1515/9783110622799-004

184

Roohollah Mofidi, Negin Mohammadi Nafchi

tion to a periphrastic construction for progressive (subsumed here as a subcategory of imperfective). This construction is formed with istodan ‘to stand’ as the
auxiliary. Finally, some other periphrastic constructions are introduced, which
express some aspectual meanings. These sections show the extent to which the
research on the aspect system of Tajik has advanced and identify the vacuous
areas. Throughout the discussions, the descriptions and data provided in Tajik
grammars are consulted.1 We also had some limited fieldwork data at our disposal, obtained from some interviews that we carried out with a few Tajik speakers from Dushanbe and Afghanistan’s Badakhshan. These data were occasionally
used to check some linguistic features during the course of our research.
The second goal of the chapter is essentially a theoretical one. Hereto forth,
there will be mention of cases where some considerations outside the domain of
perfective-imperfective are involved. Particularly, the interaction of aspect and
tense will be discussed for Tajik perfective aspect, and we address some interpretations of imperfective aspect which bring mood and modality to the discussion.
Furthermore, the distributional patterns of Tajik main aspectual device (i.e., me)
trigger some issues about the affix order (being mostly open to further research),
and the limitations of this prefix with some stative verbs links the discussion to
the field of event structure. Likewise, the other aspectual device of Tajik (i.e.,
istodan) can be viewed as further grammaticalization of perfect constructions.
We conclude with the issue of the development of Tajik aspectual features, in
comparison to Dari, Persian, and Classical Persian, to provide a speculative chronology for the emergence of these features in Tajik.

2 The unmarked category of perfective aspect
The aspectual category of perfective is classically defined as a holistic viewpoint
towards the events, in which “the whole of the situation is presented [by the
speaker] as a single unanalysable whole” (Comrie 1976: 3). Dahl (1985: 78–79)
adds a secondary feature to the definition of this category: “with a well-defined
result or end-state, located in the past”, and he explains the feature as “[t]here
is a strong tendency for PFV [=perfective] categories to be restricted to past time
reference”. Therefore, this aspectual value is dominantly correlated with the past

1 The Tajik data cited in the chapter has been uniformly converted to APA transliteration (from
the original, Cyrillic script), and English glosses obey Leipzig Glossing Rules, which could differ
from the glosses in the sources of data (if available in the sources at all). English translations are
cited intact from the original sources.

4 Aspect in Tajik

185

tense; i.e., its members are often the past forms, morphologically (the present
perfective category has also been attested in some languages with future time reference; cf. Bybee et al. 1994: 83; De Haan 2011: 450–451). This correlation between
perfective aspect and past tense is observed in Tajik, as well, as will be shown in
the following section. Conversely, at least in Tajik, past tense forms have perfective interpretations by default. In other words, the default interpretation of the
past tense in Tajik is perfective, unless the opposite aspectual value is specified
explicitly (by the imperfective marker me-). This aspectually unmarked status for
Tajik past forms (interpreted as perfective), as opposed to the me-marked past
forms (interpreted as imperfective), is noteworthy.

2.1 The interaction of aspect and tense
In Tajik, there is no overt marker to express the category of perfective aspect. In
the present tense, if we restrict the investigation of aspect to the forms of indicative mood (representing actualized events as opposed to the non-actualized
subjunctive, imperative, and irrealis), all of the finite forms have the imperfective
aspect. Also, almost all of them are marked overtly with me- (see Section 3.1 for
information on the distribution of this prefix, and Section 3.4 for the exceptions to
this general overt marking). In the past tense, almost all imperfectives are overtly
marked with the same marker (the exceptions being the same as in the present).
The absence of such an overt marking in past tense verbs would yield the perfective aspect. The establishment of such a correlation between aspect and tense is in
line with Perry (2005: 320), who refers to the past tense as the “bearer of perfective
aspect”, adding that “[t]his tense is perfective in aspect, and states that an action
was performed and (by implication) completed in the past” (Perry 2005: 213).
Morphologically speaking, Tajik verbal roots are inflected for aspect (or mood)
and agreement to refer to the present time (with no overt temporal marker), and
they obligatorily take an additional past-marking suffix to refer to the past time;
i.e. no present-past pair is left unmarked (cf. Perry 2005: 194–197 for information
on the agreement system of Tajik verbs; Rzehak 1999: 106–107 for the inflectional
forms of bin-/did- ‘to see’, as an example). The frequent past markers include -t,
-d, and -id, and the less frequent ones are -ist and -od. Generally, the choice of
these markers is lexically conditioned, because of historical morphophonemic
processes (cf. Windfuhr and Perry 2009: 447 for a short discussion). Among those
verbal roots which opt for -t or -d, these two variants are phonologically conditioned: the former appears after voiceless consonants, and the latter after vowels
or the voiced consonants [r] or [n] (the last consonant being mostly the final
sound of the causative suffix -on) (cf. Ido 2005b: 42–43). Furthermore, an addi-

186

Roohollah Mofidi, Negin Mohammadi Nafchi

tional factor that complicates the scene is that some past markers are added to a
different allomorph of the verbal root than to the one inflected for present tense.
Perry (2005: 182–185) gives more detailed information on the verbal sub-groups of
Tajik, and Ahmadi-Givi and Anvari (2011: 34–41) classify verbal stems of Classical
Persian into eight groups, which is useful for Tajik as well (For an extensive list of
Tajik verbs in present and past stems, cf. Perry 2005: 186–193; Rzehak 1999: 103–
105; Kalbāsi 1995: 183–195). The examples (1a–b) illustrate past-marking in Tajik:
(1) a. kitob-ro ba jonona dod-am
book-acc to Jonana give.pst-1sg
‘I gave the book to Jonona’
(Baizoyev and Hayward 2004: 433)
b. šumo on-ro
xar-id-ed?
you.pl that-acc buy-pst-1sg
‘Did you buy that?’

(Khojayori and Thompson 2009: 64)

The past stem dod- (1a), consisting of the lexical root and the past marker (merged
together as a result of historical processes), is inflected for person and number to
give the finite form dod-am. This form is interpreted perfectively, in contrast to the
past form me-dod-am (ipfv-give.pst-1sg) and the present form me-dih-am (ipfvgive-1sg), which are both interpreted imperfectively, as a semantic consequence
of the presence of me- (the imperfective marker, e.g., see Section 3). Similarly,
the past form xar-id-am in (1b) has the perfective aspect, and it is contrasted to
me-xar-id-am (ipfv-buy-pst-1sg) and me-xar-am (ipfv-give-1sg), the imperfective
past and present forms, respectively.2

2.2 Controversial historical behavior
In this section, the arguable possibility of marking the perfective with an aspectual marker in Classical Persian (as the predecessor of Contemporary Persian and
Tajik) is discussed. In Early Classical New Persian texts (largely from 10th to 13th
centuries A.D.), there is a functional element bi-, which sometimes appears with
the past tense verbs, and less frequently with perfects or other constructions.

2 Perry (2005: 178) assigns the value of “perfective aspect” to present perfect (karda-ast), past
perfect (karda buda-ast), subjunctives (kunad, karda bošad), and the conjectural (karda-gi-st) as
well. To define the perfective category in such an extensive way is highly controversial. At least
in this chapter we prefer not to do so and restrict ourselves to past forms as representatives of
the perfective aspect.

4 Aspect in Tajik

187

This element is usually described by Persian grammarians to have an “emphatic
function” (Ahmadi-Givi 2001: 164; Bahār 1994: I/333; Farshidvard 2008: 156;
Mashkur 1984: 99–100; Shari’at 1985: 146), or to be “redundant” (Farshidvard
2008: 156; Gharib et al. 1994: 95–96), or even “decorative” (Ahmadi-Givi 2001:
164; Khayyāmpur 1996). On the contrary, some Persian grammars get closer to the
aspectual notions in their analyses of this element. For example, Nātel-Khānlari
(1986: II/204) refers to the use of bē in Middle Persian texts, asserting that some
scholars consider its function as “expressing the completion and occurrence of
the verb” (Reference can also be made to Rastorgueva 2000: 132; and Brunner
1977: 161–162). Nātel-Khānlari rejects the decorative and emphatic functions, specifically claiming that the latter requires explicit and extensive evidence to be
proven, for which he has not found such evidence convincingly. Sedighiān (2004:
71), in her study of some texts of 11th and 12th centuries rejects the emphatic function as well, suggesting no alternative function.
Nyberg (1931, 1974) is probably the first to analyze bē in Middle Persian as
“denoting the perfective aspect of the act, viz. that it comes to an end, or has its
limit” (Nyberg 1974: 46).3 Furthermore, quoting from Windfuhr (1979: 95), Kaj Barr
(in Andreas 1939: 431–433) was “the first to interpret the seeming emphasis as a
marker of perfectivity” in Classical Persian; and MacKinnon (1977) showed the
perfective function of bi- in a study of a 10th-century text. Amongst grammarians
of Tajik, reference can be made to Perry (2005: 198) for such an idea: “[a]t earlier
stages of the language [i.e. Tajik], bi- had a perfective sense and application in
both stems,4 as the counterpart of the imperfective (ha)me-”. Finally, Rubinčik
(2012: 310) does not use the term perfective but mentions that the marker was
used in the Classical period to indicate a single occurrence of the action, and its
completion. This usage of bi- in Classical Persian is as follows:5

3 For a detailed report of other approaches to the function(s) of bē in Middle Persian, cf. Jügel
(2013).
4 Perry does not provide any argument for the extension of the perfective function from past
stems to present stems. Possibly, he intends the future-marking function of bi- with the present
stems in Early New Persian (cf. Windfuhr 1979: 93–94), which was lost along with its perfective
function with the past stem. In that stage of the language, bi- also served a subjunctive-imperative function with the present stem, with a low frequency (for a statistical report, cf. Mofidi 2021).
This function continued to expand its frequency afterwards, now being actively in use in Persian
(cf. Yousef 2018: 244–258), but “mainly” not in Tajik (Windfuhr and Perry 2009: 456).
5 In transcribing the vowels in Classical Persian, we are following Lazard (1963) and Windfuhr
(1979), among others, who treat these vowels as unchanged from Middle Persian. Some features
of the Classical Persian vowel system are still unchanged in Tajik, while they have experienced
much change in the Persian of Iran.

188

Roohollah Mofidi, Negin Mohammadi Nafchi

(2) a. bārān bi-bār-id
va
bar zamin bi-ist-ād
rain
pfv-rain-pst[3sg] and on ground pfv-stand-pst[3sg]
‘It rained and the water stayed on the ground’
(Tārikh-i Bal’ami; in Ahmadi-Givi 2001: 260)
b. ya’qub in
du’ā
bi-xān-d.
jibra’il bi-āmad
Jacob this prayer pfv-read-pst[3sg] Gabriel pfv-come.pst[3sg]
‘Jacob uttered this prayer. Gabriel came’
(Faraj-i ba’d az shiddat; in Ahmadi-Givi 2001: 260)
To connect the discussion to Tajik more directly, we can refer to Mofidi (2020),
who statistically investigated the marker bi- in fifty-five New Persian texts. Among
these texts which ranged from the 10th to the 20th centuries, seven have definitely
been written in Central Asia, the home of Tajik, and three other texts have indirect
links to the region. This investigation shows a considerable high frequency of
bi- in some texts that date back to the 10th to 13th centuries. Then, in the following
centuries, the frequency of the marker decreases, ultimately to zero in Contemporary Persian and Tajik. The reason(s) for the disappearance of this function of
bi- is open to further syntactic and sociolinguistic investigation from a diachronic
perspective.

3 A marking strategy for general
imperfective aspect
This section will introduce an inflectional prefix (me-), which is employed in the
aspect system of Tajik to mark the imperfective aspect. This marker has generalized both morphologically and semantically, to be used with almost all Tajik
verbs (except for three stative verbs, i.e., ‘be’, ‘have’, and ‘should’), and to cover
an extensive range of aspectual interpretations in the imperfective domain
(including progressive, habitual, future, and irrealis), and therefore, it can be
called a general imperfective marker. In other words, the generalized character
of me- can be observed at both formal and functional levels, and these two levels
of generalization satisfactorily define a general imperfective device, which has
grammaticalized to a high level as an inflectional prefix. This marking strategy
(i.e., linguistic device) is distinguished from a periphrastic construction in the
Tajik aspect system, which is used to specifically express progressive aspect as
a subcategory of imperfective (this specific strategy, formed with the auxiliary
istodan ‘to stand’, will be addressed).

4 Aspect in Tajik

189

3.1 The marker me- as a prefix
There are at least seven canonical or periphrastic constructions in Tajik in which
the prefixal marker me- is used. This variety of structural occurrence, i.e., the
structural positions of this prefix in the verbal complex, with respect to the verbal
root and other lexical and functional elements, as well as the marker’s morphological and phonological properties, is to be considered carefully. An important
point to be highlighted here is that the properties of me- (and its general imperfective nature) are verified through descriptions and examples presented in several
Tajik grammars, in which a range of imperfective interpretations has been mentioned for this prefix.

3.1.1 Distribution of meThere is a variety of canonical and periphrastic constructions which employ me-,
either with the finite or non-finite form of the main verb, or with the inflecting
auxiliary. Table 1 provides an overview of all these constructions with xūrdan ‘to
eat’ as an example, classifying them with respect to some grammatical categories. As shown (Table 1), finite verbs (the main verbs or auxiliaries) are inflected
for agreement features of person and number, and non-finite verbs are made of a
non-finite suffix -a added to the past stem of the verb.6
Table 1: Structural distribution of me-.
Present

Past

(1) me-xūr-am
(ipfv-eat-1sg)
‘I eat/am eating’

(2) me-xūr-d-am
(ipfv-eat-pst-1sg)
‘I used to eat/was eating’

(3) me-xūr-d-a=am
(ipfv-eat-pst-ptcp=1sg)
‘I have been eating’

(4) xūr-d-a
me-bud-am
(eat-pst-ptcp ipfv-eat.prs-1sg)
‘Had I eaten . . .’ (irrealis)
(5) me-xūr-d-a
boš-am
(ipfv-eat-pst-ptcp be.prs-1sg)
‘I might have been eating.’

6 We gloss this suffix as ptcp [=participle] in this chapter, in accordance with Leipzig Glossing
Rules. Alternative labels used in Tajik grammars are ger [=gerund] (cf. Ido 2005b) and part
[=participle] (cf. Windfuhr and Perry 2009).

190

Roohollah Mofidi, Negin Mohammadi Nafchi

Table 1 (continued)
Present

Past

(6) me-xūr-d-agi-st-am
(ipfv-eat-pst-conj-be.
prs-1sg)
‘I might be eating’ /
‘I’m about to eat’

(7) xūr-d-agī
me-bud-am
(eat-pst-conj ipfv-be.prs-1sg)
‘Might I have eaten . . .’ (irrealis)

In the Table, the indicative constructions (1 and 2) are canonical, only consisting of the inflecting main verb. In the present perfect construction (3), me- is
prefixed to the non-finite verb, to add or emphasize the imperfective sense, while
the construction is used more commonly without me-. In both the unmarked construction and the me-marked one, the non-finite verb hosts agreement clitics7 (as
in the example in Table 1), or it is optionally followed by ast (be.prs[3sg]) for
the third person singular.8 Likewise, the pluperfect (i.e., past perfect)9 is used
without me- far more frequently (e.g. xūr-d-a bud-am) than the me-marked construction (4), which expresses irrealis meaning (cf. Windfuhr and Perry 2009:
458 who assign a “conditional function” to this construction, illustrating their
description with raft-a me-bud-am; see Section 3.3.2 for an introduction to irrealis). Finally, for the subjunctive and conjectural10 categories and their various

7 These enclitics include =am (1SG), =ī (2SG), =em (1PL), =ed (2PL), and =and (3PL). They can
also attach to nouns, adjectives, and even prepositional phrases, functioning as the copula;
e.g., hanūz rozi=ed? ‘are you still content?’ (Perry 2005: 200), and they differ from the regular
agreement markers, both in some of the members of the paradigm and in their distribution. For
more information about Tajik perfect constructions, cf. Perry (2005: 217–219); for a sketch of other
agreement endings of the verb in Persian, which are basically preserved in Tajik as well (though
with phonological differences), cf. Mahmoodi-Bakhtiari (2018: 289).
8 Tajik grammarians believe that this verb is also cliticized to the past participle (Ido 2005b: 58;
Perry 2005: 200; also cf. Mahmoodi-Bakhtiari 2018: 289 for the same analysis in Persian). The
following example illustrates its occurrence in the present perfect construction:
padar-am
čand
marotiba ba faronsa raft-a
ast
father-1sg.poss several times
to France go.pst.ptcp be.prs[3sg]
‘My father has been [Lit.: gone] to France a number of times’
(Baizoyev and Hayward 2004: 184)
9 Tajik pluperfect forms are all periphrastic, including an inflecting auxiliary bud- (be.pst-), as
in the following examples:
in
kitob-ro
man porsol
xar-id-a
bud-am
this book-acc I
last.year buy-pst.ptcp be.pst-1sg
‘I had bought this book last year’
(Rzehak 1999: 68)
10 Abbreviated in Table 1 as conj, which is not listed in Leipzig Glossing Rules.

4 Aspect in Tajik

191

forms, including constructions (5, 6, and 7), which may have a perfect sense as
well, cf. Ido (2005b: 62–65); Perry (2005: 236–246). Moreover, in addition to the
eight constructions introduced in Table 1, there are several less grammaticalized,
aspectual constructions in Tajik in which the auxiliary (or sometimes possibly
the light verb) is inflected in the same way as regular constructions (1 and 2);
namely, they are present indicative or past indicative. Generally, these aspectual
constructions will be introduced in section 5, and specifically, reference can be
made to the examples (39 and 40), which illustrate me- in that section.

3.1.2 Morphological and phonological properties of meAn important morphological feature of me- is that it is attached to the verb as a
prefix. Referring again to Table 1, the first argument in favor of the prefixal status
of me- arises from its position in constructions (4 and 7). In these constructions,
me- skips the main verb, attaching to the auxiliary, instead. This morphological
behavior is, at least, in contrast to adverbial particles and derivational prefixes,
which usually (but not always; see Examples 6a–c) appear at the margin of the
verbal complex, not inside it. Boz ‘back’ (3) is an example of these markers, and
it precedes the imperfective marker:
(3) vay
zardolu-ro girift-a
xūrda-xūrda boz me-dav-id
(s)he apricot-acc grab.pst.ptcp eating-eating back ipfv-run-pst[3sg]
‘He grabbed the apricot and ran back, eating (all the way)’
(Perry 2005: 150)
A second, stronger and more general piece of evidence comes from complex predicates. These constructions are mostly consisting of a noun or adjective which is
combined with a verb to make a single structure (cf. Windfuhr and Perry 2009:
496–500 for the syntactic and semantic properties of Persian complex predicates,
which can also be generally extended to Tajik). The marker me- always appears
within these constructions, being prefixed to the verbal element. Sentences
(4a–b) exemplify this morphological phenomenon (cf. Ido 2005b: 76–77 for the
structure of these verbs, which he calls “compound verbs”; Perry 2005: 459–467
for an extensive range of examples with several verbs).
(4) a. dar berun
bača-ho bozī me-kun-and
in outside child-pl play ipfv-do-3pl
‘The children are playing outside’ (Baizoyev and Hayward 2004: 430)

192

Roohollah Mofidi, Negin Mohammadi Nafchi

b. tu-ro
muntazir me-šav-am
you.sg-acc waiting ipfv-become-1sg
‘I’ll wait for you’

(Perry 2005: 462)

Likewise, some prefixal predicates (formed with a derivational prefix plus a verb,
which can also be seen as a type of complex predicate, in a broad sense) support
the prefixal status of me- (just like the complex predicates [ as in 4a–b), though
some of them diverge from this norm, requiring me- to precede the derivational
prefix. Perry (2005: 452–457) is particularly a detailed report of the behavior of the
prefixal verbs with respect to me- and na-. He believes that there are seven “preverbs” in Tajik (discounting the variants): bar-, dar-, fur- (far-, etc.), faro-, boz-,
vo-, and ǧun-. According to his report, the first three in the list (bar-, dar-, fur-)
attach inseparably to some verbal stems, not allowing the inflectional prefixes
(me- and na-) to appear between the derivational prefix and the stem. It seems
that Perry would like to explain this behavior by resorting to frequency: “the three
most frequently occurring preverbs have become inseparably attached to the stem
of the most common verbs of motion” (Perry 2005: 452; with our emphasis on
frequently and common). Regardless of whether there could be other alternative
explanations or not, the instances provided by him are cited below (5), with the
glosses within parentheses extracted from his work. The usage, then, is illustrated (6a–c).
(5) bar-omadam (on/up/over-come) ‘to come/go up, out; ascend, emerge’
bar-ovardan (on/up/over-bring) ‘to bring up/out, produce’
dar-omadam (in-come) ‘to come/go in, enter’
dar-ovardan (in-bring) ‘to bring in, take in, import, introduce’
fur-omadan (below/down-come) ‘to come/go down(stairs), descend, alight,
land’
fur-ovardan (below/down-bring) ‘to bring down, lower, unload’
(Perry 2005: 453–454)
(6) a. me-bar-oy-am
ipfv-out-come-1sg
‘I leave’
b. me-dar-or-em
ipfv-in-bring-1pl
‘We bring in’

(Olson 1994: 81)

(Rzehak 1999: 75)

4 Aspect in Tajik

193

c. az in
kor
heč
čiz
na-me-bar-oy-ad
of this work (not) any thing neg-ipfv-up-come-3sg
‘Nothing will come of this (matter)’
(Perry 2005: 453)
On the other hand, the same derivational prefixes (bar-, dar-, fur-), when combined with other verbs, display a different morphological behavior: they require
that me- and na- precede the verbal stem immediately, being separated from the
stem, themselves (as in 7a and 7b below). This is in conjunction with the general
pattern among other Tajik prefixal predicates (8). The only verb which displays
structural diversity in this regard is bar-doštan ‘to pick up’: it allows both patterns
(cf. Perry 2005: 453 for one example of this verb).11
(7) a. ba xona-i
xud bar-me-gard-ad
to home-gen own back-ipfv-turn-3sg
‘He’ll return home’
b. davo-ro
furū na-me-bar-ad
medicine-acc in
neg-ipfv-take-3sg
‘(S)he won’t swallow the medicine’

(Perry 2005: 453)

(Perry 2005: 454)

(8) rayon-i
sanoatī
qism-i
janubi-i
šahr-ro
district-gen industrial part-gen southern-gen city-acc
faro
me-gir-ad
across ipfv-take-3sg
‘The industrial district occupies/covers/encompasses the southern part of
the city’
(Perry 2005: 454)
The third piece of evidence to demonstrate that me- is a prefix can be its position
with respect to the negative marker na-. In all of the constructions in which meand na- co-occur (in all negative imperfective forms), the order of occurrence is
na-me-, i.e., the imperfective marker always appears after the negative marker.
If we accept the prefixal status of na- (at least, since it carries the primary stress
of the whole verb form as a single word), me- would inevitably be a prefix. On

11 For a comparison of all these prefixal verbs to their Persian counterparts, see Section 3.2,
which discusses the issue from a diachronic perspective. Also cf. Ioannesyan (1998: 151) for reporting some inseparable prefixal predicates in the Dari dialect of Kabul, and the absence of this
kind of inseparability in the Dari dialect of Herat, the latter resembling the Persian dialect of
Tehran with this respect.

194

Roohollah Mofidi, Negin Mohammadi Nafchi

the contrary, if we do not accept this assumption about na-, the whole argument
will turn out to be Circular. Below, (9a and 9b; also see 6c above) are examples of
negative imperfective forms:
(9) a. pašša-xona
a
inja yoft-a
na-me-ton-et,
mosquito-house from here find.pst-ptcp neg-ipfv-can-2sg
aka-jon
‘Brother, you can’t find mosquito-nets here’
(Ido 2007: 93)
b. tu
ki
čiz-e
na-me-dod-ī,
čaro hamon
you.sg that thing-indf neg-ipfv-give.pst-2sg why same
dam-i
dar na-guft-ī?
front-gen door neg-say.pst-2sg
‘Since you were not going to give me anything, why didn’t you say so,
right at the door?’
(Windfuhr and Perry 2009: 453)
Furthermore, as aforementioned, and as far as can be ascertained from Tajik
grammars, me- always appears in this single form, with no variation for different
phonological environments. This means that no allomorphic variation could be
assumed for this marker, not even before the vowels, which is a frequent phonetic context, since several verbal roots begin with vowels (e.g., omadan ‘to
come’, ovardan ‘to bring’, etc.). Whether the grammars are presenting precise
information this is a question for which there is no validating answer. The issue
calls for further phonetic investigation, because even if we are to approve the
aforementioned feature of me-, it still needs to be validated possibly by resorting
to general phonological rules of Tajik, and/or any specific morpho-phonological
properties of me-.
Finally, as a phonological feature of me-, it is worth pointing to its stress-bearing nature, unless the verb is negated with na- (whose stress-bearing force is
stronger than me-). Perry (2005: 27) believes that “[s]tress in verb forms is basically
regressive, i.e., the first syllable of a finite, conjugated form carries the stress”,
and that “[p]refixes bi-, me-, na- are always stressed; if more than one occurs, the
first (the negative na-) is stressed”. For example, in the forms me-rav-am ‘I go’,
me-guy-am ‘I say’, and me-šinoxt-a ‘(S)he has known (it)’, the primary stress is on
the imperfective marker, but in na-me-rav-am ‘I don’t/won’t go’, na-me-gūy-am ‘I
don’t/won’t say’, and na-me-šinoxt-a ‘(S)he hasn’t known (it)’, the primary stress
falls on the negative marker, not on the imperfective.
Windfuhr and Perry (2009: 430), however, have reservations about the regressive nature of stress: “though less so in Tajik than in Persian”. They do not give
any more details, but this could possibly be in line with Ido (2005b: 15–16), who

4 Aspect in Tajik

195

refers to some contradicting remarks of the grammarians, concluding that “there
appears to be no general agreement among Tajik linguists regarding stress placement in cases where me- occurs or where na- and me- co-occur.”12

3.2 Diachronic development of meHistorically, the inflectional marker me- has grammaticalized from the adverb
hamē(w) ‘always, continually, forever’ of Middle Persian (approximately 3rd–10th
centuries A.D.), which, in turn, is argued to come from the reconstructed form
*hama-aiwa- ‘same duration, time’ of Old Iranian (cf. Bubenik 2019: 199–200;
Josephson 2016: 49). Interestingly, the vowel quality of the form in Tajik (me-)
is closer to its historical origin than the form used in the Persian of Iran (mi-).
Therefore, the two varieties share the phonological reduction of dropping the first
syllable of the original adverb (also called phonetic erosion), though Persian has
experienced a general change of ē>i in all phonological environments (cf. Meier
1981 for an extensive discussion about this phonological change).
As the adverbs are expected to enjoy high flexibility in their structural positioning, the decategorialization from adverb to prefix (regardless of possible
intermediate categories in between) has established a fixed position for me- in
the verbal complex (as shown in Section 3.1). From this point of view, the data
presented (Table 1) clearly points to the fact that me- is always prefixed to the
verbal root. However, some sentences apparently do not follow this general statement and are exceptions to the rule (6a–6c). At least in the grammars, as well as
in our fieldwork data, we do not observe any sign of dialectal variation or change
towards uniformity for these exceptions in Tajik. In the Persian of Iran, however,
these exceptional derivational prefixes constitute a uniform pattern along with
other prefixal and complex predicates, all allowing the inflectional prefixes to
pave the whole structural path to attach to the verbal root. In other words, in
Persian, none of the inflectional prefixes (including mi-, be-, and na-) precede the
derivational prefix and the non-verbal element of the complex predicate, all of
them appearing immediately before the verbal root (cf. Windfuhr and Perry 2009:
448; Yousef 2018: 179). The complex predicate (10) and the prefixal predicates
(11a–b) are from Persian, to be compared to Tajik (4a–b and 6a–c), respectively;
Also cf. Yousef (2018: 225) for an example of bar-dāštan ‘to pick up’ in Persian,

12 This issue can be a future fieldwork project, and the results could possibly be considered in
relation to the general stress-bearing positions and markers in each Tajik dialect.

196

Roohollah Mofidi, Negin Mohammadi Nafchi

which uniformly requires mi- to appear between its two consisting parts, contrary to Tajik which allows diversity for the appearance of me- with this verb (bardoštan, as mentioned in Section 3.1.2).
(10) bāzi ne-mi-kon-am
play neg-ipfv-do-1sg
‘I won’t play’

(Mahootian 1997: 88)

(11) a. az
qarāen
bar-mi-āy-ad
ke
bimār ast
from evidence.pl up-ipfv-come-3sg that sick
be.prs[3sg]
‘It seems that (s)he is sick’
(Tabibzadeh 2012: 99)
b. dastkeš-am-rā
dar-mi-āvar-am
glove-1sg.poss-acc out-ipfv-bring-1sg
‘I am taking off my gloves’

(Yousef 2018: 242)

3.3 Functional generalization
Following the diachronic discussion in the previous section, which addressed
phonetic erosion and decategorialization as two parameters of the grammaticalization of me-, now we can engage with its semantic and pragmatic features. As a
matter of fact, within a grammaticalizational framework, some amount of bleaching (the so-called desemanticization) and pragmatic enrichment is expected to
have occurred for the grammaticalizing element (me- in this case). In an aspectual
perspective, the result of this process is a highly generalized functional element,
which covers a wide variety of imperfective interpretations. The following subsections address this functional generalization. First, the two core interpretations of
imperfective, i.e., progressive and habitual, will be introduced, and then the two
extended interpretations, i.e., future and irrealis, will be analyzed.

3.3.1 Progressive and habitual interpretations
The progressive aspect, either as a separate grammatical device (see Section 4 for
this usage in Tajik), or as part of the general imperfective (as being discussed here
for me-), indicates that the event is in progress, i.e., it is going on in a specific situation at a certain time. This viewpoint towards the event, taken by the speaker,
can include two sub-divisions. Either a mere concept of duration in a time interval
is intended, or a certain time point of a continuing event is focused on, some-

4 Aspect in Tajik

197

times simultaneous with another event (cf. Bertinetto et al. 2000 for introducing “durative” and “focalized” in more details). The former can be a dynamic,
ongoing event, or more probably, a constant state, while the latter is essentially
dynamic. These two interpretational choices are instantiated by (12a–b) and (13),
respectively:13
(12) a. man zabon-i
anglisi-ro
me-fahm-am,
ammo
I
language-gen English-acc ipfv-understand-1sg but
xub gap zad-a
na-me-tavon-am14
well talk hit.pst-ptcp neg-ipfv-can-1sg
‘I understand English, but can’t speak it well’
(Baizoyev and Hayward 2004: 437)
b. du sol-i
raso intizor me-kard
two year-gen exact wait
ipfv-do.pst[3sg]
‘She waited two whole years’

(Perry 2005: 215)

13 Our theoretical classification here is regardless of the diverging labels which are used by
Tajik grammarians. Perry (2005) generally calls the me-marked forms “Indicative” in present
tense, and “Imperfect” in the past, mentioning “durative” as one of the uses of the latter. This
last term is also used in his work for the me-marked present perfect and subjunctive, while the
term “progressive” is limited to the periphrastic construction of istodan. Ido (2005b) distinguishes the me-marked forms and istodan-constructions with the terms “imperfective” and “progressive”, respectively. Khojayori and Thompson (2009) call the present imperfectives “present-future tense”, the past imperfectives “imperfect”, and the istodan-constructions “continuous”.
14 According to Perry (2005: 338), the past participle in such examples “may have derived from
the Infinitive construction”. His original examples are raft-an me-tavon-am and raft-a me-tavonam, both meaning ‘I can go’, in which the latter could be derived from the former, accordingly.
The same analysis can be traced back to Lazard (1956: 176), who believes that “what looks like a
past participle in these constructions has been explained as being derived historically from an
old infinitive (raftan) whose -n has been dropped” (Paul 2019: 610). The use of the full and shortened infinitival forms (raftan and raft, respectively) is quite common in Classical Persian, though
in the modal+infinitive order (such as tavān-am raft/raftan) (cf. Ahmadi-Givi 2001: 1362–1381 for
a great deal of examples in various tenses; Nātel-Khānlari 1986: II/359–360). An anonymous reviewer kindly reminded us that the opposite order (i.e., infinitive+modal, as in Tajik) is attested in
Classical Persian in a poem by Omar Khayyām (d. 12th c.): man bi may-i nāb zist-an na-tvān-am ‘I
cannot live without pure (strong) wine’. Generally, Paul (2019: 610) points out that “[t]he inverse
order of modal and full verb here [in Tajik dialects and Kaboli Darī], in contrast to modern Fārsī,
may reflect Turkic influence”.

198

Roohollah Mofidi, Negin Mohammadi Nafchi

(13) vay me-don-ist
ki
zan-i
xud dar xona ast
he ipfv-know-pst[3sg] that wife-gen own at home be.prs[3sg]
va
me-xand-ad
and ipfv-laugh-3sg
‘He knew that his wife was at home (and was) laughing’ (Perry 2005: 212)
Both inflecting verbs in (12a) are present indicative, being interpreted as if they
express a state holding at the present time in a durative way. The verb in (12b) is
stative as well, as a result of a ‘long time’ adverb, which turns the waiting process
into a mental event, rather than physical (compared to waiting somewhere for an
hour, which is physical, though still stative, and not a dynamic activity). The first
verb of (13), me-don-ist, is still within the same category as the verbs in (12a–b):
knowing something as a state of affairs in the past tense. On the other hand, the
last verb of (13), me-xand-ad, is an example of the focalized progressive, which is
a more typical sort of the progressive category, being a representative instance of
an ongoing action, i.e. laughing.
As evident through our examples above, the distinction between durative
and focalized interacts with the event type, to a large extent, but generally, the
focalized interpretation is closer to the typical usage of the term progressive. As
mentioned by Mair (2012: 812), among others, “progressives are largely incompatible with stative verbs and predicates, although, of course, the degree of incompatibility varies across languages”. In Tajik, there are only three stative verbs that
refuse to take me- (see Section 3.4), and other statives take the prefix in imperfective contexts. As exemplified above, this latter group represents states being
held in their time intervals, without being focused on, and therefore, they can be
called durative. Other examples of this group include mondan ‘to stay’, istodan
‘to be standing’/‘to stay’ (but not ‘to stop’ or ‘to stand up’ [from a non-standing
position], which are punctual motion verbs), etc. (cf. Perry 2005: 221–223 for more
examples of statives; Ido 2005b: 55 for mentioning that the verbs istodan ‘to
stand’, nišastan ‘to sit’, and doštan ‘to have’ do not appear as the main verb in the
progressive form). The term durative for these statives is close to Comrie’s (1976)
“continuous” which includes progressives plus statives.
A second core interpretation of imperfective is habitual, which is cross-linguistically “more commonly included in the meaning of a more general gram
[=grammatical morpheme], such as imperfective or present, than expressed separately” (Bybee et al. 1994: 159–160). This is exactly instantiated by me- as an
imperfective marker, embracing the habitual meaning in Tajik as well. As put by
De Haan (2011: 451), “[h]abitual aspect refers to situations in which the speaker
wishes to express that the action being described occurs more than once”. Therefore, the habitual category is a viewpoint of multiple occurrences of the event, as

4 Aspect in Tajik

199

exemplified in below. The example (14a) expresses a current habit, without specifying the periods between each occurrence, (14b) is talking about a recurring
event at specific time intervals in the past, and (14c) is a habitual present perfect
construction (numbered as [3] in Table 1), that expresses an event that recurs in a
period of time which begins in the past and continues until the present moment,
unless it is explicitly specified that the event stops recurring before the present
(which is not the case here).
(14) a. mo odatan čoy-i
kabud me-nūš-em
we usually tea-gen green ipfv-drink-1pl
‘We usually drink green tea’
(Baizoyev and Hayward 2004: 412)
b. sol-i
guzašta man har
hafta ba barodar-am
year-gen last
I
every week to brother-1sg.poss
maktub me-navišt-am
letter
ipfv-write.pst-1sg
‘Last year I used to write letters to my brother every week’
(Khojayori and Thompson 2009: 85)
c. har
sol tobiston ba qišloq-amon
bar-me-gašt-a
every year summer to village-1pl.poss back-ipfv-turn.pst-ptcp
ast
be.prs[3sg]
‘He [is said to have] . . . returned every year in summer to our village’
(Perry 2005: 230)

3.3.2 Interpretational extensions: Future and irrealis
Future is essentially a temporal category, being a division on the time axis in
the languages which grammaticalize it: “a prediction on the part of the speaker
that the situation in the proposition, which refers to an event taking place after
the moment of speech, will hold” (Bybee and Pagliuca 1987, in Bybee et al. 1994:
244). In the formal/literary variety of Tajik, there is a periphrastic construction
dedicated to future time reference, made with the finite auxiliary xoh- ‘to want’,
followed by the past stem of the main verb as a non-finite form.15 This auxiliary
15 In Classical Persian, the full infinitival form (the past stem plus the infinitival suffix -an) was
sometimes used in this construction, e.g., xāh-am raft-an (want-1sg go.pst-inf) ‘I will go’ (cf.
Jahani 2008; Windfuhr 1979: 88). Grounded on this old usage, the grammarians assume that the
infinitival form in the current usage has been contracted, i.e., it has lost the infinitival suffix -an

200

Roohollah Mofidi, Negin Mohammadi Nafchi

is inflected for person, number, and negation, but not for aspect and mood. Its
lack of aspectual inflection, i.e., not receiving me-, is probably a remnant of the
historical stage of Persian in which (ha)mē had not generalized to its current
extent. The stylistically limited usage of this construction (formal/literary) has
been admitted by Tajik grammarians, as well. Perry (2005: 216) mentions that “[t]
he tense is little used in everyday speech and writing, for which the Present Indicative is preferred”. Baizoyev and Hayward (2004: 126) call it “a feature of literary
Tajiki” which “is not used in colloquial”. Rzehak (1999: 86) calls it a categoric
future, being mainly used in written language (Also cf. Ido 2005b: 58; Khojayori
and Thompson 2009: 126–127; Windfuhr and Perry (2009: 489).16 Below, illustrates this usage (15a–b):
(15) a. bekorī
har
sol ziyod xoh-ad
šud
unemployment every year much want-3sg become.pst17
‘Unemployment will increase every year’
(Rzehak 1999: 86)
b. bar na-xoh-and
gašt
back neg-want-3pl turn.pst
‘They will not return’

(Perry 2005: 216)

Regardless of this specific usage, the concept of time has generally been grammaticalized in Spoken Tajik with two values, i.e., past and non-past, and reference to
future is contained within the non-past tense. In this case, we are not faced with
a primary future being expressed by a dedicated marker (Rzehak’s [1999] categoric future), but faced with a category that has been extended from imperfective
aspect to include future. This extension to the future time reference necessarily
relies on the context, i.e., it is the context which provides the required informa(cf. Ido 2005b: 58; Also cf. Maggi and Orsatti 2018: 48–49, who call this periphrastic future “a
Post-Classical development”, which has SPECIALIZED in later Persian as the new future).
16 Windfuhr and Perry (2009: 451) refer to a “vestigial” usage of bi- that “occurs regularly only
as a morphological suppletive in Stem I forms [=present forms] of the two common verbs o-/
omad- ‘come’ and or-/ovard- ‘bring’: me-bi-oyam ‘I come, am coming’; bi-or, bi-or-ed ‘bring (it)’.”
The usage of bi- in their first example (bi- + present form) may be a remnant of Classical Persian,
in which bi- with present stems could express future, or more precisely, completion in the future
(see Section 2.2, footnotes; cf. Jahani 2008; Also cf. Perry 2005: 214–215 for the complete inflectional forms of me-bi-yo-yam, and for an example of ovardan ‘bring’: na-me-bi-yor-and (negipfv-fut?-bring-3sg) ‘They won’t bring (it)’ (the gloss “fut?” is ours); and Perry 2005: 336 for
another example which has been translated as occurring in near future).
17 The absence of an abstract agreement feature in the gloss shows the non-finite status of this
form, as opposed to the finite forms of past tense, which are zero-marked for third person singular (see Examples 3 and 12b).

4 Aspect in Tajik

201

tion for the listener to interpret the event as occurring in the future. In fact, “the
form would be interpreted as indicating present if no temporal reference is established by the context” (Bybee et al. 1994: 275, who call this category aspectual
future).18 Below (16a) contains no temporal specification, and depending on the
situational context, it may be interpreted as either progressive, habitual, or future
(cf. Perry 2005: 211, for his explanation about this sentence; Olson 1994: 82 for a
similar example and his explanation). On the other hand, the other example (16b)
is enriched with temporal information of future, by means of an adverb.
(16) a. man xona me-rav-am
I
home ipfv-go-1sg
‘I’m on my way home’/‘I go home’/‘I’m going home (soon)’
(Perry 2005: 211)
b. man pagoh
ba xujand
me-rav-am
I
tomorrow to Khujand ipfv-go-1sg
‘I am going to Khujand tomorrow’
(Baizoyev and Hayward 2004: 77)
Then, a second extended semantic category to be mentioned for me- is irrealis
mood (also called counterfactuality). Cross-linguistically, the expression of irrealis by means of imperfective devices is quite common (cf. Lazard 2006 for a categorization of the “means of expressing the counterfactual”, “[o]n the basis of a
sample of a few dozen languages”). This extension from imperfective to irrealis
can generally be categorized within the interaction of aspect and modality (cf.
Timberlake 2007: 326 for the definition of the category of mood as “modality crystallized as morphology”). In Tajik, the irrealis use of imperfective aspect marker
occurs in the past tense, and it is contrasted to the subjunctive category, which
occurs in the present tense, “mainly” without an inflectional marker (for this
latter category, cf. Windfuhr and Perry 2009: 456). This contrast has been shown
(in 17a–b) for the irrealis, as opposed to the subjunctive (in 18a–c):
(17) a. agar man ū-ro
me-did-am,
ba vay
me-guft-am
if
I
(s)he-acc ipfv-see.pst-1sg to (s)he ipfv-say.pst-1sg
‘If I saw him/her, I would tell him/her’
(Ido 2005b: 54)

18 At this point, a reference should be made to Haig (2019: 75), who argues for an aspect-neutral
analysis of the prefix mi- with present stems in Contemporary Persian. He believes that “these
erstwhile aspectual markers have become bleached of aspectual content”. His analysis, though
controversial, can essentially be relevant to Tajik, as well.

202

Roohollah Mofidi, Negin Mohammadi Nafchi

b. koš
man ū-ro
me-did-am
would (that) / (I) wish I
(s)he-acc ipfv-see.pst-1sg
‘Would that I had seen her’ (lit. I wish I would see her)
(Khojayori and Thompson 2009: 132)
(18) a. agar ba šahr rav-ī, barodar-am-ro
me-bin-ī
if
to city go-2sg brother-1sg.poss-acc ipfv-see-2sg
‘If you go to the city, you’ll see my brother’
(Baizoyev and Hayward 2004: 166)
b. agar tu
me-rav-ī,
man bo
tu
me-rav-am
if
you.sg ipfv-go-2sg I
with you.sg ipfv-go-1sg
‘If you go, I’ll go with you’
(Khojayori and Thompson 2009: 143)
c. omad-am
ki
ū-ro
bin-am
come.pst-1sg that (s)he-acc see-1sg
‘I came to see him’
(Windfuhr and Perry 2009: 522)
The example (17a) is a counterfactual conditional in both of its clauses, and both
of the me-marked imperfective verbs are interpreted as expressing the irrealis
mood. The morphology of both verbs is past tense, and their reference time could
be the past, because both verbs denote events that could have occurred in the
past (before now), but they are impossible to be realized anymore, as the time/
chance is over. The other example (17b) follows the same morphology and semantics, being pragmatically understood as an unrealized wish. On the contrary, the
conditional clause (18a) is in the subjunctive mood, and its verb is morphologically unmarked for this category, being interpreted as a situation that can be met
in the future (since the morphology of the verb denotes present tense, which is
interpreted as non-past time in Tajik, including the present and future). Accordingly, the second clause in this example also refers to the future, and therefore,
there will be a future interpretation for the me-marked verb. In the same way,
both me-marked verbs of (18b) are interpreted as referring to the future. The difference, however, is that in this example (18b) “[t]he speaker knows for certain
that the other person is going” (Khojayori and Thompson 2009: 143), in contrast
to the example (18a) which lacks such a certainty on the speaker’s part. Finally,
the last example (18c) contains a purpose clause (as its second clause), and its
subjunctive verb denotes the possibility of the verb’s realization (though less possible than an indicative verb, which can be called realis), as opposed to the unrealized verb (in 17b).

4 Aspect in Tajik

203

The development of such a shared marker for past imperfective and irrealis
has another counterpart in Classical Persian. In that case, the marker was not a
general imperfective one, but a morphological device for marking the past habitual: -ē which was suffixed to the verb, mostly in the past tense, and it was also
rarely used with present tense (cf. Lenepveu-Hotz 2012, 2014 as recent accounts;
Lazard 1963; Windfuhr 1979). The remnants of this suffix in Persian are the forms
bāy-est-i (should-pst[3sg]-irr)19 and mi-bāy-est-i (ipfv-should-pst[3sg]-irr),
neither of which has been mentioned in Tajik grammars, so far as can be ascertained. Below (19a–b) are from Classical Persian, with the habitual and irrealis
functions of -ē respectively. Also, below (20) illustrates bāy-est-i in Persian, which
is called “the frozen archaic counterfactual” by Windfuhr and Perry (2009: 490)
(also cf. Perry 2007: 1003). Phillot (1919: 539 f.) reports that this suffix “is still
used in conditional sentences by both Indians and Afghans in speaking”, and
Ioannesyan (1998: 151) provides two examples of bud-i (be.pst[3sg]-irr) from the
Dari dialect of Herat.
(19) a. ādat-i
ō ān
bud
ki
ba Aila nišast-ē
habit-gen he that be.pst[3sg] that to Aila sit.pst[3sg]-hab
[. . .] va
šarāb bā
zan-ān
xwar-d-ē
and wine with woman.pl drink-pst[3sg]-hab
‘It was his habit to reside in Aila [. . .] and to drink wine with the women’
(Tārix-i Sistān; in Lenepveu-Hotz 2014: 236)
b. agar sajda
mar haq rā bud-ē,
iblis takabbur
if
prostration for God for be.pst[3sg]-irr Devil pride
na-kard-ē
neg-do.pst[3sg]-irr
‘If the prostration had been for God, the Devil would not have had a
disdainful attitude’
(Rauzat al-ahbāb; in Lenepveu-Hotz 2014: 239)

19 We suggest including the abstract agreement information [3sg] in the gloss, in comparison to
the Classical Persian usage of non-defective verbs in which the agreement marker appears before
the irrealis marker -ē, such as dān-ist-am-ē (know-pst-1sg-irr) ‘I had known’ (cf. Nātel-Khānlari
1986: II/322–325 for an attested example of this verb, and several examples of other verbs, including bāyistan).

204

Roohollah Mofidi, Negin Mohammadi Nafchi

(20) dowlat
bāy-est-i
barā-ye refāh-e
government should-pst[3sg]-irr for-gen welfare-gen
hāl-e
kārmand-ān eqdām-āt-e
lāzem
be
circumstance-gen employee-pl action-pl-gen necessary to
amal āvar-ad
action bring-3sg
‘The government should take necessary actions for welfare circumstances
of the employees’
(Data corpus; in Akhlāghi 2008: 97)

3.4 Lexical restrictions
Lexically speaking, me- has generalized insofar as it is obligatorily used with all
imperfective verbs, except for three stative predicates. These predicates, which
are not included within the generalizational domain of me- are ‘be’, ‘have’,
and ‘should’. The following roots/stems are to be considered as such: ast (be.
prs[3sg]), hast- (be.prs), bud- (be.pst), dor- (have.prs), došt- (have.pst), boy-ad
(should-3sg), and bo-ist (should-pst[3sg]).20 Some of these forms are inflected
for person and number, but generally not for aspect and mood, though with some
exceptions. However, some of them resist the inflections of person and number,
being even mostly neutral to tense, but they reject taking me- to a lesser extent.
The following section will address all these types of morphological behavior, as
well as the marginal cases in which these verbs can take me-. The general idea is
that the verbs under investigation are the sources of restriction in the generalizational pattern of me-.

3.4.1 Budan ‘to be’
Following the general, cross-linguistic feature of stative verbs, i.e., that they are
not progressivized (see Section 3.3.1), we can assume that Tajik stative verbs,
also, do not embody a progressive meaning (cf. Perry 2005: 223, among others).
However, these verbs can express other imperfective interpretations (durative,
habitual, future, and irrealis) in appropriate contexts by means of the imperfective morphology, i.e., the prefix me- (see Example 12a for fahmidan ‘understand’
and tavonistan ‘to be able to’, and Example 13 for donistan ‘to know’; also cf. Perry

20 More precisely, the Tajik modal boistan should be translated as ‘must/should’. We continue
calling and glossing the forms made with it as ‘should’ only for the sake of brevity.

4 Aspect in Tajik

205

2005: 340–342 for examples of xostan ‘to want’). On the other hand, there are the
three stative verbs of Tajik that do not generally take the imperfective me-, which
shows that the restriction is of a more general nature than that of the incompatibility of statives with a progressive meaning. The forms of the verb ‘be’ constitute
the first class of such verbs. Examples (21a–c) represent different stems of this
verb in imperfective environments, without me- appearing before them:
(21) a. xohar-am
bist-sola
ast
sister-1sg.poss twenty-of.year be.prs[3sg]
‘My sister is 20 years old’
(Khojayori and Thompson 2009: 106)
b. mo kambaǧal hast-em
we poor
be.prs[1pl]
‘We are poor’
c. šumo dar boǧ bud-ed?
you.pl in park be.pst-2pl
‘Were you in the park?’

(Rzehak 1999: 10)

(Olson 1994: 58)

Some grammarians of Tajik seem to attribute the restriction (21a–c) to the stative
nature of the predicate. For example, Perry (2005) calls budan and doštan “stative
verbs”, and then provides a list of “dynamic-stative verbs”: donistan ‘to know’,
šinoxtan ‘to know/recognize’, nišastan ‘to sit’, šištan ‘to sit’, xobidan ‘to sleep’,
pūšidan ‘to wear’, jo(y) giriftan ‘to settle (down)/be located’, and istodan ‘to
stand’, (Perry 2005: 223), and he describes bud- as “[t]his past tense is by nature
durative and does not take me-” (Perry 2005: 205). He clarifies his term dynamic-stative verbs as “[they] may express either an action or a state, depending on
tense and context” (Perry 2005: 221), but he does not explain if this feature could
be the source of morphological contrast between the forms of ‘be’ (as in 21a–c)
and other stative predicates (as in 12a and the first verb in example 13). In fact,
the grammars do not convincingly account for not observing the same restriction
of ‘be’ for other stative verbs.
Another verb that complicates the issue further is boš-, a stative verb with the
same meaning of ‘be’, whose “stem has an imperfective or durative sense” (Perry
2005: 203). Surprisingly, this present tense verb (with no past counterpart) takes
me- obligatorily in its general imperfective use, as exemplified (22a–b):
(22) a. on
mošin-i padar-am
me-boš-ad
that car-gen father-1sg.poss ipfv-be.prs-3sg
‘That’s my father’s car’
(Khojayori and Thompson 2009: 106)

206

Roohollah Mofidi, Negin Mohammadi Nafchi

b. tu
korgar me-boš-ī
you.sg worker ipfv-be.prs-2sg
‘You are a worker’

(Rzehak 1999: 18)

Thus, in our opinion, a conclusion, so far, could be that the restriction is lexical
but not necessarily due to the stative nature of the predicates. We further suggest
that the issue should be investigated theoretically even more, through examination of other possible hypotheses. The fact that this verb shows some idiosyncrasies (e.g., having several different tense-based stems, being used as an auxiliary
in several constructions, etc.) can be a clue to more discussions, and semantic
accounts could also serve as pertinent pathways.
On the other hand, the past stem of ‘be’ (bud-) can optionally take me- when
used in its irrealis meaning (which is not imperfective in its strict sense, as discussed in Section 3.3.2). This form is infrequent, but there is no report on its frequency rate.21 The following (23a–b) illustrate the usage:
(23) a. agar vay
zinda me-bud, . . .
if
(s)he alive ipfv-be.pst[3sg]
‘If he were alive (today), . . .’

(Perry 2005: 379)

b. koš
ki
man dar tojikiston me-bod-am
would / (I) wish that I
in Tajikistan ipfv-be.pst-1sg
‘Would that I were in Tajikistan’ (lit. I wish I were in Tajikistan)
(Khojayori and Thompson 2009: 132)

3.4.2 Doštan ‘to have’
This verb, with its two stems (dor- and došt- for present and past, respectively) and
its various forms, makes up the second class of verbs that do not generally accept
me- in imperfective contexts. It resembles the verb ‘be’ in not accepting aspect and
mood morphology (called “irregular and partially defective” by Windfuhr and Perry
2009: 459, for this feature), and also in inflecting for agreement categories (person
21 Windfuhr and Perry (2009: 460) claim that “[i]n Tajik, but not in Persian, me- in its counterfactual function may be added to Stem II: bud-am, etc. ~ me-bud-am, etc.”. We wonder if they
mean that these counterfactual me-marked forms are more frequent in Tajik than Persian, since
these forms are not uncommon in Persian at all, though they are infrequent (for Persian examples, cf. Yousef 2018: 270–271). However, Perry (2007: 1002) admits the existence of this morphological behavior in Persian: “it may occur optionally to form the (homomorphic) conditional
tense: agar (mi-)budam ‘if I were/had been’, etc.”.

4 Aspect in Tajik

207

and number). However, a difference between ‘be’ and ‘have’ is that the restriction
for me- is less severe with the latter than the former. In other words, me- appears
with more forms and meanings of ‘have’-verbs, compared to ‘be’-verbs, but generally, ‘have’-verbs still display more restrictions than typical Tajik verbs outside the
domain of the three exceptional ones (i.e., ‘be’, ‘have’, and ‘should’).22 The following (24a–c and 25a–c) display unmarked and me-marked instances, respectively:
(24) a. man zavja-i
xub dor-am
I
wife-gen good have-1sg
‘I have a good wife’

(Ido 2005b: 57)

b. ū
yak-čand farzand
došt
(s)he one-some son/daughter have.pst[3sg]
‘He had several children’
(Baizoyev and Hayward 2004: 404)
c. on-ho farq
dor-and
that-pl difference have-3pl
‘They are different (lit.: They have a difference)’

(Rzehak 1999: 20)

(25) a. vay
qalam-aš-ro
me-dor-ad
(s)he pencil-3sg.poss-acc ipfv-hold-3sg
‘She is holding her pencil’
(Khojayori and Thompson 2009: 107)
b. man birinj dūst
me-dor-am
I
rice
friend ipfv-hold-1sg
‘I like rice’ (lit.: I hold it as a friend) (Khojayori and Thompson 2009: 84)
c. xūk nigoh me-došt-and
pig look ipfv-keep.pst-3pl
‘They used to keep pigs’ [lit.: They used to keep an eye on pigs]
(Perry 2005: 207)
And here is an explanation of the above examples with a quotation:
The meaning ‘have’ of this verb is derived from its basic meaning ‘keep, hold’. When used
in its primary sense (which implies an imperfective-durative state), this verb does not admit
the prefix mi-/me- with either stem [=present or past].
(Windfuhr and Perry 2009: 460)

22 Another peculiarity of ‘have’ is that it is not used in the present subjunctive form. Rather, it
is in the periphrastic past subjunctive, as in boy-ad yagon rafiq došt-a boš-ed ‘You have to have
some companion’. However, its imperative form is constructed regularly with the present stem,
as in šarm dor-ed ‘Shame on you. (lit.: Have shame) (Perry 2005: 207).

208

Roohollah Mofidi, Negin Mohammadi Nafchi

We can adopt the semantic distinction proposed in the quotation and apply it to
the examples (24a–c and 25a–c). The former group of examples implies a semantically (and historically) derived meaning, i.e., ‘to possess (physically or metaphorically)’, which is more frequent (called “primary sense” in the quotation above),
while the latter group of examples implies an original meaning, i.e., ‘to keep,
hold’ (also cf. Cheung 2007: 57–59 to follow the semantic pathway of its development; Brunner 1977: 28, for Middle Persian examples). Both groups express
stative situations in more or less the same way. A hypothesis to account for their
different morphological behavior (i.e., rejecting me- as opposed to accepting it)
could be that the unmarked instances (24a–c) represent a more grammaticalized
stage than the fully lexical verbs (in 25a–c). The examination of this hypothesis
requires further theoretical consideration, as well as empirical research, into the
history of Persian.
The grammars of Tajik summarize the distinction (unmarkedness vs. memarking) by resorting to two factors: i) the stative nature of the unmarked cases,
and ii) the morphologically complex structure of me-marked cases. For example,
Perry (2005: 206–207) believes that “[s]ince it expresses intrinsically a durative
and incomplete state (‘being in possession of’), doštan does not take the prefix
me- on tenses formed from either stem”, and that “[w]hen doštan is part of a
complex or compound idiom such that the literal meaning is lost in the metaphor,
it takes me- like any ordinary verb”. Similarly, Ido (2005b: 57–58) expresses that
“the verb doštan ‘to have’ does not occur in the present imperfective form, i.e., it
does not take the prefix me-. However, compound verbs that have doštan as one
of their components, such as pinhon doštan ‘to conceal’, as well as the verb bozdoštan ‘to keep/detain/stop’, can occur in the present imperfective form” (with our
conversion of Cyrillic to APA for Tajik examples) (also cf. Khojayori and Thompson
2009: 83–84; Rzehak 1999: 18).
Finally, similar to the ‘be’-verbs of the previous section, doštan takes me- for
irrealis mood (26):
(26) agar zud-tar
me-raft-ed,
hamin
fursat-ro
if
early-comp ipfv-go.pst-2pl this (same) opportunity-acc
na-me-došt-ed
neg-ipfv-have.pst-2pl
‘If you had gone earlier, you wouldn’t have had this opportunity’
(Perry 2005: 207)

4 Aspect in Tajik

209

3.4.3 Boistan ‘should’
Along with the two stative verbs already mentioned previously, we can introduce
the modal forms boy-ad (should-3sg), bo-ist (should-pst[3sg]), and the less frequent me-bo-ist (ipfv-should-pst[3sg]), all of which being invariably third person
singular and therefore being defective for agreement forms. The first two forms are
aspectually unmarked, and the last one is marked with me-. These forms always
express durative aspect, but not progressive (for being stative), not future (probably as part of their inflectionally defective nature, or for their modal meaning and
the related semantic restrictions, which are to be investigated further), and not
irrealis. However, it is the dependent main verb (which accompanies the modal)
that expresses the notions of realis and irrealis. Below (27a and 27b) illustrates
boy-ad in the contexts of realis and irrealis, respectively. Also (see 28) represents
bo-ist with an irrealis main verb. Other examples (29a and 29b) are examples of
me-bo-ist with realis and irrealis verbs, respectively.
(27) a. man boy-ad
ba šumo yak čiz-ro
gūy-am
I
should-3sg to you.pl one thing-acc say-1sg
‘I must tell you one thing’
(Ido 2005b: 70)
b. on-ho boy-ad
ba dušanbe
me-raft-and
that-pl should-3sg to Dushanbe ipfv-go.pst-3pl
‘They had to go to Dushanbe’
(Khojayori and Thompson 2009: 103)
(28) vay
bo-ist
xona me-raft
(s)he should-pst[3sg] home ipfv-go.pst[3sg]
‘(S)he had to go home’

(Perry 2005: 332)

(29) a. me-bo-ist
in
daraxt-ro az
bex kan-d-a
ipfv-should-pst[3sg] this tree-acc from root pull-pst-ptcp
bar-or-em
up-bring-1pl
‘We ought to uproot this tree’
(Windfuhr and Perry 2009: 491)
b. dūst-ho-yaton
me-bo-ist
imrūz me-omad-and
friend-pl-3pl.poss ipfv-should-pst[3sg] today ipfv-come.pst-3sg
‘Your friends should have come today’
(Rzehak 1999: 52)
A conclusion drawn from these examples, which is important for the current discussion, is that the morphologically present form (i.e., boy-ad) essentially does
not receive me- (at least as far as mentioned by Tajik grammars concerning the

210

Roohollah Mofidi, Negin Mohammadi Nafchi

typical, everyday usage23). This modal form can be followed by main verbs of both
non-past and past tenses. However, the morphologically past forms of the modal
(i.e. bo-ist and me-bo-ist) can be used with or without me-, again with both tenses
(cf. Windfuhr and Perry 2009: 490 for the idea that the less frequent, me-marked
forms have “milder force”). In fact, the usage of boy-ad, bo-ist, and me-bo-ist with
(non-)past tenses, and (ir)realis moods, supports our claim that the modal forms
in these constructions are interpreted as durative. This is at least in line with
Perry (2005: 332), who calls me-bo-ist “the Durative Past”, and we are extending
the idea to all three forms of ‘should’ in Tajik.

4 A periphrastic construction for the
progressive aspect
In the following subsections, a syntactic construction in Tajik will be introduced,
which is functionally restricted to the progressive meaning. First, the construction, with its several varieties, will be described, and then, a theoretical addressing of the issue of the construction’s interaction with tense – also focusing on
some parameters from grammaticalization theory.24

4.1 Description of the construction
With the general present perfect and past perfect constructions of Tajik in mind
(see Section 3.3.1, for a brief introduction), the periphrastic progressive construction is made of the non-finite form of the main verb plus the perfect forms of
istodan ‘to stand’ as the auxiliary – this non-finite form is usually called past
participle in Tajik grammars. Therefore, there will be two main uses of this category: the progressive made of present perfect, and the progressive made of past

23 In the Persian of Iran, mi-bāy-ad is infrequently used as a literary/formal form. We wonder if
this form is used in the same style in Tajik (cf. Akhlāghi 2008: 97, for an example of this form in
the Persian of Iran).
24 In addition to the construction being discussed in this section, and its variants, there are, at
least, four other periphrastic constructions in Tajik: 1) perfect constructions (see Section 3.3.1,
particularly the footnotes); 2) past subjunctive, as in raft-a boš-am ‘I might have gone’ (as the
unmarked counterpart of construction (5) in Table 1); 3) passive constructions, as in kard-a mešav-ad ‘It is being done’/‘It is [habitually] done’/‘It will be done’ (cf. Perry 2005: 247–253); and 4)
formal/literary future (see Section 3.3.2, especially the footnotes).

4 Aspect in Tajik

211

perfect (as instantiated in 30a–b and 31a–b below, respectively). In all these
examples, the event is reported by the speaker to be happening, in the same way
as in the progressive interpretation of the general imperfective (see Section 3.3.1).
Thus, the periphrastic construction which is being discussed in this section (the
istodan-construction) provides a structural alternative to the canonical construction (the me-marked forms). As far as can be ascertained from Tajik grammars,
no difference in meaning can be attested between this periphrastic construction
and the progressive interpretation of the general canonical construction (e.g., cf.
Baizoyev and Hayward 2004: 144, for the comparison of some pairs of examples).
But the issue can be viewed as open to further research, at least at the discoursal
level.25
(30) a. vay
radio-ro
gūš kard-a
ist-od-a
ast
(s)he radio-acc ear do.pst-ptcp stand-pst-ptcp be.prs[3sg]
‘He’s listening to the radio’
(Khojayori and Thompson 2009: 108)
b. holo man dar dušanbe
zindagī kard-a
now I
in Dushanbe living
do.pst-ptcp
ist-od-a=am
stand-pst-ptcp=1sg
‘I am currently living in Dushanbe’ (Baizoyev and Hayward 2004: 144)
(31) a. modar dar xona xurok-i
šom
puxt-a
mother in home meal-gen dinner cook.pst-ptcp
ist-od-a
bud,
ki
telefon zang
zad
stand-pst-ptcp be.pst[3sg] that phone ringing hit.pst[3sg]
‘Mother was cooking the dinner meal when the phone rang’
(Rzehak 1999: 79)
b. az gurusnagī murd-a
ist-od-a
bud-em
of hunger
die.pst-ptcp stand-pst-ptcp be.pst-1pl
‘We were dying of hunger’
(Perry 2005: 84)
The examples (30a and 31a–b) are generally serving more or less the same grammatical function as the second verb of the example (13), all expressing a focalized
interpretation of the progressive: the subject(s) is/are listening, cooking, laughing, and metaphorically dying. In the same way, example (30b) is comparable to

25 For the full list of agreement forms of the periphrastic construction, as well as information
about the dialectal variants of the auxiliary, cf. Perry (2005: 224–225), among others.

212

Roohollah Mofidi, Negin Mohammadi Nafchi

(12a–b), all of which being stative verbs, expressing a durative interpretation of
the progressive: living, understanding, and waiting.
However, there are some instances of this construction, which are suspected
of having developed a habitual function as in below (32a–b). Theoretically, this
is possible, since it is a step towards further grammaticalization of progressives,
which can eventually lead to a general imperfective aspectual category (cf. Bybee
et al. 1994: 140–144). Practically, however, it seems unreasonable to accept the
realization of such an extensive aspectual change (habitual development) with
reference to a few examples, which can also be interpreted as progressive. Perry
(2005: 224) claims that “[t]he Present Progressive may sometimes stand for the
habitual aspect”, but finally, he describes the following example (32a) “[as] the
focus is on a limited current experience within the larger frame of the habitual
or iterated past-present-future”. This description, with the feature of a current
experience, seems to come closer to a progressive interpretation, departing from
a typical habitual meaning (contrary to his claim of the development of habitual
aspect). Therefore, we prefer to adopt a conservative stance and to interpret both
examples (32a–b) as progressive.
(32) a. mo dar institut fan-ho-i
gunogun-ro omūxt-a
we in institute subject-pl-gen various-acc learn.pst-ptcp
ist-od-a=em
stand-pst-ptcp=1pl
‘We are learning/learn various subjects at the institute’ (Perry 2005: 224)
b. az
rū-i
xabar-ho-i
puxta-e,
ki
har-rūza
ba
from on-gen news-pl-gen skilled-indf that every-daily to
man ras-id-a
ist-od-a
ast,
kor-i
mo
I
reach-pst-ptcp stand-pst-ptcp be.prs[3sg] work-gen we
on
qadar
xub ne-st
that amount good neg-be.prs[3sg]
‘According to the daily expert information that I am receiving, our work
is not so good’
(Ido 2005b: 56)
The third usage of the periphrastic construction, then, is the double perfect of
the second usage (i.e., past progressive). For the meaning that is mostly associated with it, it is called non-witnessed past progressive by Perry (2005: 233),
and evidential progressive pluperfect by Windfuhr and Perry (2009: 468). This
usage employs three non-finite forms (the main verb, the auxiliary istodan, and
a second auxiliary budan), followed by ast (for 3sg), or the agreement clitics (see
Section 3.1.1 for a short introduction to these clitics):

4 Aspect in Tajik

213

(33) vay
kitob xon-d-a
ist-od-a
bud-a
ast
(s)he book read-pst-ptcp stand-pst-ptcp be.pst-ptcp be.prs[3sg]
ki
man dar-ro
taq-taq
kard-a=am
that I
door-acc knocking do.pst-ptcp=1sg
‘He was evidently reading a book when I knocked at the door’
(Windfuhr and Perry 2009: 464)
Furthermore, the categories of subjunctive and conjectural mood can combine
with present progressive, to form the fourth and fifth constructions with istodan.
The fourth usage, described as “rare” by Windfuhr and Perry (2009: 468), is functionally close to the progressive interpretation of construction no. 5 in Table 1.
Finally, the fifth usage is close in meaning of other conjectural forms (including
construction 6 [of Table 1]).
Table 2, below, provides an overview of all the uses of the periphrastic progressive construction, with xūrdan ‘to eat’ as the main verb (similar to Table 1).
In the examples provided, the poly-functional nature of some constructions may
sometimes make them neutral to tense, or assign several other meanings to them.
Table 2: Periphrastic progressive constructions with istodan.
Present

Past

(1) xūr-d-a
ist-od-a=am
(eat-pst-ptcp stand-pst-ptcp=1sg)
‘I am eating’

(2) xūr-d-a
ist-od-a
bud-am
(eat-pst-ptcp stand-pst-ptcp be.pst-ptcp-1sg)
‘I was eating’
(3) xūr-d-a
ist-od-a
bud-a=am
(eat-pst-ptcp stand-pst-ptcp be.pstptcp=1sg) ‘
I have been eating’

(4) xūr-d-a
ist-od-a
boš-am
(eat-pst-ptcp stand-pst-ptcp be.prs-1sg)
‘I may be/have been eating’
(5) xūr-d-a
ist-od-agi-st-am
(eat-pst-ptcp stand-pst-conj-be.prs-1sg)
‘I might be eating’

As the last point to be mentioned in this section, there is a similar construction
in Tajik, which employs the auxiliary gaštan ‘to become’ instead of istodan (34).
Perry (2005: 225) describes it as being used “[c]olloquially, [. . .] usually with this
additional Perfect Progressive sense”:

214

Roohollah Mofidi, Negin Mohammadi Nafchi

(34) hamsoya-i
mo dar dušanbe
kor
kard-a
neighbor.gen we in Dushanbe work do.pst-ptcp
gašt-a
ast
become.pst-ptcp be.prs[3sg]
‘Our neighbor has been working in Dushanbe (and still is)’ (Perry 2005: 225)

4.2 Theoretical considerations
The grammars of Tajik all mention the periphrastic construction with istodan,
assign the semantic content of progressive to it, and provide some examples, but
attempts at theoretical explanation are rare among these grammars. To name one
of such rare attempts, Perry (2005: 469–470) mentions that in karda istodaam
and karda istoda budam, “[the auxiliaries] are fully desemanticized, and need
not contain any notion of ‘standing’ or have a human or animate subject”. He
provides an example for his last point:
(35) ob
šud-a
ist-od-a
ast
water become.pst-ptcp stand-pst-ptcp be.prs[3sg]
‘It is melting’ Perry’

(2005: 470)

Returning to the examples presented in the previous section, since progressive constructions are reports of ongoing events, the reference times of (30a–b)
and (31a–b) are present and past, respectively, coinciding with the event time,
because of the progressive nature of the whole construction (using Reichenbach’s
[1947] trichotomy). In other words, listening and living (in 30a–b) are happening
at the present moment, as a result of the formal, and probably functional, combination of progressive and present perfect, while cooking and dying (in 31a–b)
were happening at a moment in the past, again with a similar combination of
progressive and past perfect. In this case we suggest that the coincidence of event
time and reference time follows from a universal interpretation of the perfect
(called “perfect of persistent situation” by Comrie 1976: 60). As stated by Rothstein (2008: 114–115),
In the universal present perfect, the event time (E) introduced by the present perfect holds
throughout the entire PST [=perfect time span] without interruption. In other words: in the
universal present perfect, there is no point in time within PST at which (E) does not hold.

Thus, we can claim that the perfect, with its universal interpretation, provides
an interval during which the event is reported to be in progress. As mentioned in
the previous section, there are two main structural choices in Tajik: to locate the

4 Aspect in Tajik

215

whole interval (including the progressive time point) around the present moment,
or to place it all in the past. These two choices were shown to be expressed with
present perfect and past perfect (in 30a–b and 31a–b, respectively). From this
perspective, the periphrastic progressive construction is a further grammaticalization of the general perfect construction of Tajik so to serve an aspectual
purpose.
Regarding the auxiliary of the progressive construction (istodan), the grammaticalization of a positional verb, with a general meaning of be standing in a
specific place at a specific time, is quite common in the evolution of progressive
aspect, cross-linguistically. Heine and Kuteva (2002: 280–282) provide a list of
several languages which employ the verb ‘stand’ in continuous / durative / progressive aspect, including Dutch, Bulgarian, Italian, Spanish, Tatar, etc., adding
that “[t]his pathway is part of a more general process whereby postural verbs (‘sit’,
‘stand’, ‘lie’) are grammaticalized to continuous and other aspectual markers”. In
many cases of grammaticalization, the lexical verb, which is the source of this
process, continues to be used as a regular heavy verb, as well. The case of istodan
in Tajik is also an example of this phenomenon:
(36) agar in
ki
rū-ba-rū-i
man ist-od-a
ast,
if
this that face-to-face-gen I
stand-pst-ptcp be.prs[3sg]
tu=ī,
...
you.sg=2sg
‘If this (person) standing before me is you, . . .’
(Ido 2005b: 50)

5 Other aspectual constructions
A general construction that is quite active and highly productive in Tajik is the
combination of the non-finite form of the main verb (called past participle, verbal
adjective, and gerund in Tajik grammars) and a finite auxiliary. The construction
introduced (section 4) is an instance of this combination, which is “reflecting
the pervasive participialization of Tajik” (Windfuhr and Perry 2009: 462). Other
instances of the combination include various auxiliaries that serve more or less
a lexical aspectual purpose, i.e., they express some notions of Aktionsart. Windfuhr and Perry (2009: 495) view them as “a class of auxiliaries” that provide “an
Aktionsart or adverbial nuance”; and Ido (2005b: 70–71) mentions that “[t]he type
of information that they encode is typically aspectual, but may also be modal”,
and that “[s]ome of them even encode the semantic roles of the arguments of the
predicates of which they are parts”, providing some examples for each of these

216

Roohollah Mofidi, Negin Mohammadi Nafchi

semantic options. Perry (2005: 467–473) has listed eighteen of these auxiliaries,
with a short semantic elaboration – and some examples for each one. His list is
as shown (37), and Windfuhr and Perry (2009: 495) conclude that “[t]he category
may still be evolving and expanding”.
(37)

bar-omadan ‘to come up/out, exit’, burdan ‘to carry, take away’, gaštan ‘to
turn; move around, frequent’, giriftan ‘to take’, guzaštan ‘to pass (through)’,
didan ‘to see’, dodan ‘to give’, istodan ‘to stand’, mondan ‘to stay; to let,
put’, nišastan/šištan ‘to sit’, ovardan ‘to bring’, omadan ‘to come’, partoftan
‘to throw away, abandon’, raftan ‘to go’, sar-dodan ‘to let go, launch, start’,
tamom kardan ‘to finish’, firistodan ‘to send’, šudan ‘to become, happen’.

We start our discussion here with istodan to establish a link with Section 4, and
then, we will proceed to address a few other auxiliaries. Perry (2005: 226–227)
believes that “[p]resent tenses of istodan/ist- may be used as serial auxiliaries,
to indicate that an action once begun will continue”, and that “[o]ther tenses
of istodan/ist- may be used serially to refine the nuances of progressivity. Thus
[e.g.] the Simple Past characterizes an ongoing activity that stopped (typically,
was interrupted by) a new occurrence”. The examples (38–40) are all cited from
his work:
(38) to
omadan-i
šumo mаn kor
kаrd-a
mе-ist-am
until arrival-gen you.pl I
work do.pst-ptcp ipfv-stand-1sg
‘Until your arrival/until you get here, I’ll carry on/keep on working’
(Perry 2005: 226)
(39) mū-ho-i
abru-von-aš
čašm-on-aš-ro
heir-pl-gen eyebrow-pl-3sg.poss eye-pl-3sg.poss-acc
pūš-on-d-a
me-ist-od-and
cover-caus-pst-ptcp ipfv-stand-pst-3pl
‘The hair of his eyebrows covered/concealed his eyes’
(Perry 2005: 227)
(40) in
voqea
dar vaqt-i
gūr-on-id-a
this incident in
time-gen
bury-caus-pst-ptcp
ist-od-an-i
Mirzo Nazrullo rūy dod
stand-pst-inf-gen Mirzo Nazrullo face give.pst[3sg]
‘This incident took place during the actual burial of Mirzo Nazrullo / while
they were in the process of burying Mirzo Nazrullo’
(Perry 2005: 255)

4 Aspect in Tajik

217

A point to be highlighted in the abovesaid examples, and for any similar cases,
is that the auxiliary seems to add the concept of duration (in a sense which can
be some nuance of progressivity, in Perry’s words), and this concept can pave the
way for the development of the focalized meaning (as defined in Section 3.3.1). In
fact, in a general sense (regardless of any specific language), further grammaticalization of the durative aspectual meaning can eventually lead to the focalized aspectual meaning (cf. Bertinetto et al. 2000). However, in this specific Tajik
case, it cannot necessarily be concluded that the examples (cf. 38–40) diachronically developed prior to the examples of the perfect construction, as presented in
Section 4: this conclusion, however, is not far from plausible. The line of development is open to further research, but at least the criterion introduced by Perry
(2005: 475) deserves mention:
the stative tenses of this verb (Perfect and Pluperfect) have been grammaticalized as auxiliaries in the compound Progressive tenses, while other tenses still express nuances of the
progressive aspect within a Conjunct verb matrix. One of the tokens of a shift from VP to
tense has been the attachment of the negative prefix to the first component of the phrase,
i.e., the Past Participle of the main verb: na-rafta istoda-and ‘they are not leaving’; this contrasts with the usual placement of the prefix on the finite verb in a Conjunct verb construction: [. . .] [oš xunuk šud-a na-ist-ad] ‘don’t let the food get cold’.

Other auxiliaries of (37) could be claimed to experience, more or less, the same
level of grammaticalization (as 38–40), but they have not undergone further
grammaticalization of the kind experienced by istodan to serve a more grammaticalized aspectual purpose. For example, šudan ‘to become’ which is the passive
auxiliary in some Iranian languages, including Persian of Iran and Tajik, also
marks the completive aspect in Tajik, as shown in (41) (cf. Ido 2005a, for detailed
information on this auxiliary). Similarly, dodan ‘to give’ and giriftan ‘to take’, as
common light verbs, also mark some type of completive and inceptive aspect,
respectively, as in (42–43). Furthermore, in some cases, the main verb can influence the nuance of aspectual meaning that the auxiliary expresses (cf. Windfuhr
and Perry 2009: 495–496, for some examples); and for all of these auxiliaries the
use of the heavy verb is common in the language. The exception is šudan which
has lost its original lexical meaning, i.e., ‘to go’, common in Middle and Classical Persian (cf. Bubenik 2019: 202; Nyberg 1974: II/188, for Middle Persian examples; Nātel-Khānlari (1986: II/214–216; Estaji and Bubenik 2007: 42, for Classical
Persian examples).
(41) man in
kitob-at-a
xon-d-a
šud-am
I
this book-2sg.poss-acc read-pst-ptcp become.pst-1sg
‘I have read / finished reading this book of yours’
(Ido 2005a: 1109)

218

Roohollah Mofidi, Negin Mohammadi Nafchi

(42) korkar-i
xud-aton-ro
kard-an
gir-ed
work-gen own-2pl.poss-acc do.pst-inf take-2pl
‘You get on with / begin your work!’
(Windfuhr and Perry 2009: 494)
(43) maqola-ro xon-d-a
dod-am
article-acc read-pst-ptcp give.pst-1sg
‘I read the article (to someone who is semantically a beneficiary)’
(Rastorgueva 1992: 86; translation from Ido 2005b: 71)

6 Cross-dialectal (/cross-linguistic) comparison
This section aims to provide a speculative chronology for the emergence of Tajik
aspectual devices, to provide a better understanding of the evolutional issues of
its aspect system. This is achieved by a comparison of the aspect system of Tajik,
Dari, Persian, and Classical Persian. Whether to view Tajik, Dari, and Persian as
varieties of the same language (therefore, having a cross-dialectal comparison),
or to consider them separate – but closely-related – languages (thus, a cross-linguistic comparison), all of them can be claimed to have immediately evolved from
Classical Persian; although their contact with other languages could have been
a determining factor in their evolution, as well. What follows, compares some
general, aspectual features of Tajik, Dari, and Persian with Classical Persian to
raise some speculations about the approximate aging of each feature in the relevant Modern language. We confine the discussion here to three aspect-marking
issues (as summarized in Table 3).26
Table 3: Aspectual features of Tajik: a comparative perspective.
Aspectual device

Tajik

Dari

Persian

Classical Persian

Perfective marker

bi- (sometimes)

Imperfective marker

me-

mē-/mey-

mi-

hamē/mē

Progressive auxiliary

istodan
‘to stand’

raftan
‘to go’

dāštan
‘to have’

In section 2.2, it was mentioned that there was a marker bi- in Classical Persian,
arguably with a perfective function, which was occasionally used with past
26 For a more detailed comparison of tense-aspect-mood forms of Tajik and Persian, cf. Amonova (1991); Malekzadeh (2015).

4 Aspect in Tajik

219

tense verbs. This function of the marker has not been inherited by Tajik, nor by
Standard Persian and Dari (cf. Yousef 2018 for Persian; Mitchell and Naser 2017
for Dari), though several regional dialects of Persian in Iran, as well as various
Western Iranian languages, continue to use it almost obligatorily – at least with
the morphologically simple verbs (for a list of these dialects and languages, cf.
Mofidi 2020, and the references therein).
The second feature to be compared here is the imperfective marker, which is
used, more or less, in the same way in all aspect systems being discussed. The
use of this marker increased during the New Persian period (beginning in 10th
century), and it has become near-obligatory in the Modern varieties. In fact, in
Early New Persian texts (mostly 10th and 11th centuries), the inherited form hamē
and the phonetically-eroded form mē, were both used. Later, the former gradually
disappeared in favor of the latter. Lazard (1968: 88) believes that in Arabic script
texts the occurrence of hamē becomes quite rare in simple prose beginning in
the 12th century. Furthermore, as mentioned by Paul (2019: 608), “(ha)mē and
bi- were still prosodically independent particles in ENP [=Early New Persian] and
would be grammaticalized only in the course of Classical literature” (For more
information on imperfective forms in Classical Persian, cf. Ahmadi-Givi 2001;
Lazard 1963; Nātel-Khānlari 1986).
It is difficult, if not impossible, to draw a precise picture of the linguistic situation and usage in Persian-speaking regions during the New Persian period.
Records of the situation in these regions are scant at best, and our information
about the remaining manuscripts is often uncertain with respect to the time and
place of the writing, and sometimes even the author’s name. Furthermore, the
Arabic-Persian script is not representative of some phonetic features (e.g., the
short vowels), lack of which makes it difficult to trace language change – and
contact patterns – geographically. Generally, Oranskij (2000: 281) believes that
relying on historical and cultural evidence, we can hypothesize that the separation of Central Asian Persian dialects from Iranian dialects of Persian took place
after the 16th century. Perry (1999: 155–156) mentions, more specifically, that “[t]
he continuum between the spoken Persian of Iran and of Central Asia was interrupted definitely from the sixteenth century by a broad band of Turkic speech”,
and Perry (2005: 490) clarifies it as “[s]outhward migration of Uzbeks and Turkmens”. Also, there was the massive destruction of Marv in the late 18th and early
19th by the Rulers of Bukhara, which brought with it its linguistic consequences
(Cf. Oranskij 2000: 281).
If we abide by this chronological hypothesis, we can account for the similarity of perfective- imperfective features in Table 3. In fact, the spread of mē and the
disappearance of bi- could both have occurred before the 16th century.

220

Roohollah Mofidi, Negin Mohammadi Nafchi

On the other hand, as shown (Table 3), there are no occurrences of the grammaticalization of a progressive auxiliary in Classical Persian (e.g., cf. Ido 2005a:
1110, who mentions “the absence of the (gerund + auxiliary verb) construction in
Classical Tajik-Persian literary language”). This is plausible, since (ha)mē was
still in the course of its development in Classical Persian. What is interesting in
the overall picture is that Tajik, Dari, and Persian have all employed an auxiliary
for the progressive aspect, though from different sources. The very fact that such
a divergence is observed could indicate that these developments have taken place
after these Persian-speaking peoples lost contact with each other. If we continue
to accept the above-mentioned hypothesis, this could have happened after the
16th century.
Starting with Tajik, it has been proposed that the progressive construction,
along with other similar constructions (i.e., past participle plus an auxiliary, as
introduced in section 5), evolved in the northern dialects of Tajik under the influence of Turkic/Uzbek. Ido (2005b: 70) mentions that “[t]he use of such auxiliaries
is naturally particularly salient in northern dialects, the dialectal peculiarities
of which tend not to be dismissed as non-standard”, and Ido (2005a) discusses
the role of Tajik-Uzbek bilingualism as “the norm in much of this area”. Furthermore, Perry (2005: 485) refers to the process of standardization of Tajik in 1930s,
“when the speakers of Uzbekized Tajik of Bukhara and other Northern dialects
took on the task of planning a national Tajik Persian language”.27 Ido (2005a:
1109) mentions the corresponding Uzbek auxiliary verb (meaning ‘to stand’) in
the same gerund+auxiliary construction, which can tentatively express the aspectual meaning of continuation. Johanson (2005) presents more examples and discussion in favor of the same language-contact hypothesis for the progressive construction; and Johanson (1998) discusses “[the] typological convergence between
certain Turkic languages and Persian” without mentioning specific cases (also cf.
Johanson 1992: 62).
However, an alternative hypothesis for the origin of Tajik periphrastic progressive construction has been proposed by Korn (2020: 482). She suggests that “[t]he
predecessor of this pattern is seen in the Middle Persian “perfectum praesens” that
uses ‘stand’ as an auxiliary”, and she cites an example from Manichean Middle
Persian (44). In a footnote, she also refers to Durkin-Meisterernst (2014: 384 f.),
27 Perry (2005: 485–486) lists several Turkic and/or Uzbek grammatical influences on Tajik,
and refers to Soper (1987, 1996) for the verb systems of the two languages. Perry believes that “[t]
he background to this development stretches back at least thirteen centuries”, which makes the
hypothesis of language-external grammatical influence more plausible: as grammatical features
are generally resistant to borrowing, only changing in long-standing intensive contact situations;
Cf. also Paul (2019), who provides several Tajik examples for some of these grammatical features.

4 Aspect in Tajik

221

who points out the resultative meaning of this pattern. She also alludes to Jeremias (1993: 106 f.), who suggests that the Persian forms of the structure dādast-īm
contain a contracted form of ‘stand’. Furthermore, Korn provides examples from a
few Iranian languages which use the verb ‘stand’ – in some form – to express an
aspectual meaning: inflected iterative/durative auxiliary in Avestan; imperfective
participle in Khotanese; imperfective particle in Buddhist Sogdian; and present
tense suffix in Yaghnobi (Korn 2020: 481–482).
(44) gyān
. . . andar tan
ā’ōn āmixt
ud passaxt
ud
soul
in
body thus mix.pst and mingle.pst and
bast
ēst-ēd
...
bind.pst stand-3sg
‘The soul . . . is (lit.: stands) so mixed, mingled and bound in the body . . .’
(Andreas and Henning 1933: 299–300, fragment M 9 II r, 16–18)
In the case of Dari, we can refer to some grammars which point out that the periphrastic progressive construction is a “recent” development, employed in the
“spoken” variety, but they do not provide any chronology for its development
(e.g., cf. Neghat Saidi 2013: 67; and Yamin 2014: 111).28 The examples (45a–b) represent this construction:
(45) a. raft-a
mē-rav-om
go.pst-ptcp ipfv-go-1sg
‘I am going’

(Paul 2019: 609)

b. mihmān-hā-rā xošāmadguyi kard-a
mē-raft
guest-pl-acc welcoming
do.pst-ptcp IPFV-go.pst[3sg]
‘(S)he was welcoming the guests’
(Fazilat et al. 2019: 11)
However, some grammars of Dari mention the usage of the periphrastic progressive with istādan (like the one already introduced for Tajik), and assign it to the

28 There is also another grammaticalized construction in Dari, made with the agentive adjectives of raftan, plus the copular ‘be’, expressing the same progressive meaning:
(i)

un-ā
nān
xor-d-a
rāyi
that-pl bread eat-pst-ptcp going
‘They were eating food when I came’

(ii) dars
xān-d-a
lesson read-pst-ptcp
‘(S)he is studying’

ravān
going

bud-an
be.prs-3pl

ast
be.prs[3pl]

ke
that

āmad-om
come.pst-1sg
(Baker 2017: 95)
(Fazilat et al. 2019: 11)

222

Roohollah Mofidi, Negin Mohammadi Nafchi

dialect of Badakhshān (Baker 2017: 95; Neghat Saidi 2013: 69); or only indicate
that it is used “in some dialects” (Yamin 2014: 111).29
Finally, the periphrastic progressive construction of the Persian of Iran is
a rather recent development, being recorded for the first time in the late 19th
century from spoken varieties. Zhukovskij (1888: 376–377) reports to have attested
it in a popular folk song in the late 1870s (Dehghan 1972: 201; for information
about more such old attestations, cf. Vafaeian 2018: 201–202). As described by
Windfuhr and Perry (2009: 462), “it is indicative only and cannot be negated. It
precedes the main verb and may be separated from the latter. Significantly, both
auxiliary and main verb are inflected, but may be separated”. There are several
works and analyses about its theoretical and empirical properties (Cf. Davari and
Naghzguy-Kohan 2017; Nematollahi 2018; Taleghani 2010; Vafaeian 2018). Below
(46a–b) is an illustratration of this construction:
(46) a. xuna-mun
dār-e
sāxt-e
mi-š-e
house-1pl.poss have-3sg build.pst-ptcp ipfv-become-3sg
‘Our house is being built’
(Mahootian 1997: 223)
b. xorus
dāšt
dāne mi-čid
ke
...
rooster have.pst[3sg] grain ipfv-pick.pst[3sg] when
‘The rooster was picking up grains, when . . .’
(Windfuhr and Perry 2009: 462)

7 Conclusion
By theoretical definition, the aspectual features of the languages develop through
some degree of grammaticalization, out of more lexical or less grammatical linguistic elements (cf. Hopper and Traugott 2003: 18; Traugott and Dasher 2001: 81).
From this perspective, and based on this definition, we can identify in Tajik, i) an
imperfective prefix, ii) a progressive auxiliary, and iii) a series of less grammaticalized auxiliaries, all of which expressing an aspectual meaning in the appropriate
structural context, and forming morphological or syntactic units (i.e., at the word
level or in syntactic constructions). The first grammaticalized element in the list
is a highly generalized marker me-, which is employed for a range of imperfective
functions, including progressive (both durative and focalized), habitual, future,

29 We can also attest to this construction in our fieldwork data from Afghanistan’s Badakhshan,
as well.

4 Aspect in Tajik

223

and irrealis, depending on the context and the lexical semantics of the verbs. This
marker appears obligatorily (as expected for typical inflectional elements) with
all Tajik imperfective verbs, except for the stative verbs budan ‘be’, doštan ‘have’,
and boistan ‘should’, which do not generally take this marker, albeit they accept it
for some specific meanings or functions, both obligatorily, and, or optionally. The
second linguistic element, developed in the aspect system of Tajik, is the auxiliary
istodan ‘to stand’, which is used with the non-finite (participial) form of the main
verb, making perfect constructions to express progressive meaning (including
durative and focalized). As far as the grammars of Tajik tell us, this construction
is functionally an alternative to the prefix me- in its progressive use. Thirdly, there
are some auxiliaries, such as istodan ‘to stand’, šudan ‘to become’, giriftan ‘to get’,
dodan ‘to give’, etc., which express different notions of Aktionsart in combination
with non-finite forms of the main verbs. These auxiliaries are employed in various
aspecto-temporal constructions. Cross-dialectal (/cross-linguistic) comparison of
the aspect system of Tajik with Dari, Persian, and Classical Persian, shows that
these constructions, as well as the periphrastic progressive construction (number
ii in the list of Tajik aspectual devices), are rather recent developments driven
by extra-linguistic factors (contact with foreign languages); or grammaticalized
intra-linguistically. These two devices (i.e., ii and iii in the list) are not, at least,
detected in Iranian and Classical Persian, showing their independent development in Tajik, probably after their linguistic contact with Iranian varieties was
lost. However, the imperfective prefix (i) is observed similarly in Dari, Persian,
and Classical Persian, as well, though with phonetic differences, and therefore, it
is an old development (in fact, originating in a Middle Persian adverb, grammaticalized during the Middle and New Persian periods).

References
Ahmadi-Givi, Hassan. 2001. Dastur-e tārixi-e fe’l [The historical grammar of verb]. Tehran:
Ghatreh.
Ahmadi-Givi, Hassan & Hassan Anvari. 2011. Dastur-e zabān-e Fārsi, II [A grammar of
Persian, 2]. 4th edn. Tehran: Fātemi.
Akhlāghi, Faryār. 2008. Bāyestan, šodan va tavānestan: Se fe’l-e vajhi dar Fārsi-e emruz
[Should, become and be able to: Three modal verbs in Contemporary Persian]. Grammar 3.
82–132.
Amonova, Firuza. 1991. Soxani az tafāvot-hā-ye Fārsi-e Irān va Tājiki (Fārsi) [A talk about the
differences between the Persian of Iran and Tajik (Persian)]. Iranian Journal of Linguistics
15–16. 2–11.
Andreas, Friedrich C. & Walter B. Henning. 1933. Mitteliranische Manichaica aus ChinesischTurkestan II. In Sitzungsberichte der preußischen Akademie der Wissenschaften, 292–363.

224

Roohollah Mofidi, Negin Mohammadi Nafchi

Berlin: Verlag der Akademie. (=Walter B. Henning. 1977. Selected Papers I. Acta Iranica 14.
191–260).
Andreas, Friedrich C. 1939. Iranische Dialektaufzeichnungen aus dem Nachlass von F. C.
Andreas. Bearbeitet und herausgegeben von A. Christensen, zusammen mit Kaj Barr und
Walter Henning (=Gött-A, 3. Folge, Nr. 11). Berlin: Weidmannsche Verlagsbuchhandlung.
Bahār, Mohammad-Taghi. 1942 [1994]. Sabk-šenāsi yā tārix-e taṭavvor-e nas̱ r-e Fārsi [Stylistics,
or the history of development of Persian prose.]. 7th edn. Tehran: Amir-Kabir.
Baizoyev, Azim & John Hayward. 2009. A beginner’s guide to Tajiki. London & New York:
Routledge Curzon.
Baker, Adam. 2017. A learner’s grammar of Dari. https://www.iam-afghanistan.org/lcp/
downloads/dari-grammar.pdf (accessed 10 May 2021).
Bertinetto, Pier Marco, Karen H. Ebert & Casper de Groot. 2000. The progressive in Europe. In
Östen Dahl (ed.), Tense and aspect in the languages of Europe, 517–558. Berlin & New
York: Mouton de Gruyter.
Brunner, Christopher J. 1977. A syntax of Western Middle Iranian. Delmar, New York: Caravan
Books.
Bubenik, Vit. 2019. Grammaticalization and degrammati(calizati)on in the development of
the Iranian verb system. In Lars Heltoft, Iván Igartua, Brian D. Joseph, Kirsten Jeppesen
Kragh & Lene Schøsler (eds.), Perspectives on language structure and language change,
193–204. Amsterdam & Philadelphia: John Benjamins.
Bybee, Joan L. & William Pagliuca. 1987. The evolution of future meaning. In Anna Giacalone
Ramat, Onofrio Carruba & Giuliano Bernini, Papers from the 7th international conference on
historical linguistics, 109–122. Amsterdam & Philadelphia: John Benjamins.
Bybee, Joan, Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect,
and modality in the languages of the world. Chicago & London: The University of Chicago
Press.
Cheung, Johnny. 2007. Etymological Dictionary of the Iranian Verb. Leiden & Boston: Brill.
Comrie, Bernard. 1976. Aspect. Cambridge: Cambridge University Press.
Dahl, Östen. 1985. Tense and aspect systems. New York: Basil Blackwell.
Davari, Shadi & Naghzguy-Kohan, Mehrdad. 2017. The grammaticalization of progressive aspect
in Persian. In Kees Hengeveld, Heiko Narrog & Hella Olbertz (eds.) The grammaticalization
of tense, aspect, modality and evidentiality: A functional perspective, 164–189. Berlin &
Boston: De Gruyter Mouton.
De Haan, Ferdinand. 2011. Typology, tense, aspect and modality systems. In Jae Jung Song
(ed.), The Oxford handbook of linguistic typology, 445–464. Oxford: Oxford University
Press.
Dehghan, Iraj. 1972. Dāshtan as an auxiliary in Contemporary Persian. Archiv Orientälni (Praha)
40. 198–205.
Durkin-Meisterernst, Desmond. 2014. Grammatik des Westmitteliranischen (Parthisch und
Mittelpersisch). Vienna: Österreichische Akademie der Wissenschaften.
Estaji, Azam & Vit Bubenik. 2007. On the development of the tense/aspect system in Early New
and New Persian. Diachronica 24(1). 31–55.
Farshidvard, Khosrow. 2008. Dastur-e moxtasar-e tārixi-e zabān-e Fārsi [Concise Historical
Grammar of Persian]. Tehran: Zavvār.
Fazilat, Mahmoud, Mohammad-Sarvar Mowlāei & Omme-Farveh Mousavi. 2019. Nemud-e
estemrāri dar Fārsi-e Dari-e Afqānestān [Progressive aspect in Afghan Persian (Dari)].
Grammar 14. 3–16.

4 Aspect in Tajik

225

Gharib, Abdol-Azim, Mohammad-Taghi Bahār, Badi’ozzamān Foruzānfar, Jalāloddin Homāyi &
Rashid Yāssemi (eds.). 1948 [1994]. Dastur-e zabān-e Fārsi (Panj-ostād) [Persian Grammar
(The five professors)]. Tehran: Jahāne Dānesh.
Haig, Geoffrey. 2019. Grammaticalization and inflectionalization in Iranian. In Heiko Narrog &
Bernd Heine (eds.), Grammaticalization from a typological perspective, 57–78. Oxford:
Oxford University Press.
Heine, Bernd & Tania Kuteva. 2002. World Lexicon of Grammaticalization. Cambridge:
Cambridge University Press.
Hopper, Paul J. & Elizabeth C. Traugott. 2003. Grammaticalization. Cambridge: Cambridge
University Press.
Ido, Shinji. 2005a. An aspect marking construction shared by two typologically different
languages. In James Cohen, Kara T. McAlister, Kellie Rolstad & Jeff MacSwan (eds.),
Proceedings of the 4th international symposium on bilingualism, 1105–1114. Somerville,
MA: Cascadilla Press.
Ido, Shinji. 2005b. Tajik. München: Lincom Europa.
Ido, Shinji. 2007. Bukharan Tajik. München: Lincom Europa.
Ioannesyan, Yu. A. 1998. Jāygāh-e guyeš-e Harāti dar miān-e guyeš-haye goruh-e zabāni-e
Fārsi-e Dari [The status of Herati dialect among the dialects of Dari Persian linguistic
group]. The Letter of Academy 16. 140–160. (Translated by Hossein Mostafavi-Gerow, from
Yu. A. Ioannesyan. 1995. Mesto geratskogo sredi dialektov dari-persidskogo jazykovogo
massiva. Peterburgskoe Vostokovedenie 7, Sankt-Peterburg. 224–241.)
Jahani, Carina. 2008. Expressions of future in Classical and Modern New Persian. In Simin
Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian Linguistics, 155–176.
Newcastle: Cambridge Scholars Publishing.
Jeremiás, Éva. 1993. On the genesis of the periphrastic progressive in Iranian languages. In
Wojciech Skalmowski & Alois van Tongerloo (eds.), Medioiranica. Proceedings of the
International Colloquium organized by the Katholieke Universiteit Leuven from the 21st to
the 23rd of May 1990, 99–116. Leuven: Peeters.
Johanson, Lars. 1992. Strukturelle Faktoren in türkischen Sprachkontakten. Sitzungsberichte der
Wissenschaftlichen Gesellschaft an der J. W. Goethe-Universität Frankfurt am Main 29. 5.
Johanson, Lars. 1998. Code-copying in Irano-Turkic. Language Sciences 20. 325–337.
Johanson, Lars. 2005. Bilateral code copying in Eastern Persian and South-Eastern Turkic. In
Éva Ágnes Csató, Bo Isaksson & Carina Jahani (eds.), Linguistic convergence and areal
diffusion: Case studies from Iranian, Semitic and Turkic, 205–214. London & New York:
Routledge Curzon.
Josephson, Judith. 2016. The construction hamē + verb in Middle Persian. In Éva Á. Csató, Lars
Johanson, András Róna-Tas & Bo Utas (eds.), Turks and Iranians: Interactions in language
and history, 49–64. Wiesbaden: Harrassowitz Verlag.
Jügel, Thomas. 2013. The Verbal Particle BE in Middle Persian. Münchener Studien zur
Sprachwissenschaft 67(1). 29–56.
Kalbāsi, Irān. 1995. Fārsi-e Irān va Tājikestān (Yek barrasi-e moqābeleyi) [Iran-Tajikistan Persian
(A contrastive survey)]. Tehran: Ministry of Foreign Affairs.
Khayyāmpur, Abdol-Rasul. 1954 [1996]. Dastur-e zabān-e Fārsi [Persian Grammar]. Tehran:
Ketāb-forushi-ye Tehran.
Khojayori, Nasrullo & Mikael Thompson. 2009. Tajiki reference grammar for beginners.
Georgetown University Press.

226

Roohollah Mofidi, Negin Mohammadi Nafchi

Korn, Agnes. 2020. Grammaticalization and reanalysis in Iranian. In Walter Bisang & Andrej
Malchukov (eds.), Grammaticalization Scenarios: Cross-linguistic Variation and Universal
Tendencies, Volume 1: Grammaticalization Scenarios from Europe and Asia, 465–498.
Berlin & Boston: De Gruyter Mouton.
Lazard, Gilbert. 1956. Caractères distinctifs de la langue tadjik. Bulletin de la Societé
Linguistique 52(1). 117–186.
Lazard, Gilbert. 1963. La langue des plus anciens monuments de la prose persan. Paris:
Librairie C. Klincksieck.
Lazard, Gilbert. 1968. La dialéctologie du judéo-persan. Studies in Bibliography and Booklore
8(2/4). 77–98.
Lazard, Gilbert. 2006. More on counterfactuality, and on categories in general. Linguistic
Typology 10. 61–66.
Lenepveu-Hotz, Agnès. 2012. Etude diachronique du système verbal persan (Xe-XVIe siècles):
d’un équilibre à l’autre? Paris: Ecole Pratique des Hautes Études PhD dissertation.
Lenepveu-Hotz, Agnès. 2014. The evolution of the Persian aspecto-modal suffix -ē, between the
10th and the 16th centuries. Journal of Historical Linguistics 4(2). 232–255.
MacKinnon, Colin. 1977. The New Persian preverb bi-. Journal of the American Oriental Society
97(1). 8–26.
Maggi, Mauro & Paola Orsatti. 2018. From Old to New Persian. In Anousha Sedighi & Pouneh
Shabani-Jadidi (eds.), The Oxford handbook of Persian linguistics, 7–51. Oxford: Oxford
University Press.
Mahmoodi-Bakhtiari, Behrooz. 2018. Morphology. In Anousha Sedighi & Pouneh
Shabani-Jadidi (eds.), The Oxford handbook of Persian linguistics, 273–299. Oxford:
Oxford University Press.
Mahootian, Shahrzad. 1997. Persian. London & New York: Routledge.
Mair, Christian. 2012. Progressive and continuous aspect. In Robert I. Binnick (ed.), The Oxford
handbook of tense and aspect, 803–827. Oxford: Oxford University Press.
Malekzādeh, Hekmat. 2015. Moqāyese-ye zamān-e fe’l-e Fārsi-e me’yār va Tājiki [Comparison of
verb tense in Standard Persian and Tajik]. Iranian Languages and Dialects 5. 131–157.
Mashkur, Mohammad-Javād. 1984. Dastur-nāme dar sarf va nahv-e zabān-e Fārsi [A Grammar
of Persian Morphology and Syntax]. Tehran: Mo’assese-ye Matbu’āti-ye Shargh.
Meier, Fritz. 1981. Aussprachefragen des älteren Neupersisch. Oriens 27–8. 70–176. (Reprinted
in Fritz Meier. 1992. Bausteine: Ausgewälte Ausfsätze zur Islamwissenschaft. Vol. 2,
1057–1164. Edited by E. Glassen and G. Schubert. Stuttgart: Steiner.)
Mitchell, Rebecca & Djamal Naser. 2017. A grammar of Dari. Muenchen: Lincom GmbH.
Mofidi, Roohollah. 2020. Marāhel-e avvaliye-ye šeklgiri-e nešāne-ye nemud-e kāmel dar
maqta’i az Fārsi-e now: Rahyāfti az atlas-e zabāni-e konuni be atlas-e tārixi [The beginning
stages of a perfective aspect marker development in a period of New Persian: An approach
from current linguistic atlas to the historical atlas. Persian Language and Iranian Dialects
5(2). 7–28.
Mofidi, Roohollah. 2021. Šavāhedi āmāri az naqš-haye vajhi-e be- dar Fārsi-e now: Motāte’eyi
dar-zamāni [Statistical evidence for mood functions of be- in New Persian: A diachronic
study]. Language Related Research 11(6). 481–514.
Nātel-Khānlari, Parviz. 1986. Tārix-e zabān-e Fārsi [The History of Persian]. Tehran: Ferdows.
Neghat-Saidi, Mohammad-Nasim. 2013. Dastur-e mo’āser-e zabān-e Dari [A contemporary
grammar of Dari]. Kabul: Amiri Publications.

4 Aspect in Tajik

227

Nematollahi, Narges. 2014. Development of the progressive construction in Modern Persian. In
Ozcelik Oner & Amber Kent (eds.), Proceedings of the 1st conference on Central Eurasian
languages and linguistics, 102–114. Bloomington: Center for the Languages of the Central
Asian Region.
Nyberg, Henrik S. 1931. Hilfsbuch des Pehlevi II: Glossar. Uppsala: Lundequistska & Leipzig:
Harrassowitz.
Nyberg, Henrik S. 1974. A manual of Pahlavi. Wiesbaden: Harrassowitz.
Olson, Randall. B. 1994. A basic course in Tajik (Grammar and workbook). http://talktajiktoday.
com/documents/ABasicCourseInTajik.pdf (accessed 20 April 2021)
Oranskij, Iosif M. 2000. Moqaddame-ye feqholloqa-ye Irāni [Introduction to Iranian Philology].
Tehran: Payam Publications. (Translated by Karim Keshavarz, from Iosif M. Oranskij. 1960.
Vvedenie v iranskuju filologiju [Introduction to Iranian Philology]. Moscow: Izdatel’stvo
vostočnoj literatury.)
Paul, Ludwig. 2019. Persian. In Geoffrey Haig & Geoffrey Khan (eds.), The languages and
linguistics of Western Asia: An areal perspective, 569–624. Berlin & Boston: De Gruyter
Mouton.
Perry, John R. 1999. Comparative perspectives on language planning in Iran and Tajikistan. In
Yasir Suleiman (ed.), Language and society in the Middle East and North Africa: Studies in
variation and identity, 154–173. Richmond: Curzon.
Perry, John R. 2005. A Tajik Persian reference grammar. Leiden & Boston: Brill.
Perry, John R. 2007. Persian morphology. In Alan S. Kaye (ed.), Morphologies of Asia and Africa,
Vol. I, 975–1019. Winona Lake, Indiana: Eisenbrauns.
Phillott, Douglas C. 1919. Higher Persian grammar. Calcutta: The University Press.
Rastorgueva, Vera S. 1992. A short sketch of Tajik grammar. Translated by Herbert H. Paper.
Bloomington: Research Institute for Inner Asian Studies, Indiana University.
Rastorgueva, Vera S. 2000. Dastur-e zabān-e Fārsi-e Miāne [Grammar for Middle Persian].
Tehran: Society for the Appreciation of Cultural Works & Dignitaries. [Translated by
Valiollah Shadan, from Vera S. Rastorgueva. 1966. Srednepersidskij jazyk (Middle Persian
language). Moscow.]
Reichenbach, Hans. 1947. Elements of symbolic logic. New York & London: Free press.
(Reprinted as Hans Reichenbach, 1966, Elements of Symbolic Logic. New York: Macmillan).
Rothstein, Björn. 2008. The perfect time span: On the present perfect in German, Swedish and
English. Amsterdam & Philadelphia: John Benjamins.
Rubinčik, Jurij A. 2012. A Literary Grammar of Contemporary Persian. Translated by M. Shafaghi.
Tehran: Institute for Humanities. [Translated by Maryam Shafaghi, from Jurij A. Rubinčik.
2001. Grammatika sovremennogo persidskogo literaturnogo jazyka (Grammar of Modern
Persian literary language). Moscow: Vostočnaja literatura RAN.]
Rzehak, Lutz. 1999. Tadschikische Studiengrammamtik. Wiesbaden: Reichert Verlag.
Sedighiān, Mahin-dokht. 2004. Vižegihā-ye nahvi-e zabān-e Fārsi dar nasr-e qarn-e panjom va
šešom-e hejri [Syntactic characteristics of Persian prose texts in the 11th and 12th centuries
A.D]. Tehran: The Academy of Persian Language & Literature.
Shari’at, Mohammad-Javād. 1985. Dastur-e zabān-e Fārsi [Persian Grammar]. Tehran: Asātir.
Soper, John D. 1987. Loan syntax in Turkic and Iranian: The verb systems of Tajik, Uzbek and
Qashqay. Ph.D dissertation, UCLA: University Microfilms International, Ann Arbor.
Soper, John D. 1996. Loan syntax in Turkic and Iranian: The verb systems of Tajik, Uzbek and
Qashqay. Revised and edited by A. J. E. Bodrogligeti. Eurasian Language Archives 2.
Bloomington, Indiana: Eurolingua.

228

Roohollah Mofidi, Negin Mohammadi Nafchi

Tabibzādeh, Omid. 2012. Dastur-e zabān-e Fārsi: Bar asās-e nazariye-ye goruhhā-ye xodgardān
dar dastur-e vābastegi [Persian grammar: A theory of autonomous phrases based on
dependency grammar]. Tehran: Nashr-e Markaz.
Taleghani, Azita H. 2010. Persian progressive tense: Serial verb construction or aspectual
complex predicate? Iranian Studies 43(5). 607–619.
Timberlake, Alan. 2007. Aspect, tense, mood. In Timothy Shopen (ed.), Language typology
and syntactic description, Volume III: Grammatical Categories and the Lexicon. 2nd edn.
Cambridge: Cambridge University Press.
Traugott, Elizabeth C. & Richard B. Dasher. 2001. Regularity in semantic change. Cambridge:
University Press.
Vafaeian, Ghazaleh. 2018. Progressives in use and contact: A descriptive, areal and typological
study with special focus on selected Iranian languages. Stockholm: Stockholm University
PhD dissertation.
Windfuhr, Gernot & John Perry. 2009. Persian and Tajik. In Gernot Windfuhr (ed.), The Iranian
languages, 416–544. London & New York: Routledge.
Windfuhr, Gernot. 1979. Persian grammar: History and state of its study. The Hague: Mouton
Publishers.
Yamin, Mohammad-Hossein. 2014. Dastur-e moāser-e zabān-e Pārsi-e Dari [Contemporary
Grammar of Dari Persian.] Kabul: Meywand.
Yousef, Saeed. 2018. Persian: A comprehensive grammar. New York: Routledge
Žukovskij, Valentin. 1888. Osobennoe značenie glagola dāštan a persidskom razgovornom
jazyke [Special meaning of the verb dāštan in spoken Persian]. 3rd edn. St. Petersburg:
Tipografija Imperatorskoj Akademii Nauk.

Justin M. Power

5 Tajik Sign Language in context
Abstract: Tajik Sign Language is a unique contributor to the linguistic diversity of
Tajikistan. Articulated and perceived in the gestural-visual modality and relatively
young in age, Tajik Sign Language differs in many ways from Tajik, from other
Tajik languages, and from all other languages spoken in Tajikistan. This chapter
takes a historical sociolinguistic approach to trace the emergence and early evolution of Tajik Sign Language from its origins in Russian Sign Language, which
had been imported to Tajik schools for the deaf beginning in the mid-twentieth
century. The chapter also reports the results of a lexical comparison of Russian
Sign Language, Tajik Sign Language, and another Central Asian signed language –
namely, Afghan Sign Language – to understand the linguistic effects of the divergent linguistic ecologies within which the two Central Asian signed languages
have emerged. The lexical comparison introduces a novel application of a computational methodology adapted to the features of signed languages and of a theoretically-informed quantitative model of historical change in the sign modality.

1 Introduction
Tajik Sign Language is the primary language of many deaf and hard-of-hearing
Tajiks. Although Tajik Sign Language represents just one of the many languages
contributing to the diverse linguistic ecology in Tajikistan, the signed language
occupies a unique place in Tajikistan’s linguistic landscape. Tajik Sign Language differs from Tajik (and from all other languages spoken in Tajikistan) in
the modality in which it is articulated and perceived: whereas Tajik is articulated
and perceived in the oral-aural modality, Tajik Sign Language is produced manually and perceived visually. The signed language also differs from Tajik in its age;
Tajik Sign Language is a relatively young language, having initially emerged in a
school for the deaf established during the Soviet period and within the signing
community that formed in connection with that school. The roots of Tajik Sign
Language may stretch back further in time – that is, to early 19th-century Russia

Acknowledgements: The author thanks, first and foremost, the Tajik, Russian, and Afghan signers who shared their languages. This research was supported in part by a Fulbright Fellowship
to Tajikistan and by a John F. Richards Fellowship from the American Institute for Afghanistan
Studies.
https://doi.org/10.1515/9783110622799-005

230

Justin M. Power

and perhaps even 18th-century Austria and France. The aim of this chapter is to
survey selected features of Tajik Sign Language that mark it as a unique contributor to Tajikistan’s linguistic diversity. Included in this survey are characteristics
of the signing community in Tajikistan, aspects of Tajik Sign Language’s history,
and the historical relationship of Tajik Sign Language to other signed languages.
On language names for Tajik Sign Language
I begin with a brief excursus about language names for Tajik Sign Language. Several
names are used to refer to the language called herein Tajik Sign Language. There
are at least two endonyms used by Tajik signers to refer to the language: (1), transcribed using the sign transcription system HamNoSys (Hanke 2004), means ‘sign’
or ‘sign language’, and (2) means ‘Russian’.
(1)
(2)

‘sign’
‘Russian’

Hence the English exonym, Tajik Sign Language, is not a direct translation of the
signs used by Tajik signers, some of whom feel there is little difference between their
language and Russian Sign Language; this perceived similarity between the signed
languages in Tajikistan and in Russia likely contributes to the continued use of the
sign in (2) meaning ‘Russian’. Because of Tajik Sign Language’s close association with
Russian Sign Language (cf. Secs. 2 and 3), the two languages are classified together
in the Ethnologue (Eberhard et al. 2021); that is, there is no separate identifier for
the signed language in Tajikistan. For similar reasons, the language is referred to as
“Russian-Tajik Sign Language” in the Glottolog (Hammarström et al. 2021).
In Tajik, the language has been referred to as Имову ишора Imovu išora,
‘gestures’ (cf. Kuvvatov and Rahmonov 2015), or simply ишора išora, ‘sign’. In
educational settings, the names išorai rusī, ‘Russian sign’, or išorai tojikī, ‘Taj
ik sign’, are sometimes used when providing a direct sign translation for each
Russian or Tajik word in a phrase (cf. discussions about “Signed Russian” in Grenoble 1992: 321 and manually coded English in, e.g., Schick 2003). Importantly,
these last two names (i.e., those meaning ‘Russian sign’ and ‘Tajik sign’) are often
based on the incorrect understanding that a signed language is merely a gestural
representation of a spoken language. ‘Tajik sign’, when used in this way, is erroneously thought to be spoken or written Tajik in gestural form – just as one might
change fonts for written Tajik without changing the underlying language. Thus,
the name ‘Tajik sign’, as it is sometimes used in educational settings in Tajikistan,
does not parallel the language name Tajik Sign Language; the latter name implicitly recognizes the independent linguistic status of this language.

5 Tajik Sign Language in context

231

Evidently, conventions with respect to exonyms for this signed language are
far from settled. Hereinafter in the essay, Tajik Sign Language is used in order
to foreground the divergent lexical conventions and grammatical structures in
the language that developed in Tajikistan after a variant (or perhaps multiple
variants) of Russian Sign Language were introduced there in the first half of the
20th century. Since that period, Tajik Sign Language has, in large part, evolved
independently of Russian Sign Language without sustained interactions between
the signing communities in Tajikistan and in Russia.

2 The emergence and early evolution of Tajik
Sign Language
Signing communities with high proportions of deaf people are typologized in two
broad categories, namely, micro-communities and macro-communities (cf. Schembri 2010; Fenlon and Wilkinson 2015 for extended discussions of this typology).
Micro-communities may form when a high incidence of deafness occurs within an
isolated population. Because a relatively high proportion of community members
in any micro-community are deaf, and because these deaf community members
primarily communicate in the gestural-visual modality, many hearing members
of these communities may also acquire the community signed language. Numerous such communities have been identified in various parts of the world. Because
these communities are typically – but not always – located in rural areas, and
because both deaf and hearing members of these communities may contribute to
the emergence of the community’s signed language, signed languages that have
emerged in micro-communities have been referred to as “village sign languages”
or “shared sign languages” (cf. Zeshan and De Vos 2012; Nyst 2012). Although no
such ‘village’ or ‘shared’ signed languages have been reported to exist in Tajikistan, it is certainly possible that micro-communities of this type have formed in
Tajikistan in the past, given the country’s mountainous topography, numerous
rural villages, and high rates of consanguineous marriage among some populations (Bittles 2001; Saify and Saadat 2012).
The signed languages of macro-communities typically trace their origins to
the establishment of schools for the deaf, many of which began in the 18th and
19th centuries in Europe (Eriksson 1998; Woll et al. 2001; and Power et al. 2020).
These schools provided, in part, the conditions for a community of signers to
form and for signed languages to emerge within those communities. Because
these macro-communities have been stable, signers have been able to transmit,
across multiple generations, the languages that emerged in connection with the

232

Justin M. Power

schools for the deaf. National signed languages – such as American Sign Language, French Sign Language, Russian Sign Language, and Tajik Sign Language –
are examples of languages that have emerged in macro-communities (cf. Lane
1984; Williams and Fyodorova 1993; and Power 2020).

2.1 History of deaf institutions in Russia and Tajikistan
Because Tajik Sign Language is a macro-community signed language, there is a
close connection between the contingent historical features of the early deaf education system in Tajikistan, the formation of the Tajik Deaf community, and the
evolution of Tajik Sign Language. Further back in time, the signing community in
Tajikistan shared close connections with the signing community in Russia, including shared roots in the deaf education system which originated in Czarist Russia,
and which expanded during the Soviet period. Soviet educational policies and
practices had substantial effects on the lives of deaf Tajiks, on the signing community that formed during the 20th century, and on the evolutionary trajectory of
its signed language. As such, a background is warranted about the deaf education
systems in Czarist Russia, in the Soviet Union, and in post-Soviet Tajikistan. Specifically, three factors are relevant for understanding the formation of the Tajik Deaf
community and the evolution of Tajik Sign Language, namely, the group of educators who introduced a variety of Russian Sign Language into Tajikistan in the first
half of the 20th century; characteristics of the deaf population who encountered
that variety of Russian Sign Language; and features of the language transmission
process that may have impacted the evolution of Tajik Sign Language.
Much of the information in this section is based on interviews, conducted
between 2016 and 2018, with 37 deaf Tajiks in three regions of Tajikistan (cf. Power
2020: 66–68) – including Dushanbe (13 males, 12 females), Bokhtar (6 males,
2 females), and Kulob (2 males, 2 females). At the time of interview (last interview
in the case of multiple interviews), the interviewees were on average approximately 48 years old (SD=15) in Dushanbe, 30 years old in Bokhtar (SD=7), and
29 years old in Kulob (SD=2).

2.1.1 Deaf education in Czarist Russia and the USSR
Large scale deaf education in Czarist Russia began in the early 19th century, when
the first school for the deaf was established in Pavlovsk, south of St. Petersburg
(Williams and Fyodorova 1993; Burch 2000). Between 1806 and 1810, two experienced deaf educators were hired to assist in setting up the Murzinka School,

5 Tajik Sign Language in context

233

as it was called. The first teacher at the school, Anselm Sigmund, was trained at
the school for the deaf in Vienna in the Habsburg Empire (Williams and Fyodorova 1993). The Russian Sign Language fingerspelling alphabet was likely adapted
at that time from the fingerspelling alphabet originating in Vienna (Power et al.
2020). Sigmund was replaced in 1810 by Jean-Baptiste Jauffret, who was sent
to St. Petersburg by the headmaster of the Paris National Institute for the Deaf
(Abramov 1993: 200). Later, other schools opened in several cities around the
Empire, including Warsaw in 1817, Moscow in 1860, and Kazan in 1886 (Burch
2000).
For much of the Czarist period following the establishment of deaf education,
the organization of deaf institutions continued within the framework of private
charity and limited state support (Shaw 2017). Although several organizations that
were headed by deaf individuals had begun to develop by the early 20th century,
they were much smaller in number and their influence was less impactful in comparison with what ensued during the Soviet period (Shaw 2017; Burch 2000). The
Soviet Union restructured its special education system in 1931, centralizing deaf
education under the aegis of the Education Council; this reorganization provided
the financial and organizational stability for the system to expand (Csapo 1984).
According to Csapo (1984: 8), “at the beginning of the twentieth century, out of
45,000 children identified as deaf, 3,000 received education in special schools. By
1940–41, 219 schools had accommodated 23,000 deaf students.” Thus, the shift
from the private charitable framework of the Czarist era to the Soviet Union’s centrally-organized education system became the platform that enabled the expansion of deaf education throughout Russia proper and beyond.
Included among the count of deaf students reported above may have been
the first group of deaf scholars in Tajikistan. Based on interviews with deaf residents of Dushanbe and with educators working in deaf education, it seems that
the first school for the deaf in Tajikistan was founded around 1940. While this
suggested dating appears plausible given the data on school enrollment cited
above, other historical information may suggest an earlier or a later date for the
school’s establishment. The Soviet Union entered the second World War in the
1930s, and the country was invaded by Germany in the summer of 1941. Based
on archival research, Shaw (2017: 109) reports that, amidst the turmoil caused by
the war efforts, and in particular the German invasion, the deaf education system
in the early 1940s shrank to a small fraction of its pre-war size: “. . . of the 28,100
deaf children in school in 1941, only 7,600 remained in school by 1943” (note the
difference in school attendees around 1941 compared to Csapo 1984 above). It is
possible that the Soviet education system continued to expand the network of
special education schools after entering the war in 1939 and before the invasion
in 1941; however, it seems more likely that the expansion of deaf education into

234

Justin M. Power

Tajikistan happened just prior to this period, or perhaps shortly after the war’s
end in the mid-1940s. All said, archival research in Tajikistan and in Russia is
needed to ascertain the exact date of the school’s founding.
Effects of Soviet educational policy on deaf Tajiks
Here, the research highlights three Soviet-era policies pertaining to deaf individuals that had notable effects on the lives of deaf Tajiks and on Tajik Sign Language;
these policies are, namely, the tripartite structure of the deaf education system, the
institutionalization of national Societies of the Deaf, and compulsory education.
In the early 1920s, the Soviet deaf education system was divided into three systems
corresponding to categorizations of hearing loss: the profoundly deaf, hard-ofhearing, and late-deafened (Burch 2000: 395). This tripartite structure, which
persists in the Tajik deaf education system today, has provided students with
differing levels of access to Tajik Sign Language. Students in the hard-of-hearing
and late-deafened systems receive less instruction through the medium of signed
language compared with students in the system for the profoundly deaf. These
educational policies with respect to Tajik Sign Language have apparently led to
sociolinguistic divisions among signers of Tajik Sign Language, with some signers
reporting greater within-group interaction versus across-group interaction; that is,
current and former students of the system for the profoundly deaf interact more
closely with each other than they do with current and former students of the system
for the hard-of-hearing and late-deafened, and vice versa. The social structure of
the signing community in Tajikistan and its effects on the structure, transmission,
and use of Tajik Sign Language are promising areas for future research.
The second Soviet-era policy that has had a noticeable impact on the lives of
deaf Tajiks pertains to the Tajik Society of the Deaf. National societies of the deaf
throughout the Soviet Union played important roles in the daily lives of deaf individuals. The All-Russian Society of the Deaf (commonly abbreviated VOG) became
a well-funded state institution with all-encompassing aims for the Russian deaf –
including, inter alia, for the provision of their housing, training, employment, and
their political organization (Shaw 2017: 21–52). The VOG-model for societies of the
deaf was replicated in other parts of the Soviet Union. In Tajikistan, the model’s
structure and infrastructure persist in the form of the Tajik Society of the Deaf, which
has chapters in several cities in Tajikistan and membership in the thousands.1

1 The Society administers apartment buildings, where many deaf residents and their families
live; it owns a gymnasium and a theater for exercise and entertainment; it operates factories
with industrial machines that spin cotton, or where deaf employees sew clothing for sale; and it
owns land in at least one idyllic riverside location in Romit, east of Dushanbe, where deaf people

5 Tajik Sign Language in context

235

The final Soviet-era educational policy highlighted herein was one that made
education compulsory for children throughout the Soviet Union (Shaw 2017:
110), including the deaf children of Tajikistan. It is likely that this policy led to
the recruitment of deaf individuals from many parts of the country who would
not otherwise have attended a school for the deaf. Hence this policy played an
important role in the cross-regional dispersal of Tajik Sign Language as students
returned to their homes during school breaks – and after finishing their studies.
Despite this compulsory education policy, however, access to the deaf education
system varied widely in the Soviet Union, both in terms of when educational services were first made available in an area and in the geographical distribution of
schools (Anderson et al. 1987). In Tajikistan, it is likely that deaf individuals living
in or near Dushanbe, where deaf educational services have been concentrated,
have had the best access to schools for the deaf and similar educational services.
In more remote parts of the country, one is likely to find lower rates of school
attendance by deaf people both in the past and in the present.

2.1.2 Deaf education in Tajikistan
As described in the previous section, there is some uncertainty about when the
first school for the deaf was established in Tajikistan. At some point, most likely
between the late 1930s and the mid-1940s, the first school for the deaf in Tajikistan was established roughly 20 kilometers south of Dushanbe in what was
then called the Leninsky district (now Rudaki district). For roughly 30 years, this
school was the only residential school for the deaf in the country. Because of its
central importance to the early emergence of Tajik Sign Language, the survey of
deaf education in Tajikistan below focuses on this school in the Leninsky district.
The Leninsky School: Aspects of its structure, faculty, and student body
The Leninsky School, the first school for the deaf in Tajikistan, was established by
a group of educators who came to Tajikistan from other parts of the Soviet Union.
It remains unclear whether this first group of educators came from one central
location, such as Moscow, or whether experienced educators were recruited from
multiple locations. Based on an interview with a deaf resident of Dushanbe,
Mr. Giasev, who is now in his seventies, and who attended the Leninsky school

can picnic and swim. In sum, the Tajik Society of the Deaf has played an important role in the
formation of the Deaf community in Tajikistan by providing the contexts for deaf Tajiks to remain
in close contact throughout their lives.

236

Justin M. Power

starting in 1962 at the age of 12 or 13, the school’s faculty members in the 1960s
came from a mix of ethnic backgrounds; several were identified as having
Russian, Ukrainian, Jewish, Tatar, and German backgrounds. The group of faculty
members from outside of Tajikistan greatly outnumbered the school’s Tajik staff.
Only one or two Tajiks were employed as teachers at the time. Mr. Giasev remembers that the faculty members who were not Tajik provided these Tajik teachers
with training in signed language and pedagogy.
Educators at the Leninsky School during the Soviet period were likely all
hearing. The oral method of education, a pedagogical method that precluded
the participation of deaf teachers, predominated in deaf education throughout
the Soviet period (Pursglove and Komarova 2003). Indeed, not until recent years
were the first deaf Tajiks employed as teachers in Tajik schools for the deaf. The
roughly 15 to 20 faculty members at the Leninsky School in the 1960s are said
to have been skilled signers; in addition, according to Mr. Giasev, they were
signing what was perceived to be Russian Sign Language. Other deaf residents
of Dushanbe who attended the Leninsky School in later years also identified the
language used in the school as Russian Sign Language. Although these early educators may have been skilled signers, we can infer from their hearing status that
most of them, if not all of them, were likely to have been L2 learners of Russian
Sign Language. Thus, the early transmission of Russian Sign Language to Tajikistan likely occurred by L2 signers principally in classroom settings.
In 1962, the Leninsky School had an enrollment of approximately 100 students in grades 0 to 8 – eighth grade being the final grade for compulsory education and the highest grade offered in the school at the time. There were roughly
10 to 12 students in each grade, and grade levels were further divided into two
classes; hence class sizes were relatively small, with just 5 or 6 students per class.
The student body’s ethnic composition in the 1960s may have differed from the
school’s ethnic composition today. Many ethnic Russians resided in Tajikistan,
as did groups of other nationalities and ethnic backgrounds, before the collapse
of the Soviet Union and subsequent civil war in Tajikistan from the late 1980s
through the mid-1990s. Several deaf Tajiks who attended school before Tajik independence reported having had many ethnic Russian classmates. Today, there are
very few Russian students in Tajik schools for the deaf.
The Leninsky School was – and still is – residential and coeducational.
Mr. Giasev lived in the school’s dormitory because it was impractical to travel
home daily; his hometown, Regar (now Tursunzoda), is close to Tajikistan’s
western border with Uzbekistan, roughly 60 kilometers from Dushanbe. In
general, most interviewees who attended the Leninsky School during the Soviet
period had fond recollections of their time living in the school’s dormitory.
Because the school was coeducational, it is likely that many students were intro-

5 Tajik Sign Language in context

237

duced to a broader pool of potential deaf spouses than they might otherwise have
encountered. In this way, the Leninsky School likely contributed to the formation
of the Tajik Deaf community; certainly, the long-term relationships that formed at
the school have provided a context for the continued use of Tajik Sign Language.
When Tajik deaf partners had children, a new pathway for language transmission
became possible: the generational, vertical transmission of Tajik Sign Language
from deaf parents to their deaf and hearing children.
Other schools and educational services for the deaf
Educational services for deaf Tajiks were available outside of Dushanbe during
the Soviet period – though the second residential school for the deaf was not
built until 1975 near Khujand, the largest city in the north of Tajikistan. In addition to the school in Khujand, which remains in operation, and the two schools
in Rudaki district2 (i.e., those for the profoundly deaf and for the hard-of-hearing
and late-deafened), there is one non-residential government-operated school for
the deaf in Dushanbe. These three schools are the largest in the country. Historically, preschool programs, starting from age 4 and going up to enrollment at age 7
in grade 0, were offered in some parts of the country – including in Dushanbe and
as far east as Khorog in the Pamir region – but it remains unclear how many children were enrolled in these programs. Among the deaf Tajiks interviewed between
2016 and 2018, only one resident of Dushanbe had attended a preschool program.
In addition to government-operated educational services for deaf Tajiks, there are
several NGO-operated preschool classes in Dushanbe and in neighboring towns
such as Vahdat, Hisor, and Faizobod.
There have been fewer educational services for deaf Tajiks outside of Dushanbe
and Khujand both during the Soviet period and now. In southern Tajikistan, a
small school with two classrooms was located in Bokhtar (formerly Qurghonteppa) until roughly the early 2000s. Later, an educational program came to
being on the premises of an existing government school in Bokhtar; the program
includes three classrooms for deaf children that are separate from those of their
hearing peers. In central Tajikistan, none of the interviewees who currently reside
in Kulob attended educational programs in that area; hence it remains unclear
whether any educational services were provided in that area historically. Among
the interviewees from Bokhtar and Kulob were several deaf individuals who did
not attend any deaf educational programs; others from those areas traveled to
Dushanbe to attend school.

2 The Leninsky school was divided into two sections in 1979 based on hearing status.

238

Justin M. Power

Deaf school attendance and its consequences for social ties among deaf Tajiks
The Tajik Society of the Deaf operates regional chapters in major Tajik cities, such as
Dushanbe, Khujand, Bokhtar, Kulob, and Khorog. In 2016, the Society’s leadership
estimated, based on governmental health statistics and its own membership records,
that there were as many as 28,000 deaf Tajiks in the country. In a country with a
total estimated population of approximately 9.1 million, this figure represents about
0.3% of the population, a proportion that is roughly in agreement with figures for
“functionally deaf” in the United States (cf. Mitchell 2006, who found 0.38% for this
category). According to the estimates of school attendance calculated below, only a
small fraction of the Tajik deaf population – if it is indeed as large as the Tajik Society
of the Deaf estimates – has attended a deaf educational institution or program.
Because available data on special school enrollments from the Soviet era do
not differentiate types of disability (Anderson et al. 1987), it is difficult to estimate
how many deaf Tajiks have been educated in schools for the deaf since roughly
1940. According to Evans and colleagues (2009: 224) in a national educational
policy review for the Organisation for Economic Co-operation and Development,
there were 251 students in the school for the profoundly deaf and 141 students in the
school for the hard-of-hearing and late-deafened in 2007; as well as 202 students in
the school for the deaf near Khujand. For purposes of estimation, we set the enrollment in 1940 at 100 (cf. the student population in 1962, according to Mr. Giasev);
and in the year 2020 at 250. If the average attendance at the school was 175 between
1940 and 2020, and if it takes roughly 9 years for students to finish their schooling
(grades 0 to 8), then in the course of 80 years an estimated 1,500 students have
attended the Leninsky School. Including the school near Khujand – where, according to these rough calculations, some 900 students have attended the school – we
can estimate that approximately 2,500 deaf individuals have attended a residential school for the deaf in the past 80 years. That estimate does not include students
who have attended only the Special School #8 in Dushanbe, which is not a residential school, or those who have attended other educational programs in the country.
Why have seemingly few deaf Tajiks attended residential or other schools for
the deaf? One aspect of the answer to this question may be that Tajikistan is an
extremely mountainous country, and many towns and villages do not have easy
access to paved roads. Many Tajik families with deaf children have limited access
to local deaf educational services, and it is likely that some of these families face
barriers in transporting their deaf family members to the residential schools in
Dushanbe and Khujand. A second factor that likely contributed to the relatively
low levels of school attendance by deaf Tajiks was the political tumult in the late
1980s and throughout the 1990s caused by the collapse of the Soviet Union, Tajik
independence, and the Tajik civil war. These conditions were certainly not conducive to high rates of school attendance among Tajik children generally.

5 Tajik Sign Language in context

239

Social and linguistic divisions have apparently developed among deaf Tajiks
based on deaf school attendance. A conventional sign in Tajik Sign Language
refers to signers who have not attended a school for the deaf (cf. Figure 1); the sign
apparently communicates a negative social and linguistic evaluation of these
deaf Tajiks. The direct translation of the sign in Russian certainly has pejorative
tones; according to signers in Dushanbe, the sign in Tajik Sign Language can be
translated in Russian as rab ‘slave’. However, the form of the sign is similar to
that of the sign in Russian Sign Language with the meaning rabstvo ‘slavery’. The
direct Russian translation of the sign certainly does not indicate the sense
intended by signers of Tajik Sign Language. The English word “homebound”
seemingly captures one aspect of the intended meaning because signers who
have not attended a school for the deaf in Tajikistan are thought to have limited
social interactions outside of their homes. The sign is also used to indicate the
way a “homebound” signer communicates – that is, in an unconventional,
pantomimic way.

Figure 1: A signer from Dushanbe produces the sign meaning ‘homebound’, which can refer
both to signers who have not attended a school for the deaf and to the way in which these
signers typically communicate.

Padden and Humphries (2006: 335) discuss a potentially similar phenomenon in
the Deaf community in the United States: “[t]he label L-V (‘low-verbal’) is used
for educational unfortunates” and for “labeling the uneducated, the working
poor, and the chronically unemployed”. Although not all of these characterizations appropriately apply to the label in Tajik Sign Language, the social divisions
among deaf Tajiks and among deaf Americans share similarities insofar as they
pertain to deaf school attendance. In Tajikistan, deaf school attendance can have
important implications far beyond a deaf Tajik’s education, or lack thereof. Deaf

240

Justin M. Power

school attendance is one of the main gateways into a signing community, with its
attendant access to close and lifelong social ties.
Language transmission in the deaf education system
Because Tajik Sign Language has emerged in connection with schools for the
deaf, the experiences of students in the deaf education system are likely to shed
light on how the language has been transmitted among deaf Tajiks over time and,
relatedly, on the factors that have driven the language’s evolution. This section
highlights the pathways of language transmission that have depended to a large
extent on whether a deaf student was born in a family with other deaf signers.
Most deaf children – perhaps as many as 90% – are born in hearing families that
do not sign as their primary mode of communication (Mitchell and Karchmer
2004). Thus, the majority of deaf Tajiks likely had their first encounter with a
deaf signing community, and with visually-accessible language, on their first day
at a school for the deaf. The experiences of these students from non-signing families have differed markedly from the experiences of students with deaf family
members, many of whom became language models for their peers at school.
Among the group of students without deaf family members is one interviewee
from Kulob, who entered the Leninsky School around 1994 at the age of 8. He
recalled that close connections within peer groups formed quickly at the school
because he and his young peers could not yet communicate well with older students. He reported having difficulty understanding the signing of these older
students for around one or two years until his own fluency in the language had
improved. His experiences contribute nuance to observations in Reilly and Reilly
(2005) about aspects of the language acquisition process for young students in
a Thai residential school. In the Thai context, older students functioned as language models for the younger students in direct interactions. In this Kulobi signer’s experience, this type of direct interaction – that is, of vertical language transmission from older to younger students – evidently followed an initial period of
horizontal transmission among young peers. Based on descriptions of his own
limited fluency during the initial phase at the Leninsky School, we may infer that
the signed communication among this group of young peers included a variety
of homesign systems (Goldin-Meadow and Mylander 1990) – that is, the partly
idiosyncratic signed systems that can develop when deaf children in hearing families lack sufficient access to linguistic input.
The experiences of Tajik children with deaf family members – in particular,
those with family members who had previously attended a school for the deaf –
likely differed in important ways compared with the experiences of children from
non-signing families. One pair of siblings who were interviewed in Dushanbe are

5 Tajik Sign Language in context

241

the children of two deaf parents, both of whom had attended the Leninsky School;
these siblings’ older brother had also attended the Leninsky School. The younger
of these two siblings reported that he was the only fluent signer of Tajik Sign Language in his peer group upon entering the Leninsky School. In the student body,
generally speaking and according to this signer, students with exposure to Tajik
Sign Language prior to entering school were few in number. During the early phase
in which his peers acquired Tajik Sign Language, he communicated with them
using pantomime and semi-conventional gestures common in wider Tajik society.
We may also assume that he used his signing skills, acquired natively at home,
in these interactions. Hence, he likely functioned as a language model within his
peer group. In addition, beginning from his first day at school, his fluency in Tajik
Sign Language, having been acquired from birth, likely gave him access to signed
interactions with both older and younger students – and with faculty and dormitory staff.
Within families that include multiple deaf children, the birth order of deaf
siblings likely has an impact on their early linguistic environment and hence on
the development of their language abilities before entering school. In an investigation of the signing skills among a family in Mexico that includes five deaf
siblings, German (2021) found that the eldest sibling produced more pantomimic
forms, whereas her younger siblings produced more complex grammatical structures. In Tajikistan, one signer who is originally from Hisor (located approx. 30
kilometers west of Dushanbe) was the first in her family of several deaf siblings to
attend the Leninsky School. She lived at the school’s dormitory during the school
week, and at home with her family on the weekends. After she gained fluency in
Tajik Sign Language, she began teaching her younger deaf sister while at home
on weekends. Another signer with a younger deaf sibling, both of whom live in
Vahdat (approx. 20 kilometers east of Dushanbe), entered the Leninsky School in
1973. Roughly six years later, his younger deaf brother enrolled at the school. The
experiences of the older and younger siblings in both aforementioned cases likely
differed qualitatively: the younger siblings had been exposed to Tajik Sign Language before entering school, an advantage that their older siblings had not had.
While all of the interviewees who grew up in or near Dushanbe attended
the Leninsky School for an extended period – typically through grade 8 – this
has not been the experience of many deaf Tajiks living in Bokhtar and Kulob.
For these deaf Tajiks, interactions with the broader Deaf signing community
were limited during childhood; some had few, if any, interactions with other deaf
Tajiks. For example, one signer from Bokhtar, who began attending the Leninsky School at the age of six, remained there for just three years; after grade 2 her
parents brought her back to Bokhtar. Another signer from Bokhtar entered the
local day school for the deaf in 2005 at the age of seven and was a student there

242

Justin M. Power

for three years before his parents sent him to the Special School #8 in Dushanbe
for grade 5. A third signer from Bokhtar never attended a school for the deaf or any
other school. He reported that his connections to other deaf signers in Bokhtar or
elsewhere are limited and that he does not understand signers of Tajik Sign Language when observing their conversations. Similarly, a signer from Kulob was not
enrolled in any school for the deaf or any other deaf education program. However,
in contrast to the experience of the Bokhtari signer described above, this signer
attended his local hearing school through grade 6. With no interpreting services
available, his level of access to learning in such a classroom environment may
have been severely limited. Again, in contrast to the Bokhtari signer’s experience,
this Kulobi signer is relatively well-connected to the local signing community in
Kulob, from whom, by his own report, he has learned Tajik Sign Language.
As the preceding brief survey shows, deaf Tajiks have had widely varying experiences with the deaf education system, with signing communities, and with Tajik
Sign Language. The experiences of these deaf Tajiks have depended to a great extent
on whether they had signing family members and hence exposure to Tajik Sign Language before entering school. Experiences varied, even within families with multiple deaf family members, depending on an individual’s birth order – in particular,
the experiences of the first deaf member of a family significantly differed from the
experiences of deaf children who had access to Tajik Sign Language from parents or
older siblings. For many deaf Tajiks living outside of Dushanbe, experiences with
the deaf education system and access to Tajik Sign Language have varied widely.

2.2 Deaf education and its impact on Tajik Sign Language
Based on the preceding sketch of the deaf education system in Tajikistan and
of the pathways of language transmission in schools for the deaf – and within
families – we may draw several inferences about the emergence and evolution of
Tajik Sign Language. Here the focus will be on three factors implicated in those
processes, namely, (i) the transmitters of Russian Sign Language (i.e., the Leninsky School’s faculty), (ii) the acquirers of that language (i.e., the student body,
with particular attention to its age distribution), and (iii) the transmission process
itself, specifically, the modes of language transmission, both within the Leninsky
School and later outside of that school in homes with deaf family members.
Hearing teachers as a founder population
As a small community of perhaps 15 to 20 individuals, the early group of hearing
educators who moved to Tajikistan in the 1930s or 1940s in order to establish

5 Tajik Sign Language in context

243

the Leninsky School represented a type of founder population (Templeton
1980; Mufwene 1996, 2001; Atkinson 2011). As such, this group was a small fraction of the larger population of signers of Russian Sign Language in Russia. The
full variation in lexical and grammatical structures present in that parent community was not likely reflected in the signed language that was imported to Tajikistan by this founding population. For example, relatively low-frequency linguistic variants in the parent community may have been present to a greater extent
in the founding population; or conversely, high-frequency variants present in
the signing community in Russia may have been represented to a comparatively
lower degree in the hearing teachers’ signing.
As hearing L2 learners of Russian Sign Language, the original teachers at
the Leninsky School may have acquired only part of the Russian Sign Language
lexicon and may not have acquired the full range of grammatical structures in
that language. In general, given the emphasis on oral approaches to deaf education in the Soviet Union (Pursglove and Komarova 2003), and given that these
teachers were L2 learners of Russian Sign Language, their proficiency in that language may have varied widely. Hence, differing groups of Tajik deaf children were
likely exposed in their classrooms to linguistic input with widely divergent and
inconsistent structural features, and to a limited range of Russian Sign Language
vocabulary. In addition, if, as with the faculty at the Leninsky School during Mr.
Giasev’s attendance there (cf. Sec. 2.1.2), the early group of educators came from
linguistically diverse backgrounds, there may have been considerable variation
in the educators’ L2 variety of Russian Sign Language due to differences in their
first languages (cf. Chen Pichler and Koulidobrova 2016 for a review of the literature on second language and second modality transfer). In sum, we infer that the
hearing teachers in Tajikistan did not provide consistent linguistic input for deaf
Tajik children; and that the variety of Russian Sign Language that was imported
into Tajikistan differed, perhaps considerably so, from the language that continued to evolve in Russia proper.
Age distribution of the early student body
The age distribution in a deaf population can have consequences for the formation of a signing community and for the evolution of the community’s language (cf. Power and Meier, in submission, for an examination of student demographics in the first school for the deaf in the United States). Based on research
about the emergence of Nicaraguan Sign Language, R. J. Senghas and colleagues
(2005) have theorized that adolescents/adults, and young children, play differing roles in the formation of a deaf community and in the emergence of a signed
language. Adolescents and adults are thought to provide a stable community

244

Justin M. Power

structure and visually-accessible linguistic input, while young children (roughly
under 10 years of age, cf. Senghas and Coppola 2001) are thought to drive the
emergent language’s grammatical elaboration (cf. also Polich 2001; Senghas
et al. 2004).
It is unclear what the age distribution of the student body was during the early
years of the Leninsky School. Today, students at the Leninsky School begin grade
0 at around age 7. However, as we have seen, Mr. Giasev was 12 or 13 when he first
entered the school in 1962. It is possible that there was more variation in the ages
at which students were admitted to the Leninsky School during the school’s early
period because, at least initially, none of the deaf Tajik children had had access to
deaf education before that time. If early on there were numerous young children
but few older students at the Leninsky School, we might expect that the formation
of the Tajik Deaf community was delayed until these younger students reached
adolescence and adulthood. Alternatively, if the student body included many
adolescents, we might expect that the Tajik Deaf community formed rapidly. The
Tajik Society of the Deaf likely played an important role in the early formation of
the Tajik Deaf community because membership in the Society was not contingent
on deaf school attendance.
Language evolution in the Leninsky School
What inferences can be drawn about the evolution of Tajik Sign Language based
on the sketches above as to the early signing community in Tajikistan? Theories
about the modes and outcomes of cultural transmission may help us understand
possible evolutionary scenarios. Cavalli-Sforza and Feldman (1981) distinguish
four types of cultural transmission, including vertical (parent to child), horizontal (among peers), and two types of oblique (nonparental adult to younger generation) transmission. According to this typology, the early transmission of Russian
Sign Language occurred along oblique pathways, namely, from hearing teachers to deaf Tajik students. Oblique transmission is thought to result in increased
rates of evolution in one-to-many transmission scenarios – that is, in scenarios
in which the cultural model is outnumbered – but to have the opposite effect in
many-to-one scenarios (cf. Lycett and Gowlett 2008). In one-to-many scenarios,
the younger generation outnumbers the older generation, potentially facilitating
the creation and uptake of cultural innovations. In many-to-one scenarios, strong
community norms may tend to discourage innovations and decrease in-group
variation.
Recall that, according to the interview with Mr. Giasev, the teacher-student
ratio some 20 to 30 years after the Leninsky School’s establishment was relatively
low. The student body consisted of around 100 students, while there may have

5 Tajik Sign Language in context

245

been 15 to 20 teachers and other staff members. In Mr. Giasev’s class, there were
only 5 to 6 students and one teacher. If the hearing teachers were concerned with
maintaining norms in Russian Sign Language – that is, at least those norms of
which they were aware – we might expect that the early phase in the transmission of Russian Sign Language to deaf Tajiks tended to discourage innovations.
If, however, teachers were unconcerned with maintaining Russian Sign Language
conventions; or if, as L2 learners of the language, they were unaware of many
conventions in Russian Sign Language, there may have been a proliferation of
linguistic innovations among students. The putative variation and inconsistency
in their teachers’ signed variety may have further encouraged such innovations,
because deaf students at the Leninsky School may not have perceived a clear set
of conventions in their linguistic input.
In addition to the teacher-student ratio, the quality of interactions between
teachers and students likely played an important role in the early evolution of
Tajik Sign Language. In the classroom, interactions between hearing teachers
and deaf students can tend to be short in duration and mainly directed by the
teacher, with limited opportunities for deaf students to contribute (Erting 1988;
Singleton and Morgan 2006). This phenomenon of “teacher talk” may not exhibit
age-appropriate structural complexity, with teachers tending to produce simple
structures regardless of their students’ ages (Wood et al. 1991). If the hearing
faculty at the Leninsky School exhibited similar characteristics in their interactions with students, these students would have had little exposure to complex
grammatical structures in Russian Sign Language. With little access to complex
structures in their teachers’ signing, students at the Leninsky School would have
likely innovated their own grammatical conventions.
Both during and following the initial phase of the school’s establishment,
horizontal transmission among peers, which is thought to foster the rapid innovation of cultural conventions (Cavalli-Sforza and Feldman 1981), is likely to
have played an important role in the evolution of Tajik Sign Language. Recall
the Kulobi signer’s early experience at the Leninsky School, when for a period of
roughly one to two years he could not understand the signing of older students
and interacted mainly with his young peers. Because, apparently, there were no
native signers of Tajik Sign Language among this group of young students, there
was ample opportunity for the students to innovate new conventions, many of
which may have developed from the students’ existing homesign systems (cf.
Coppola 2020). During the early years of the Leninsky School, it is unlikely that
there were any native signers of Russian Sign Language among the student body.
Hence, like the Kulobi signer’s peer group, all of the groups formed early on at
the school were likely to have been fertile ground for the creation of linguistic
innovations.

246

Justin M. Power

While these putative new conventions were developing, these young students
were acquiring the emergent signed language from older students, from their
teachers, and, later, from native signers of the language among the student body;
hence, these students also began to use Tajik Sign Language with each other,
transmitting the language along horizontal pathways within their peer group: a
situation that may have encouraged further innovations in the language. The residential school’s dormitory was likely the locus of both horizontal transmission
among peers, and vertical transmission from older to younger students, with the
older students occupying a parental role in typical vertical language transmission processes (cf. discussion in Singleton and Meier 2021: 28). Later, deaf children of former deaf students – such as the signers from Dushanbe mentioned
above – likely also functioned as language models for their peers.
In sum, all three of the major transmission processes in Cavalli-Sforza and
Feldman’s (1981) typology – including vertical, horizontal, and oblique processes –
were implicated in the emergence and early evolution of Tajik Sign Language.
However, the prominence of each of these transmission types likely varied over
time. Initially, there were, apparently, no deaf individuals living in Tajikistan
who had natively acquired Russian Sign Language; thus, the early transmission
of Russian Sign Language occurred along oblique (via hearing teachers) and
horizontal (among students) pathways. As we have seen, many of the conditions that were likely relevant at the school during the early years – for example,
the teachers’ putative lack of fluency in Russian Sign Language and the prominence of horizontal language transmission among students who were not themselves native signers – were conducive to the rapid divergence of the emergent
signed language from Russian Sign Language. Later, once a group of children
had natively acquired the emergent language, that group of native signers acted
as language models for other deaf students and eventually, in some cases, for
their own deaf children. In these ways, Tajik Sign Language has been transmitted
across generations to the contemporary community of signers.

2.3 Records of Tajik Sign Language
There are three main records of Tajik Sign Language, including one video dictionary, one print dictionary, and the corpus analyzed in Power (2020). First, a
video dictionary and signed language course were created in 2008 in Dushanbe
through a collaboration between the Tajik Society of the Deaf, the Moscow Centre
for Deaf Studies and Bilingual Education, and the Finnish Association of the
Deaf. These materials include approximately 650 signs as well as scripted conversations. Second, Kuvvatov and Rahmonov (2015) published a collection of

5 Tajik Sign Language in context

247

commonly used signs. The collection includes 1,594 photographs of signs, translations in Tajik and Russian, and descriptions in Tajik of sign pronunciations.
Third, between 2016 and 2018 this author collected more than 80 hours of video
recordings – comprising interviews, elicited lexical and grammatical data, and
narrative responses to the Pear Story (Chafe 1980; cf. Power 2020).

3 Tajik Sign Language’s relationship to other
signed languages
Because, like Tajik Sign Language, many of the world’s macro-community signed
languages have emerged in connection with schools for the deaf, historical connections among schools for the deaf have played a critical role in the distribution
of the world’s signed languages. Scholars have often used the extralinguistic histories of these deaf educational institutions, and of the signing communities that
have formed in connection with them, to inform their explorations of the linguistic relationships among signed languages and their classification of these languages into families (e.g., Woodward 1978; Fischer 2015). In addition to historical
connections among schools for the deaf, Padden (2011) argues that other cultural
and political factors may influence the distribution and relationships of signed
languages in a region. In the case of five Middle Eastern signed languages (cf. also
Al-Fityani and Padden 2010), Padden discusses how political borders and political relationships have influenced the possibilities for language relationships and
language contact among signers in that region. She also highlights how cultural
practices such as consanguineous marriage can define the social groups within
which signed languages emerge and are transmitted. According to Padden, the
patterns of relationships among the world’s signed languages are sensitive to
many of the same political and sociocultural factors that influence the patterns of
relationships and the patterns of contact among spoken languages.
Where does Tajik Sign Language fit in the landscape of signed language
relationships? In what sense is Tajik Sign Language a language of Central Asia?
What is the linguistic evidence for classifying Tajik Sign Language together with
Russian Sign Language (as in, e.g., Eberhard et al. 2021; as well as Hammarström et al. 2021)? In light of the background pertaining to the history of deaf
education in Tajikistan (cf. Sec. 2), it may be unsurprising to find close lexical
similarity between Tajik Sign Language and Russian Sign Language. In the
Russian Sign Language lexicon, the early teachers at the Leninsky School had
an established set of conventional signs which they used when interacting with
deaf Tajiks. Local Tajik signs that were brought to the school by students were,

248

Justin M. Power

evidently, often outcompeted by the conventional and institutional signs from
Russian Sign Language (cf. Yoel 2007; and Lanesman and Meir 2012 for explorations of the conditions leading to language shift among deaf migrants to Israel).
Based on a lexical comparison of 12 Tajik and 2 Russian signers, it was argued in
Power (2020) that much of Tajik Sign Language’s basic vocabulary has etymological origins in Russian Sign Language.
Here, this research extends the lexical comparison in Power (2020) by including data from another Central Asian signed language, namely, Afghan Sign Language. In the current comparison, the linguistic boundaries among signed languages in Central Asia are investigated to ascertain whether any basic vocabulary
signs are shared across signing communities in Tajikistan and Afghanistan. The
results of the lexical comparison show that there is little linguistic evidence of an
historical relationship between Afghan Sign Language and Tajik Sign Language –
and little evidence of contact among signers of these two languages. In contrast,
Russian Sign Language, according to the analysis presented here, has had the
more important historical influence on the vocabulary of Tajik Sign Language
by way of the deaf education system established during the Soviet period. Thus,
although from a geographical perspective, the signing community in Tajikistan
is a Central Asian signing community, Tajik Sign Language is much more closely
connected to Russian Sign Language, when considering the linguistic landscape
of signed languages in the region, and hence to other signed languages of the
former Soviet Union.
Section 3.1 begins with a background on Afghan Sign Language. Following
this background, Section 3.2 presents the lexical comparison of Tajik Sign Language, Russian Sign Language, and Afghan Sign Language. Section 3.2.1 will
describe the data and methods used for the lexical comparison, and Section 3.2.2
reports on the comparison’s results. Section 3.3 discusses these results and their
importance for our understanding of Tajik Sign Language in the context of the
distribution of signed languages in Central Asia.

3.1 Afghan Sign Language: Historical background
Afghan Sign Language is the primary language of many deaf and hard-of-hearing
Afghans. The language is primarily used in Afghanistan, which is located south
of Tajikistan; the two countries share a border of approximately 1,357 kilometers.
Despite their close geographic proximity, the signing communities in Afghanistan and in Tajikistan have histories that differ in important ways. Although
both countries had political ties to the Soviet Union in the 20th century, those ties
developed later in Afghanistan (beginning in 1978) and were much less stable

5 Tajik Sign Language in context

249

(cf. Ewans 2002) when compared with the political integration of the Tajik Soviet
Socialist Republic in the Soviet Union. Perhaps because of the political instability
in Afghanistan, there does not appear to have been an expansion of the Soviet
deaf education system into that country during the Soviet-backed Communist
regime’s period in power (1978–1992). Thus, one of the most critical factors in
the emergence of Tajik Sign Language – namely, the expansion of the Soviet deaf
education system into Tajikistan in the late 1930s or early 1940s – apparently
played no part in the emergence of Afghan Sign Language.
The origins of Afghan Sign Language can be traced to Peshawar in northwestern Pakistan in the early 1990s.3 Prior to that time, deaf Afghans may not have
had the opportunity to gather in large numbers in Afghanistan; apparently, there
were no educational services for deaf Afghans in Afghanistan before the 1990s.
However, some of the deaf Afghans who would later work in Afghan schools for
the deaf – and who would play a part in forming the Afghan National Association
of the Deaf – had been long-term residents of northwestern Pakistan by the early
1990s and may have attended schools for the deaf there. Given cross-border ethnic
and linguistic ties among Afghan and Pakistani Pashtuns, it was not (and is not)
an uncommon situation for Afghans to live across the border in Pakistan. That
some of these Afghans were deaf and that they may have attended a Pakistani
school for the deaf is also not surprising. As such, although there apparently
were no deaf educational services in Afghanistan prior to the 1990s, some deaf
Afghans may have been Pakistan Sign Language signers at that time.
Over the course of the decade following the Soviet-backed coup in 1978 and
the military invasion of Afghanistan in 1979, the number of Afghans living in
Pakistan rapidly increased. Several million Afghans left Afghanistan. As many
as 3.2 million had fled to Pakistan by 1984 (Amstutz 1986: 223–225), and over 2
million were still in Pakistan in 1992 (Ghufran 2008: 121) – three years after the
Soviet Union’s withdrawal from Afghanistan. Many of the refugees from Pashto-speaking provinces along Afghanistan’s eastern border with Pakistan settled
in Peshawar, a Pashto-speaking city located roughly 50 kilometers from the
border crossing at the Khyber Pass. Numerous organizations began relief and
development projects among the refugees in Peshawar. In 1992, an organization
called Serve Afghanistan started a project in Peshawar that focused on vocational
training and education for roughly 60 deaf Afghans. Among the staff working
3 I was an employee of the NGO, Serve Afghanistan, from 2004–2006 and 2010–2013. Details
in this section about the formation of the Afghan Deaf community and about the emergence of
Afghan Sign Language reflect information that I obtained while working in Jalalabad and Kabul
as well as information through correspondence with Soo Choo Lee, who led Serve Afghanistan’s
project serving deaf and hard-of-hearing Afghans from 1994 to 2004.

250

Justin M. Power

in that project were two Americans who were fluent signers of American Sign
Language. Initially, that language was used as the main language for training the
project’s deaf Afghan participants.
In the mid-1990s, the vocational training project in Peshawar, including a
number of its deaf Afghan employees – but without the two fluent signers of
American Sign Language – relocated across the border to Jalalabad, Afghanistan. There the project shifted its focus to primary education by establishing
the first school for the deaf in Afghanistan in 1995. The school was called the
SHIP school (i.e., Serve’s Hearing Impaired Project; its name in Afghan Sign Language is an iconic representation of a boat or a ship). In 1995, the project also
produced the earliest published collection of signs that are identified as Afghan
signs. This collection of Afghan signs – which was published in the form of a
book with drawings of signs and translations of these signs in Dari, Pashto, and
English – includes a number of signs that appear to have etymological origins in
American Sign Language and in Pakistan Sign Language; there has not yet been
a systematic study of the signs in the 1995 collection (Serve’s Hearing Impaired
Project 1995). Hence, these two macro-community signed languages, American
Sign Language and Pakistan Sign Language, likely contributed to the linguistic ecology of the signing community that formed in Peshawar and Jalalabad in
the 1990s.
Later in the 1990s, deaf and hearing employees of the SHIP school in Jalalabad visited the Holy Land Institute for the Deaf in Jordan, an institute which has
served as a resource center for deaf education in the region (cf. Al-Fityani 2010:
20). These Afghans’ visit to the Institute in Jordan led to the adaptation of the
fingerspelling alphabet in Jordanian Sign Language for use in Afghan Sign Language. The decision to adapt an existing fingerspelling alphabet – that of Jordanian Sign Language – for use in Afghan Sign Language was facilitated by similarities across the alphabets in Jordan and in Afghanistan: the Jordanian Sign
Language fingerspelling alphabet represents the Arabic alphabet, while the Dari
and Pashto alphabets are both based on the Arabic alphabet. Additional handshapes were created in the Afghan Sign Language fingerspelling alphabet for
letters in Dari and Pashto that do not exist in Arabic.
Today, the majority of signers of Afghan Sign Language live in Jalalabad and
in the country’s capital city of Kabul, where the Afghan National Association of
the Deaf is located. There are three large schools for the deaf in those two cities,
two of which are in Kabul; these schools have roughly equal enrollments (approx.
between 200 and 300 students) and grade offerings (preschool to grade 12). None
of these three schools are residential; instead, they mainly draw students from
their local populations. Historically, all three schools have been coeducational,
except when coeducation was not permitted under the Taliban government

5 Tajik Sign Language in context

251

(1996–2001). There are also populations of signers of Afghan Sign Language in
other parts of the country – in particular, in urban areas in which educational
services for the deaf have been located, such as Mazar-i Sharif in the north, Herat
in the west, and Kandahar in the south. In their report on the role of Afghan Sign
Language in the Afghan deaf education system, Becker & Eichmann (2013: 7)
estimated that there were at least 30,000 profoundly deaf Afghans who signed
as their primary means of communication and that the majority of these deaf
Afghans had never attended a school for the deaf.
Linguistic boundary between Afghan Sign Language and Tajik Sign Language
How does the brief background in this section about deaf Afghans, and about the
emergence of Afghan Sign Language, affect our understanding of the linguistic
boundary between Tajik Sign Language and Afghan Sign Language? First, the two
languages have been influenced by differing macro-community signed languages
by way of the deaf education systems existing in each country. In Tajikistan,
Russian Sign Language was introduced to deaf Tajiks through the Soviet deaf education system. In Afghanistan, the deaf education system developed differently,
without any connection to the Soviet system and apparently without any contact
between Afghan signers and signers of Russian Sign Language. Initially, nongovernmental organizations played a comparatively larger role in deaf education in
Afghanistan. The historical differences between the deaf education systems in
the two countries have likely contributed to the linguistic boundary between Tajik
Sign Language and Afghan Sign Language. However, the two macro-community
signed languages that were introduced to deaf Tajiks and deaf Afghans by means
of their deaf education systems – namely, Russian Sign Language and American
Sign Language – have their own historical connections going back to the influence of French schools for the deaf in the early 19th century; deaf signers in Russia
and in the United States both came into contact with French educators of the deaf
during that period (cf. Lane 1984; and Abramov 1993). This historical connection
between Russian Sign Language and American Sign Language by French Sign
Language represents a potential pathway for the etymologies of signs in Tajik
Sign Language and Afghan Sign Language to overlap.
The second relevant factor, highlighted above, that pertains to the potential
relationship between Tajik Sign Language and Afghan Sign Language is the locations of the two signing communities. In Tajikistan, the two largest schools for
the deaf are located near the country’s two largest population centers, namely,
Dushanbe and Khujand, which are located in Tajikistan’s central eastern and
northeastern regions, respectively. It is likely that the largest concentrations
of signers of Tajik Sign Language in Tajikistan are found in those two cities. In

252

Justin M. Power

Afghanistan, the largest concentrations of signers of Afghan Sign Language have
lived in the central and eastern parts of that country, namely Kabul and Jalalabad; many of these Afghan signers previously lived in Peshawar, which is located
farther east of Jalalabad. In sum, the largest groups of Afghan signers since the
early 1990s have lived some distance from the Afghan-Tajik border; furthermore,
they have lived even farther from the relatively large signing communities in
Dushanbe and Khujand. Because the Tajik and Afghan signing communities
have been separated geographically throughout their histories, it is unlikely that
signers of the two languages have been in frequent contact.

3.2 Lexical comparison of Tajik Sign Language, Russian Sign
Language, and Afghan Sign Language
In light of the background presented above about the emergence of Tajik Sign
Language (cf. Sec. 2), and of Afghan Sign Language (cf. Sec. 3.1), we might not
expect to find evidence of an historical relationship between these two Central
Asian signed languages or to find evidence of language contact between signers
of the two signing communities. This section examines the linguistic evidence for
an historical relationship and for language contact by comparing lexical signs
for a set of basic vocabulary concepts across Tajik Sign Language, Russian Sign
Language, and Afghan Sign Language. It begins with a description of the data
and methods used to compare these three languages. In connection with the
description of methods, it briefly highlights relevant methodological problems
that confront historical comparisons of signed languages, and that are relevant
for understanding the results of the current lexical comparison.

3.2.1 Data and methods
Data: Signers
The data for Tajik Sign Language and Russian Sign Language in the current
study comprise a subset of the data reported in Power (2020), which were collected between 2016 and 2018. For Tajik Sign Language, four signers are included
in the current comparison, all of whom resided in Dushanbe, Tajikistan at the
time of data collection. In 2018, these signers (2 female) were between 34 and 49
years old (M=41.75, SD=6.6); all had attended the Leninsky School beginning at
age 6 or 7, and all had first degree deaf family members. For Russian Sign Language, two signers are included (2 female; ages 33 and 34 in 2019), both of whom
lived in Moscow, Russia at the time of data collection; data were collected by

5 Tajik Sign Language in context

253

Valeriya Dushkina and shared with this author by Vadim Kimmelman (University
of Bergen, Norway). Both Russian signers attended schools for the deaf from an
early age (cf. Power 2020, for a detailed description of the data collection procedure).
Data for Afghan Sign Language are aggregated from two separate datasets
that were originally collected for use in video dictionaries of Afghan Sign Language. One of these datasets was collected between 2009 and 2012 as part of a
European Commission-funded project directed partly by this author. The second
dataset was shared with this author by the Sedaqat Deaf Center in Kabul. The
combined dataset was reported in Power (2014). For the current comparison, only
those signers are included who are known to have attended a school for the deaf
(the school in Jalalabad) from an early age (N=5, all male, all approx. 20 years old
at the time of data collection). The data typically include, per concept, one sign
from one signer; that is, the data do not typically include signs from multiple
signers for each concept. For purposes of the current study, the aggregated signs
from all five Afghan signers are compared to individual signers of Tajik Sign Language and Russian Sign Language.
Methods: Concept list
For Tajik Sign Language and Russian Sign Language, a 231-item concept list
was used to collect basic vocabulary data; the list was developed in the IndoEuropean Cognate Relationships database project (cf. Bouckaert et al. 2012).
Signs were available for only 50 of these concepts in the aggregated Afghan Sign
Language dataset described above. Hence, the current comparison includes
only signs representing the 50 concepts that are available for Afghan Sign
Language. These 50 concepts included 18 concepts with nominal meanings (not
including body parts; e.g., ‘grass’, ‘night’), 12 with adjectival or adverbial meanings (e.g., ‘hard’, ‘near’), 7 with verbal meanings (e.g., ‘go’, ‘sit’), 5 color terms
(e.g., ‘black’, ‘green’), 5 body part terms (e.g., ‘head’, ‘mouth’), 2 interrogatives
(‘where’, ‘who’), and 1 pronoun (‘he’). The complete list of concepts included in
the study is provided in Table 1. For the Tajik and Russian signers, there was no
upper limit to the number of signs that a signer could produce for each concept
in an elicitation session; hence, the number of signs produced varied across
signers. The complete data set, including transcriptions, is included in the supplementary material.4

4 Supplementary material available at https://doi.org/10.6084/m9.figshare.15170700.

254

Justin M. Power

Table 1: Signs for basic vocabulary concepts, grouped into seven categories, that are included
in the current comparison.
Category

Basic vocabulary concept

Noun

day, grass, man (male), man (person), meat, moon, mother, mountain,
name, night, path, pig, rain, river, road, salt, sand, sea

Adjective

good, hard, long, many, narrow, near, new, not, old, other, right (correct), round

Verb

do, give, go, hear, say, sit, think

Color

black, green, red, white, yellow

Body part

head, heart, mouth, neck, nose

Interrogative

where, who

Pronoun

Woodward (1978, 2011) has argued that some concepts, such as body part
terms, should be excluded from concept lists in historical comparisons of signed
languages because body part terms tend to be represented in signed languages by
phonologically-similar indexical (i.e., pointing) forms. For example, the concept
meaning ‘nose’ is represented in many signed languages by contacting the nose
with the tip of the index finger. One should not conclude, based merely on their
phonological similarity, that signs representing the concept ‘nose’ in various
signed languages have an historical relationship. Rather, these signs’ phonological similarity may be influenced by tendencies in the gestural-visual modality that
are orthogonal to language history. According to this line or reasoning, by including body part terms (and, e.g., pronouns) in an historical comparison, we are likely
to overestimate the amount of historically-related vocabulary that is shared across
the languages being compared. This methodological problem was particularly
acute in past quantitative approaches that sought to classify signed languages in
families based on percentages of shared basic vocabulary (e.g., Woodward 2011;
Parkhurst and Parkhurst 2003); in those approaches, similarity percentages and,
thus, language classifications were highly sensitive to false positives.
The current study adopts a different approach to addressing the methodological problem described above – that is, the problem of overestimating the number
of historically-related signs in a comparison due to the prevalence of phonologically similar indexical forms. Section 3.2 reports the overall comparison results,
as well as the comparison results, broken down by concept category – including
the category of body part signs. By analyzing concept categories separately, it is
easier to explore how the comparison’s results are impacted by signs for specific
concepts, or specific sets of concepts.

5 Tajik Sign Language in context

255

Methods: Methodological problems for lexical comparisons
of signed languages
In addition to the problem related to indexical signs described above, numerous
other theoretical and methodological problems confront any historical comparative investigation of signed languages (cf. Power 2022, for a discussion of theoretical problems related to the transmission of signed languages). Below describes
one of the most intractable challenges confronting the historical comparison
of signed languages, namely, the problem of differentiating inherited linguistic
material (e.g., cognate lexical signs and structural features) from linguistic material that is shared across signed languages due to other processes, such as borrowing or independent parallel development. Sign scholars have not yet developed the type of robust methodology that exists in spoken language historical
linguistics in the form of the comparative method for identifying shared cognate
vocabulary (cf. Power et al. 2019).
Consider first the comparative method in historical linguistics. In the context
of studying a set of languages that are assumed to be related – based on “diagnostic linguistic evidence” (Nichols 1996: 48) – scholars apply the comparative
method, inter alia, to identify vocabulary that has been inherited from a common
ancestor. In order to identify this inherited, or cognate, vocabulary, scholars search
for systematic correspondences, typically beginning with comparisons of semantically related lexical items (cf., e.g., Hale 2015). The existence of these shared
correspondences is most parsimoniously explained by invoking two notions:
inheritance from a common ancestor; and the regularity of sound change. That
is, the correspondences are regular and systematic because they are derived from
a system that existed in a common ancestral language and because diachronic
sound change can be regular. Because systematic correspondences are taken to be
sounds that are inherited from a common ancestor, they provide the best possible
evidence for identifying cognate vocabulary across related languages. To date,
there has been no published evidence of systematic correspondences across the
signs of any putatively related signed languages. Thus, in their historical comparisons of signed languages, sign scholars have lacked the type of evidence that is
thought to be the best possible evidence for identifying cognate vocabulary.
Why have sign scholars not yet identified systematic correspondences across
putatively related signed languages? Rankin (2003: 184) highlights two assumptions about language, and language change, underlying the comparative method
that do not clearly hold for signed languages, namely, the fundamental arbitrariness of the linguistic sign and the regularity of sound change. These two assumptions are related in the following way. If there exists only an arbitrary connection
in the minds of speakers or signers, between the phonological form of a spoken
word, or of a manually produced sign, and the meaning of that word or sign, then

256

Justin M. Power

sound change can operate in a regular manner at an unconscious level (cf. Rankin
2003; Labov 1981, 2020). If, however, speakers or signers perceive a non-arbitrary
connection between a representation’s phonological form and its semantics, then
it becomes possible for speakers or signers to block regular sound change or to
introduce irregularity into sound change (cf. Joseph 1987; Malkiel 1994). According to this line of reasoning, we might expect an inverse relationship between
the proportion of non-arbitrary representations in a language and the regularity of sound change in that language: languages that have greater proportions
of non-arbitrary words or signs will exhibit less regularity in diachronic change.
Because iconic (i.e., non-arbitrary) and indexical representations are ubiquitous
throughout the lexicons of all known signed languages (cf. Perniss et al. 2010),
both assumptions about language change that underly the comparative method –
that is, assumptions about the fundamental arbitrariness of the linguistic sign
and the regularity of sound change – may not hold for signed languages. If change
in signed languages is not regular in the way that sound change can be regular in
spoken language change, then inherited vocabulary that is shared across related
signed languages may not exhibit systematic correspondences.
One important consequence of the putative lack of systematic correspondences across related signed languages is that sign scholars have not had the
methodological tools for differentiating cognate vocabulary from vocabulary that
is similar in phonological form due to other processes. For example, due to the
prevalence of iconic and indexical representations in signed languages, many
semantically similar signs may have similar phonological forms, even across
historically unrelated signed languages. Guerra Currie and colleagues (2002)
found that 23% of the vocabulary sampled from Mexican Sign Language and Japanese Sign Language – two languages with no known historical connections via
deaf education and with no known contact across signing communities – were
articulated similarly (i.e., the signs shared two out of three major phonological
parameters). The authors speculate that there may be a relatively high minimum
level of similarity in the phonological forms of semantically-similar lexical signs
across all signed languages (cf. also Woll 1983). If true, the task of differentiating inherited vocabulary from vocabulary that is phonologically similar due to
other causes is challenging indeed. All methods that have been developed so far
in order to identify cognate vocabulary across related signed languages, being
alternatives to the identification of systematic correspondences, share this fundamental problem: the methods do not rigorously distinguish the causes of phonological similarity.
The theoretical and methodological problems confronting historical comparisons of signed languages are, ipso facto, relevant to understanding Tajik
Sign Language’s relationships to other signed languages. Power (2020) argues

5 Tajik Sign Language in context

257

that, given the field’s current methodological limitations, an historical comparative methodology for signed languages should aim to differentiate etymological
relatedness (cf. List 2016) – that is, shared history due to either inheritance or
language contact – from phonological similarity due to iconicity or chance. It
further argues that one step towards the development of such a methodology is
the articulation of a model of historical change, a model that reflects our current
understanding of change in signed languages and as such that adds an historical
dimension to comparisons of the phonological forms of signs (cf. Power 2020:
43–55).
Methods: An inferential framework for studying the histories of signs
This section provides an abridged description of the inferential framework developed in Power (2020: 79–100) for making historical inferences about the etymological relationships of signs. The methodology’s principal aim is to make the
inference procedure transparent – and hence improvable – using a computer-assisted framework following List (2014). The main features of the methodology are
(i) the transcription of signs using a computer-readable notation system, (ii) the
optional translation (or standardization) of those transcriptions into classes of
symbols, (iii) the measurement of similarity of sign representations using a model
of historical sign change, and (iv) the inference of etymological relations among
signs using a clustering algorithm.
The first step in the methodology is transcription. A computer-readable sign
transcription system, HamNoSys (Hanke 2004), was used to transcribe 428 total
lexical signs (including 25 compounds, i.e., 453 total morphemes) in Tajik Sign
Language, Russian Sign Language, and Afghan Sign Language. Because HamNoSys is computer-readable (and compatible with Unicode), transcriptions in HamNoSys can be used with existing computational tools, such as PySign (Power and
List 2020), a Python library developed to manipulate HamNoSys sign transcriptions. The following features were transcribed for each sign: handshape, location, movement, and symmetry features; thereafter the number of hands used
to articulate each sign was coded. Orientation features for Tajik Sign Language
or Russian Sign Language were not transcribed. Parks (2011) found that, among
the six parameters that he tested, orientation values – specifically, changes in
orientation values – were the least reflective of expected groupings in a lexical
comparison of 50 participants, from 13 countries.
The second step in the methodology involves optionally translating the sign
transcriptions into symbol classes. The approach used here follows the conceptual design of the class-based approach developed by List (2014; cf. also List et al.
2018, for an implementation of this approach in the historical analysis of spoken

258

Justin M. Power

language data). In historical linguistics, class-based approaches originally aimed
to create a model of well-known sound changes and to use this model in cognate
inference procedures (cf. List’s 2014 discussion of Dolgopolsky’s 1964 soundclasses). Class-based approaches have also been used to standardize representations of words in order to facilitate crosslinguistic comparisons (cf. Holman et al.
2008). In the current methodology, classes were used to explicitly define categories for the transcription symbols. For example, two handshape symbols that
differ only in finger flexion were assigned to the same class in the current study.
For converting transcriptions into class-based representations, four translation
tables were developed using Python (Power 2020: 239–243, for the classes that are
defined for the current comparison).
The third step in the methodology is to quantify the similarity of the class
representations. There are numerous methods that can be used to measure similarity; one of the most common of these is the Jaccard index, or similarity coefficient (Jaccard 1912). The Jaccard index provides a measure, for two sets A and
B, of the magnitude of the set intersection divided by the magnitude of the set
union, or J (A, B) = ∣A∩B∣/∣A∪B∣ resulting in a score between 0 and 1, where 1 is
equivalent to complete similarity. In the current comparison, each pairwise comparison of signs was first broken down into five independent, pairwise comparisons of sign features – in particular, of handshape, location, movement, and
symmetry features, and of the number of hands used to articulate a sign. The
Jaccard index measured the intersection of class symbols in each of these five
comparisons divided by the union of the class symbols, resulting in five scores
that were weighted (handshape=0.3, location=0.3, movement=0.3, symmetry=0.05, number of hands=0.05) to yield a preliminary overall score for the similarity of two signs. The weights used in the current study were loosely based on
theoretical differences in the phonological status of sign formational features – in
particular, of handshape, movement, and location features, which are thought
to constitute the primary formational parameters of a sign (cf. e.g., Stokoe et al.
1965; Sandler 2012). The methodology developed in Power (2020) allows for variable weightings of the formational features encoded in HamNoSys.
The preliminary measure described above quantifies the similarity of two
signs; however, the measure is naïve to any tendencies of diachronic change in
signed languages (cf. e.g., Frishberg 1975). A similarity measure that is naïve in
this way will not differentiate between the following two cases: (i) two signs that
are phonologically different because they are not etymologically related, and (ii)
two signs that, although etymologically related, differ phonologically because of
historical change. In order to improve on the naïve similarity measure described
above, a model of sign change was developed that reflects our understanding
of the probability of certain diachronic changes in signed languages (cf. Power

5 Tajik Sign Language in context

259

2020: 43–54, 243–246). In the current comparison, that model was used to weight
the preliminary similarity measures such that differences between two signs that
are defined in the model as common diachronic changes were not scored as completely different. Instead, for any differences that are defined in the model, an
adjustment was made to the numerator of the Jaccard index calculation – that is,
the value of the intersection was increased based on the model, thereby increasing the overall similarity measurement of two signs. Similarity measures were
converted to distance measures by subtracting from 1. The preceding method for
calculating pairwise distances produced a distance matrix for each of the 50 concepts in the current comparison (Power 2020: 89–92, for a detailed description of
the comparison methodology that has been sketched here).
In the fourth step of the methodology, 50 distance matrices (one matrix for
each concept), produced in the preceding step, were used to make inferences
about the etymological relationships of signs. The distance matrix for a concept
was used as input for a hierarchical clustering algorithm, such as UPGMA (Sokal
and Michener 1958). The clustering algorithm calculated a tree graph for each
concept, based on the relevant distance matrix. A distance threshold (0.5) was
defined to form clusters of signs, such that all signs belonging to a cluster had
distances in the relevant tree graph below the distance threshold (for an overview
of the use of clustering algorithms in historical linguistics, cf. List et al. 2018). All
signs within a cluster were inferred to share a common etymon. Because there
were varying numbers of responses for each concept, it was possible for a signer
to produce multiple signs that were inferred to share a common etymon. In such
cases (37 total), only one sign with a common etymon was included in the final
analysis for each signer.
In the final step of the comparison methodology, the clusters of signs inferred
in the previous step were used to calculate pairwise percentages of signs that
were inferred to share etymological origins. In the case of multiple responses for
one concept, the lowest number of responses in the pairwise comparison was
taken as the number of possible shared signs. For example, in the comparison of
signs for the concept ‘name’ in the responses of the first signer from Dushanbe,
and in Afghan Sign Language, the signer from Dushanbe produced one sign,
while two signs were available for Afghan Sign Language. The lowest number
of responses in this pairwise comparison was 1; hence, there was only one possible shared sign. In this example, the sign meaning ‘name’, produced by the
Dushanbe signer, was inferred to have etymological origins that are distinct from
either sign in Afghan Sign Language. Hence, the number of shared signs in this
comparison (the numerator) was 0, while the number of possible shared signs
(the denominator) was 1. Following this procedure, the percentage of signs in
each pairwise comparison that were inferred to share etymological origins was

260

Justin M. Power

calculated. This percentage represents the estimated amount of vocabulary in the
lexicons of each language that share etymological origins.

3.2.2 Results
Figure 2 shows the aggregated results of the pairwise comparisons of Tajik signers
with Afghan Sign Language (leftmost bar), of Tajik signers with Russian signers
(center bar), and of Russian signers with Afghan Sign Language (rightmost bar).
Based on the comparison methodology outlined in the previous section, on
average 79.7% (N comparisons=8, SD=3.77) of the basic vocabulary produced
by the Tajik and Russian signers share etymological origins. In contrast to the
relatively high percentage in the comparison of Tajik and Russian signers, both
comparisons with Afghan Sign Language yielded lower percentages of signs with
shared origins: on average 37.9% (N comparisons=4, SD=3) of signs in the comparison of Tajik signers with Afghan Sign Language are inferred to share etymological origins, and an average of 36.5% (N comparisons=2, SD=0.1) of signs in the
comparison of Russian signers with Afghan Sign Language are inferred to share
common origins. As a baseline (not included in the figure), we can compare the

Figure 2: Percentage of signs with shared etymological origins, inferred following the
methodology described in Section 3.2.1. Error bars represent one standard deviation
above and below the mean. SL=Sign Language.

5 Tajik Sign Language in context

261

Tajik signers with each other – and the Russian signers with each other. When
comparing the four Tajik signers, an average of approximately 87.6% (N comparisons=6, SD=3.9) of the basic vocabulary produced by these Tajik signers shares
etymological origins; and approximately 78.6% of the basic vocabulary produced
by the two Russian signers shares common origins.
The average percentages in the two comparisons that include Afghan Sign
Language in Figure 2 differed significantly from the average percentage in the
comparison of Tajik and Russian signers according to Welch’s t-test: for the first
comparison (Tajik Sign Language and Afghan Sign Language versus Tajik Sign
Language and Russian Sign Language), t(7) = -20.7, p < 0.001; and for the second
comparison (Tajik Sign Language and Russian Sign Language versus Russian Sign
Language and Afghan Sign Language), t(7) = -32.4, p < 0.001. That the two comparisons including Afghan Sign Language (i.e., Afghan Sign Language compared
with Tajik Sign Language versus Russian Sign Language) show similar results is
likely due to the similarity between Tajik and Russian Sign Language. That is,
because basic vocabulary among the Tajik and Russian signers is so similar, the
results of their separate comparisons with Afghan Sign Language are inevitably
also similar.
The relatively high percentages in the comparison of Tajik and Russian
signers are expected because of the history of deaf education in Tajikistan outlined in Section 2. However, the percentages in the other two comparisons – that
is, in the comparisons of Tajik and Russian signers with Afghan Sign Language –
may be higher than expected, given that the signing community in Afghanistan
has apparently had little interaction with the signing communities in Tajikistan
and Russia. What are the main contributors to the unexpectedly high percentages
of signs that are inferred to share etymological origins in the comparisons that
include Afghan Sign Language?
Figure 3 provides a breakdown of the results based on the basic vocabulary
categories outlined in Table 1 in Section 3.2.1. Two comparisons are shown in the
figure across these categories: Tajik signers are compared with Russian signers
(light gray bar) and the aggregated results of the Tajik and Russian signers are
compared with Afghan Sign Language (dark gray bar). Recall from Section 3.2.1
that the basic vocabulary concepts included in the comparison comprised 18
nouns, 12 adjectives/adverbs, 7 verbs, 5 color terms, 5 body part terms, 2 interrogatives, and 1 pronoun. As such, the five comparison categories in Figure 3
represent aggregations of differing numbers of comparisons – for example, more
signs were compared with nominal meanings versus body part terms. Because
there were only 2 interrogatives and 1 pronoun among the basic vocabulary concepts, these categories are not included in the figure below. In the comparison of
Tajik and Russian signers with Afghan Sign Language, none of the interrogatives

262

Justin M. Power

Figure 3: Breakdown of lexical similarity by basic vocabulary category. Error bars represent
one standard deviation above and below the mean. SL=Sign Language.

and pronouns were inferred to share common origins; when comparing the Tajik
and the Russian signers, 87.5% (14 out of 16) of the signs in these two categories
were inferred to share common origins. Thus, the results for these two categories,
although not included in the figure, are consistent with the overall results.
The results in Figure 3 show high percentages of shared vocabulary across
all categories in the comparison of Tajik Sign Language and Russian Sign Language. The average percentages in the noun (M=78.5, SD=6.3), adjective/adverb
(M=78.9, SD=6.3), and verb (M=78.3, SD=5.8) categories are all close to 80%; while
all signs produced for color terms (i.e., M=100) by Tajik and Russians signers
were inferred to be etymologically related. All of the average percentages in
these four categories are above the average percentage in the body part category
(M=66.7, SD=16.9). Recall that body part terms, which are represented in many
signed languages using indexical signs, have been excluded from some historical comparisons to avoid inflating the number of signs that apparently share
historical relations (cf. Woodward 2011). In the current comparison, the average
percentage of signs inferred to share common origins among Tajik and Russian
signers is lowest in the body part category.
In contrast to the consistently high percentages across all categories in
the comparison of Tajik and Russian signers, the percentages of signs inferred

5 Tajik Sign Language in context

263

to share common origins differ sharply across categories in the comparison
that includes Afghan Sign Language. For example, the percentages of signs
with putative common origins in the noun (M=23.1, SD=8.9) and color term
(M=22.2, SD=21.4) categories are relatively low compared with the percentages in
the verb (M=74.1, SD=14.3) and body part (M=65.8, SD=9.2) categories. The latter
two categories show percentages of shared vocabulary that are similar to the percentages in the comparison of Tajik and Russian signers. Hence, in the current
comparison, the noun and color terms categories most clearly differentiate the
two sets of languages – that is, the set of languages that is expected to be historically-related based on the extralinguistic history (Tajik Sign Language and
Russian Sign Language; cf. Sec. 2), and the group that is not expected to be historically related (any group that includes Afghan Sign Language; cf. 3.1); whereas
the verb and body part categories do not clearly differentiate these two groups.
The results reported in Figure 3 pertaining to the noun category may differ
from results in a previous study. Parkhurst and Parkhurst (2003) used three
concept lists – one with basic vocabulary, one with vocabulary hypothesized
to be non-iconic, and one with only nouns – in two separate comparisons that
included one set of putatively related signed languages and another set of putatively unrelated signed languages. They found that, when comparing results
across concept lists, their results based on the list of nouns were most similar for
the putatively related and unrelated signed languages; that is, the list of nouns
did not differentiate the two sets of languages as well as the other lists based
on the authors’ expectations about the historical relatedness of the languages in
their study. In contrast, in the current comparison, the results in the noun category clearly differentiate Tajik Sign Language and Russian Sign Language from
Afghan Sign Language in agreement with our expectations of the relationships
among these languages, based on the extralinguistic background.
Signs inferred to share etymological origins in the noun
and color term categories
Which signs in particular were inferred to share common origins in the two categories that best differentiated the languages in the current comparison based
on our expectations – that is, in the noun and color term categories? In the noun
category, signs representing 8 concepts (‘person’, ‘sand’, ‘sea’, ‘moon’, ‘night’,
‘pig’, ‘rain’, ‘river’) are shared by at least one Tajik or Russian signer and Afghan
Sign Language. Of these 8 concepts, signs representing 5 concepts (‘moon’,
‘night’, ‘pig’, ‘rain’, ‘river’) are shared by at least one Tajik signer and Afghan Sign
Language, but not by either Russian signer; while signs representing the other
3 concepts (‘person’, ‘sand’, ‘sea’) are shared by at least one signer of all three

264

Justin M. Power

languages. Signs that are shared across Tajik Sign Language and Afghan Sign
Language, but not Russian Sign Language, may represent evidence of a shared
Central Asian signing tradition; while signs that are shared across all three languages may be evidence of an historical connection among Tajik Sign Language
and Afghan Sign Language via the shared roots, in French Sign Language, of
Russian Sign Language and American Sign Language.
Consider first the signs that are shared by at least one Tajik signer and Afghan
Sign Language, but not by either of the Russian signers. Recall (in Section 3.2.1)
that, currently, there is no available qualitative methodology by which we can
differentiate vocabulary that is phonologically similar due to shared etymological
origins from vocabulary that is similar due to, for example, chance or iconicity. Of
the 5 signs shared by at least one Tajik signer and Afghan Sign Language, 4 signs
have important phonological differences. Consider, for example, the signs representing the concept ‘moon’ in (3); the leftmost set of symbols in these transcriptions represents handshapes, the second set of symbols represents locations, and
the rightmost set of symbols represents movements. Two signers from Dushanbe
(cf. 3a and 3b) produced signs that include path movements and changes in aperture – that is, the sign’s initial handshape changes from an open to a closed position, or vice versa, while moving along a path. Signs meaning ‘moon’ that were
produced by the two Russian signers (not shown here) also include path movements and changes in aperture. In contrast, the sign meaning ‘moon’ in Afghan
Sign Language (cf. 3c) includes only a path movement and not a change in aperture. Hence, because of these differences in their movements, the inference that
these signs all share common etymological origins is principally due to the similarity of the signs’ handshapes and locations.
(3) a.
b.
c.

‘moon’ produced by one Tajik signer
‘moon’ produced by a second Tajik signer
‘moon’ in Afghan Sign Language

The signs in (3) all represent the crescent moon, and these signs all begin with
similar locations: in the ipsilateral side of neutral space at either chest height
(3a), or slightly above shoulder height (3b and 3c). However, the representations
differ in important ways. The handshapes produced by the two Tajik signers trace
the outline of a crescent moon, or part of a crescent moon (cf. 3b). The handshape
starts from one tip of the crescent moon with the fingertips of the index finger and
the thumb in contact with each other; the handshape then moves to the widest
part of the crescent by separating the index and thumb while executing a path
movement, downwards and to the side, in an arc; and the handshape continues to move to the moon’s second tip by closing the tips of the fingers together a

5 Tajik Sign Language in context

265

second time. In contrast, the handshape in Afghan Sign Language (cf. 3c) iconically represents the crescent moon by extending the index finger and thumb and
slightly flexing these fingers at the interphalangeal joints. In light of the phonological and iconic-semantic differences between the signs in (3), it seems likely
that the inference of etymological relatedness represents a false positive in this
set of signs; a similar conclusion is likely justified for 4 out of the 5 examples
cited above – that is, those examples representing potential evidence of a shared
Central Asian signing tradition.
Consider next the set of signs (representing the concepts ‘person’, ‘sand’,
and ‘sea’) that are shared by at least one signer from all three languages. Recall
that this set of signs may be evidence of shared etymological histories for signs
in Tajik Sign Language and Afghan Sign Language that can be traced back, via
Russian Sign Language and American Sign Language, to those languages’
shared roots in French Sign Language. An etymological pathway of the type just
described appears plausible for the signs meaning ‘sand’ and ‘sea’, which are
represented by forms in both American Sign Language and French Sign Language
that are similar to the forms in the current comparison. For the sign meaning
‘person’, however, the form in American Sign Language differs from the forms in
the current comparison; hence, the etymological pathway described above seems
unlikely.
In the color term category, only 1 sign out of 5 is shared across Tajik and
Russian signers, as well as Afghan Sign Language. All of the signs meaning ‘red’
in the comparison were inferred to share one common etymon. The forms of these
signs include contact at the lips with the index finger. This form may be cross-linguistically common for signs meaning ‘red’: for example, the signs meaning ‘red’
in American Sign Language and French Sign Language share a similar form with
contact at the lips; but other, putatively unrelated signed languages, such as Japanese Sign Language and Turkish Sign Language, also have signs meaning ‘red’
that include contact at the lips with the index finger. Hence, while it is possible that the signs in the current comparison may share an etymological history
stretching back to French Sign Language, it is also plausible that the similarity in
the forms of the signs in the current comparison is due to an independent parallel development. That is, Afghan Sign Language may have developed a sign for
the concept ‘red’ that is similar to the sign meaning ‘red’ in Tajik Sign Language,
not because of a shared history, but instead because indexing the lips appears
to be a common strategy among signed languages for representing the concept
‘red’. None of the other Tajik and Russian signs that represent colors in the comparison (i.e., signs meaning ‘black’, ‘green’, ‘white’, and ‘yellow’) were inferred
to share common origins with signs in Afghan Sign Language. Thus, inferences
about the set of signs meaning ‘red’ may also represent a false positive.

266

Justin M. Power

3.3 Discussion
The results in the preceding section provide little evidence for a close historical
connection between Tajik Sign Language and Afghan Sign Language. Following
is a brief discussion of the results of the lexical comparison in light of two related
issues: (i) Tajik Sign Language and its relationships with other signed languages,
and (ii) the methodological issues, discussed at length above, pertaining to our
ability to understand the histories of signed languages. First, the results suggest
that close historical connections among signed languages may be reflected in
high percentages of shared vocabulary across multiple categories in the lexicon;
certainly, the results are consistently high across categories in the comparison of
Tajik and Russian signers. The overall percentage of shared vocabulary among
Tajik and Russian signers in the comparison is, on average, roughly 80%; and
the percentages of shared vocabulary across vocabulary categories are all above
roughly two-thirds. These results suggest that the Russian Sign Language lexicon
was apparently transmitted to Tajik signers as a whole – at least insofar as basic
vocabulary can be used to estimate the composition of the broader lexicon.
In contrast to the results of lexical comparisons of signed languages with
close historical connections, we may expect to find relatively low overall percentages of shared vocabulary in comparisons of signed languages with limited
or no historical connections. In the overall comparisons that included Afghan
Sign Language, the percentages (approx. 37% and 38%) of signs inferred to share
etymological origins are significantly lower than the overall percentage in the
comparison of Tajik and Russian signers (approx. 80%). Given the methodological problems for lexical comparisons of signed languages that were described in
Section 3.2.1, we should expect to find a nontrivial number of false positives in
lexical comparisons of any two signed languages (Guerra Currie et al. 2002; Woll
1983); this observation pertains to the current comparison as well. Thus, the exact
percentages reported here (i.e., 80% similarity of Tajik and Russian signers, and
approx. 37% similarity in comparisons between those signers and Afghan Sign
Language) should be interpreted with caution.
In addition to the lower overall percentages of shared vocabulary, we may
expect to find variable percentages of shared vocabulary across vocabulary categories in comparisons of signed languages that do not have close historical connections. Even though in the comparison including Afghan Sign Language the
percentages of shared vocabulary are high in the body part term (approx. 66%)
and verb categories (approx. 74%), the percentages are much lower in other categories. These variable results are consistent with the assumption that iconic
and indexical representations are not equally distributed throughout the lexicon
(cf. Woodward 1978; Parkhurst and Parkhurst 2003). That is, for signed languages

5 Tajik Sign Language in context

267

that do not share a close historical connection, we may expect to find relatively
high percentages of shared vocabulary in some vocabulary categories; however,
we should not expect to find high percentages across all vocabulary categories.
In the comparison with Afghan Sign Language, the relatively high percentage of
shared signs representing body part terms is unsurprising, given that the forms
of many of these signs are indexical; it is also expected that the percentage of
shared signs in the body part category may be among the highest percentages
of all categories in the comparison. In contrast, in the comparison of Tajik and
Russian signers, the average percentage of shared signs in the body part category
is lowest among the vocabulary categories. This result would be unexpected in a
comparison of signed languages that do not share a close historical connection.
In sum, when teachers of the deaf imported a variety of Russian Sign Language into Tajikistan, these teachers, apparently successfully, transmitted much
of the Russian Sign Language lexicon to deaf Tajiks: perhaps some 80% of the
basic vocabulary in that lexicon according to the current comparison. Although
not all signs in contemporary Tajik Sign Language have their origins in Russian
Sign Language (cf. Power 2020: 117–122), the lexical similarity of Tajik and Russian
signers, as measured by the methods presented above, is nearly identical to the
lexical similarity measured across the two Russian signers (on average, 79.7%
versus 78.6%). In contrast, despite the unexpectedly high percentage of vocabulary with shared etymological origins in the comparison of Tajik Sign Language
and Afghan Sign Language, the evidence is unconvincing in suggesting that
these shared signs represent a shared Central Asian signing tradition. Instead,
many of these signs are likely to be false positives that are indicative of intractable methodological issues in lexical comparisons of signed languages. Some of
these shared signs may indeed reflect a deeper shared history that can be traced
back via the histories of Russian Sign Language and American Sign Language to
French Sign Language. The divergent histories of the deaf education systems in
the two countries are likely the most important factors in accounting for the linguistic boundary between Tajik Sign Language and Afghan Sign Language.

4 Conclusion
Although Tajik Sign Language represents just one of the many languages contributing to Tajikistan’s linguistic diversity, this chapter has argued that the signed
language is unique among these languages. At less than a century old, Tajik Sign
Language is much younger than Tajik and all other spoken languages in Tajikistan. Because of its relative youth, it has proven possible to infer many details

268

Justin M. Power

surrounding the language’s emergence based on features of the early signing
community at the Leninsky School for the deaf; it was, principally, at that school
that a variety of Russian Sign Language was imported to Tajikistan by a group of
hearing teachers of the deaf. Section 2 described the history of deaf education in
Tajikistan and explored how Russian Sign Language was first transmitted from
teachers to students; and how, later, the language that evolved in Tajikistan was
transmitted among deaf Tajiks. As Section 2 has shown, the primary mode by
which Tajik Sign Language has been transmitted generationally has not been the
vertical language transmission of parents to children. Instead, as in other macro-signing communities around the world, Tajik Sign Language has been transmitted generationally in a variety of ways: obliquely from teachers to students
at schools for the deaf, horizontally among deaf peers, and vertically from older
students to younger students at these schools – and also vertically among deaf
family members, including from parents to their children as well as from older
siblings to younger siblings. In the variety of ways that the signed language has
been transmitted across generations, Tajik Sign Language is likely unique among
languages in Tajikistan.
Tajik Sign Language is also unique in the modality in which the language is
signed and perceived. Whereas Tajik Sign Language is articulated and perceived
in the gestural-visual modality, Tajik and all other spoken languages in Tajikistan are articulated and perceived in the oral-aural modality. This difference
in modality is not merely superficial: Section 3 highlighted several important
consequences that this modality difference has for language change and for theoretical investigations of the histories of signed languages; in particular, how
methods for investigating these histories are currently limited due to features
of signed languages that differ from the features of typical spoken languages,
such as the prevalence of iconic and indexical representations throughout
signed lexicons.
Despite these methodological challenges, the lexical comparison in Section 3
showed that the historical connection among the deaf education systems in
Russia and Tajikistan has had measurable linguistic effects, namely, close similarity in the basic vocabularies of Tajik Sign Language and Russian Sign Language. In contrast, there was much weaker support in Section 3 to indicate that
any signs across Tajik Sign Language and Afghan Sign Language have shared
etymological histories. Based on the results in Section 3, Tajik Sign Language
belongs squarely in the complex of signed language relationships with roots in
Russian Sign Language. Thus, although Tajik Sign Language is a unique language
in its geographic context – that is, in Tajikistan – it may share lexical and grammatical features with several other signed languages that developed via schools
for the deaf in other parts of the Soviet Union.

5 Tajik Sign Language in context

269

References
Abramov, Igor. 1993. History of the deaf in Russia: Myths and realities. In Renate Fischer &
Harlan Lane (eds.), Looking back: A reader on the history of deaf communities and their
sign languages, 199–206. Hamburg: Signum-Verlag.
Al-Fityani, Kinda. 2010. Deaf people, modernity, and a contentious effort to unify Arab sign
languages. San Diego, CA: University of California, San Diego dissertation.
Al-Fityani, Kinda & Carol A. Padden. 2010. Sign language geography in the Arab world. In Diane
Brentari (ed.), Sign Languages, 433–450. Cambridge & New York: Cambridge University
Press.
Amstutz, J. Bruce. 1986. Afghanistan: The first five years of Soviet occupation. Washington, DC:
National Defense University Press.
Anderson, Barbara A., Brian D. Silver, & Victoria A. Velkoff. 1987. Education of the handicapped
in the USSR: Exploration of the statistical picture. Soviet Studies 39(3). 468–488.
Atkinson, Quentin D. 2011. Phonemic diversity supports a serial founder effect model of
language expansion from Africa. Science 332(6027). 346–349.
Becker, Claudia & Hannah Eichmann. 2013. Strategies to vitalize Afghan Sign Language and
to support vocational education for deaf students in Afghanistan. Berlin: Humboldt
University.
Bittles, Alan H. 2001. Consanguinity and its relevance to clinical genetics. Clinical Genetics
60(2). 89–98.
Bouckaert, Remco, Philippe Lemey, Michael Dunn, Simon J. Greenhill, Alexander
V. Alekseyenko, Alexei J. Drummond, Russel D. Gray, Marc A. Suchard, & Quentin D.
Atkinson. 2012. Mapping the origins and expansion of the Indo-European language
family. Science 337(6097). 957–960.
Burch, Susan. 2000. Transcending revolutions: The tsars, the Soviets and Deaf culture. Journal
of Social History 34(2). 393–401.
Cavalli-Sforza, Luigi L. & Marcus W. Feldman. 1981. Cultural transmission and evolution: A
quantitative approach. Princeton, NJ: Princeton University Press.
Chafe, Wallace L. 1980. The pear stories: Cognitive, cultural, and linguistic aspects of narrative
production. Norwood, NJ: Ablex.
Coppola, Marie. 2020. Gestures, homesign, sign language: Cultural and social factors
driving lexical conventionalization. In Olivier Le Guen, Josefina Safar, & Marie Coppola
(eds.), Emerging sign languages of the Americas, 349–390. Boston, Berlin & Nijmegen,
The Netherlands: De Gruyter Mouton & Ishara Press.
Csapo, Marg. 1984. Special education in the USSR: Trends and accomplishments. Remedial and
Special Education 5(2). 5–15.
Chen Pichler, Deborah & Helen Koulidobrova. 2016. Acquisition of sign language as a second
language. In Marc Marschark & Patricia E. Spencer (eds.), The Oxford handbook of deaf
studies in language, 218–230. Oxford & New York: Oxford University Press.
Dolgopolsky, Aaron B. 1964. A probabilistic hypothesis concerning the oldest relationships
among the language families in Northern Eurasia. In Vitalij V. Shevoroshkin & T. L. Markey
(eds.), Typology, relationship and time: A collection of papers on language change and
relationship by Soviet linguists, 27–50. Ann Arbor: Karoma Publisher.
Eberhard, David M., Gary F. Simons, & Charles D. Fennig (eds.). 2021. Ethnologue: Languages of
the world. Dallas, TX: SIL International. http://www.ethnologue.com (accessed 14 July 2021).

270

Justin M. Power

Eriksson Per. 1998. The history of deaf people: A source book. Örebro, Sweden: Daufr.
Erting, Carol J. 1988. Acquiring linguistic and social identity: Interactions of deaf children
with a hearing teacher and a deaf adult. In Michael Strong (ed.), Language learning and
deafness, 192–219. Cambridge & New York: Cambridge University Press.
Evans, Peter, Diane Richler, Serge Ebersold, Mihaylo Milovanovitch, Denise Rosa, & Eluned
Roberts-Schweitzer. 2009. Reviews of national policies for education: Kazakhstan, Kyrgyz
Republic and Tajikistan. Paris: OECD Publishing.
Ewans, Martin. 2002. Afghanistan: A new history. 2nd edn. London & New York: Routledge
Curzon.
Fenlon, Jordan & Erin Wilkinson. 2015. Sign languages in the world. In Adam C. Schembri &
Ceil Lucas (eds.), Sociolinguistics and Deaf communities, 5–28. Cambridge & New York:
Cambridge University Press.
Fischer, Susan D. 2015. Sign languages in their historical contexts. In Claire Bowern & Bethwyn
Evans (eds.), The Routledge handbook of historical linguistics, 442–465. London &
New York: Routledge.
German, Austin. 2021. The emergence of segmentation in Zinacantec Family Homesign.
Presentation at the 20th meeting of the Texas Linguistics Society, University of Texas at
Austin, 5 March.
Ghufran, Nasreen. 2008. Afghans in Pakistan: A ‘protracted refugee situation’. Policy
Perspectives 5(2). 117–129.
Goldin-Meadow, Susan & Carolyn Mylander. 1990. Beyond the input given: The child’s role in
the acquisition of language. Language 66(2). 323–355.
Grenoble, Lenore. 1992. An overview of Russian Sign Language. Sign Language Studies 77.
321–338.
Guerra Currie, Anne-Marie P., Richard P. Meier, & Keith Walters. 2002 A crosslinguistic
examination of the lexicons of four signed languages. In Richard P. Meier, Kearsy Cormier,
& David Quinto-Pozos. Modality and structure in signed and spoken language, 224–236.
Cambridge & New York: Cambridge University Press.
Hale, Mark. 2015. The comparative method: Theoretical issues. In Claire Bowern & Bethwyn
Evans (eds.), The Routledge handbook of historical linguistics, 146–160. London & New
York: Routledge.
Hammarström, Harald, Robert Forkel, Martin Haspelmath & Sebastian Bank (eds).
2021. Glottolog 4.4. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://
glottolog.org (accessed 14 July 2021).
Hanke, Thomas. 2004. Hamnosys – representing sign language data in language resources
and language processing contexts. LREC 4. 1–6. Paris: European Language Resource
Association.
Holman, Eric W., Søren Wichmann, Cecil H. Brown, Viveka Velupillai, André Müller, & Dik Bakker.
2008. Explorations in automated language classification. Folia Linguistica 42(3–4).
331–354.
Jaccard, Paul. 1912. The distribution of the flora in the Alpine zone. New Phytologist 11(2).
37–50.
Joseph, Brian D. 1987. On the use of iconic elements in etymological investigation: Some case
studies from Greek. Diachronica 4(1–2). 1–26.
Kuvvatov, Sattor D. & Zikriyo P. Rahmonov. 2015. Имову ишора [Gesture]. Dushanbe, Tajikistan:
Maorif.
Labov, William. 1981. Resolving the Neogrammarian controversy. Language 57(2). 267–308.

5 Tajik Sign Language in context

271

Labov, William. 2020. The regularity of regular sound change. Language 96(1). 42–59.
Lane, Harlan. 1984. When the mind hears: A history of the deaf. New York: Random House.
Lanesman, Sara & Irit Meir. 2012. The survival of Algerian Jewish Sign Language alongside
Israeli Sign Language in Israel. In Ulrike Zeshan & Connie de Vos (eds.), Sign languages in
village communities: Anthropological and linguistics insights, 153–180. Boston, Berlin &
Nijmegen, The Netherlands: De Gruyter Mouton & Ishara Press.
List, Johann-Mattis. 2014. Sequence comparison in historical linguistics. Dissertations in
language and cognition. Düsseldorf: Düsseldorf University Press.
List, Johann-Mattis. 2016. Beyond cognacy: Historical relations between words and their
implication for phylogenetic reconstruction. Journal of Language Evolution 1(2). 119–136.
List, Johann-Mattis, Mary Walworth, Simon J. Greenhill, Tiago Tresoldi, & Robert Forkel. 2018.
Sequence comparison in computational historical linguistics. Journal of Language
Evolution 3(2). 130–144.
Lycett, Stephen J. & John A. J. Gowlett. 2008. On questions surrounding the Acheulean
‘tradition’. World Archaeology 40(3). 295–315.
Malkiel, Yakov. 1994. Regular sound development, phonosymbolic orchestration,
disambiguation of homonyms. In John J. Ohala, Leanna Hinton, & Johanna Nichols
(eds.), Sound symbolism, 207–221. Cambridge & New York: Cambridge University Press.
Mitchell, Ross E. 2006. How many deaf people are there in the United States? Estimates from
the survey of income and program participation. The Journal of Deaf Studies and Deaf
Education 11(1). 112–119.
Mitchell, Ross E. & Michael A. Karchmer. 2004. Chasing the mythical ten percent: Parental
hearing status of deaf and hard of hearing students in the United States. Sign Language
Studies 4(2). 138–163.
Mufwene, Salikoko S. 1996. The founder principle in creole genesis. Diachronica 13(1). 83–134.
Mufwene, Salikoko S. 2001. The ecology of language evolution. Cambridge & New York:
Cambridge University Press.
Nichols, Johanna. 1996. The comparative method as heuristic. In Mark Durie & Malcolm Ross
(eds.), The comparative method reviewed: Regularity and irregularity in language change,
39–71. Oxford & New York: Oxford University Press.
Nyst, Victoria. 2012. Shared sign languages. In Roland Pfau, Markus Steinbach, & Bencie Woll
(eds.), Sign language: An international handbook, 552–574. Berlin: Mouton de Gruyter.
Padden, Carol A. 2011. Sign language geography. In Gaurav Mathur & Donna Jo Napoli
(eds.), Deaf around the world: The impact of language, 19–37. Oxford & New York: Oxford
University Press.
Padden, Carol A. & Tom Humphries. 2006. Deaf people: A different center. In Lennard J. Davis
(ed.), The disability studies reader, 331–338. 2nd edn. London & New York: Routledge.
Parkhurst, Stephen & Dianne Parkhurst. 2003. Lexical comparisons of signed languages and
the effects of iconicity. Work Papers of the Summer Institute of Linguistics, University of
North Dakota Session 47. 1–17.
Parks, Jason. 2011. Sign language word list comparisons: Toward a replicable coding and
scoring methodology. Grand Forks, ND: University of North Dakota MA thesis.
Perniss, Pamela, Robin L. Thompson, & Gabriella Vigliocco. 2010. Iconicity as a general
property of language: Evidence from spoken and signed languages. Frontiers in
Psychology 1. 1–15.
Polich, Laura. 2001. Education of the deaf in Nicaragua. Journal of Deaf Studies and Deaf
Education 6(4). 315–326.

272

Justin M. Power

Power, Justin M. 2014. Handshapes in Afghan Sign Language. Grand Forks, ND: University of
North Dakota MA thesis.
Power, Justin M. 2020. The origins of Russian-Tajik Sign Language: Investigating the historical
sources and transmission of a signed language in Tajikistan. Austin, TX: University of
Texas at Austin dissertation.
Power, Justin M. 2022. Historical linguistics of signed languages: Progress and problems.
Frontiers in Psychology 13. 818753. doi: 10.3389/fpsyg.2022.818753
Power, Justin M. & Johann-Mattis List. 2020. PySl: Python library for the manipulation of sign
language data. Available at: https://github.com/lingpy/pysign
Power, Justin M., Guido W. Grimm, & Johann Mattis-List. 2020. Evolutionary dynamics in the
dispersal of sign languages. Royal Society Open Science 7. 1–15.
Power, Justin M., David Quinto-Pozos, & Danny Law. 2019. Can the comparative method be used
for signed language historical analyses? Presentation at the 13th Conference on Theoretical
Issues in Sign Language Research, Universität Hamburg, 26–28 September.
Power, Justin M. & Richard P. Meier. In submission. The early signing community in Hartford: A
quantitative view of its demographics and linguistic ecology from 1817–1867.
Pursglove, Michael & Anna Komarova. 2003. The changing world of the Russian Deaf
community. In Leila Monaghan, Constanze Schmaling, Karen Nakamura, & Graham H.
Turner (eds.), Many ways to be deaf: International variation in deaf communities, 249–259.
Washington, DC: Gallaudet University Press.
Rankin, Robert L. 2003. The comparative method. In Brian D. Joseph & Richard D. Janda
(eds.), The handbook of historical linguistics, 183–212. Malden, MA: Blackwell Publishing.
Reilly, Charles B. & Nipapon W. Reilly. 2005. The rising of lotus flowers: Self-education by deaf
children in Thai boarding schools. Washington DC: Gallaudet University Press.
Saify, Khyber & Mostafa Saadat. 2012. Consanguineous marriages in Afghanistan. Journal of
Biosocial Science 44(1). 73–81.
Sandler, Wendy. 2012. The phonological organization of sign languages. Language and
Linguistics Compass 6(3). 162–182.
Schembri, Adam C. 2010. Documenting sign languages. In Peter K. Austin (ed.), Language
documentation and description. Vol. 7, 105–143. London: SOAS.
Schick, Brenda. 2003. The development of American Sign Language and manually coded
English systems. In Marc Marschark & Patricia E. Spencer (eds.), Deaf studies, language,
and education, 219–231. New York: Oxford University Press.
Senghas, Ann & Marie Coppola. 2001. Children creating language: How Nicaraguan Sign
Language acquired a spatial grammar. Psychological Science 12(4). 323–328.
Senghas, Ann, Sotaro Kita, & Aslı Özyürek. 2004. Children creating core properties of language:
Evidence from an emerging sign language in Nicaragua. Science 305(5691). 1779–1782.
Senghas, Richard J., Ann Senghas, & Jennie E. Pyers. 2005. The emergence of Nicaraguan Sign
Language: Questions of development, acquisition, and evolution. In Sue T. Parker, Jonas
Langer & Constance Milbrath (eds.), Biology and knowledge revisited: From neurogenesis
to psychogenesis, 287–306. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Serve’s Hearing Impaired Project. 1995. Afghan Sign Language: Book one. Jalalabad,
Afghanistan: Serve.
Shaw, Claire L. 2017. Deaf in the USSR: Marginality, community, and Soviet identity, 1917–1991.
Ithaca & London: Cornell University Press.

5 Tajik Sign Language in context

273

Singleton, Jenny L. & Dianne D. Morgan. 2006. Natural signed language acquisition within the
social context of the classroom. In Brenda Schick, Marc Marschark, & Patricia E. Spencer
(eds.), Advances in sign language development by deaf children, 344–373. Oxford & New
York: Oxford University Press.
Singleton, Jenny L. & Richard P. Meier. 2021. Sign language acquisition in context. In Charlotte
Enns, Jonathan Henner & Lynn McQuarie (eds.), Discussing bilingualism in deaf children:
Essays in honor of Robert Hoffmeister, 17–34. New York: Routledge.
Sokal, Robert R. & Charles D. Michener. 1958. A statistical method for evaluating systematic
relationships. University of Kansas Scientific Bulletin 38. 1409–1438.
Stokoe, William C., Dorothy C. Casterline & Carl G. Croneberg. 1965. A dictionary of American
Sign Language on linguistic principles. Washington, DC: Gallaudet College Press.
Templeton, Alan R. 1980. The theory of speciation via the founder effect. Genetics 94(4).
1011–1038.
Williams, Howard G. & Polina Fyodorova. 1993. The origins of the St. Petersburg Institute for the
Deaf. In Renate Fischer & Harlan Lane (eds.), Looking back: A reader on the history of deaf
communities and their sign languages, 295–306. Hamburg: Signum-Verlag.
Woll, Bencie. 1983. The comparative study of different sign languages: Preliminary analyses.
In Filip Loncke, Penny Boyes-Braem, & Yvan Lebrun (eds.), Recent research on European
sign languages, 79–91. Lisse, the Netherlands: Swets and Zeitlinger.
Woll, Bencie, Rachel Sutton-Spence, & Frances Elton. 2001. Multilingualism: The global
approach to sign languages. In Ceil Lucas (ed.), The sociolinguistics of sign languages,
8–32. Cambridge & New York: Cambridge University Press.
Wood, Heather, David Wood, & Marian Kingsmill. 1991. Signed English in the classroom, II.
Structural and pragmatic aspects of teachers’ speech and sign. First Language 11(33).
301–325.
Woodward, James. 1978. Historical bases of American Sign Language. In Patricia Siple
(ed.), Understanding language through sign language research, 333–348. New York:
Academic Press, Inc.
Woodward, James. 2011. Some observations on research methodology in lexicostatistical
studies of sign languages. In Gaurav Mathur & Donna Jo Napoli (eds.), Deaf around the
world: The impact of language, 38–53. Oxford & New York: Oxford University Press.
Yoel, Judith. 2007. Evidence for first-language attrition of Russian Sign Language among
immigrants to Israel. In David Quinto-Pozos (ed.), Sign languages in contact, 153–191.
Washington, DC: Gallaudet University Press.
Zeshan, Ulrike & Connie de Vos (eds.). 2012. Sign languages in village communities:
Anthropological and linguistic insights. Boston, Berlin, & Nijmegen, The Netherlands:
De Gruyter Mouton & Ishara Press.

Leyli R. Dodykhudoeva

6 Tajik dialects of Badakhshan and
Shughnani: A comparative perspective
Abstract: The chapter analyses the interaction of modern Tajik and its dialects
with a group of so-called Pamir languages in the Mountainous Badakhshan
Autonomous Region of Tajikistan, where the population speaks various Iranian
vernaculars of Eastern and Western Iranian origin. In the early medieval period
this area was strongly influenced by the Persian language – the language of
administration, culture and science. Tajik is closely related to Persian; it evolved
along with modern Persian and Afghan Dari through the shared cultural background and language of the classical literature of the 10th–15th centuries. During
the second millennium, the Persian-Tajik language attained a high status
throughout Central Asia, and by the early 20th century, it became the official state
language of the Republic of Tajikistan. In modern Tajikistan the Western Iranian
Tajik language – the state language – is present in written and oral forms; various
heritage Eastern Iranian languages, the Yaghnobi language as well as the group
of Pamir languages are spoken there, none of which has a written tradition.
This study provides an overview of the sociolinguistic situation in the Mountainous Badakhshan Autonomous Region. It discusses the areal stratification
of the continuum of Tajik dialects and the interaction of these dialects with the
group of Pamir languages, in particular Shughnani. The research further highlights issues arising from these contacts, in particular concerning vocabulary and
word formation.

1 Introduction
This work analyses the interaction of modern Tajik – and some Tajik dialects –
with the Shughnani language1 in the Mountainous Badakhshan Autonomous
Region of Tajikistan, where the population speaks various Iranian vernaculars.
1 In this research, instead of the widely used form Shughni, we have adopted the form Shughnani,
as used by the people of Shughnan who derive it from the name of their area. This language form
resulted from Tajik Šuǧnonī, from the Tajik toponym Shughnon (Šuǧnon): the local name of this
language is Sh Xuɣnʊni, Xuɣnʊn ziv) (See TRS 2006; Karamšoev 1999 and the official site of the
Committee on Languages and Terminology under the aegis of the Government of the Republic of
Tajikistan—Kumitai zabon va istilohoti nazdi hukumati Jumhurii Tojikiston).
https://doi.org/10.1515/9783110622799-006

276

Leyli R. Dodykhudoeva

We know from historical and archaeological data (Gafurov 1989: 13–53, 54–82)
that until the medieval period this territory was inhabited by people speaking
Sogdian, Bactrian, and various other Eastern Iranian vernaculars. Later, the area
was strongly influenced by the Persian language – the language of administration, culture and science. Tajik is closely related to Persian; it evolved along with
modern Persian and Afghan Dari through the shared cultural background and language of the classical literature of the 10th–15th centuries (Rubinčik 1987: 115–116;
for more detail concerning the consolidation of modern Tajik and its delineation
with Persian and Dari, see Efimov et al. 1982: 5–12). During the second millennium, the Persian-Tajik language achieved a high and stable status throughout
Central Asia, and by the early 20th century, it became the official language of a
state, Tajikistan. Henceforth, this work will employ the term Persian-Tajik for the
period before the Soviet era.
Today in Tajikistan we encounter the Western Iranian Tajik language – the
state language – in written and oral forms, as well as various heritage Eastern
Iranian languages, the Yaghnobi language and the group of so-called Pamir languages, none of which has a written tradition.
This study discusses the genetic and areal division of Iranian languages in
the Mountainous Badakhshan Autonomous Region of the Republic of Tajikistan
(MBAR, Tajik Viloyati Muxtori Kūhistoni Badaxšon).2 We examine the areal stratification of the continuum of Tajik dialects and their interaction with Pamir languages, in particular Shughnani. The research further highlights issues arising
from the mutual contacts between western and eastern groups of Iranian vernaculars, and their mutual influence. The focus will be specifically on the Tajik
dialects of the Mountainous Badakhshan area of Tajikistan – Darvoz, Vanj, and
Ishkashim districts (see Map 1, identifying the areas where the population speaks
Tajik dialects). These dialects are then compared with Pamir languages – mainly
with Shughnani, Yazghulami, extinct Old Vanji, Ishkashimi and Wakhi, and also
other members of the Shughnani-Rushani group. This study of Pamir languages
reveals not only new linguistic data, but also various significant linguistic features reflecting the historical evolution of the people who speak these languages.
It also includes a brief description of the sociolinguistic situation in the Mountainous Badakhshan area of Tajikistan.

2 Mountainous Badakhshan Autonomous region, widely known as Gorno-Badakhshan.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

277

This study contains a synchronous analysis of the vocabulary of the Iranian
vernaculars spread throughout Tajik Badakhshan. To describe language processes, the focus will be mainly on lexical and word formation levels. The study
provides an overview of the vocabularies of Southern (mainly Badakhshani) and
South-Eastern Tajik dialects, comparing them with the Shughnani language and
other Pamir languages, thus revealing their common features.

Map 1: Map of MBAR indicating Iranian languages: Tajik and Pamir (Map courtesy of Yuri Koryakov).

278

Leyli R. Dodykhudoeva

2 Sociolinguistic situation
The Mountainous Badakhshan Autonomous Region of Tajikistan was established as a distinct region because a number of minority groups speaking various
Eastern Iranian vernaculars were living there.3 Due to the interaction of Western
and Eastern Iranian vernaculars in this area a diglossia developed over time.
Today, the Tajik language has a high level of prestige in the region; it is used in
all official situations and for documentation, denoting a level of education and
culture, and hence a higher social status.

2.1 Sociolinguistic situation in MBAR
We briefly describe the sociolinguistic situation in MBAR in order to show the
interaction between Tajik and Pamir vernaculars having different origins, status,
and functions (see also Dodykhudoeva (2004: 281). These vernaculars were
layered on different substrates and underwent significant typological restructuring. Over time, despite the differences in the origin of these languages, the
process of cohabitation in Badakhshan led to the creation of common features,
particularly as concerns vocabulary. The spatial relationship between Eastern
and Western Iranian languages is listed below (Tables 1 and 2), where subgroups
representing the current language situation in MBAR and its adjacent areas are
indicated.
Table 1: Iranian languages/dialects in MBAR: Eastern Iranian.
Eastern Iranian languages/
dialects

Geographical presence/function

Shughnani-Rushani group
Yazghulami
Old Vanji (extinct)
Ishkashimi
Wakhi

Shughnan, Rushan
Yazghulam
Vanj
Ishkashim
Wakhan

Languages of oral communication

3 Vanj and Darvoz districts were included into the Gorno-Badakhshan Autonomous Region in
1937. They were previously part of the former Gharm region.
4 Genetically this group also includes Sarikoli, today spoken in China.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

279

Table 2: Iranian languages/dialects in MBAR: Western Iranian.
Languages/dialects

Geographical presence/function

Tajik dialects

Means of oral communication

Darvozi

Darvoz

Vanji

Vanj

Badakshani

Ghoron, Ishkashim, Wakhan

Inter-Pamir Tajik
(“Inter-Pamir Porsi
(Forsi)”)

used in inter-ethnic communication
used in administrative centres in Khorogh and Ishkashim-Centre
✶
used in folklore, historical chronicles, religious treatises, scholarly
texts, authored poetry and prose (written texts); local media

Persian-Tajik

“sacred” language of religion

Standard Tajik

State language, spreading in MBAR from Dushanbe (capital of
Tajikistan)

✶
✶

All three Tajik-speaking districts in MBAR (see Maps 2–4) border the Panj
River. Darvoz and Vanj are adjacent districts located geographically in the northwest of MBAR, with the area of Darvoz extending some distance on both sides
of the Panj river in Tajikistan and Afghanistan. To the north of the Tajik area of
Darvoz lies Rasht district, including Vakhiyo Poyon region (see Map 2).
Vanj district is located geographically to the east of Darvoz; the area of Vanj
valley extends northeast away from the Panj river. Vanj district includes not only
Vanj valley but also Yazghulam valley, where people still speak one of the Pamir
languages, Yazghulami (see Map 3).
Ishkashim district – extending south – includes areas along the Panj river,
from Khas-Kharagh village to Lake Zorkul on the right bank. This district consists of three historical and administrative sub-districts (see Map 4): Ghoron,
Ishkashim, and Wakhan jamoats. All three historically included territories on the
left bank of the Panj river in Afghanistan. In the south-west of MBAR, the predominantly Tajik-speaking population of Ghoron and Ishkashim live in villages along
the Panj river, which serves as the border with Afghanistan. Ryn village and part
of Sumjin village, as well as villages along the Wakhan valley are mostly inhabited by people speaking dialects of the Eastern Iranian group, such as Ishkashimi
and Wakhi (see Map 4).

2.1.1 Geographical proximity
The Pamir languages are considered a geographical construct, since genetically
they are not a separate branch of the Eastern Iranian subgroup (for a recent anal-

280

Leyli R. Dodykhudoeva

Map 2: Map of Darvoz district and Vakhiyo (Map courtesy of Yuri Koryakov).

ysis and genetic classification of Iranian languages, see Korn 2016: 51–66). Within
that subgroup, a group of North Pamir languages (Shughnani-Rushani group, Yazghulami and extinct Old Vanji) all have close genetic relations. North Pamir languages are grouped with other Pamir languages (Ishkashimi and Wakhi) because
in the process of convergence, they all came to constitute a community based on
geographical proximity. These relations of affinity in the structure and typology
of Pamir languages overlap with the original genetic distinctions (Dodyxudoev
1970: 23–24; Dodykhudoev 1972: 463). Consequently, there are many common
features between all Pamir languages of the region at all linguistic levels under
the influence of areal factors (Édel’man 1980: 21–22, 1986: 217; Édel’man and
Civ’jan 2005).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

281

Map 3: Map of Vanj district indicating Vanji Tajik dialects (Map courtesy of Yuri Koryakov).

Similar common features are observed historically in the Persian-Tajik vernaculars of this area, leading to convergence of the broader Pamir-Hindukush
ethnolinguistic region and its Eastern Badakhshani branch (Grjunberg and Steblin-Kamensky 1974: 277–278; Steblin-Kamensky 1982: 4).
The Tajik dialects of MBAR are included in the areal convergence, due to their
Eastern Iranian substrates, similar religious traditions, and long historical economic and cultural evolution in close contact with Pamir languages. This leads to
the similarity of these languages at all levels, especially in vocabulary and morphosyntax. Since the 20th century, the Tajik language is represented in MBAR not
only by several local dialects, but also by its own literary form – the Tajik state
language, the language of education.

282

Leyli R. Dodykhudoeva

Map 4: Map of Ishkashim district indicating Tajik subdialects of Ghoron, Ishkashim
and Wakhan (Map courtesy of Yuri Koryakov).

2.1.2 Presence of the Tajik language and interaction with Pamir languages
As globalization spreads, the trend towards the uniformalization of language
continues apace. In Tajikistan, the Pamir languages are directly influenced by
Tajik; this leads to a process of shift5 in these languages, and their replacement
by the dominant Tajik language.
This process is especially intensive in areas where there is considerable
contact between languages (Rozenfel’d 1981). In these situations, the already
fragmented zones of various minority Pamir languages, such as Wakhi and Ishkashimi, heavily mixed with Tajik dialects, are now steadily shrinking. As such,
the boundaries of the region of Eastern Iranian (Pamir) languages are narrowing.
5 Language shift is the process whereby for various reasons people abandon their native language in favour of another (most widespread reasons for language shift are migration, resettlement, ethnic cleansing, etc.).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

283

One further case in point is in the northwest, where the Qarategin, Darvoz,
and Vanj dialects of Tajik are currently located (Zarubin 1924: 79; Rozenfel’d
1956a: 273–280). Although the boundaries of this area are currently blurred due
to several ethnolinguistic similarities, historically the area included a geographically broader presence of Eastern Iranian (specifically Pamir) languages, which
have now been replaced by Tajik.
These Tajik dialects assimilated Eastern Iranian vernaculars, previously
present in the area. In consequence, Tajik dialects located in southeastern
Tajikistan are complex linguistic formations which include the Eastern Iranian
substrate. This applies particularly to the dialects of Vanj, Darvoz, Ghoron, Ishkashim and Wakhan in MBAR, as well as to the Dari speech varieties of Darvoz,
Ghoron, Ishkashim, Wakhan, and Zebak in Afghanistan.
We know that several centuries ago, in the region under study, people
from Vakhiyo migrated to the lower areas of Vanj. As a result, those who had
settled earlier had to move to the upper highlands of the Vanj valley, where they
remained in several rugged areas; there they became known as “vanji(i tupti)”,
the indigenous population of Vanj, changing their language from Eastern
Iranian Old Vanji to the Tajik dialect (Rozenfel’d 1964: 21, 141). Similarly, in
Darvoz we find remnants of the Shia and Ismaili population, who in the early
20th century reported that their forefathers once spoke some Eastern Iranian
vernaculars.
As a result of these processes, Tajik dialects are a unique repository for a
number of Eastern Iranian languages that have already disappeared from the
geographical map. Thus, Vanji Tajik still retains certain elements of the Old Vanji
dialect (vanjiwori), recorded at the beginning of the 20th century by I. I. Zarubin,
M. S. Andreev and later by A. Z. Rozenfel’d (1964: 3–4). Their research confirmed
the presence, in Tajik Vanji dialects, of Eastern Iranian elements in phonetics,
morphology, grammar and vocabulary. This unique speech variety, Tajik Vanji,
preserves individual lexical and some other linguistic relics, which are currently
becoming archaic and declining in use.
In the past, the area of the Pamir languages was more widespread in the north
and northwest. This broader presence is verified by toponymic data. Old Vanji, an
Eastern Iranian language (closely related to Yazghulami), was still known in the
Vanj valley in the 19th century, but was no longer spoken by the early 20th, and
survived only in some words and occasional phrases written down by scholars
during their fieldwork (Zarubin 1924; Rozenfel’d 1964: 141): this Eastern Iranian
vernacular was supplanted by Vanji Tajik.
The people of Yazghulam valley still speak the Eastern Iranian Yazghulami
language, close to Old Vanji. However, because of outside influence, the population converted to Sunni Islam in the 19th century and made the transition to bilin-

284

Leyli R. Dodykhudoeva

gualism, strengthened by their resettlement in areas in central Tajikistan. Today,
in the lower part of the Yazghulam valley, in Xexak and Dašti Yazǧulom villages,
the population predominantly speaks Tajik.
In the southwest of MBAR, the boundaries of the region where Eastern
Iranian (Pamir) languages are spoken have also narrowed; here, Tajik dialects
of Badakhshan are now widespread. For example, the subdialect spoken in
Ghoron represents the speech of newcomers from Afghanistan who absorbed a
significant number of Eastern Iranian (particularly Shughnani) language features
(Rozenfel’d 1956a; Dodyxudoev 1975: 94). Here in Ghoron, according to tradition,
the Persian-speaking population was resettled from the Afghan side to work on
ruby mines. These migrants supplanted the indigenous inhabitants throughout
the Ghoron area. Consequently, under their influence, the local population also
switched to Persian-Tajik dialects, like the group of Rogh Tajik dialects spoken by
another group of people who resettled in southern Tajik areas of Khatlon district
from the area of Rogh in Afghan Badakhshan. The same situation occurred in
Wakhan and Ishkashim, where the Tajik subdialects of Wakhan (Steblin-Kamensky 1999) and Ishkashim (Nazarova 1998: 10) took root.
As early as the 1920s, Alexei D’jakov reported most of the population in
Ghoron as speaking Badakhshani Tajik with only some remnants of a specific
variant of Shughnani. Today only Tajik is spoken in the area (D’jakov 1975: 169).
Similar information was documented by E. Hojibekov, who interviewed residents
in the 1990s. They told him that their ancestors knew Shughnani in parallel with
Tajik and used it as a secret language in their childhood. Hojibekov also noted
some Shughnani vocabulary preserved by the elderly there (2009: 60–61). This
information is confirmed by the Shughnani origin of most toponymic names in
Ghoron (Dodyxudoev 1975: 66). The presence of Shughnani influence, especially
in vocabulary, is also attested in textual records of the local Southern Tajik dialects (ŠJZT 1980, 5: 277).
The zone contraction of Pamir vernaculars is typical for Ishkashim and also
Wakhan. D’jakov verifies for Ishkashim that in the first half of the 20th century, in
the regional administrative centre of Ishkashim district, people spoke Ishkashimi
and Tajik (Porsi), adding that in Nyut village there were speakers of Ishkashimi
and “several families who speak the Badakhshani dialect of the Tajik language”
(D’jakov 1975: 169; see also: D’jakov 1931: 85–90). Today Ryn is the only village
in both Tajik and Afghan parts of Ishkashim to be completely inhabited by Ishkashimi speakers.
In the Wakhan valley, Anna Rozenfel’d documented the Tajik language mainly
in Udit and in a number of other villages: Daršai, Čiltok, Yamg (Rozenfel’d 1964).
Of the four villages which were considered Tajik-speaking by the local population, the only truly Tajik-speaking village was Udit, inhabited by immigrants from

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

285

Afghanistan who called themselves sayyids. Its inhabitants did not marry the
local Wakhi population, nor did they speak Wakhi. Inhabitants of the other three
villages were bilingual (Steblin-Kamensky 1970: 213). At present, many inhabitants of Dašt and Namadgut villages have also switched to Tajik.
Along with migration, we commonly observe the process whereby parents
favour the majority language, instead of passing on their own mother tongue to
their children. Such a sociolinguistic situation in the region leads us to consider
the Pamir languages, which are in proximity and in constant contact with Tajik
and its dialects, to be “endangered” languages as a result of rapid socioeconomic
change, beginning in the 1990s.
Historically in Badakhshan the term “Porsi” was used to denote a local form
of Tajik – a literary language, the language of oral and written communication,
and the language of interethnic communication. In the mid-20th century, the
term was coined by A. Z. Rozenfel’d as “Inter-Pamir Porsi (Forsi)” to designate a
language used for interlingual communication between all the peoples inhabiting MBAR (Rozenfel’d 1975: 210). This term is generally used to describe the language of folklore, historical chronicles, religious treatises, scholarly texts, and
authored poetry (in oral or written form). (On functions of this language variety,
see Table 2).
In the early 1940s, Alexander Boldyrev, analysing the language used in literary works from Ghoron, Shughnan and Wakhan, described this same language
variety as Badakhshani Tajik, and confirmed that it was used as “a literary language” for the whole of Badakhshan (Boldyrev 1948: 277).
By the mid-20th century, this Badakhshani Tajik had already become closer to
literary and spoken standard Tajik, differing mainly in some phonetic and vocabulary features.
Today with education fully conducted in standard Tajik, most active generations switch to this form in oral and written communication, using it in most
areas as the language of education, administration, media and culture, as Standard Tajik is the state language, the language of governmental bodies, and that of
interaction between the capital and MBAR.
The role of Khorogh as a centre where Tajik is predominant results from the
presence there of several prestigious colleges and the University of Khorogh, all
of whose language of instruction is Tajik. This is reinforced by the local social
media, as well as by regional departments of broadcast networks and by the
regional branch of the Association of Writers (as local authors write poetry in
mother tongues, but prose mainly in Tajik). The first, and still the only, novel in
Shughnani was published in 2017 (Xudobaxš Xudobaxšov. Zindagi az naw ts͡a sʊd
sar (If life begins again).

286

Leyli R. Dodykhudoeva

As a consequence of Persian-Tajik’s long-held status as the language of both
written literature and folklore, most genres of folklore in Wakhan were recorded
in Persian-Tajik (Steblin-Kamensky 1970: 213). This trend was mirrored in religious
Ismaili poetry – Sh maddo (from Persian madh), Persian-Tajik qasida, and qasoyed
‘panegyrics to holy venerated persons’ – which was composed in Persian-Tajik
and performed during Ismaili religious rituals. At the same time, there exists a
corpus of folklore poetic texts in Shughnani, ranging from short quatrains to fulllength lyrical poems.
In general, because of these processes, it can be observed that multilingualism (as a rule, subordinate bilingualism) among the peoples of Tajikistan is not
universal, but rather quite a complex, multifaceted phenomenon with interference arising from the mutual interaction of Tajik with Shughnani (and with other
Pamir languages).
In everyday life, the Pamir folk speak their native language, but in interethnic contacts they speak Tajik or Shughnani. The prevalence of the latter language
has recently grown in the public domain, as it has begun to be used in literature and the creative arts. The number of Shughnani-speaking poets and writers
(e.g., Lidush Habib) has increased. In Tajikistan, Tajik and Russian, as well
as English constitute the languages exerting the most influence on the Pamir
languages.

3 Dialectal profile of MBAR: Iranian vernaculars
of Badakhshan and their distribution
In Tajikistan, since the mid-20th century, comprehensive language data on local
Tajik dialects have been methodically collected by means of documentation projects and extensive fieldwork undertaken by large field expeditions. The resulting
analyses have demonstrated the distribution of various Tajik dialects.
A key milestone in this research was V. S. Rastorgueva’s (1964) seminal work
An experience of the comparative study of Tajik dialects. This study presented
the distribution of Tajik dialects in Tajikistan and adjacent regions (Rastorgueva
1964: 154–182). On further examination, these results were verified and described
in greater detail (Efimov et al. 1982: 51–52).
A. Z. Rozenfel’d based her dialectal work on an earlier approach, which
classified only three groups of Tajik dialects: Northern, Central, and South-Eastern (1964: 20; 1971: 35). The last of these combined the Southern and SouthEastern groups of dialects (identified by Rastorgueva) into a single SouthEastern group.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

287

In Rastorgueva’s classification, which we draw on for the purposes of this
study, she specified particular zones of distribution (see Table 3):
1) Northern group dialects, located around cities in Tajikistan and adjacent
regions, and covering the subdialects of particular settlements: Asht, Chust,
Qassansai, Leninabad, Konibodom, Isfara, Bukhara, Samarkand, Ura-Tyube
Shakhristan, Penjikent, Boysun, Derbent.
2) Central group dialects, located in Upper Zeravshan in the areas of Falgar,
Matcha, and along the valley of the Fan-Darya river.
3) Southern group dialects spread in Kulob (Khatlon), Qarategin (Rasht), the
Vakhiyo region of Qarategin, Vakhiyo-Poyon6 (i.e., lower Vakhiyo), and parts of
Varzob zone (Kulobi and Qarategini dialects), as well as in part of Hisor zone
(Viloyati). This group also includes Badakhshani vernaculars. For a full list of
Southern dialects, along with details of their classification, see (Rastorgueva
1964: 154–185; Južnye govory 2014: 252–255; Xorkašev 2014b: 231–233). Additionally, there exists a group of Rogh dialects in Tajikistan, some speakers of
which came from Afghanistan to Tajikistan, although most have remained on
the other side of the Panj river.
4) South-Eastern group dialects (Darvoz group), covering the area of Darvoz
close to the Panj river, and the Vanj valley of Tajikistan.
Outside of the four main dialectal groups of Tajik we find:
1) Upper Chirchik and South Fergana dialects, representing three varieties of
transitional dialects between Central and Northern groups. The phonetic
system (vocalism) of these dialects resembles that of the Central zone, and
the verb system and syntax are of the Northern type;
2) dialects of Vakhiyo-Bolo (the territory of the upper course of the Obikhingou
river), also known as Vakhiyo-of-Darvoz. These are transitional between the
Southern and South-Eastern groups. They demonstrate the Southern type
of vocalic system, while their verb system is of the South-Eastern type (Rastorgueva 1964: 162).
Table 3 shows the division of the Tajik language into four groups of dialects.
Transitional dialects, as well as details on Northern and Central dialects, are not
included.

6 The territory of the lower course of the Obikhingou river.

288

Leyli R. Dodykhudoeva

Table 3: Tajik language dialects grouped into areas of distribution, showing classification
of Southern and South-Eastern vernaculars according to Efimov et al. (1982: 12), with some
additions.7
Groups of Tajik
dialects

Northern

South-Eastern (Darvoz)

Southern

Qaroteginī,
Vaxiyo Poyon

Varzob: Qaroteginī,
Kūlobī
Hisor: Viloyatī
N. Kūlobī
S. Kūlobī
W. Kūlobī

Roǧī

Badaxšonī

Central (Upper
Zeravšon)

Jorf village

Qurǧovat, Pošxarv
villages

Sediya: Sangovi Daroz,
Durobak, Patk(i)nou villages

Šekaī : Širǧovat, Yoged, Škev,
Ravnou

Vanjī subdialects

This study focuses on only two groups of these dialects – Badakhshani dialects and Darvoz group, covering the area of Darvoz close to the Panj river, and
the Vanj valley of MBAR in Tajikistan.
Today, the classification of Tajik dialects is not limited to their division into
four large groups. Priority is given to description of as many varieties as possible,
so that the entire dialectal profile of Tajik language variants can be reconsidered
through the prism of these smaller units. For instance, the internal division of
7 The names of the dialects are based on the Tajik tradition of naming, see (Xorkašev 2014b:
231–233).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

289

the South-Eastern group has not yet been precisely established, in spite of the
fact that the Darvozi and Vanji groups of Tajik dialects have been described in
detail (Rozenfel’d 1956; 1964). The same applies to the Badakhshani Tajik dialect,
which has not been documented further since the research conducted by Rozenfel’d (1971); consequently, the bulk of our assumptions are based on her works.
Only the Southern and South-Eastern groups of dialects are present in Mountainous Badakhshan Autonomous Region and adjacent areas. In Table 4, we list the
main settlements of speakers of these groups of dialects, including the dialects of
Vakhiyo and Vakhiyo-Poyon, constituting the Southern group; and the transitional
dialects of Vakhiyo-Bolo. The Table also includes the available list of place names
for the Afghan areas of Ghoron and Darvoz.
Table 4: Distribution of the Tajik language and its dialects in the territory of MBAR and its
adjacent regions (Rozenfel’d 1964: 19–21, 1971: 183; Steblin-Kamensky 1970; Ofaridaev
2002: 7, 8).
Tajik dialects

Geographical presence of sub-dialects (names of settlements)

Afghan Darvoz (on the left
bank of the Panj river)✶

Daršer, Zarnif, Čoštak. Moymay, Ubagi, Šikay, Ubagni Sifli, Folon,
Darai Juy, Surbon Kaškdara, Xvoxon, Jirmob, Varkad, Dar Šikay,
Gumai Sifli, Xasob, Šing, Daru, Bomar, Darai Xuf, Xonon, Darai
Parina, Xoldasak , Manov, Havzi Šox, Zingire, Jomarči Bolo,
Jomarči Poyon, Nisay, Xordara, Gumay

South-Eastern: Darvoz

Lower: Punišor, Nulvand, Sangevn, Višxarv, Žag, Xostav, Ziǧar,
Šikev, Yoged, Širgovad
Upper: Qalayi Xum(b), Ruzvay, Dašti Luč, Xumbivarī, Zing, Širg,
Xost, Madrasa, Zingi Roǧ, Daštak, Umarak, Gušun, Rubot,
Jorf, Kevron, Visxarv, Birovg, Xurk, Ubaǧn, Šodag, Toǧmay,
Kurgovad, Pošxarv
Saǧirdašt✶ area: Saǧirdašt, Kulumbai Bolo, Kulumbai Poyon,
Saydon, Čuxkak, Kamčak, Safedoron, Qalai Husayn, Pastiroǧ,
Sariparom, Xur, Vučun, Sabzixarf, Langaro, Puštaroǧ, Čorsun,
Marǧak

South-Eastern: Vanj

Lower: Viskrog, Bičxarvak, Roharv, Bovid, Odešt, Dašti Roǧ,
Langar, Gišxun, Gumayak, Vodxud, Baravn, Buniga, Varavz,
Langarak, Laxš, Uzbai, Daštak, Pišixarv
Middle: Bunay, Potov, Sed, Texarv, Širgovad, Rav, Jovid, Baravn,
Čihox, Ravgada, Poimazor
Upper: Texarv, including: (Doršir) M(i)dixarv, Gijovas, Sitarg,
Murgitga, Van-van Bolo, Van-van Poyon, Ušxarvak, Puni Jangal,
Gumast, Yazgo, Langar, Rovand, Garmčašma, Sungad

290

Leyli R. Dodykhudoeva

Table 4 (continued)
Tajik dialects

Geographical presence of sub-dialects (names of settlements)

Southern: Badakhshani

Ǧoron (north to south): Xasxaraǧ, Andarob, Šoidara,✶ Devlox,✶
Sinib, Ǧorj, Žənd,✶8 Xosguni,✶ Kuyi Lal, Sist, Vogz, Šambede,
Qozde, Baršor, Bəǧəš (Boldyrev 1948)
Iškošim: Malvoj, Nyud (Ishkashim centre), Sumjin9
Waxon: Dašt, Namadguti Poyon, Namadguti Bolo, Daršay,
Šitxarv, Udit10
Šoxdara (Tavdem, Corj)11

Dialects of Afghan
Ghoron✶:
Settlements along the left
bank of the Panj river:
Settlements on the Ghoroni
Bolo plateau away from the
river bank

Budorbund, Čuksang (Ček), Šexbek, Gəlboǧ, Zeč, Andoj,
Nawobod;
Yifč, Wanud, Qozde; Uned, Dorumador, Safedsang, Sund,
Žorvax, Tiršore, Bezlinj (Boldyrev 1948: 279)

Southern

Vaxiyo-Poyon (Vakhiyo of Qarategin)

Transitional: South to
South-Eastern

Vaxiyo-Bolo (Vakhiyo of Darvoz)

✶

These dialects have not been subject to comprehensive research

3.1 Main classification of the dialectal profile
In the classification of V. S. Rastorgueva (1964), a primary feature is the reflection of the historical vowel-system; the group of back vowels ✶u, ✶ū, ✶ō were the
subject of detailed analysis. This group of vowels underlies the development
of the vocal system of each dialect cluster. It is represented in many words and
morphemes and gives clear isoglosses; it is also reinforced by differences in the
verb system, as well as in syntax and vocabulary. Based on these features, Tajik
8 Žənd is now known as Garmčašma.
9 A. Z. Rozenfel’d also included the villages of Avj and Yakhshivol in the Ishkashimi Tajik subdialect (Rozenfel’d 1971: 183).
10 A. Z. Rozenfel’d indicated that there are few Tajik speakers in the villages of Čiltok, Yamg and
Nižgar (1971: 183). She found that in 1968, the village of Namadgut was Wakhi-speaking (Steblin-Kamensky 1970: 209). At present, its population, especially in the lower part of the village,
speaks both Wakhi and Tajik.
11 Speakers of Munji, present until the mid-20th century as documented by A. Z. Rozenfel’d in
the villages of Tavdem and Corj (Shakhdara), and in the Andarstez suburb of Khorog (1971: 183),
have now switched to Shughnani. The Munji idiom is extinct.

291

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

dialects were divided into four distinct groups: Northern, Central, Southern, and
South-Eastern. Each group is divided into a set of dialectal units, which in turn
break down into smaller groups of dialects and subdialects. In many cases, the
isoglosses of individual dialectal phenomena do not coincide, so it is difficult to
distinguish the smaller subdialects. The boundaries between individual subdialects and dialectal units are not fixed, as dialects gradually transition from one to
another. Isoglosses of individual dialectal phenomena extend beyond the borders
of Tajikistan into Uzbekistan and Afghanistan – and as far as the northern part of
Iran (in Khorasan) (Rastorgueva 1964: 154–162; Efimov et al. 1982: 12).
Due to the fluidity of these boundaries, some dialectal groups were not classified, or were classified as transitory. In recent years, work on their specification
has continued (see, e.g., Xorkašev 2014a, 2014b).
Herein, the focus is on two groups of Tajik dialects – Southern (in particular
Badakhshani), and South-Eastern (including dialects of the Panj area of Darvoz
and Vanj valley) – which are present in the territory of the Mountainous Badakhshan Autonomous Region (MBAR) in Tajikistan.
In the Southern type of Tajik vocalism significant changes have occurred in
the quantitative characteristics of vowels in comparison with the Northern type
(that in the 20th century until the 1990s was the basis of standard Tajik) and
the literary language. For a diagram of the Southern type of Tajik vocalism, see
Figure 1 below.
Front

Central
i

Back
m

Close-mid

e
Open-mid

Open
Figure 1: Southern Tajik vocal system.

292

Leyli R. Dodykhudoeva

The Southern Tajik group vocal system significantly differs in the development of the vowels of the back row, which in modern dialects are represented
by reflexes of the historical short /✶u/, and long /✶ū/ and /✶ō/. The short
vowel /✶u/ was replaced by /ɯ/, and long vowels /✶ū/ and /✶ō/ contracted
into one stable phoneme /u/ (Rastorgueva 1964: 34–35, 157). The phoneme
ɯ in all phonetic positions is counterposed with all other stable vowels as a
shorter sound (for more details on the specific quality of this vowel in Southern Tajik dialects, see Sokolova et al. 1952: 156–164; Rastorgueva 1964: 26). As
regards the other two unstable vowels (i, a), they do not contract as much in
the open stressed syllable as in other Tajik dialects: in the stressed syllable
and in the closed unstressed syllable they hardly differ at all. Consequently,
according to the experimental research of V. S. Sokolova, the juxtaposition of
groups of stable and unstable vowels in the Southern Tajik vocalism is blurred
to a certain extent, and the convergence of these groups is manifested in the
general opposition to the phoneme ɯ (Sokolova et al. 1952: 156–158, 165–167
[statistical data]).
Today in the Southern group vocal system we find an inventory of six phonemes, where /u/, /e/, /o/ are stable, /i/ and /a/ are “neutral”, and /ɯ/ is significantly reduced. This last phoneme represents a close back unrounded vowel
(Nemenova 2013: 19, 25–28).
According to R. L. Nemenova (1963: 60–67; Efimov et al., 1982: 53) the vocal
inventory of the South-Eastern group of dialects of Darvoz (or Upper Panj) has
eight phonemes (Figure 2).12 In some of its features, it is similar to the vocalism
of the Northern Group, and in others it resembles the Southern group.
In the vocal system of the South-Eastern (Darvoz) group, the historical /✶u/,
/✶ū/ and /✶ō/ were replaced by /ɯ/, /ʉ/, /ɵ/ (Rastorgueva 1964: 157). Over time,
the main distinctive features of the Darvozi Tajik vocal system have evolved as
follows (Nemenova 1963: 60–67; Rastorgueva 1964: 27–28):

12 However, Rozenfel’d (1956: 200–201) suggested a different interpretation of the vocal inventory of Darvozi Tajik. Her system consists of seven phonemes: three stable (e, o, ū) and four unstable (i, a, u and ə). She considers ü a labialized variant of ū or u, and for the Yoged subdialect
she documents ů, having a majhul-type quality. She also observes a labialized variant – ü – in
Vanji Tajik, which corresponds to the literary ū. A similar vowel labialization was encountered
by Zarubin in Bartangi and Roshorvi. Apart from that, Rozenfel’d considers the reduced ə to be
a central close-mid unrounded phoneme that can replace the unstable i, a, u, in an unstressed
syllable.

293

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

Central
i

Back
–
u

Front
Close

Close-mid
e
e

Open-mid

Open

Figure 2: Darvozi Tajik vocal system (Nemenova 1963: 60–67).

1) The historical short /✶u/ has transformed into /ɯ/ (as in Southern dialects):
(1) a) bəz
‘goat’
b) gəl
‘flower’
In this chapter, in further language data concerning Southern and South-Eastern
Tajik dialects, ɯ is denoted as ə (schwa), according to its usual romanization (see,
for instance, Rozenfel’d 1964, 1956). This symbol is written as ъ in cyrillic-based
sources, Russian and Tajik (Nemenova 2013; ŠJZT 1980, 5).
2) The historical long /✶ū/ has changed its quality, by turning into a front rounded
stable vowel /ʉ/, represented by R. L. Nemenova as /ü/. She described its position
as slightly advanced back, and of near-close height:
(2) a) düd
‘smoke’
b) angür
‘grapes’

294

Leyli R. Dodykhudoeva

In R. L. Nemenova’s view, this is a separate phoneme in the main Darvozi dialects. However, according to research by A. Z. Rozenfel’d (1956: 200), in several
Darvozi subdialects (excluding Qalay Khumb and Vakhiyo Bolo), the sound [ü] or
[ɵ] in question represents a variant either of the phoneme of a majhul-type quality
/ɵ/, or alternatively of /u/; nevertheless, this sound is rare and is not represented
elsewhere in the present study.
3) The vowel /ɵ/ which has a majhul-type quality (derived from the Cl Pers, Middle
Pers ō wāwi majhūl) – corresponding to /ů/ in R. L. Nemenova’s research, and to
literary Tajik /ū/ – has been preserved, but differs somewhat in quality from the
equivalent phoneme of Northern dialects, being the sound of a formation more
advanced in terms of position (front-to-central mixed row, not back-mixed, as in
Northern dialects):
(3) a) rů̂z
‘day’
b) dů̂st
‘friend’
c) rů̂
‘face’
d) mů̂
‘hair’
4) The historical long /✶ā/ (modern Tajik o) in the position before the nasal consonants has transmuted into a rounded u, in R. L. Nemenova’s records /û/: nûn
‘bread’, xûna ‘house’; she considers this an autonomous phoneme in Darvozi Tajik
that does not match /ɵ/. This opinion is supported by V. S. Rastorgueva (1964:
27–28). The similar phenomenon of the tendency of protrusion of ā before n or m
up to the protruded u can be seen in Tehrani Persian (L. S. Pejsikov 1960: 19–20).
In this discussion of specific features of vocabulary, no distinction is made
between u-type vowels in Tajik dialects of Vanj and Darvoz (Nemenova’s ů, ü and
û), and instead we follow Rozenfel’d (1964, 1956) in representing them as in literary Tajik, as u or ū.
5) The historical long /✶ī/ and short /✶i/ have merged, forming a single phoneme /i/.
The division between Tajik dialects, based on phonetic features, is also confirmed by data on the verb system, and on certain syntactic and lexical characteristics. In morphosyntax these characteristics are as follows: the prepositive

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

295

definition of possession, as in muallima kitobaš ‘teacher’s book’; various infinitive and participle constructions in Northern dialects; and the expression of
immediate future action using the archaic past participle: this precise feature,
present in the South-Eastern and Central dialects, is absent in Southern dialects.
This archaic form of the past participle is equal to the stem of the past tense: raft
‘gone’, kard ‘done’, furūxt ‘sold’: xūnaša mebara bad furūxt ‘Take it home, then
sell it’ (Rastorgueva 1964: 157).
In Badakhshani Tajik dialect, including Ghoron and Nyut subdialects, V. S. Rastorgueva identifies the main characteristic features as follows: the Southern type of
vocalism; transition of o into majhul ū before nasal consonants; absence of pharyngeal consonants (the consonant system consists of 22 consonants); spirantization
of the consonant b in the intervocal position: kavud < kabud ‘blue, green’, xoṷ < xob
‘dream’, vudan < budan ‘to be’; the Southern type of verb system; low frequency of
gerund constructions; and specific vocabulary (Rastorgueva 1964: 162).

3.2 Specific symbols and rules for Tajik dialects
and Shughnani
In Tajik dialectological literature describing Tajik dialects – and Shughnani – , it
is not customary to mark the last stressed -i in the word, cf. literary Tajik -ī. This
study generally follows representation of the short -i, but in morphological analysis, it introduces -í for some sensitive cases when treating suffixes.13
It should be emphasized that in the course of several generations, the introduction of literary Tajik in school education has resulted in the erosion of many
dialectal features as children absorb rules of standard Tajik from an early age.

3.2.1 Specific consonants in Tajik dialects
In Southern and South-Eastern dialects, the consonant system is typical for the
Tajik language; nonetheless, some differences are observed:
a) the articulation of a voiced labial fricative consonant, which in literary Tajik
(and modern Persian) is labiodental (v), and in Dari bilabial (w). In Badakhshani
Tajik, v and w are used interchangeably and represent allophones [v] and [w].

13 In this chapter, language data refer to the Badakhshani Tajik dialect unless specifically indicated otherwise.

296

Leyli R. Dodykhudoeva

According to A. Z. Rozenfel’d (1971: 9) these constitute a single undelineated
phoneme v/w.
(4) a) TBV v/walǧang
‘source of the irrigation channel’
b) w/voskat
‘waistcoat’
c) lav/wz
‘speech’
d) w/vatan
‘motherland’
A similar process is observed in the transition of v (from b) into [w]:
(5) a) ow/v < ob
‘water’
b) w/vega < begoh (Sh vegā)
‘yesterday’
Tajik dialects of MBAR, in both Southern and South-Eastern groups, retain labial
articulation as supported by Pamir phonemes and by the pronunciation in everyday speech of native speakers of Pamir languages. We observe shifts in articulation from TBVD b to v, as in the set of words:
(6) a) čarv < čarb
‘clarified butter, ghee’
b) vekor < bekor
‘idler’
c) TV evol < ubol
‘sin’
d) angəšv/wona < T anguš(t)pona
‘thimble’
In Badakhshani and Vanji Tajik, the articulation of a voiced labial fricative consonant as bilabial (w) is common at the beginning of a word:

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

297

(7) a) wunjo
‘there’
b) wəlčak (Sh wulčak)
‘measure’
c) wəlanga, wulonga (Sh wilongā)
‘flame’
b) positional changes: here we see the difference in the Badakhshani consonants /k/
and /g/, which in standard Tajik are generally backlingual, or hard. In Badakhshani
Tajik, like the adjacent vernaculars of the Shughnani-Rushani group, these velar
consonants have back lingual articulation mainly before the back vowels. Before
the front vowels, at the end of the syllable, and in the absolute final of the word,
they tend to adopt midlingual palatalaized pronunciation [ḱ́], [ǵ], which acoustically creates the impression of softness, as illustrated in the following examples:
(8) a) čig’ina
‘wooden sledge’
b) g’iriftan
‘to take’
c) nuk’
‘tip, beak’
d) g’arm
‘warm, hot’
In these Tajik dialects, as in literary Tajik, there also exists a phoneme /q/ – an
uvular, occlusive, voiceless consonant, considered to have entered Tajik with
vocabulary borrowed from Arabic and/or Turkic languages. In this context, we
note a group of words, where the phoneme /q/ is used in regular correspondence
with k, which may suggest that this group has an Iranian origin:
(9) a) k/qal‘a
‘fortress’
b) k/qənǧola
‘betrothed’
c) k/qurut
‘kurut, dried cottage cheese’

298

Leyli R. Dodykhudoeva

d) -k/qati
postposition
In the Soviet period, literary Tajik adopted from Russian the midlingual voiceless
affricate ts ͡ (which was often replaced in dialects by s); this is currently excluded
from the literary Tajik alphabet. In Badakhshani Tajik, in many cases, this consonant ts ͡ functions in parallel with the equivalent in various adjacent Pamir languages and is therefore retained in pronunciation.
In addition, in the mid-20th century, various examples of borrowed vocabulary were documented, including not only ts ͡, but also dz͡ (Rozenfel’d 1982):
(10) a) ts͡ərax(d)ək, (Sh ts͡iraxak)
‘spark (from fire)’
b) s/ ts͡ətraxs
‘ritual dish of milk and butter cooked on the first night at the summer
pasture’
c) s/ ts͡ətraxm
bot. ‘name of the incense plant, Helichrysum’
d) dz͡inga(k), (Sh dz͡ingak),
‘sacred part of the hearth’
and its homonym
e) dz͡ingak (Sh dz͡ingak)
‘short string of the rubob (musical instrument)’.
Additionally, an example of the use of Sh ç (x; on the use of this symbol see below)
was documented in Ghoron (Rozenfel’d 1971: 10).
Historically in the South-Eastern Tajik dialects of Darvoz, two types of pharyngeal consonants – lower pharyngeal h and upper pharyngeal ḥ – were present
(for more, see Rastorgueva 1964: 166). Their pronunciation gradually loosened
to the standard Tajik h. In earlier times, pharyngeal consonants were used for
a prothetic function: in the anlaut position at the beginning of the word. This
phenomenon is also characteristic for Badakhshani Tajik, Shughnani and other
Pamir languages (Rozenfel’d 1956: 201–203). However, see modern Badakhshani
Tajik examples for cardinal numerals such as aft ‘7’, ašt ‘8’, čor ‘4’, čil ‘40’, and
azor ‘1000’ (94a, 94b, 94c, 94d, 94e), where this prothetic consonant is absent;
the same is true for Shughnani ‘40’ and ‘1000’.

299

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

3.2.2 Specific Shughnani vowels and consonants
As in this section Tajik dialects are compared with the Shughnani language, for
greater clarity we provide a Shughnani vowel diagram with long (ī,14 e, ɛ, ā, o, ʊ,15
ū) and short (i, a, u) vowels (see Figure 3 below).
On the whole, the vocalism of the Shughnani language, as well as of Wakhi,
is heterogeneous consisting of two interacting subsystems: a subsystem of vocalism of the indigenous vocabulary and a subsystem of vocalism of words borrowed
from standard Tajik (Grjunberg and Steblin-Kamensky 1976: 547). These distinctive features of vocalism are common for all Pamir languages that have no written
tradition and exist in conditions of diglossia and active bilingualism. The same
trend is also manifest in Tajik dialects, where younger generations of speakers
gradually transfer to standard Tajik rules.
Front

Central

Back

Ω
Close-mid

e
Open-mid

o
ε

Open

Figure 3: Vowel system of the Shughnani language.

We also indicate below the Shughnani consonant system (in IPA symbols). To
clarify the use of symbols, see Table 5:

14 Three long Shughnani vowels are represented by a symbol with macron: ī, ā, ū; these all have
short pairs: i, a, u.
15 Usually romanized as ů.

300

Leyli R. Dodykhudoeva

Table 5: Inventory of Shughnani consonants.
Labial

Dental

Alveolar

Plosive

Nasal

Affricate
Fricative

θð

Approximant

Rhotic

Alveolo-palatal

Velar

Uvular

χʁ

ts͡ dz͡

tʃ ͡ dʒ͡

ʃʒ

xɣ

Glottal

(h)

In general, we follow the representation of symbols introduced in this volume
for the Tajik language. However, we have introduced some additional IPA symbols
for Shughnani: these are ɛ, ʊ, ī, ā, ū, and several consonants θ, ð, ts͡, dz͡, ɣ and x; as
the symbol x16 is used for other purposes, to denote this sound in Shughnani we
use ç (voiceless palatal fricative), a consonant proximate in quality.

4 Specific features in the morphology of Tajik
dialects and Shughnani
Traditionally, the following basic parts of speech can be identified: noun, pronoun,
adverb, numeral, verb, preposition, postposition, particle, and conjunction. This
chapter will deal only with nominal entities. There is usually no morphological
distinction between noun, adjective, and adverb, so that the identification of
some words as belonging to one of these categories requires additional semantic
and syntactic data. In nominal forms, we mainly encounter cases of agglutination
(along with internal inflection in Shughnani).

4.1 Nominal categories
In this section, we will examine the following nominal categories: classificatory
gender (masculine and feminine gender expressed lexically in animated nouns),
number (singular and plural), and definiteness/indefiniteness.

16 Usually romanized in Pamir languages as x̌ ; see, for example Zarubin 1960.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

301

4.1.1 Classificatory gender
As in all Western Iranian languages, there is no category of gender in Tajik, so
morphological markers of gender are absent. The same applies to the Tajik dialects of Badakhshan. However, in some cases natural gender is shown.
In many cases in Shughnani, there is a tendency for the category of gender to
be transformed into a classificatory system according to the principle of semantic classes. In this system, abstract nouns are masculine, while concrete nouns
are classified according to a specific cognitive pattern, generally attributing the
feminine gender to parts of the body, clothes, and tools; as well as to landscape
terms. Terms for inanimate objects and those for animals appear in the masculine
gender when referring to species in their entirety, while concrete objects are feminine, irrespective of natural gender.
As regards animate nouns, their gender can be expressed lexically. The following apply to humans:
(11) a) mardak ‘man’ – zanak ‘woman’ (Sh mard, čorik ‘man’ – ɣinik ‘woman’),
b) bača ‘young man, son’ – duxtar ‘girl, daughter’ (Sh ǧidā ‘young man,
son’ – ǧāts͡ ‘girl, daughter’).
In the case of animals, special words apply for male and female species:
(12) a) xorus ‘rooster’ – TV motkin ‘broody hen’, TB mokuk ‘broody hen, female
bird’, Tajik of Wakhan murǧ(i kərk) (Sh murǧ ‘rooster’ – kuruk ‘hen not
giving eggs’)
b) bəqa-gov ‘bull’, barza-gov ‘ox; bull’ – gov ‘cow’ (Sh çīj, gow ‘bull’ – žow ‘cow’)
c) navband, gəsola ‘calf’ – ǧunojin, fərǧom/nč ‘heifer’ (Sh šīg, nʊbānd ‘calf’
– farǧemts͡ ‘heifer’)
d) guspan(d) ‘ram, sheep’, mešak ‘mountain sheep’, mešaki nar ‘ram’ –
meš ‘ewe’ (Sh miɣīj ‘ram, sheep’ – maɣ ‘ewe, sheep’)
Alternatively, gender can be designated by a special word added to the name of
an animal, creating a compound. In Tajik, these words are nar ‘male animal’ and
moda, moča ‘female animal’, narkin ‘of masculine gender, male’ and modkin ‘of
feminine gender, female’ (cf. new Tajik term narmoda ‘androgyne’):
(13) a) narbəz ‘male goat’– mo(da)bəz ‘female goat’ (Sh buč ‘male goat’ – f. vaz
‘female goat’)
b) narkavg ‘male partridge’ – modkavg ‘female partridge’ (Sh narkawg
‘male partridge’ – f. kawg ‘female partridge’)

302

Leyli R. Dodykhudoeva

In the Shughnani language, which has grammatical gender, the special words
for ‘male’ (nīr) and ‘female’ (sitiredz͡) can be added to just one word in a pair to
designate its gender, because in the other word in the pair, gender is implied by
its lexical form:
(14) a) nīr-dz͡arīdz͡ ‘mountain partridge’ – f. dz͡arīdz͡ ‘female mountain partridge’
b) nīr-kiçɛpts͡ ‘male magpie’ – f. kiçɛpts͡ ‘female magpie’
The similar formation TV bobo rubiz ‘grandfather fox’ was documented (Rozenfel’d 1964: 21). This is a remnant of East Iranian Old Vanji rupč where the term for
‘grandfather’ was added to confirm gender in Tajik.
(15) a) bobo rubiz
‘Grandfather fox’
b) Old Vanji rupč
‘fox, vixen’
Cf. Sh rʊpts(͡ ak) ‘vixen, fox’ which is feminine in gender, but can also be used
with additional words marking gender:
(16) a) nīr-rʊpts͡ak
‘fox’
͡ k
b) sitiredz-͡ rʊptsa
‘vixen’
This last tendency in Shughnani and Rushani probably evolved under the influence of Badakhshani Tajik (Édel’man 1987: 288).
The same principle applies in Badakhshani Tajik when designating a young
male or female:
(17) a) bačamard (Sh id.)
‘young male’
b) bačazan (Sh id.)
‘young female’
Cf. the Caucasian Tat (Juuri) language, where one term is used for both fatherand mother-in-law; however, where a distinction is needed, words identifying
gender are used, i.e.,

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

303

(18) a) merdxuəsuər
‘father-in-law’
b) zexuəsuər
‘mother-in-law’

(Nazarova 1985: 215)

In Tajik of Vanj it is assumed that the pair buča, biča(k) ‘kid, cub’ are old remnants of gender; these forms are reflexes of opposition between the feminine and
masculine genders that were designated separately in Old Vanji (Laškarbekov
2008: 96).
In the same way, this gender distribution is also confirmed in other Tajik dialects by a pair of words designating gender.
For example, in the accessible data for the Tajik of Qarategin and Vakhiyo we
encounter the opposition:
(19) a) narbu/iča
‘he-goat’
b) modbiča
‘she-goat’
In Shughnani gender distinction is preserved morphologically in a small number
of substantives, animate nouns and adjectives, and is marked by an ablaut. These
Shughnani pairs of gender opposition are stable:
(20) a)
b)
c)
d)

m. čuç ‘cock’ – f. čaç ‘hen’,
m. guj ‘he-goat’ – f. gij ‘she-goat’.
m. kud ‘dog’ – f. kid ‘bitch, puppy’,
m. bukul ‘calf’ – f. bakal ‘heifer one and a half years old’.

Some other examples of the vestiges of archaic gender distinction are apparent
in the vocabulary of the Tajik dialect of Vanj. Comparison of the Vanji Tajik term
connected with the interjection used to call goats and the Old Vanji term for
‘she-goat’ leads us to infer that this is another remnant of archaic morphological
gender distinction:
(21) a) TV geč
interjection used to call goats
b) Old Vanji keč
‘she-goat’

(Laškarbekov 2008: 95–96)

304

Leyli R. Dodykhudoeva

This is confirmed by the Shughnani form:
c) geč-geč, Sh geč-a-geč
‘interjection used to call goats’
In another example, the distinctive feature of remnants of gender distinction in
the Tajik dialect of Vanj is supported by an Old Vanji interjection when shooing
dogs – a reflex of the feminine gender in the form of an i-umlaut vocalization
(Laškarbekov 2008: 96) – and also by a term of abuse in Vanji Tajik consisting of
the elements kud- ‘dog’+ padar ‘father’:
(22) a) TV gudapiyar
dog.father
‘curse (your) father’
b) OV ket
an interjection when shooing dogs
These examples can be compared with Sh m. kud ‘dog’ and f. kid ‘bitch, puppy’,
which are reflexes respectively of ✶kuta- and ✶kuti- (Sokolova 1967, §67).
In Old Vanji, ket is a reflex of the feminine gender with an i-umlaut vocalization (Laškarbekov 2008: 96), verifying the presence of remnants of gender distribution in Vanji Tajik. This perspective is partially confirmed by the following data
from Vakhiyo-Qarategini Tajik:
(23) kəčə/uk ‘puppy’ – kəč ‘bitch, female species’

(Xorkašev 2014a: 220, 287)

In modern Shughnani, elements of word formation include highly productive components m. -buts͡ (<✶putra-) – f. -bits͡ (<✶putri-) (Sokolova 1967, §67) ‘child, young species’:
(24) a) m. wārg-buts͡
‘lamb’
b) f. wārg-bits͡
‘ewe’
The same trend can also be seen in the following:
(25) a) bajgi
‘cub, baby animal’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

305

b) TV bajgak
‘adolescent, teenager’
Cf. TV, OV bič ‘younger sister’ (from Ir. ✶-putri- or a later form evolved by analogy)
(Laškarbekov 2008: 96).
Suffixes in the Tajik dialects of Vanj – -(o)č, -(a)j, -ič, -ij – , still productive at
the beginning of the last century, can be traced back to the Old Vanji feminine
suffix ending in ✶-či. With the support of these suffixes, nominal stems evolved in
Vanji Tajik (Laškarbekov 2008: 96). The following are examples:
(26) a) zam(i)č < ✶zam-či-, (Yaz zamč, Sh zimts͡)
‘a plot of arable land’
place names: Boǧi zamč, Gil zamč, Regi zamč
b) čapč < ✶zam-či-, čamb- ‘to bend’
‘old worn-out shoes’
cf. (Yaz čapč ‘a type of rawhide footwear’, Sh čapli ‘sandals, worn-out
shoes’,
Wakhi ‘id.’, Pashto čaplay ‘sandals’)
c) pač < ✶pači- (?) or ✶pasči-?
‘sheep dung’
(cf. Sh paxč ‘small cattle droppings’)
This suffix ✶-či, which was present in Old Vanji was lost in the Vanji Tajik word:
(27) a) OV farč
‘sheep’, cf. (Sh farǧemc ‘heifer’)
b) TVD far
‘farrow cow, barren (about dairy cattle)’
In Vanji Tajik, words with the final voiceless suffix -č, which merged with the
stem lost their apparent connection with gender at an early stage. However, this
suffix was still used in the formation of new feminine nouns based on the Old
Vanji models (Laškarbekov 2008: 96) as follows:
(28) zalič
‘a kind of bird’

(cf. Yaz žaražg, Sh zarīdz͡ ‘mountain partridge’)

306

Leyli R. Dodykhudoeva

Several words belonging to this group were preserved in the Old Vanji lexical lists
of the Russian linguists I. I. Zarubin and M. S. Andreev and were used in upper
Vanji dialects in earlier periods (Laškarbekov 2008: 91, 96):
(29) a) xaraj < ✶xvahar-či- ‘sister’
‘woman’
b) xaloč
‘old woman’
c) palič with metathesis from ✶lap(a)-ači ‘lip’ (Wakhi lafč)
‘lip’
(cf. Sh lafč ‘lip of an animal’)
d) sipoj <✶spā-ka‘threshed wheat on the threshing floor after wind’ (cf. Yaz sapin- ‘to fill’,
Sh sipen-, ✶spāi-)
In Vanji Tajik, this suffix, -ič, is also used in place names for plots of arable land:
(30) a) Galabanič
the name of a plateau in Vanj used for grazing cattle, based on the
herdsman’s profession, from T gala-bon ‘herdsman’
b) Xargič
‘large field’, from T xar- ‘big’

4.1.2 Number
In Badakhshani Tajik, the singular form has no markers as in other Tajik dialects.
The most productive and polyfunctional plural suffix is -(h)o:
(31) a) mardo, marako ‘men’
b) zanako ‘women’
In Shughnani, the suffix identifying plurality -(y)ēn can be used occasionally with
the addition of vowel alternation, as a means of constructing grammatical forms:
(32) a) šīg ‘calf, heifer’– Pl. šagen
b) čīd ‘house’ – Pl. čaden

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

307

In Badakhshani Tajik, the suffix -o/u/ūn (Tajik -on) has a collective connotation;
it designates mostly human beings and parts of the body:
(33) a) kason ‘men’
b) čašmon ‘eyes’, lavon ‘lips’
It also marks names of birds and animals:
(34) a) mərǧon ‘birds, hens,
b) gərgun ‘wolves.
The suffix -o/un can be included in toponyms to mark plurality. The following
instances refer to place names in Vanj and Darvoz:
(35) a) Mazori Čilduxtar(ak)un
‘shrine of forty girls’
b) Qalandaron
‘(place of) wandering dervishes’
c) Safedoron
‘name of the village’, safedor ‘poplar’
This type of suffix is also documented in modern speech in the Tajik dialects of
Khatlon (un < on):
(36) kərtaporakənun
‘preparation of the bride’s clothes, dress’

(Xorkašev 2014a: 92)

In Shughnani, specific suffixes designating kinship are used in parallel with the
most common suffix -en; these kinship suffixes include -y/gʊn (yon < ✶ān, which
can also denote collectiveness), -jʊn, and -ār:
(37) a) xolayen – xolayʊn
‘aunts’
b) oçnoyen – oçnogʊn
‘close friends, mates’
c) nibosen – nibosjʊn
‘grandsons’

308

Leyli R. Dodykhudoeva

d) viroden – virodār
‘brothers’
Moreover, the Shughnani archaic suffixes -orj, and -ɛrd͡z were preserved longer
in the Bajuw highlands, where they were used until the mid-20th century (Karamšoev 1963: 91):
(38) a) mijād-en, mijād-orj, mijād-ɛrdz͡
‘wives of brothers’
b) abīn-en, abīn-orj, abīn-ɛrdz͡
‘co-wives’
c) pats-͡ en, pats͡-ɛrdz͡
‘sons’
Plural forms of nouns ending in -o grammatically denote multiple periods of time.
However, semantically, these forms can express a single approximate unspecified
period and can be used as adverbs:
(39) a) ruzo ‘approximately in these days (we will go)’
b) saaro ‘some time in the morning’
c) ǧoyato ‘for some time, for a period of time’, cf. TD ǧot, TQ ǧoti ‘time’
(Sh ǧot ‘id.’),
TVD amǧalo ‘now, recently’
We observe the same process in TQ:
(40) av/wulo ‘from the very beginning’
Similarly, the Shughnani suffixes -jev and the Bajuwi suffix -yʊn both grammatically denote plurality when connected with words designating time; although
semantically they may also refer to collectivity in terms of a repeated time period:
(41) a) sāryʊn, sār-jev ‘some time in the morning, (every day) in the morning’
b) maðoryʊn, maðor-jev ‘in the afternoon(s)’, (every) afternoon’
c) zimistʊnyʊn, zimistʊn-jev ‘in winter(s)’ (cf. Rushani zimistōniōn)
(Karamšoev 1963: 91)

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

309

The suffix -o is widely used throughout the region. It may represent either the
common suffix -ho, or alternatively (y)on, or (y)un with n dropped at a later stage:
we know that the final -n is often dropped and the vowel o changes into u before
nasal consonants. Words with such endings could represent a contamination
of two suffixes at a later stage. This phenomenon is widespread in toponyms
throughout southern areas of Tajikistan (Murvatov 2013: 66–67).
With this in mind, we might conclude that in the region of Vanj the marker of
the plural -(y)o(n) is represented in toponyms such as:
(42) a) Čirniyo
b) Razo
c) Xambako

(Rozenfel’d 1964: 147–148)

In Badakhshani Tajik the suffix -ot (denoting plural number and collectivity) is
documented in a number of cases:
(43) kavgot(o)
partridge.pl.pl
‘partridges’, with the addition of a secondary -o
In Vanj valley, a number of place names with this suffix have been documented:
(44) a) Murǧot (Rozenfel’d 1964: 147–148)
b) Kuyi Kalot, Pasi kalot, Peškalot, Sari kalot (Ofaridaev 1991)
In this connection a comparison can also be made with place names having the
component qalot in Shughnan and Rushan regions (Dodyxudoev 1971: 57).
In Badakhshani Tajik, a term ending in the suffix -ot is used specifically with
reference to the Pamir highlands:
(45) Pomirot
Pamir.pl.pl
‘the Pamirs’

(Rozenfel’d 1971: 49)

4.1.2.1 Collective nouns and unspecified plurality
Plurality and a notion of collectiveness may be expressed by using alliterative
patterns of the components gala ‘herd, group of horses; many’, and dasta ‘group’.

310

Leyli R. Dodykhudoeva

The illustration below shows how plurality may be indicated in Southern and
South-Eastern dialects by the specific marker gala ‘group, flock, herd, group of
horses’.17
This component can refer to human beings as well as animals:
(46) a) TV ǧažd-gala, saǧər-gala, TVD čaǧər(a)-gala ‘small children, boys’
(Rozenfel’d 1964: 93, 1956b: 205)
b) TBVD bač(a)-gala ‘children (boys)’, TV duxtar-gala ‘a group of girls’
(Rozenfel’d 1964: 10)
c) TD xar-gala ‘a herd of donkeys’ (Yagn xar-gala ‘id.’)
(Rozenfel’d 1956b: 205)
In all Tajik Southern and South-Eastern dialects, the compound galagow/v ‘a
yoke (group) of threshing oxen’ is present; it is also present as a loanword in all
Pamir languages.
In some local dialects, there are different compounds containing this element,
e.g.
(47) galagurg, TK galagərg
‘a pack of wolves; fig. ‘dangerous people’

(FGJZT 2012: 65)

The same marker is present in Shughnani:
(48) a) bač-galā
‘children, boys’
b) ǧāts-͡ galā
‘children, girls’
Furthermore, in Shughnani, another component -xel is also used to denote ‘group’:
(49) a) vaz-xel ‘goats’
b) ǧā ts͡-xel ‘girls’

17 Cf. T gala bačaho čillakbozī mekardand ‘a group of boys were playing “tip-cat” (in this game a
sharpened stick is beaten with another stick)’ (Rahimzoda B. Šaxčanor, Dušanbe 2010).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

311

This tendency is especially reinforced by reduplication:
(50) a) TQ gala-gala ‘group-by-group, swarm, crowd; many’ (FGJZT 2012)
b) dəsto-dəsta ‘many, crowd’
Additionally, there exists the expression ya gala ‘many, crowd’.
In Vanji Tajik, a particular expression is used to denote unspecified plurality:
(51) yak xel
‘a group (of inhabitants)’

(Rozenfel’d 1964: 54)

In Badakhshani Tajik and Shughnani, alliterative patterns can be used to denote
plurality and a notion of collectiveness:
(52) a) čoy-poy, čoy-moy (identical in Shughnani)
‘all sorts of tea; everything for tea-drinking’;
b) Sh wiskūndak-piskūndak
‘all sorts of forks’
Additionally, specific components with similarity as their meaning are used in
Shughnani, for example miǧūnd ‘and such, similar, so on’:
(53) jogā-miǧūnd
‘crockery, pottery and such’
Another component to denote unspecified plurality is the postposition -adis ‘and
such’, used mostly in Bajuw and adjacent regions of Shughnan:
(54) a) garðā-adis
‘bread and so on’
b) somʊnā-adis
‘dress and so on’
The same idea of collective plurality is expressed in Badakhshani Tajik terms
denoting kinship, and close community, through the suffix -gí:
(55) a) momogí
‘grandmothers, women collectively associated with the grandmother’s
relatives (awlod)’

312

Leyli R. Dodykhudoeva

b) amsoyagí
‘neighbours’
(Rozenfel’d 1971: 27–28)

4.1.3 Definiteness/Indefiniteness
Singular nouns denote not only a separate object, but also objects in general – a
category of objects. One of the means of individualizing an object – distinguishing
a single object from a set of homogeneous objects – is the post-positive (enclitic)
indefinite or, more precisely, the excretory article:
1) this article indicates the singularity and specificity of the object
2) in complex sentences, the article is attached to the noun, followed by the
attributive subordinate clause related to it: in this case the function of the
article is purely excretory, as it highlights a separate object (or phenomenon)
in order to further reveal some of its features in the subordinate clause, and
thereby concretize it (Efimov et al. 1982: 110).
The numeral “one” (T yak) can also be used to denote indeterminacy and unspecified singularity, sometimes with the simultaneous attachment to the noun of a
post-positive article (Efimov et al. 1982: 111).
Western Iranian vernaculars express (in)definiteness mainly by the postpositive excretory article, while in Eastern Iranian vernaculars the dominant means of
expressing definiteness is the demonstrative and/or the numeral “one” (Gadilia
2019: 130). However, demonstrative pronouns are widely used in this function in
colloquial Tehrani Persian (Pejsikov 1960: 44).
4.1.3.1 Definiteness
In Badakhshani Tajik definiteness is expressed by the definite article -e/i:
(56) mardak-i kale
budast
man-iz
bold.art be.prf.3.sg
‘he was the bold man’

(Rozenfel’d 1971: 13)

(57) puloy-i
ki
ba u
doda
bədam,
na
money.pl-art that to him give.ptcp be.PPFV.1sg no
kam-aš
bist sum bud
less-dem.3sg 20
ruble be.pst.
‘the money that I gave him was no less then 20 rubles’ (Rozenfel’d 1971: 44)

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

313

Definiteness may also be expressed with the numeral one, depending on the context:
(58) yak mardək xat-a
xub medona,
Kiyobekov
one man.suf writing-acc good pref.know.prs.3sg Kiyobekov’
‘one man (this one) is very educated, Kiyobekov’
(Rozenfel’d 1971: 59)
In all Southern and South-Eastern dialects, as in literary Tajik, definiteness can
be expressed by the postposition -ro/a, -a, as it marks the direct object, e.g.,
(59) duǧ-a
kašidan az
širi
xomukí
yogurt-acc pull.inf from milk-iz sour
‘(this) yogurt is prepared from raw sour milk’

(Rozenfel’d 1971: 13)

(60) TV duxtar-a bin,
čodar-a
tori
sar
karast
girl-acc look.imp shawl-acc top-iz head do.prf.3sg
‘look at the girl, she pulled the shawl over her head’
(Rozenfel’d 1964: 34)
In all Southern Tajik dialects, in order to perform the function of the definite
article, -ro/a is often used next to a word with a modifier expressed by a demonstrative pronoun:
(61) padaro-i
ino-ro
mešinosam
father.pl-iz they-acc pref.know.prs.1sg
‘I know their fathers’

(Murvatov 2013: 68)

In addition, definiteness can be formally expressed by preposing forms of the demonstrative pronoun; (am)i(n) ‘this’ or (am)o/u(n) ‘that’ serve as definite articles:
(62) a) in mard ‘this (very) man’ – un zan ‘that (very) woman’
b) in ra ‘this road’ – un ra ‘that road’
In Shughnani definiteness is formally expressed by forms of the remote demonstrative pronoun, including their corresponding oblique and plural forms; these
effectively serve as definite articles:
(63) a) yu ǧiðā
‘that boy’
b) wev ǧāts͡en
‘those girls’

314

Leyli R. Dodykhudoeva

4.1.3.2 Indefiniteness
Indefiniteness is formally expressed, for the most part, by the cardinal numeral
ya(k) ‘one’, serving as an indefinite article:
(64) a) dar yak dašt
in
one plateau
‘in one plateau’,
b) yak mardək bud. . .
one man.suf be.pst.3sg
‘there one man was. . .’
Such a constructon can be used in traditional expressions, for example in a riddle:
(65) TB yak čizi
meravad,
meravad,
soya
one thing pref.go.prs.3sg pref.go.prs.3sg shadow
nadorad
neg.go.prs.3sg
‘Something that goes ahead, but doesn’t have a shadow’ (=roh ‘a road’)
(Rozenfel’d 1971: 45)
This numeral can also convey a single unspecified period of time:
(66) a) yak ruz
‘one day’
b) yak čand ruz
‘several days’
b) yak sot
čaq-čaq kənim
one hour talk-talk do.prs.1pl
‘ for some time we will talk’

(Rozenfel’d 1971: 55)

Moreover, in Badakhshani and Vanji Tajik, the indefinite particle -e/-i is documented:
(67) sang-i
as
bolo
omad
stone-art from above came.pst.3sg
‘A stone fell from above’
(Rozenfel’d 1971: 13)

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

(68) TV kam-e namak ba oš
bəpereš
bit-art salt
into soup be- imp.2sg
‘A bit of salt put into soup’

315

(Rozenfel’d 1964: 52)

The construction yak. . . e is commonly used in all Southern and South-Eastern
Tajik dialects, and is increasingly widespread, while the use of the indefinite
article -e/-i is fading (Murvatov 2013: 68). Since the mid-20th century this indefinite article has become more a feature of folklore narratives; data on colloquial
speech collected since the late 20th century verify this shift.
For instance, in colloquial speech, we find an example where indefiniteness
is marked by both an article and the numeral ‘one’:
(69) TB yak mardak-i
ino omadagist
one man.suf-art they come.prfII.3sg
‘one man of theirs came’

(Rozenfel’d 1971: 46)

Indefiniteness in Shughnani is formally expressed in the same way; the numeral
yīw ‘one’, usually in its reduced form yi, represents the indefinite article. Quite
often in folklore, fairy tales begin with the phrase:
(70) tar yi
taraf, tar yi
çār. . .
in one place in one town’
‘in a country, in a town’

4.2 Adjectives
In Southern and South-Eastern dialects of Tajik the comparative of qualitative
adjectives is marked by -tar:
(71) TV varavtar
‘younger (son/daughter)’
In addition, comparison may be differentiated by degree. In Vanji Tajik, we find
the suffix -ak, indicating a lesser degree of the property – or quality – of one
object in comparison with another. This is widely used by speakers in Vanj valley:
(72) xurd ‘small’ – xurdtar ‘smaller’ – xurdtarak ‘even smaller’

316

Leyli R. Dodykhudoeva

Comparative meanings can be formed by combining a preposition (az/ay ‘from’)
with the comparative form -tar:
(73) TV in cašmon-aš
ay
man tangtar-ay
her eye.pl-dem.3sg from me narrow.comp-cop.3sg
‘her eyes are more narrow than mine’
(Rozenfel’d 1964: 49)
The comparative can also be expressed by combining the preposition ay and
postposition -(ən)da with or without the comparative form -tar:
(74) TV (Vanj) ay
hama-nda šərin-ay-da
from all-post
sweet-cop.3sg-fp
‘this one (a small child) is sweeter than anyone else’
(Xorkašev 2014a: 194)
Another means by which comparison may be differentiated by degree is reduplication:
(75) dəroz-dəroz
‘very long’
For instance, the supernatural being almasti is described as having sinahoui
daroz-daroz ‘very long breasts’ (see vocabulary under the entry dev [Rozenfel’d
1971: 60]).
Adjectives signifying intensification of quality are sometimes used in folklore
and riddles. Intensification of quality is formed by two adjectives separated by a
conjunction. One example is a Badakhshani Tajik riddle documented by Rozenfel’d, involving encoding of the notion of a road, where a form using two synonymous adjectives denotes the idea of distance:
(76) šutur-i
gardandaroz meravad
dur-u-daroz
camel-iz neck.long
pref.go.prs.3sg far-and-long
‘a camel with a long neck goes a very long way’ (roh = a road)
(Rozenfel’d 1971: 37)
In its form today, the first adjective is repeated twice:
(77) šutur-i
gardandaroz dur-u-dur meravad
camel-iz neck.long
far-and-far pref.go.prs.3sg
‘a camel with a long neck goes a very long way’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

317

Some words used as adverbs of time represent a type of word formation ending in
-tar (peštar ‘before, in earlier times’), and are also present with the addition of the
plural number suffix -o, such as TBV peštaro ‘in earlier (times)’.
The superlative expressed by -tarin occurs quite frequently in Badakhshani
Tajik, especially in ritual and folklore- oriented texts:
(78) nazdiktarin qaum
‘closest relatives’
In addition, when comparison concerns the highest degree of quality, we observe
widespread use of a descriptive model incorporating az/y hama ‘from all’ – and
sometimes also the comparative form with -tar:
(79) a) TVD az hama lum(b)tar
‘most solid, huge, most of all’,
b) TVD az hama lumb
‘most of all, very many’
The highest intensity of quality is also expressed by full or partial reduplication:
(80) səp-safed, safed-safed
‘very/most white’
Decreasing intensity can be represented through the suffixes -rang, -guna,
-(ča)tob/w, -lem:
(81) a) savz-rang, savz-guna, savz-čatob/w
‘less green’
b) TQ-V sərxlem(b)
‘less red’
A full set of literary Tajik expressions of colour is present in the speech of educated people in Badakhshan, such as:
(82) safedčatob, safedrang
‘less white’

318

Leyli R. Dodykhudoeva

For words expressing intensity of quality or elative, Badakhshani Tajik widely
uses an adverbial sof, either as an adjective, with the meaning ‘clean, clear, transparent; righteous’, or alternatively as an adverb meaning ‘completely, absolutely’:
(83) sof qoq
‘completely dry’
Examples of its adjectival function include:
(84) zaboni sof, toza
‘clear, pure language’
It is also used in Tajik dialects of Rogh and in some southern Kulabi idioms:
(85) rišaš sof siya
‘his beard is completely white’
(Murvatov 2013: 73)
The adverbial pur ‘(very) much, full’ performs a similar function in Darvozi and
Vanji Tajik (cf. Sh pur ‘a lot’):
(86)

pur xunuk
‘very cold’

Many similar forms and expressions are used in Shughnani. Here, the marker
-di is used to denote comparison. When duplicated, it is used to differentiate a
higher, more intensive level: -di-di ‘much more’. A way of denoting even higher
intensity is -dar-di ‘some, yet more’, e.g.,
(87) a) luk-di
‘fuller (with water), larger, more ripe (about cereals)’
b) luk-didi, luk-dardi
‘even fuller, larger, riper’.
Additionally, in Shughnani, an increasing level of quality is expressed by the suffixes -nak and -(y)akí:
(88) kaltanakdí, kaltanakí
‘very big’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

319

The same level of quality is also identified by full or partial duplication:
(89) tɛr-tɛr(ak), tɛr-kištɛr
‘very black’
Decreasing intensity can be represented in Shughnani by the components -rāng
and -gunā, functioning as suffixes:
(90) a) sāvdz͡-rāng
‘not very green’
b) dusgunā
‘not very big’
The elative – the absolute highest degree of quality – is expressed in Shughnani
by the adverb lap ‘very’, e.g.,
(91) lap zūr ǧāts͡
‘great(est) girl’
The superlative is indicated by expressions such as sar ‘top, over’, as fuk ‘than
all’, and bar fuk ‘upon all’, added to the comparative form, e.g.,
(92) a) sar zūr, as fuk zūr-di
‘best of all’
b) bar fuk zūr-di
‘most great, greatest’
To express intensity of quality, the word sof is as widely used in Shughnani as in
Badakhshani Tajik, both as an adjective, with the meaning ‘clean, clear, transparent; righteous’, and as an adverb meaning ‘completely, absolutely’. This denotes
an elative, high degree of quality:
(93) sof bewʊç
‘completely crazy’

320

Leyli R. Dodykhudoeva

5 Numerals and numerical words
In Badakhshani Tajik cardinal numerals differ from literary Tajik counterparts
mainly in their pronunciation, for example:
(94) a)
b)
c)
d)
e)
f)

aft ‘7’ < T haft
ašt ‘8’ < T hašt
čor ‘4’ < T čahor
čil ‘40’< T čihil
azor ‘1000’ < T hazor
senzda ‘13’ < T sezdah

Some traditional forms of counting have been preserved in various vernaculars
of the region.
In some highland areas, rudiments of the vigesimal,18 twenty-digit number
system of counting (based on counting in twenties, starting from the number
twenty), can still be traced. In Vanj, this system was documented in the mid-20th
century by Rozenfel’d:
(95) men se
bist,
doimo kasal
I
three twenty always ill
‘I am three times twenty (=60 years old), all the time ill’
(Rozenfel’d 1964: 33)
In several local speech varieties in Badakhshan, counting in twenties was recorded
for diverse agricultural tasks, for instance counting sheaves of hay, where the
term yi bist denoted 20 sheaves; yi bor ‘a measure equivalent to 10 sheaves’. The
the same term (yi) bor is used with the meaning of ‘load (that can be carried by a
man or animal), a bundle of brushwood’. Another traditional measure of counting in twenties is paymonayi soǧi ‘equalling 20 caps’ (Rozenfel’d 1982). This type
of counting has also been confirmed in Tajik in the Qarategin (Rasht) area and
in Khatlon (Murvatov 2013: 105), as well as in the northern part of Tajikistan and
adjacent regions (Rastorgueva 1964: 71).

18 A similar system of counting was documented in the Caucasian Tat language (Grjunberg
1963). For a related type of counting in Balochi, see Korn (2006: 201–212).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

321

This system of counting in twenties beginning with “twenty” was also widespread in the Wakhi, Munji and Yazghulami languages (Paxalina 1975: 235;
Édel’man 1966: 33; 1975):
(96) “2x20 and 10 and 5”:
a) W bu bist-ə(t)
ðas-ə(t) pandz͡
two twenty-and ten-and five
b) M lu
wisto
los pānj
two twenty ten five
c) Yaz bow wast-a
ðʊs-a
penj
two twenty-and ten-and five
‘two-twenty (and) ten (and) five, 55’
In Tajikistan, as documented by Steblin-Kamensky (1999: 97) in the late 20th
century, this type of counting was nearly extinct, although “the older generation still remembers it”. However, it continued to be used for counting animals
by herdsmen and pastoral Wakhi people in the Tajikistan highlands, mainly in
Wakhan, as well as in Afghanistan, Pakistan, and China.
Elements of the vigesimal system were recorded by Sh.A. Badaxši (1960: 10, 29)
in Ishkashimi-Sanglechi:
(97) a) rā-wīšt
three-twenty
‘three-twenty, 60’
b) rā-wīšt-das
three-twenty-ten
‘three-twenty-ten, 70’
Today, this system of counting, which represented a specific areal feature, has
virtually disappeared.
Another traditional method of counting is still present in Shughnani. Occasionally, the expression of incomplete tens in Shughnani was conveyed by subtraction from the upper basis 10. Numerals are expressed descriptively using a
construction with kam. This method was documented in the speech of older people
in Shughnani (Karamšoev 1963: 141; Edelman and Dodykhudoeva 2009: 797):
(98) ðu kam wūvd ðīs
two less seven ten
‘(he is) 68 years old (lit. two less then seven tens)’

322

Leyli R. Dodykhudoeva

A similar model is used in Badakhshani Tajik:
(99) du kam aftod
two less seventy
‘68 years old (lit. two less than seventy)’
Even today, the following can be heard from the elderly when they talk about
themselves, or their grandchildren:
(100) mu senusol se
kam navad-ast
my age
three less ninety-COP.3sg
‘my age is 87 (lit. three minus ninety)’
Similar constructions were recorded in adjacent regions (TQ and TQ-V):
(101) mən də kam šast raftəm
I
two less sixty go.PRS.1sg
‘I am 58 (lit. two minus sixty’)

(Murvatov 2013: 105)

In modern Tajik, this construction is used to identify time before the hour:
(102) a) soat ponzdah(to) kam čor
hour fifteen.NUMP less four
‘quarter to four (3:45)’
b) dah kam du
ten
less two
‘ten to two’ (1:50)
A notable expression employing this traditional way of counting can be found in
Sotim Uluǧzoda’s novel The morning of our youth (Subhi javonii mo):
(103) Asp
yak bist
qadam ba daryo rafta
bud
horse one twenty step
to river go.PTCP be.PPFV.3sg
‘The horse was going about twenty paces into the river (when cold water
reached my feet and got into my boots)’.
In this case, the expression yak bist qadam ‘about twenty paces’ draws on a traditional unit of counting, a means of conveying short or inexact distance.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

323

In Shughnani, another similar archaic construction for counting was also
documented, in this case to convey age:
͡ vor zoçt
(104) as
aftod-um
tsa
from seventy-3sg four
take.pst
‘I am 66 (lit. I’m 4 away from 70)’

(Karamšoev 1963: 141)

(105) wuz-um as
sad
yīw zoçt
I-1sg
from hundred one take.pst
‘I am 99 (lit. I’m 1 away from 100)’
Baxtibekov (1979: 21) documents the Shughnani and Tajik expressions denoting
99 years old by the more traditional method of adding 90 + 9:
(106) a) Sh wuz-um
now ðīs-at
now salā
I-cop.1sg nine ten-and nine Year
‘I am 99 (lit. I am nine-times-ten and nine, 99)’
b) TB man navad-u
nuh sola hastam
I
ninety-and nine year be.cop.1sg
‘I am 99 (lit. I am ninety and nine, 99)’
Some figures demonstrate sacred features connected with various taboos in the
everyday life of agricultural people. Such features are present in the following
figures:
(107) a) du ‘2’
b) čor ‘4’
c) aft ‘7’
This aspect can be illustrated in Badakhshani Tajik:
(108) afti šikastan
‘to break the prohibition on coming to the summer pasture on the seventh
day’
In all Tajik dialects of the region, the following numerals are used to mean ‘many’:
(109) a) čil ‘40’
b) azor ‘1000’
c) sad ‘100’

324

Leyli R. Dodykhudoeva

The numeral čil ‘40’ is used in this sense in some local place names in Badakhshan:
(110) a) Čildara
lit. ‘forty ravines’
b) Čilbed
lit. ‘forty willows’
It is also widely employed, in many cases, in a supernatural sense to denote a
group of revered elders, or to identify part of a season, or a consecrated forty-day
period of fasting, mourning, recreation, etc:
(111) Čiltan-i
pok
forty.person-IZ holy
lit. ‘forty pure (holy) persons’

5.1 Numerical words
In Southern and South-Eastern Tajik dialects the numeric particle -ta/o (T -to) is
used:
(112) dahta
diga
bəgu,
mešava
bist
ten.NUMP another say.imp pref.become.prs.3sg twenty
‘recite another 10 (quatrains), there will be 20’
(Rozenfel’d 1971: 48)
There exists an early construction found in TV yak. . . (di)ga(r):
(113) yak sahari ga
‘one more (lit. another) morning’

(Rozenfel’d 1964: 78)

A similar construction is also represented in Shughnani, i.e., yi-ga, (yi)-(di)ga
‘another one’:
(114) yi-ga mis yat
‘another one also came’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

325

5.2 Expressions of quantity
To denote the concept of quantity, we detail below a set of terms used semantically to define a large, or small, amount or portion of an undefined number.
The following terms are used to indicate a large amount of something:
(115) TBVD pə/ur, T pur (Sh id.)
‘many, full, filled’
A term widely used in Shughnani is:
(116) a) lap
‘a lot, many, big’
b) Bj lumba
‘fat, huge, big; many’
This term is also employed in other Pamir languages, cf. Ishk lip ‘many, full’ and
W lup ‘large, big, adult, a lot’, and in TD ləmb ‘huge, big; many’ (✶4lap- ‘big, a lot,
very’) (Édel’man 2015: 74).
A small amount of something can be defined in the following terms:
(117) a) TBVD andak
‘a little bit’ (Sh id.)
b) Southern Tajik dialects andak-mundak
‘a tiny bit (of something)’
In Badakhshani Tajik, words used sometimes with a duplicated suffix belong to
this category, for instance:
(118) ila, ilak(ak)
‘a little, just a bit’
In Shughnani, the notion of a small amount is denoted by the expression yi lāv ‘a
bit, a little, portion, part’. This term is also present in Ishk uk-lav ‘a little; portion’,
and W yi lav ‘a piece’ (cf. also TVD lav ‘portion, piece, edge’, TK lafka ‘piece,
slice’).

326

Leyli R. Dodykhudoeva

5.3 Metrological vocabulary and words used for counting
Combinations of numerals with words of quantitative and measurable meaning
are widespread in Southern and South-Eastern Tajik dialects. These include classifiers used in the construction “Number + classifier + noun denoting the object
being counted”.
The following words are used as classifiers in Badakhshani Tajik.
For people, animal, and countable objects:
(119) a)
b)
c)
d)
e)

nafar ‘person, man’ (Sh nafar)
sar ‘head’ (Sh sar)
tan ‘person’ (Sh tanā)
dona ‘grain’ (Sh dʊnā)
tuda, gala ‘group, flock (of animals)’ (Sh galā)19

The term sela ‘herd of animals, horses, flock of birds; many’ is widely used in
other Southern dialects, including with duplication, as in sela-sela ‘many, crowd’.
In Shughnani, there is also a group of broader somatic terms that are used
as numerative words, such as: tan ‘set of clothes’, or when denoting a portion of
food: ziv ‘tongue, one portion’, ǧɛv ‘mouthful, one portion’.
Traditional methods existed in the region to count the weight of grain and
liquids, and to measure length. The words for these methods are still preserved in
the memory of older people, and in folk sayings:
(120) a) pud/t (Sh put)
‘put, traditional measure for weighing (grain)’
b) man (Sh man)
‘man, traditional measure for weighing (grain)’
c) ser (Sh sɛr ‘measure of grain about 1 sir = 10 kilos’)
‘ser, traditional measure for weighing (grain)’
c) ambun (Sh ambʊn)
‘traditional measure for weighing (grain); unit for measuring land’
The most widespread term for measuring length in the region is gaz:
(121)

a) gaz (Sh gāz)

19 For examples of the word gala and some others, see also the section 4.1.2.1 Collective nouns.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

327

b) čub-gaz (Sh čub-gāz, čūv-gāz)
c) gilim-gaz (Sh gilim-gāz)
‘a variable measure of length of material’
This can be compared with the Darvozi Tajik measure for local woven material:
(122) karbos
‘one piece of local woven material’

(TKD 1966, 1: 302)

5.4 Traditional units of measurement referring
to the human body
In traditional measurement, due to the prevalence of barter exchange, words
denoting everyday objects evolved in terms of their value in the social, cultural
and economic life of the region. The metrological vocabulary and examples
typical for Badakhshan are highlighted below, starting with vocabulary for measurement of length and distance referring to the human body.
One measure for the height and stature of a human being or animal, as
well for a length of material, is qad ‘height’ (Sh qād). This term is used in a
compound verbs and in expressions denoting the process of growing up (about
a child)’:
(123) a)
b)

qad kardan
‘to grow, ripen (about plants, cereals)’
qadu qomat kardan
‘to grow up (about a child)’

There is also the saying:
(124) qad-i
past, unar-i mast
height-iz low skill-iz Capable
‘small in stature, but a jack of all trades’
This word is also used in the numerative (yak) qad:
(125) yak qad paridan, qad-qad paridan (Sh yi qād andīdow)
‘to shudder, cringe (out of fear)’

328

Leyli R. Dodykhudoeva

In TBDV qadi can be used in an adverbial sense ‘approximately; this much, so
much; along; similar’:
(126) qad-i
man meduna
similar-iz me pref.become.prs.3sg
‘(he) knows as much as me’

(Rozenfel’d 1971: 126)

Some other units of measurement are expressed in terms of words for various
size of span, small and large. Other units of measurement denoted by parts of the
body include:
(127) a) qəloč
‘the distance between fingertips, with arms outstretched’ (Sh qiloč,
Yazgh qaloč)
b) v/wajab, wu/əlčak
‘span, measure of length equal to the distance between the little finger
and thumb’
c) Sh wiðɛd, wajab
‘span’ (cf. wulčak ‘measure, mesuring length with a twig’)
d) angəšt (Sh angiçt)
‘finger, measure of length equal to one finger’
e) olčin (Sh olčin, W arət)
‘arshin (length of the arm from elbow to the end of the index finger)’
(cf. Russian aršin ‘measure of length equal to 0.71 m’)
f)

qadam (Sh qadam, Rush wiyow)
‘foot, measure of length for land, when building a house, etc.’

g) Sh pīð(joy) (Rush pay)
‘foot, measure of length used in a game’

5.5 Words for measurement connected with traditional
household terms
In traditional agricultural life, different types of containers for liquid or bulk
solids, such as vessels or sacks, were used as everyday units of measurement.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

329

We come across terms of everyday clothing – or household utensils – used as
measures:
(128) a)

toqi(n) (Sh toqi)
‘traditional cap used as a measure’

bun (Sh id.)
‘root’, numerative when counting plants

čumča, čubča (Sh čib)
‘spoon’

d) Sh tanob
‘rope’ (unit for measuring length)
Another group of everyday objects traditionally employed in households, such
as sacks or containers, were used also as measures:
(129) a) kulv/wor, TVD kudvol, qudvol (Sh kilwor)
‘sack, goatskin’
b) qap (Sh qāp ‘woolen sack’, cf. also Sh būjīn ‘sack, a lot’)
‘sack’, cf. T qob ‘case for small objects’, qopča ‘sack’
c) balovdun, TVD bo/əloǧdūn (W bəloǧund ‘tobacco box’, cf. Sh bi/ulmuč
‘a pinch of chewing tobacco’)
‘wooden bucket for bulk solids, used to measure grain’
(Steblin-Kamensky 1999: 108)
d) seri (Sh sɛrak ‘vessel to measure grain of 1 sir = 10 kilos’, sɛr ‘vessel to
measure grain of 10 kilos’)
‘vessel used to measure grain’
e) pemona (Sh pemʊnā ‘wooden vessel’)
‘yardstick, measuring vessel ’
f)

čen(ak)
‘measuring vessel’

g) TVD čenak
‘measure of gunpowder, enough for one shot’
h) čen
‘measuring device, period of time, season’

330

Leyli R. Dodykhudoeva

nišun (Sh niçʊnā ‘sign, mark’)
‘yardstick for measuring milk’

TD anachronic expression ovozras, T čeni ovoz
‘measure of length equal to the distance a cry is heard in the mountains’

6 Word formation
Nouns are represented by pure stems, as well as by nouns with the productive derivational affixes and affixes formed in the past, and identifiable only through historical analysis. The basic means of nominative formation are derivation and compounding.
The inventory of derivational suffixes in Vanji Tajik is largely based on the
preceding language of the valley – Old Vanji. Consequently, this inventory is rather
close to that of the Yazghulami and Shughnani-Rushani group of languages, mirroring their dynamics of development.
There is also subsequent convergence of these elements between Badakhshani
Tajik and Shughnani, and to a lesser extent between Ishkashimi and Wakhi.

6.1 Derivation
Derivation by means of affixes is by suffixation in most cases; some of these suffixes are closely related in Tajik and Shughnani. Prefixes and suffixes, as a means
of nominal word formation, are not strictly distributed between parts of speech.
There are suffixes specific for nouns or for adjectives; or for both.

6.1.1 Prefixation
Prefixation is not widely used; in Shughnani prefixes are mostly loaned from
Tajik.
Prefixes are rare and mainly form adjectives. The only exception is the prefix
(h)am- which is quite productive in deriving nouns:
(130) TDV amǧəl/amliǧ
‘peer, person of a similar age’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

331

Other prefixes produce mainly adjectives from nouns (or from other adjectives).
These include the prefixes ba- ‘with’, bar- ‘on, along’, and be- or no- ‘without’ +
noun > adjective, as in:
(131)

a) bauš (Sh bawuç)
‘clever’
b) bardam (Sh id.)
‘with health, healthy’
c) bekola (Sh bekolā)
‘without qalym (property or money given by a husband to his wife on
their marriage)’
d) nobakor (Sh id.)
‘incapable, idler’

6.1.2 Suffixation
The main suffixes of noun formation are as follows:
The productive suffix -ə/i/uk, -ak forms diminutives indicating affection, or
gives words new meaning by providing additional specification:
(132)

a) nəmolak (Sh id.)
‘kerchief; handkerchief’
b) čišmuk
‘sieve hole’
c) čišmok (Sh cemak)
‘bead’, lit. ‘a small eye’;

-ok, -ak forms nouns and adjectives from the present stems of verbs:
(133) a) dargirok (Sh id.)
‘a person holding the door during wedding ceremonies’ (dar ‘door’ +
giriftan ‘to hold’ + ok)
b) lesak
‘stuffed calf’, lesidan ‘to lick’
(cf. Sh lesak ‘thick slop for animals’)

332

Leyli R. Dodykhudoeva

In Vanji Tajik, -vara forms abstract nouns, and is not productive:
(134) TV bačavara
‘having many children; relatives of the groom’;
Vanji Tajik -vor is a suffix denoting possession of some attribute:
(135) TV kučavor
‘family man’
-ina (Sh -inā) forms a range of subjects, and is not productive:
(136) ganjina (Sh ganjinā)
‘treasury, milk storage place’
-akí forms abstract nouns, and is not productive:
(137) TVD šikorakí
‘hunter’
-gar (Sh -gār) forms an agent noun, and is no longer productive:
(138) alowgar (Sh alowgār)
‘stoker’
-ga, -go(h) (Sh -gā) forms nouns with the meaning of place, and is no longer productive, although it is represented in numerous place names:
(139) yelga (Sh yelgā)
‘summer pasture’
-lox denotes the meaning of a place, and has long been unproductive:
(140) devlox, TB also dewlox (Sh dewlox)
‘summer pasture’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

333

-zor denotes a place, where something is in abundance; it is well known from the
classical language and is not productive:
(141) alafzor (Sh id.)
‘meadow’
-sor denotes place, and indicates abundance; it was preserved from the classical
language of an earlier period, and is not productive:
(142) kabudsor
‘grove, thickets’ (Sh woman’s name)
-č, (-č/j <✶-ači) denotes place, and is not productive:
(143) a) TV kərč, TB kər
‘cave, pothole; deep, arched’
b) TQ qarčin
‘lawn bordered with trees’
c) Sh kurc
‘pit, ravine; deep’
d) Yaz qarči
‘shady lawn under fruit trees’
-bo/un (Sh -bʊn) is especially productive in Badakhshani Tajik:
(144) galabo/un (Sh galabʊn)
‘herdsman (for a herd of cattle)’
Suffixes of adjectives:
The suffix -č is no longer productive:
(145) a) ǧəramč
‘mixed (food)’
b) Sh ǧirafč
‘grated mulberry’
c) parğəč
‘cockeyed, skew’

334

Leyli R. Dodykhudoeva

d) Sh parğečā
‘small bit (of stone)’
Cf. Shughnani suffix -ej marking the geographic origin of a person, as well as
denotation of intention or purpose, e.g.,
(146) a) bajuwej
‘from Bajuw’
b) pečakej
‘(woolen thread) intended for artificial plaits’
-in indicates adjectives describing materials (that something is made of), and
adjectives derived from nouns and participles, and is not highly productive:
(147) pustin (Sh pʊstīnak)
‘fur, sheepskin (coat)’
-gin indicates in Vanji Tajik adjectives denoting source, and is archaic:
(148) genagin
‘made of nuts’
-gnik indicates negative qualities in Vanji Tajik and is archaic. It occurs only in
several words:
(149) bugnik (Sh bʊygin)
‘smelly’
Suffixes of nouns and adjectives:
-í forms abstract nouns and relative adjectives, and is strongly productive (Sh -í):
(150) a) panjtaní (Sh id.)
‘Ismaili, (lit. denoting followers of five (holy) persons’)
b) TV potaxsí (Sh poytāxcí)
‘gift’
c) sav/bzí (Sh sāvdz͡í)
‘greenery, green space’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

335

-yor forms agent nouns and adjectives:
(151) dastyor (Sh id.)
‘helper’

6.2 Noun compounding
6.2.1 Copulative compounds with equal elements
We identify below several types of copulative compound formations.
One group is formed by juxtaposition only, e.g.,
(152) xeš-tab/vor, T xešutaborī (Sh xeç-tabor)
‘relatives’
The following is formed by juxtaposition with simple duplication, e.g.,
(153) čaq-čaq (kardan) (Sh id.)
‘to talk’
A different group is formed by an alliterative concord with substitution of the
initial consonant of the second part of the word, mainly with m- or p-, e.g.,
(154) kola-mola
‘dowry’
We also encounter juxtaposition with connective elements: -o, u ‘and’ (Sh at ‘id.’),
e.g.,
(155) bača-vu-kača (cf. Sh bač-kač)
‘children; family’
Juxtaposition with connective elements and duplication is also noteworthy, e.g.,
(156) šav/b-o-šav/b, šab-dar-šab (Sh çab-o-çab)
‘late at night before dawn’

336

Leyli R. Dodykhudoeva

In addition, we find juxtaposition with the connective element ba- e.g.,
(157) dar-ba-xayol (Sh id.)
‘deep in thought’

6.2.2 Determinative compounds
The following examples are formed by the construction “noun+noun”:
(158) a) jamoat-xona (Sh jamoat-xʊnā)
‘congregational place for Ismailis’
b) xamir-mo(ya) (Sh xamir-mo)
‘dough starter’ (xamir ‘dough’ + mo(ya) ‘dough starter’)
Another group of terms is formed by the construction “adjective+noun”:
(159) a) zard-gulak
‘name of a plant’ (zard ‘yellow’ + gul ‘flower’ + suffix -ak)
b) Sh zīrd-gālak
‘dandelion’
The following example represents a group of terms formed by the construction
“numeral+noun”:
(160) čorxona (Sh čor-xʊnā)
‘ceiling frame in the house, the central part of the Pamir ceiling’ (čor ‘4’ +
xona ‘house’)
There also exists a group of terms formed by the construction “Noun + Verb Stem
or Agent Noun”:
(161) a) dar-basta (Sh dar-bastā)
‘death of all family males’ (dar ‘door’ + past part. basta, bastan ‘to
close’)
b) alowparak (Sh alow-parak)
‘ritual of jumping over fire’ (alow/v ‘fire’ + pres. stem par, paridan ‘to
fly, jump’ + suf. -ak)

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

337

Two groups of compounds are formed by the combination of three stems.
One subgroup represents the combination “Noun + Noun + Noun”:
(162) podaxovga (Sh podaxobgā)
‘cattle sleeping place’ (poda ‘herd’ + xov ‘sleep’+ ga ‘place’)
The second subgroup features the combination “Noun + Noun/Adjective+Verb
Stem”:
(163) balodurkunak (Sh balodurkunak)
‘trouble-averting talisman’ (balo ‘trouble’ + dur ‘far’+ kun, kardan ‘to do’,
suf. -ak)

7 Issues of vocabulary and phraseology
7.1 Kinship terms
Kinship terminology today is tending to become uniform throughout Tajikistan.
We observe a tendency for archaic terms known from classical literature, which
have been preserved in Southern Tajik dialects, to be introduced back into standard Tajik. More generally, all dialects tend to absorb the mainstream forms currently in use.
In Badakhshan, kinship terminology is to some extent indigenous, but
has been subject to several waves of change (Table 6). In the early 20th century,
kinship terminology began to reflect changes in social structure. In this period,
the extended family, common in Badakhshan and elsewhere as a functioning
economic unit, began to decline. Previously, several families closely related by
paternal lineage lived together, often under one roof, and under the leadership
of the father and mother. Children treated their uncles and aunts as their own
parents. There were collective terms for the older generation of relatives, such as:
(164) momo
‘a group of older generation of women in one family’
Originally, Iranian terms were identical for both uncles and aunts from both sides
of the family. Distinguishing terms for uncles and aunts, and even for parents,
were absent.

338

Leyli R. Dodykhudoeva

Pisarchik at the time documented the situation where, in large families, the
old term for uncle (in Rushani and Bartangi) was the same as that used for father,
and was still used in remote areas of Badakhshan (1953: 181–183):
(165)

dōd
‘father, uncle’

The author mentions that in Yazghulam valley, as in the Wakhi language, there
was still only a single unified term for all uncles, both paternal and maternal.
However, in Shughnan – and in Ishkashim, adjacent to the area of Tajik Badakhshani speech varieties – this term already no longer applied to the mother’s
brother (Pisarchik 1953: 181–183).
This change in the Shughnani term can be explained by the influence of
Tajik, which underwent a more rapid transformation; in Tajik dialects, with the
fragmentation of families, all four Arabic terms were already in use by the first
decades of the 20th century:
(166) a) amak
‘uncle’ (mother’s brother)
b) xolak
‘uncle’ (mother’s brother)
c) amma
‘aunt’ (father’s sister)
d) xola
‘aunt’ (mother’s sister)
Today, these terms are used throughout Tajikistan; in Tajik dialects, particularly
in most of Ishkashim district, the term taǧo supplanted the earlier (modar)xolak.
However, in Tajik of Wakhan, as well in Wakhi and Shughnani, the term xolak is
still in use.
In addition, in Shughnani, the unified term for cousins is still preserved,
although in Tajik dialects the specific descriptive terms ‘daughter of the mother’s
sister’, and ‘son of the mother’s sister’, are in use today.
(167)

a) Sh pitiš
‘cousin (brother or sister) from both sides of the family’
b) xola-duxtar
‘daughter of the mother’s sister’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

339

c) xola-bača
‘son of the mother’s sister’
In Shughnani, a single term to designate nephews and nieces has also been preserved:
(168) xɛr
‘sibling’s offspring’
Traces of earlier terminology and special treatment of kinship groups can be seen
in a variety of specific suffixes still used in kinship terminology in Shughnani.
These suffixes can be applied to broader groups of terms denoting particularly
close social relationships (for kinship terms with specific plural suffixes, see
section 4.1.2. Number).
Some Iranian terms are preserved in Tajik and Pamir languages; however,
their use is not extensive. Among kinship terms, the original Iranian words for
“father” and “mother” are not widely used in Tajik dialects of MBAR and the
vicinity, except for one term for “father” preserved only in Vanji Tajik; in Badakhshani Tajik, both terms have been supplanted by nursery terms, or “baby talk”
(Table 5). However, reflexes of some of these Iranian terms are preserved and used
in the Shughnani variety of Bajuw, in the forms of pid ‘father’ and mod ‘mother’,
when addressing and also when indicating the person, as well as in Shughnani
proper, to define a genre of lament:
(169) a) Bj pid
‘father’
b) Bj mod
‘mother’
c) Sh dargīl-modik
‘lament, sad song of a mother’ (cf. TB dargil ‘sad, melancholic’)
In Tajik dialects, the word pisar was generally replaced by another term bača
‘young man, son’, both of Iranian origin. However, in the Shughnani term for ‘son’
we see a reflection of an earlier Iranian form ✶putra-; this is also reflected in another
Shughnani form -buts͡ (Édel’man 2009: 117–118). The Iranian stem was preserved in
Badakhshani and other adjacent Tajik dialects in the meaning of ‘son of the spouse’:
(170) a) TBVD bača
‘young man, son’,

340

Leyli R. Dodykhudoeva

b) T pisar, Sh puts͡
‘son’
c) Sh -buts͡
diminutive ‘youngling’
d) TBVD pisandar(a), pisarandar, Sh puts͡ej
‘stepson’
We observe an interesting phenomenon in the case of the term for ‘daughter’.
The original Iranian term was preserved in Tajik dialects, but was replaced in
Shughnani by the euphemism rizīn, a reflex of a form connected with the notion
“to be born” (✶zan-); the cognate of this word is present in Tajik in the form of
farzand ‘offspring, progeny, son’ (Édel’man 2009: 118):
(171)

a) duxtar
‘girl, daughter’
b) Sh rizīn
‘girl, daughter’
c) T farzand
‘child, offspring, son’

A noteworthy term for ‘daughter-in-law’, derived from a Proto-Iranian term, was
preserved in both Tajik dialects, and Shughnani:
(172)

TBVD si/ənhor, Sh zinaǧ < ✶snušā- (cf. also Russian snoxa) ‘daughter-inlaw’.

Two other terms for ‘father-in-law’ and ‘mother-in-law’ were preserved in both
Tajik and Shughnani; the Shughnani terms probably evolved through contamination of the indigenous terms by Western Iranian ones (Édel’man 2009: 124, 125):
(173)

a) xusur (Sh xisur)
‘father-in-law’
b) xušdoman, TV xuš (Sh xīç)
‘mother-in-law’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

341

Some other notable terms have descriptive formations, such as the Shughnani
term for ‘wife of the brother’, originating at an earlier stage from ‘sister’ (xohar)
with an additional element.
(174)

xi/ayʊn < xohar ‘sister’
‘wife of the brother’, and also ‘husband’s/wife’s sister’

Terms for relatives by marriage “brother-in-law”, “sister-in-law”, etc. are represented by derivatives: descriptive terms derived from basic kinship words related
to the extended family.
In Badakhshani Tajik, the specific term, borrowed also by Ishkashimi, for
‘brother-in-law, wife’s brother’, contains an element indicating ‘father-in-law’
(xəsur) with the implication that the brother in question is related to the fatherin-law (Rastorgueva and Édel’man 2007: 488).
Another related term for ‘brother-in-law, wife’s brother’ xəsur-bača is similarly based on the father-in-law’s designation as a reference term for his relative.
In Shughnani, the term for a wife’s brother (xisīrdz͡) is founded on this same
conceptual scheme, denoting the idea of offspring of the father-in-law; a similar
scheme also applies to ‘husband’s brother’ and ‘sister’s husband’:
(175)

a) TB, Ishk xəsur-bəra
‘brother-in-law, wife’s brother’
b) xəsur-bača
‘father-in-law’s son, offspring’
c) Sh xisīrdz͡
‘brother-in-law, wife’s brother’

There also exists in Shughnani a descriptive form, defining the wife of a husband’s brother living in an extended family. This word is derived from ✶hamakata- ‘living together in the same house’ (Morgenstierne 1974: 44; Édel’man 2009:
126–127). In adition, a Wakhi term with the same meaning is based on the notion
of living together in one house (Steblin-Kamensky 1999: 83);
(176) a) Sh mijād
‘wife of a husband’s brother’
b) W andarč
‘wife of a husband’s brother ’ (lit. ‘belonging to household woman’).

342

Leyli R. Dodykhudoeva

Because of the current trend whereby Iranian terminology is re-introduced into
literary language, the Tajik Southern dialect term has now been promoted for use
in literary Tajik, along with the long-accepted terms kelin and arus:
(177) sinhor
‘daughter-in-law’
Table 6: Kinship terminology.
Sh

tāt, dod, tat, do/adí
Bj pid
nān,
Bj mod

TV-B

TRogh

Tajik

English

bob/
va,
piyar

bəb/vo

dad,
boba

padar

father

oča

nəna

modar

mother

nana/i, jiya (m)ə/
uma

put͡s

bača

b/vača

pisar

son

rizīn

duxtar

dəxtar

duxtar

daughter

virod

bərodar,
biyor20

biyor

barodar

brother

aka

elder brother

biyor(i
kalon)

yax

nibos

biyori xərd/ biyorak dodar(ak) dodar, nənik (up to dodar,
mayda
uka
5 years old) uka/o

younger
brother

xuwar

apa

xowar,
apa

amšir (a)

elder sister

xwarək,
xwari xərd/
mayda

xuwar,
xorak

xohar

xuwar,
xorak

nənik (up to xohar
5 years old)

nabosa,
nabera

nəvosa nuvosa

nəvera nəwasa

apa, xohari
kalon

nabera

nibɛs

younger
sister
grandchild
grandchild
(girl)

aberā

(n)abosa,
natija

h
abasa, nəvasa
nabasa

abosa, navesa
av/
bera

abera

bob

bo/əbo

bəb/ví

bob/
voí

bobo(kalon) grandfather

bobí

wo, bobí

great
grandchild

20 In the Tajik subdialects of Ghoron and Ishkashim the term lol – borrowed from Ishkashimi – is
used as a term of address.

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

343

Table 6 (continued)
Sh

mūm

bib/ví

TV-B

TRogh

Tajik

English

boboi
kata

father’s
father

bobí

mother’s
father

bibí, (a)
ví, muma

momai wu, bə(bí)
kata

modarkalon grandmother

Members of the family commonly address each other by using kinship terms.
In Badakhshani Tajik, older members of the family use their own designation as
a term of endearment and affection when addressing younger members of the
family; the same was documented for Shughnani (Edelman and Dodykhudoeva
2009: 818).
In Badakhshani Tajik, several terms, denoting a group of blood relatives, are
used to designate a family community consisting of 3–4 generations of male relatives on the paternal side, together with their wives and children.
To indicate such an extended family, the term awlod is used:
(178) Qadamšo av/wlod
‘the whole family, lineage’
General terms for blood relatives are as follows:
(179) a) TBVD xeš-tabor or qav/wm(iyot) (Sh xeç-tabor, qawmiyot)
‘relative’
b) pušt (Sh puçt)
‘generation’
c) TVDQ kunda
‘kinship group’
d) TBD ziryot (Sh ziryot)
‘child(ren), offspring’
In terms of vocabulary, the head of the family, in accordance with the patriarchal principle, is the “senior” man (kad/txudo, katanaki xona, kalontari xona, TV
xovand < xudovand). Along with this older man, the older woman of the household also heads the family community (kadbonu, TV kay(vo)nu ‘mistress of the
house’), with responsibility for her own sphere of gender-specific domestic activ-

344

Leyli R. Dodykhudoeva

ity. These elders control the personal life of the family members, especially in the
choice of a marriage partner.

7.1.1 Construction indicating intention to marry
A construction expressing the intention to marry, and set up a household, is based
on compound verbs, where the nominal part is represented by a verbal adjective
derived from gir (the present stem of the verb giriftan ‘to take’), and by the suffix
-ə/am. This construction is documented in several Southern Tajik dialects, such
as Southern Kulabi, Roghi and Badakhshani, and in Vanji Tajik.21
(180) a) girəm šudan
‘to intend to marry, to intend to start a family (for man or woman)’
b) girəm budan
‘to intend to marry’
In Roghi dialects, intention to marry can similarly be expressed with the help
of the construction of the nominal part gərəm/girəm. This, in combination with
the auxiliary verbs budan, šudan, istodan, signifies “I’m going to (marry)”; it is
used exclusively to express the intention to get a wife or husband (Bogorad 1956:
157, 188):
(181) girəm bəd-ak
na-gərift
taking be.pst.3sg-suf neg-take. pst.3 sg
‘he intended to marry me, but didn’t marry’

(Bogorad 1956: 157)

(182) u
mard-ək u
zanək-a
girəm
that man-suf that woman.suf-acc taking
‘that man is going to marry that woman’

(Bogorad 1956: 159)

(183) mə tə-ro
megirəm
I
you-acc pref.take.prs.1sg
‘I intend to marry you’

(Bogorad 1956: 161)

21 This construction can be compared with a similar construction in the Northern Tajik dialects of Chust and Qasansay in which the compound verbs giron raftan ‘to take away’ and giron
omadan ‘to bring’ were documented (Rastorgueva 1964: 53).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

345

Later, this construction was recorded in Badakshani Tajik. In this dialect, the
presence of the prefix bə- in case of the positive form (bəgiram), and its absence
in the negative form (giram), permits the assumption that these forms may originally have been a present participle of giriftan:
(184) u giram na-bud
he taking neg-be.pst.3sg
‘He didn’t intend to get married’

(Rozenfel’d 1971: 42)

(185) man bəgirəm budam
I
taking be.pst.1sg
‘I intended to marry, to get married, to have a family’ (Rozenfel’d 1971: 42)
This construction was documented also in the South-Eastern Tajik dialect of Vanj:
(186) bača-i Akbar ba
in
duxtar bəgirim ši
son-iz Akbar with this girl
taking become.pst.3sg
‘son of Akbar intended to marry this girl’
(Rozenfel’d 1964: 14)
In modern Tajik, and particularly in Vanji Tajik and in Vakhiyo Qarategini dialects, we come across similar constructions:
(187) a) TV zan giriftan
‘to get married, to have a family’
b) TQ-V zan dodan
‘to marry someone’
This can be represented by the following example:
(188) zan-ša
munda-st,
boz
zan-i
diga
gərifta-st
wife-dem.3sg leave.prf-3sg again wife-iz different take.prf.3sg
‘(he) left his wife and married another (one)’
(Xorkašev 2014a: 69)

7.1.2 Construction indicating intention to start a family
We find different, and widely used, expressions in Badakhshani Tajik to convey
the intention to start a family; these are based on the term ‘master of the house’:

346

Leyli R. Dodykhudoeva

(189) kad/txudo
‘master, host’, later ‘family man, married, welcoming host’
These expressions also include:
(190) a) kad/txudo šudan
‘to intend to marry, to get married, to start a family’
b) kad/txudo kardan
‘to marry off (one’s son or daughter)’
(191) imsol
bača-ša
kad/txudo kərd
this.year son-dem.3sg family
do.pst.3sg
‘this year he married off his son’

(Rozenfel’d 1971: 61)

Another later term derived from this source, but no longer used, is the following:
(192) tuy-i
kad/txudoí
reception-iz matrimonial.suf
‘wedding reception’
A similar construction is present in Shughnani, with the same main component
katxuðoy and the same verbs ‘to do, make’ (čīdow) and ‘to become’ (sittow), e.g.,
katxuðoy čīdow ‘to marry off one’s son’, and katxuðoy sittow ‘to get married, to
start a family’:
(193) asīd-um
xoyiç čūɣj
idi xu puts͡ katxuðoy kinum
this.year-1sg wish do.prf that my son family
make.prs.1sg
‘This year I wanted my son to be married (to start a family)’

7.2 Phraseology
7.2.1 Expressions denoting birth of a child
In Tajik of MBAR and other adjacent dialects, we find different compounds used figuratively to designate “boy” and “girl”. Throughout Badakhshan, on the occasion of a
child’s birth, the father is congratulated with ritual expressions which vary depending on the child’s gender. These are oblique terms, where reference to the child is
made according to the traditional distribution of activities by gender in society. The
ritual phrase to greet the father with son is based on the terms ‘shooter’ or ‘hunter’:

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

347

(194) a) tirandoz, TV tirdoz
‘shooter’
b) TDQ-V šikorakí
‘hunter’
c) TK šikorčí (Ishk šikorči)
‘hunter’22
The following expressions are customarily used in some Southern dialects when
blessing a hunter before his hunt; all have the same meaning:
(195) a) məborak-i šikorakí
blessing-iz hunter
b) məborak-i šikorí
blessing-iz hunter
c) šikor muborak šavad
hunt blessed become.prs.3sg
‘(let your) hunt be blessed’
In the case of the birth of a son, the father is greeted by the following formula:
(196) Muborak-i tirandoz
blessing-iz shooter
‘Congratulations on (your future) shooter’
In Tajik of Darvoz, particularly in Qalay Xumb and vicinity, this formula contained another term denoting the same activity:
(197) Muborak-i šikorakí
blessing-iz hunter
‘Congratulations on (your future) hunter’

(TKD 1976, 3: 66)

22 In addition, the Cl Persian šikārāndan, šikardan ‘to hunt, chase, kill’ is noteworthy here; this
was preserved in the TV šikorūndan ‘to hunt’ (Rozenfel’d 1964: 15, 113).
Moreover, in everyday life, the notion of “hunter” was usually designated as TDQ-V mergan
‘hunter, shooter’ (lit. sharpshooter) or TB pala(w)on, Ishk palawon ‘hunter’ (T pahlavon ‘hero, a
man of immeasurable strength and courage’). A senior hunter was called TDQ-V pir, or ustoi šikor
‘master hunter, master-of-hunt’.

348

Leyli R. Dodykhudoeva

In the Darvoz village of Pšixarv (Pošxarv), this expression was documented in
a more complete version, and was included into a short prayer of thankfulness,
solemnly expressing gratitude for a new life:
(198) Muboraki-i
mehmon-i nav, muboraki-i šikorčí, naxčir.kuš,
blessing.suf-iz guest-iz
new blessing-iz hunter goat.killer
Allahu
akbar
God
Greatest
‘Congratulations on a newcomer, congratulations on (a future) hunter,
who’ll kill mountain goats. God is great’
(TKD 1976, 3: 66)
The same formula is used in Badakhshani Tajik when a daughter is born, but with
a different reference to traditional female activity:
(199) Muborak-i alwo.paz
blessing-iz sweetmeats.cook
‘Congratulations on (your future) sweetmeats cook’
In the Darvoz village of Pšixarv (Pošxarv), we also find a more complete version
with reference to a newborn girl:
(200) Muboraki-i
mehmon-i nav, muboraki-i
halwopaz-ak
blessing.suf-iz guest-iz
new blessing.suf-iz sweetmeats.cook-dim
‘Congratulations on a newcomer, congratulations on a sweetmeats cook’
(TKD 1976, 3: 66).
Throughout Badakhshan, in the main Pamir-language speaking areas, including
Shughnan, Ishkashim and Wakhan, this formula was borrowed from Tajik as a
fixed compound. In Shughnani, we illustrate the formula here with examples
emphasizing a direct personal greeting to the father:
(201) muborak-i tīrandoz tu-rd
blessing-iz shooter you-post
‘Congratulations on your (future) shooter’
(202) muborak-i alwopadz͡
tu-rd
blessing-iz sweetmeats.cook you-post
‘Congratulations on your (future) sweetmeats cook’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

349

In Shughnani the actual words for boy or girl – Tīrandoz and Alwopadz͡ – are used
not only to denote the gender of the child, but also as their proper names.
In modern Vakhiyo-Qarategini Tajik dialect (but not in adjacent dialects), the
ritual expressions for greeting the father on the birth of a newborn – for boys
muboraki šikor(ak)í, and for girls muboraki havlopaz – are used not only once, on
occasion of a birth, but also in everyday colloquial speech, with reference to a son
or daughter (Xorkašev 2014a: 63, 73):
(203) dəgoniko
yata-š
məborak-i šikorakí, yata-š
twins.suf.pl one-dem.3sg blessing-iz hunter
one-dem.3sg
məborak-i havlopaz-ai
blessing-iz sweetmeats.cook-cop.3sg
‘One of (the pair of newborn twins) is a “boy”, another is a “girl”’
(Xorkašev 2014a: 73)
In Tajik Vakhiyo-Qarategini dialects, there exists another pair of terms for newborn girls and boys – duǧí ‘girl’ from duǧ ‘milk serum’ and yuǧí ‘boy’ from yuǧ
‘yoke’:
(204) zan-ət
či
kard
duǧí -yay yo
wife-dem.2sg what do.pst girl-3sg
or
‘Did your wife give birth to a girl or a boy?’

yuǧí
boy

(Xorkašev 2014a: 73)

These terms refer to traditional gender-associated activities in the highlands –
working in dairy farms for women, and cultivating the land with oxen (under a
yoke) for men.
In the modern world, where the roles of men and women are much less clearly
marked, the equivalent Tajik formula makes no distinction between genders.
(205) Mehmon-i nav muborak bošad
guest-iz
new blessed be.prs.3sg
‘Congratulations on the newcomer (i.e., birth of a child)’

7.2.2 Greetings for Nawruz
Throughout Badakhshan, the celebration of Nawruz has special significance.
People greet each other as follows:

350

Leyli R. Dodykhudoeva

(206) Šogun-i
nav muborak! Šogun-i
boor
muborak!
good.omen-iz new blessed
good.omen-iz spring blessed
‘Let New Year be blessed! Blessing for the spring New Year!’
And the traditional answer is:
(207) Xudo muborak!
God blessed
‘God bless!’
Another more widely used variant of the New Year greeting is simply:
(208) Šogun
muborak!
good.omen Blessed
Let New Year be blessed!

7.2.3 Shughnani phraseological units based on Tajik
In this section, the phraseological units in Shughnani represent partial or full
calques from Tajik.
The examples given here are shortened versions of traditional ritualized formulas associated with ancient mourning ceremonies and rituals of repentance,
mentioned in the Bible and the Qoran. Their application throughout history
ranges from the performance of ancient mourning rituals, to the figurative use of
these formulas in cursing and swear words.
In Tajik we find a composite with a literal and figurative meaning:
(209) siyohrū(y)
black.face
‘black-faced’, lit. ‘(with) black (siyo) face (rū(y)’, ‘disgraced’

(TRS 2006)

In Badakhshani Tajik, the following expression is present, also with both a literal
and a figurative meaning:
(210) ru-siyo(h)
face-black
(lit. ‘(with) face (ru) black (siyoh)’, ‘bad mood, feeling depressed’,
‘shameless, disgusting, harmful’).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

351

In Shughnani, an expression with a similar meaning is based on indigenous language elements for ‘face’ and ‘black colour’. The following phraseological unit
keeps the same order of elements as in Badakhshani Tajik, with similar literal
and additional figurative meanings:
(211) pī ts͡ tɛr
face black (lit. ‘(with) black (tɛr) face (pī ts͡)’)
‘shameless, disgusting, harmful’, ‘condemned’, ‘censured with harsh
expressions and actions’.
This phraseological unit is used as a substitute for a stronger curse: in exclamatory sentences, it is a reproach, associated with the ancient mourning custom of
showing remorse by blackening one’s face and sprinkling ashes on one’s head on
the occasion of the death of a loved one. So, the implied sense is that “I wish you
were dead, and (that) your mother will mourn for you!”:
(212) tu
nān
pī ts͡ tɛr
your mother face black
‘Shameless! (lit. your mother’s face is black)’
In Shughnani, this expression can be compared to a compound verb including
the adjective ‘black’ and the verb ‘to do’: tɛr čīdow ‘to make something black, to
give a black colour’, with the figurative sense ‘to darken mood, upset’:
(213) mu kurtā tɛr
kinum
my dress black do.prs.1pl
‘I colour my dress black’.
Based on this verb a phraseological unit is formed:
(214) xu
pī ts͡ tɛr
čīdow
REFL face black do.Inf
‘To tell lies, to have no shame; to blacken, besmirch somebody’
The verb itself, in this phraseological unit, can appear in the shortened form xu
pīc tɛr-um:
(215) Čīz
xu
pī ts͡ tɛr-um
why refl face black-cop.1sg
‘What a sin to conceal’

(Mirzoev, Karamova 2014: 33)

352

Leyli R. Dodykhudoeva

This same unit is also found with the full form of the verb xu pīc tɛr čīdow ‘to do’:
(216) Čīz
xu
pī ts͡ tɛr
kinum
why refl face black do.prs.1sg
‘Why blackening my own face (i.e., blackening myself)’.
However, in this case the expression is perceived with the shift of meaning ‘to
deny, conceal (one’s sin)’.
This construction can be treated as a complex noun predicate in which the
component pīc tɛr can be considered a noun phrase as in the example below (a),
with the potential to be rearranged into a compound word (composite pīc-tɛr) (b):
(217) a) dāð
xu
pī ts͡ tɛr
kinen
they.d refl face black do.prs.3pl
‘They’re lying (blackening their own face). . .’ (Karamšoev 1991: 439)
b) dāð
xu
pī ts͡-tɛr
kinen
they.d refl face-black do.prs.3pl
‘They’re lying (they blacken themselves)’
7.2.3.1 Transformation of Tajik constructions in Shughnani
The formation of a new phraseological unit in the Shughnani language is based on
the merging of core components (face + black, in our example), and their lexicalization and contraction into a single compound, which can be treated as a substantive
or adjective, and which acts as a grammatical modifier in the noun phrase. The
meaning of the unit varies depending on the lexical environment. Individual units
can retain their original meaning, or alternatively this meaning can be transformed.
In addition, the noun and adjective in the phraseological unit can preserve
their original grammatical meaning, or they can be combined into a compound, at
which stage they become either a composite substantive, or an adjective with a figurative meaning (in our example, ‘shameless’, ‘shameless person’, or ‘black faced’).
a) In the traditional attributive Shughnani construction, the modifier customarily comes before the modified element (the head). In our example, the first
component is an adjective “ADJ + N = ADJ” (tɛr-pī ts͡ ‘with black face’), and the
second is a noun.
(218) tɛr-pī ts͡
odam
black-face man
‘Dark faced person’

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

353

b) In our example, the order of components in the noun phrase is traditional,
as the modifier – the resulting composite-adjective “N + ADJ = ADJ” (pī ts͡-tɛr
‘with black face’) – precedes the head noun:
(219) pī ts͡-tɛr
odam
face-black man
‘Shameless man, lier (i.e., black-faced man)’
c) In the next example, the first component “N + ADJ = Noun” is treated as a
noun; the whole noun phrase is close to becoming a composite, where the
modifier noun precedes the head of the noun phrase:
(220) pī ts͡-tɛr
odam
face-black man
‘Shameless man (i.e., with black face)’
The additional affix -i can be added to pī ts͡-tɛr odam (Alamshoev 2018: 373); this
addition usually marks a noun in Shughnani.23 In this case, the grammatical
meaning of the compound pīc-tɛri shifts, so the meaning is difficult to determine,
and varies depending on the context.
In the example below, on the one hand pīc-tɛri can be treated as a noun
“N + ADJ + SUF = N” so that the whole noun phrase represents an attributive construction (221a). Alternatively, it can be treated as an izafe construction, whereby
both components in the noun phrase are “N + N” (221b):
(221) a) pī ts͡-tɛr-i
odam
face-black-suf man
‘Shameless man, lier’

23 An example of a composite borrowed from Tajik with the addition of -í can be seen in a noun
phrase with the modifier azor-ǰufí ‘quirky, diverse’ (hazor ‘thousand’, Present stem juf- from juftan ‘to fold twice’):
azor-ǰuf-í
odam
thousand-fold-suf man
‘quirky man (fig. ‘multifaceted man’)

(Karamšoev 1988: 63)

Here, the substantive composite is adjectivized, even though, as a rule, the suffix -i in Shughnani
marks a noun.

354

Leyli R. Dodykhudoeva

b) pī ts͡-tɛr-i
odam
face-black-iz man
‘Shameless man, lier’
This new expression, again, demonstrates the trend whereby in the Shughnani
language we find increasingly common use of constructions like those in Tajik.
However, in Tajik the dominant member (head) of the noun phrase takes first
place and is marked with an izafe, while in Shughnani juxtaposition without a
connecting element is more common:
(222) a) odam-i ganda
man-iz bad
‘bad person’
b) Sh odam-i gandā
man-iz bad
c) Sh gandā odam
bad
man
‘bad person’
Returning to the construction discussed here, we can say that the modification
highlighted in example (b) above can be treated as an izafe construction, where
the modifier precedes the head (odam), i.e., the noun phrase is a reverse izafe
construction, where the izafe marker -i is added to the modifier that is treated as
noun, and not to the head as is typical in Tajik.
The Tajik idiomatic phrase, closest in terms of grammatical meaning to the
Shughnani phraseological unit, is the Tajik phrase below:
(223) siyohrūy
odam
black.face man
‘black-faced man; disgraced person’
Here, we see the reverse order of the components in the modifier, but the same
order of components in the noun phrase, where the prepositive modifier is juxtaposed before the head (the word being defined) with no connection between
them. We know, however, that such attributive grammatical meaning is usually
expressed in standard Tajik by an izafe construction marking the dominant
member of the noun phrase:

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

355

(224) odam-i siyohrūy
man-iz black.face
‘black-faced man, disgraced person’
At present, both types of construction are quite frequently borrowed by Shughnani
as a whole unit, for example gandā odam and odam-i gandā. To conclude this
illustration of grammatical processes, the following assumptions can be made:
1)

In the Shughnani language, the process of word formation has a tendency
to form composites, and to restructure their grammatical meaning, by transforming the composite and/or its components into substantives (or adjectives).
2) The language also has a tendency to create a noun phrase by means of a Tajik
izafe construction. In some cases, the reverse izafe (izafati maǧlūb) construction is present, with a reverse order of components.
3) This transformation may be interpreted as a shift in Shughnani towards
arrangement of the noun phrase along the lines of the Tajik model.

8 Lexical and areal aspects
As these languages were in close contact for centuries, they preserved many
indigenous forms (i.e., terms concerning kinship, the human body, agriculture,
and rituals) in their vocabulary; they also retained old Iranian vocabulary and
cultural terms that generally fell out of use in other areas. In local Tajik dialects,
Eastern Iranian vocabulary was in many cases used as a substrate layer. The features of this substrate can be identified mainly through the phonetic profile of the
vocabulary. This profile reflects some traces, such as l < ✶d, ǧ < ✶g, identifiable
only through historical comparative analysis.
Other particular features can be identified through the evolution of lexical
and semantic forms in various dialects of Badakhshan, representing the Eastern
Badakhshani branch of the Pamir-Hindukush ethnolinguistic region. In some
cases, it is also possible to identify a region-specific cultural worldview reflecting
traditional knowledge or contacts with other cultures, such as Arabic and Turkic.

8.1 Iranian
Numerous words related to traditional culture and shared religion have areal features (e.g., the names of seasons in the folk calendar, household utensils, types

356

Leyli R. Dodykhudoeva

of buildings, ritual dishes, clothes, etc.). Indigenous vocabulary faded long ago,
and forms contaminated by various Iranian and non-Iranian languages of the
region emerged: older terms are preserved mainly in ritual formulas, collocations
and phraseology. As already mentioned, in kinship terms, a series of old words of
Iranian origin for son, daughter, sister, brother, father-in-law, mother-in-law, etc.
were preserved (see section 7.1).
Some words widespread in the region were originally derived from Iranian
vocabulary but represent different groups of either eastern, or western origin.
These words may be used, along with one another, in a range of regional languages, or can be borrowed by any one of these groups and be used by another.
The term (“co-wife”) highlighted below is an example of the early spread of
western Iranian words to eastern Iranian vernaculars and shows how a specific
word of eastern Iranian origin was preserved in a number of regional idioms. This
notion is reflected in the languages of the region by two historical stems:
(225) a) TB amboǧ ‘wife of a polygamist in relation to another wife’ is a reflex of
the Iranian compound ✶ham-bāga- ‘co-sharing’ and relates to Cl Pers
anbāǧ ‘the wife of a polygamist in relation to another wife’.
This term was borrowed from the Badakhshani dialects of Dari-Tajik
by Eastern Iranian languages: W, Ishk, Yd amboǧ, Sang ambāǧ ‘id.’; the
term went through the process of adaptation in other adjacent dialects:
TVDQ boǧčun, TQK boxčun, TK boxčon (Morgenstierne 1938: 190, 380;
Steblin-Kamensky 1999: 82–83; Rastorgueva and Édel’man 2003: 52).
b) Another designation for the same notion of “co-wife” is reflected in
a number of Iranian languages of the region, specifically in the languages of the Northern Pamir group, e.g., in the form of Sh abīn and
Yaz aban. The form can be traced back to ✶ha(m)-patni- ‘mistress, one
of several mistresses’. This word is of the same origin as that found in
Northern and Central Tajik dialects pal/nonj, palon/č ‘id.’, borrowed
from Sogdian in the territory to the north of Badakhshan (cf. Yagn
pinonč, Édel’man 2020: 252–254).

8.1.1 Borrowings from Tajik to Shughnani
In the course of history, many words were borrowed from Persian as a language of
culture and religion, and later from Tajik. Until the early 20th century, numerous
terms for traditional culture, as well as cultural loan words from the Badakhshani
Tajik sub-dialects of Ghoron, and also Vanj, Darvoz and adjacent dialects, were

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

357

adopted in the Shughnani language. Today, for the most part, the Shughnani
language acquires vocabulary directly from standard Tajik; these words include
terms relating to modern technology, as well as words specifically reflecting the
profile of modern Tajik. The following example illustrates loan words connected
with traditional culture (dairy produce):
(226) TB čaxs (also with the form čaǧs, documented by Rozenfel’d) ‘strainer (for
milk), milk filter’; this cultural term was widespread in the region, and Sh
čāǧdz,͡ Wakhi and Ishkashimi čaxs, ‘colander, strainer (for milk)’ were all
derived from Tajik dialects. Cf. also TQ čaǧču, čaхčūb ‘plunger, stirrer for
churning butter’ (čub ‘stick’), čaǧdeg ‘churn’ (deg ‘cauldron’).
All these examples can be traced back to ✶1čak- ‘drip, pour (drop by
drop)’, associated with milk processing: filtering, churning butter (Steblin-Kamensky 1999: 123; Rastorgueva and Édel’man 2003: 211, 213).

8.1.2 Borrowings from Eastern Iranian to Tajik dialects
Various loans from Eastern Iranian languages, including Shughnani, are present
in Badakhshani Tajik sub-dialects – especially in the area of Ghoron, and in Vanji
Tajik. Specific Shughnani consonants were preserved in the Tajik sub-dialect of
Ghoron. See for example, a word with the unexpected consonant ç:
(227) TB çədəmč
bot. ‘Onopordum acanthium, cotton thistle’

(Rozenfel’d 1971: 10)

Other instances are discussed in Section 3.2.
Words meaning ‘fortification, fortress’ are borrowings from the Eastern
Iranian language with transition of ✶d, ✶t to l:
(228) These include Cl Pers kalāt ‘fortification on the top of a mountain, village’,
Tajik qal’a ‘fortress’, Tajik kalot ‘fortification (on the top of a mountain),
village (on a mountain)’, TB (especially in Wakhan) qalot ‘a memorial
tower made of stones in the mountains’ (Rozenfel’d 1971), Yazg qəlā, qal’a
‘fortress’, Sh qalā ‘fortress, fortification’, and kalot ‘fortification towers,
towers on the top of mountains’, used in toponyms (Dodyxudoev 1975;
Steblin-Kamensky 1999: 322–323), Sh qalot ‘a tower made of stones in the
mountains’.

358

Leyli R. Dodykhudoeva

The following group is an example of loans from various Eastern Iranian languages to Tajik – and later to Pamir languages:
(229) a) An ancient Iranian form, ✶kata/āna-, denoting size, status, or age, and
preserved in Eastern Iranian vernaculars may have been borrowed
by Tajik dialects, “through an intermediary source, especially by
the Southern Tajik and Dari dialects on the territory of Afghanistan”
(Édel’man 2011: 350).
b) There also exists a group of areal words widespread in various
vernaculars of Central Asia, such as TBVD kat(t)a ‘big, adult, huge’,
TD katana, TB katanak ‘big, adult, senior, chief’.
In Shughnani, words from this group were borrowed from Tajik dialects, e.g., katanak ‘big, adult, senior, chief, old man’, katanaki ‘seniority’, kattā, kaltā ‘big, elder, adult, huge’, and the element kata- in
collocations ‘big, senior, huge’. Sh kaltanak ‘big, adult, senior, chief,
old man, old settler’ is presumably a later secondary form of a “hybrid
nature” (Édel’man 2011: 348).
c) We also find a specific group of words derived from the same stem,
borrowed by many Western Iranian languages, and widespread in
various vernaculars of Central Asia. These “forms with the transition of ✶-t- to -l- are characteristic of some Eastern Iranian languages”
(Édel’man 2011: 349): T kalon ‘large, huge, senior, adult’. Borrowings
from Tajik include late Sh kalʊn ‘head, chief, leader; senior’ and Yaz
kalon ‘senior in rank, chief’.
The following is another example of loans from various Eastern Iranian languages, presumably from Old Vanji into Vanji Tajik, and later to Pamir languages:
(230) The forms TV (am)ǧal ‘right now’, TD, Sh, Rush, Bart, W ǧal ‘now’ – possibly derived from ✶gātu- – represent an adverb of time and place.
Cf. ǧot, a variant from another Eastern Iranian language borrowed by the Tajik
dialects of Hissar: hamin ǧot ‘now, just now’, čiǧoti ‘when’, lit. ‘what time’ (see
also Tajik goh < ✶gāθu-, used both in the meanings of place (ziyoratgoh ‘place
of worship’) and time (on goh ‘then, at that time’), (Laškarbekov 2008: 83; Rastorgueva and Édel’man 2007: 269).

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

359

8.2 Areal vocabulary
There remains a group of vocabulary, the origin of which has not been identified. These words are present in many dialects of the region, and are very similar
in form, indicating that they may have been borrowed from one another. One
instance is a landscape term widespread in Western and Eastern Iranian vernaculars; this could be a loan from Tajik into Eastern Iranian, or vice versa:
(231) nušur, nəšər, TV nišer, TDQ našar, nəšr(a), W nišɨr, Sh na/išar ‘shady side
(of the valley, place)’, Yaz nəsʊr, Sang nišərm ‘passage of the sun behind
the mountains (in winter)’ (Steblin-Kamensky 1999: 244–245).
Another noteworthy word used locally in the area is:
(232) TBVD (y)el(o) (Sh yel)
‘summer pasture, the herd that goes to the summer pasture’

8.3 Loan vocabulary
8.3.1 Arabic
As a consequence of the spread of Islam and centuries-long contact in the area,
there are many borrowed words representing common vocabulary of Arabic
origin. Specifically, Tajik Badakhshani dialects use words which were introduced
through the Persian-Tajik language. Most of these relate to the Ismaili tradition.
In local Tajik dialects and Pamir languages, the following cultural and religious term designating the soul and spirits of ancestors plays an important role:
(233) arv/wo
‘spirit(s) of dead ancestors’
This word is preserved in all Pamir languages, as well as in neighbouring Tajik
and Turkic dialects.
Several Arabic loan words have become archaisms in contemporary literary
Tajik, but were present in dialects of Badakhshan till recent times, for example:
(234) xalifá (Sh xalīfā́)
‘khalifá, Ismaili religious administrator, deputy of pir’
(cf. Pers xalif(a), an old loan word from Arabic khalīfatun ‘khalif’).

360

Leyli R. Dodykhudoeva

For Ismailis of Badakhshan, this was an important term designating a local religious administrator who ministered to people in their everyday life, organizing
ceremonies of birth, weddings, and funerals. As a specific Ismaili term, this word
was preserved unchanged in Badakhshan, in the region of Tajik and all Pamir
languages. However, recently with the abolition of the institution of pirship, this
term has become outdated.

8.3.2 Turkic
Some words are borrowed by Tajik dialects of Badakhshan from Turkic, especially from Uzbek and Kyrgyz. Most of these words relate to cattle breeding, dairy
farming, and seasonal migration to summer pastures. They also include some
social terms, such as:
(235) a) ayloq (Sh id.)
‘summer pasture’
b) yanga (Sh yangā)
‘brother’s wife’
At the same time, some words made the transition from Iranian to Turkic vernaculars and back to Iranian, such as:
(236) Sogd ǧrtr’k
‘mule’.
This was borrowed by Turkic languages, and then loaned back to Tajik and
its dialects as xačir, Sh qačīr (✶kara- ‘donkey’)
(Édel’man 2011: 283, 285)
Another word that was considered Turkic has a stable Iranian provenance:
(237) TV k/qəngola, k/qənǧola
‘betrothed’

(Édel’man 2011: 222–223)

9 Conclusion
This overview of the language situation in MBAR, focused on Iranian languages,
confirms that Tajik dialects and the Pamir languages are undergoing a constant

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

361

process of convergence due to the interaction of Western and Eastern Iranian languages in the state of diglossia. Specific Tajik dialects of MBAR are included in
this areal convergence due to their complex linguistic formation – which includes
Eastern Iranian substrates (closely related to the Pamir languages) – and their
distinctive features that developed in the area. Among other factors influencing
convergence are the presence of similar religious traditions and the long historical
socioeconomic and cultural evolution of these languages and cultures in close
contact with one another. At the same time, today in MBAR, due to the use of Tajik
as a medium in education and in administration, the Tajik language is represented
not only by local dialects; it is widely present in the form of standard Tajik – the
state language of Tajikistan. As globalization influences language unification,
both Tajik dialects and Pamir languages have fallen under the influence of standard Tajik to the extent that they are being subsumed by this standardized form.
In this regard, a historical contraction of the region inhabited by Pamir language speakers can be observed. This is evinced in the northwest of MBAR, where
the Tajik dialects of Qarategin, Vakhiyo, Darvoz, and Vanj are currently located,
and in the southwest, particularly in Ghoron, where Badakhshani Tajik has supplanted Shughnani. This is also the case in Afghanistan, where the Dari speech
varieties of Ghoron, Ishkashim and Zebak have supplanted Pamir languages.
Moreover, this summary of the nominal system of Tajik dialects in comparison
with the Shughnani system identifies the main distinctive features of Tajik nominal
categories (noun and adjective), and typical characteristics of numerals, as well as
traditional metrological vocabulary and kinship terminology. In all these aspects,
especially in vocabulary and morphosyntax, as well as in the main derivational
models, we reveal regional shared features, distinctive from standard Tajik.
Furthermore, illustration of the vocabulary of benevolent and malevolent
expressions, Nawruz greetings etc. sheds light on social and personal relationships; these terms and their gender distribution further our understanding of
the economic life of the region, and the livelihoods of its population, reflected in
speech constructions and vocabulary.
This data enables us to identify the specific models of borrowing of these formulas and phrases by the Shughnani language. We also present the methods of
adaptation and restructuring, used by Shughnani, to transform the Tajik noun
phrase and composites according to Tajik models. In addition, the research demonstrates close convergence of these language varieties through examples of shared
Iranian in both borrowed and areal vocabulary.
Over the last century, in the spheres of phonetics, morphosyntax and vocabulary, specific changes in Tajik dialects and Pamir languages have taken place,
engendered by the transition to standard Tajik; these changes have implications
in norms of pronunciation. This comparative description of Western and Eastern

362

Leyli R. Dodykhudoeva

Iranian languages in the particular area of MBAR is a clarification of these issues,
highlighting the evolution of convergence and its gathering pace today.

Abbreviations for languages
Bart
Bj
Cl Pers
IE
Ishk
Middle Pers
OV
Pers
Proto-Ir
Rush
Sang
Sh
Sogd
T
B
D
Q
K
Matcha
Q-V
Rogh
V
V-B
Uzb
W
Yagn
Yaz
Yd

Bartangi
Bajuwi
Classical Persian
Indo-European
Ishkashimi
Middle Persian
Old Vanji
Persian
Proto-Iranian
Rushani
Sanglichi
Shughnani
Sogdian
Tajik
Badakhshani Tajik dialect
Darvozi Tajik dialects
Qarategini Tajik dialect (of modern Rasht district, previously Gharm)
Southern, Northern and Western Khulabi Tajik dialects (of Khatlon)
Tajik dialect of Matcha
Vakhiyo-Qarategini Tajik dialect (of Rasht)
Roghi Tajik dialects
Vanji Tajik dialects
Vakhiyo-Bolo Tajik dialect
Uzbek
Wakhi
Yaghnobi
Yazghulami
Yidgha

Abbreviations for sources
FGJZT
FZT
ŠJZT
TRS

Farhangi gūišhoi janubii zaboni tojikī. 2012
Farhangi zaboni tojikī. 1969
Ševai janubii zaboni tojikī. 1980
Tadžiksko-russkij slovar’. 2006

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

363

Abbreviations for glosses
acc
cop
comp
d
dem
dim
fp
imp
iz
neg
nump
pl
poss
post
ptcp
ppfv
pref
prf
prs
pst
refl
sg
suf
id.

accusative
copula, predicative link of the verb
comparative
direct case (pronouns)
clitic pronoun
diminutive
final particle
imperative
izafe
negative form of the verb
numerical postfix when denoting the number of objects
plural
possessive
postposition
participle
past perfect
prefix
perfect
present tense
past tense
reflexive pronoun
singular
suffix
idem

References
Alamšoev, Qurbon. 2015. Farhangi šikor dar Pomir [Dictionary of hunting in the Pamirs].
Dušanbe: Irfon. (in Tajik)
Alamšoev, Šervonšo M. 2018. Frazeologija šugnanskogo jazyka (strukturno-grammatičeskij i
semantičeskij aspekty) [Phraseology of the Shughnani language (structural, grammatical
and semantic aspects)]. Dušanbe: Academy of Sciences of Republic Tajikistan PhD
Dissertation. (in Russian)
Badaxši, Šah Abdullah. 1960. Dictionary of some languages and dialects of Afghanistan. Kabul:
Pashto Tolana. (in Pashto and Persian)
Baxtibekov, Tupči. 1979. Grammatikai zaboni Šuǧnonī [Grammar of the Shughnani Language].
Dušanbe: Doniš. (in Tajik)
Bogorad Julija I. 1956. Rogskie govory tadžikskogo jazyka [The Rogh dialects of the Tajik
language]. In Trudy Instituta jazykoznanija Akademii Nauk SSSR [Proceedings of the
Institute of Linguistics, Russian Academy of Sciences] 6. 133–195. Moskva: Nauka. (in
Russian)

364

Leyli R. Dodykhudoeva

Bogorad Julija I. 1963. Goronskij govor tadžikskogo jazyka [The Ghoron speech variety of the
Tajik language]. In Iranskij sbornik. K semidesjatiletiju prof. I.I. Zarubina, 44–59. Moskva:
Vostočnaja literatura. (in Russian)
Boldyrev, Alexander N. 1948. Badaxšanskij fol’klor [Folklore of Badakhshan]. Sovetskoe
vostokovedenie [Soviet Oriental Studies] 5. 275–295. Moskva & Leningrad: Izdatel’stvo
Akademii Nauk SSSR. (in Russian)
Dodykhudoev, Rahim Kh. 1972. Die Pamir-Sprachen (Zum Problem der Konvergenz).
Mitteilungen des Instituts für Orientforschung 17. 463–470. Berlin.
Dodykhudoeva, Leyli R. 2004. The Tajik language and Sociolinguistic situation in the
Mountainous Badakhshan, Tajikistan. In Iran and the Caucasus. Research papers from the
Caucasian Centre for Iranian Studies, Yerevan VIII (2). 281–288. Leiden-Boston: Brill.
Dodyxudoev, Rahim X. 1970. Pamirskie jazyki (k probleme konvergentsii) [Pamir languages
(on the issue of convergence)]. In Aktual’nye voprosy iranistiki i sravnitel’nogo
indoevropejskogo jazykoznanija. Tezisy dokladov [Relevant problems of Iranian studies
and comparative Indo-European linguistics. Abstracts of papers], 23–24. Moskva: Nauka.
(in Russian)
Dodyxudoev, Rahim X. 1971. Areal’no-istoričeskaja interpretatsija mikrotoponimii Pamira
[Areal-historical interpretation of the Pamir microtopony]. In Problemy kartografirovanija
v jazykoznanii i ètnografii [Mapping Techniques in linguistics and ethnography], 56–57.
Leningrad: Nauka. (in Russian)
Dodyxudoev, Rahim X. 1975. Pamirskaja mikrotoponimija (issledovanie i materialy) [Pamir
microtoponymy (research and materials)]. Dušanbe: Irfon. (in Russian)
D’jakov, Aleksej M. 1931. Jazyki sovetskogo Pamira [Languages of the Soviet Pamir]. In Kul’tura
i pis’mennost’ Vostoka. Kniga X [Culture and Writing of the East. Book X], 85–90. Moskva:
VTsK NA. (in Russian)
D’jakov, Aleksej M. 1975. Kratkaja xarakteristika étničeskogo sostava naselenija
Gorno-Badaxšanskoj avtonomnoj oblasti v pervoj četverti XX veka [Brief description of the
ethnic composition of the population of the Gorno-Badakhshan Autonomous Region in the
first quarter of the XX century]. In A.N. Zelinskij (ed.), Strany i narody Vostoka [Countries
and peoples of the Middle East] 16. Pamir. Moskva: Nauka. (in Russian)
Edelman, Joy I. & Leyli R. Dodykhudoeva. 2009. The Pamir Languages. In Gernot Windfuhr (ed.),
The Iranian Languages, 787–824. (Routledge Language Family Series). London &
New York: Routledge.
Édel’man, Džoj (Joy) I. 1966. Jazguljamskij jazyk [Yazghulami language]. Мoskva: Nauka,
Vostočnaja Literatura. (in Russian)
Édel’man, Džoj (Joy) I. 1968. Osnovnye voprosy lingvističeskoj geografii (na materiale
indoiranskix jazykov [The main questions of linguistic geography (based on Indo-Iranian
languages)]. Moskva: Nauka. (in Russian)
Édel’man, Džoj (Joy) I. 1971. Jazguljamsko-russkij slovar’ [Yazghulami-Russian dictionary].
Мoskva: Nauka, Vostočnaja Literatura. (in Russian)
Édel’man, Džoj (Joy) I. 1975. K genezisu vigezimal’noj sistemy čislitel’nyx [On the genesis of the
vigesimal number system]. 30–37. Voprosy jazykoznanija 5. (in Russian)
Édel’man, Džoj (Joy) I. 1980. K substratnomu naslediju Tsentral’noaziatskogo jazykovogo sojuza
[On the substrate heritage of the Central Asian language union]. Voprosy jazykoznanija 5.
21–32. (in Russian)

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

365

Édel’man, Džoj (Joy) I. 1986. Sravnitel’naja grammatika vostočnoiranskix jazykov. I. Fonologija
[Comparative grammar of the East-Iranian languages. I. Phonology]. Moskva: Nauka. (in
Russian)
Édel’man, Džoj (Joy) I. 1987. Šugnano-rušanskaja jazykovaja gruppa [Shughnani-Rushani
language group]. Osnovy iranskogo jazykoznanija IV [Fundamentals of the Iranian
linguistics. New Iranian languages], 236–347. Moskva: Nauka. (in Russian)
Édel’man, Džoj (Joy) I. 1990. Sravnitel’naja grammatika vostočnoiranskix jazykov. II Morfologija,
elementy sintaksisa [Comparative grammar of the East-Iranian languages. II Morphology.
Elements of syntax]. Moskva: Nauka. (in Russian)
Édel’man, Džoj (Joy) I. & Tat’jana V. Civ’jan. 2005. Osnovnye čerty Central’noaziatskogo
jazykovogo sojuza. In Jazykovye sojuzy Evrazii i ètnokul’turnoje vzaimodejstvie (istorija i
sovremennost’) [The main features of the Central Asian language union. Language unions
of Eurasia: ethnocultural interaction (History and the present)], 189–203. Moskva: Institut
jazykoznanija RAN. (in Russian)
Édel’man, Džoj (Joy) I. 2009. Sravnitel’naja grammatika vostočnoiranskix jazykov. III Leksika.
[Comparative grammar of the East-Iranian languages. III Vocabulary]. Moskva: Vostočnaja
literatura. (in Russian)
Édel’man, Džoj (Joy) I. 2011. Étimologičeskij slovar’ iranskix jazykov [Etymological Dictionary of
Iranian Languages] 4. Moskva: Vostočnaja literatura. (in Russian)
Édel’man, Džoj (Joy) I. 2015. Étimologičeskij slovar’ iranskix jazykov [Etymological Dictionary of
Iranian Languages] 5. Moskva: Vostočnaja literatura. (in Russian)
Édel’man, Džoj (Joy) I. 2020. Étimologičeskij slovar’ iranskix jazykov [Etymological Dictionary of
Iranian Languages] 6. Moskva: Vostočnaja literatura. (in Russian)
Efimov, Valentin A., Vera S. Rastorgueva & Elena N. Šarova. 1982. Persidskij, dari, tadžikskij
[Persian, Dari, Tajik]. In Osnovy iranskogo jazykoznanija. Novoiranskie jazyki: zapadnaja
gruppa, prikaspijskie jazyki [Fundamentals of Iranian linguistics. New Iranian languages:
Western group, languages of Kaspian area], 5–230. Moskva: Nauka. (in Russian)
Farhangi gūišhoi janubii zaboni tojikī [Dictionary of Tajik language Southern vernaculars]. 2012.
Compiled by Mansur Mahmudov, Ǧaffor Jūraev & Bahrom Berdiev. Dušanbe: Doniš. (in Tajik)
Farhangi zaboni tojikī [Dictionary of Tajik language]. 1969. Compiled by Muhammadjon
Šukurov, Valentin Kapranov, Rahim Hošim, Nosirjon Ma’sumī. Moskva: Sovietskaja
Encyklopedija. (in Tajik)
Gadilia, Ketevani. 2019. A typological study of (in)definiteness in the Iranian languages.
In Alireza Korangy & B. Mahmoodi-Bakhtiari (eds.), Essays on the typology of Iranian
languages, 122–132. Berlin and Boston: De Gruyter.
Gafurov, Bobojon G. 1989. Tadžiki. Drevnejšaja, drevnjaja i srednevekovaja istorija [The Tajiks.
The ancient, old and mediaeval history] I. 2nd edn. Dušanbe: Irfon. (in Russian)
Grjunberg, Aleksandr L. 1963. Jazyk severoazerbajdžanskix tatov [The language of the North
Azerbaijani Tats]. Leningrad: Nauka. (in Russian)
Grjunberg, Aleksandr L. & Ivan M. Steblin-Kamensky. 1974. Étnolingvističeskaja xarakteristika
Vostočnogo Gindukuša [Ethnolinguistic characteristics of the Eastern Hindu Kush]. In
Problemy kartografirovanija v jazykoznanii i étnografii [Mapping Techniques in linguistics
and ethnography], 276–283. Leningrad: Nauka LO. (in Russian)
Grjunberg, Aleksandr L. & Ivan M. Steblin-Kamensky. 1976. Vaxanskij jazyk [Wakhi language].
Moskva: Nauka, Vostočnaja literatura. (in Russian)
Hojibekov, Elbon. 2009. Korburdi zaboni Šuǧnonī dar Ǧoroni Badaxšoni Tojikiston va sababhoi
az bayn raftani on: tahlili ètnolingvistī [The use of Shughnani language in Ghoron of

366

Leyli R. Dodykhudoeva

Ishkashim district of Badakhshan and the causes of its demise]. In Proceedings of the
Conference FEL XIII. Endangered Languages and History, 57–66. Khorog, Tajikistan, 24–26
September 2009. Dušanbe: Doniš. (in Tajik)
Karamšoev, Dodxudo. 1963. Badžuvskij dialekt šugnanskogo jazyka [The Bajuwi dialect of
the Shughnani language]. In Trudy Instituta jazyka i literatury Akademii Nauk Tadž. SSR
[Proceedings of the Institute of Languages and Literature of the Academy of Sciences
of Soviet Socialist Republic of Tajikistan] XVI. Dušanbe: Tajik Academy of Sciences. (in
Russian)
Karamšoev, Dodxudo. 1988, 1991, 1999 Šugnansko-russkij slovar’ [Shughni-Russian
Dictionary], 3 vols. Moskva: Vostočnaja literatura. (in Russian)
Južnye govory tadžikskogo jazyka [Southern vernaculars of the Tajik language]. 2014. Dušanbe:
Doniš. (in Russian)
Korn, Agnes. 2006. Counting Sheep and Camels in Balochi. In M.N. Bogoljubov (ed.),
Indoiranskoe jazykoznanie i tipologija jazykovyx situacij. Sbornik statej k 75-letiju
professora A. L. Grjunberga (1930–1995) [Indo-Iranian linguistics and typology of linguistic
situations. Collection of articles dedicated to the 75th anniversary of Professor A.L.
Grunberg (1930–1995)], 201–212. Moskva: Nauka.
Korn, Agnes. 2016. The languages, their histories and genetic classification: Iranian. In
Hans Henrich Hock, Elena Bashir (eds.), The Languages and Linguistics of South Asia: A
Comprehensive Guide, 51–66. (World of Linguistics 7). Berlin: De Gruyter Mouton.
Kumitai zabon va istilohoti nazdi hukumati Jumhurii Tojikiston [Committee on Languages and
Terminology under the aegis of the Government of the Republic of Tajikistan] http://www.
kumitaizabon.tj: www.kumitaizabon.tj/tg/category/забони-шуғнонӣ (15 August, 2021).
Laškarbekov, Boǧšo B. 2008. Starovandžskij jazyk [Old Vanji language]. In Valentin A.
Efimov (ed.), Osnovy iranskogo jazykoznanija. Sredneiranranskie i novoiranskie jazyki
[Fundamentals of Iranian linguistics. Middle and New Iranian languages], 61–109. Moskva:
Vostočnaja literatura. (in Russian)
Mirzoev, Šonazar & Ibodat Karamova. 2014. Kratkij slovar’ frazeologičeskix edinic šugnanskogo
jazyka i ix ekvivalenty v russkom jazyke [A short dictionary of phraseological units of the
Shughnani language and their equivalents In Russian], Šodixon P. Yusufbekov (ed.). Xorog:
Akademija nauk Respubliki Tadžikistan, Institut Gumanitarnyx nauk. (in Russian)
Morgenstierne, George. 1938. Indo-Iranian frontier languages, II. Iranian Pamir languages.
Oslo: Instituttet for Sammenlignende Kulturforskning. Universitetsforlaget.
Morgenstierne, George. 1974. Etymological vocabulary of the Shughni group. Wiesbaden: R.
Reichert Verlag.
Murvatov, Jamol. 2013. Morfologija [Morphology]. In Ǧaffor Jūraev & Mansur Mahmudov (eds.),
Južnye govory tadžikskogo jazyka [Southern vernaculars of the Tajik language], 63–102.
Dušanbe: Doniš. (in Russian)
Nazarova, Evgenija M. 1985. Terminy rodstva i svoistva v tatskom jazyke [Family and kinship
terms in the Tat language]. In Problemy otraslevoj leksiki dagestanskix jazykov: Terminy
rodstva i svoistva [Problems of the branch vocabulary of the Dagestan languages: Family
and kinship terms], 209–216. Maxačkala: Dagestanskiy filial AN SSSR, ordena «Znak
pocheta» Institut istorii, jazyka i literatury im G. Cadasy. (in Russian)
Nazarova, Zarifa О. 1998. Sistema iškašimskogo glagola v sopostavlenii s badaxšanskotadžikskoj [Ishkashimi verb system in comparison with Badakhshani Tajik]. Moskva:
Akademija nauk Respubliki Tadžikistan, Pamirskij filial Akademii nauk Respubliki
Tadžikistan, Institut Gumanitarnyx nauk. (in Russian)

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

367

Nemenova, Rozalija L. 1963. Vokalizm tadžikskix govorov Darvaza. Iranskij sbornik. K 75-letiju
I.I. Zarubina [Vocalism of Tajik dialects of Darvoz. Iranian collection: To the 75th anniversary
of I.I. Zarubin], 60–67. Мoskva: Vostočnaja literatura. (in Russian)
Ofaridaev, Nazri. 1991. Mikropotoponimija Vandža i Darvaza. Lingvističeskij analiz
[Microtoponymy of Vanj and Darvoz. Linguistic analysis], Rahim X. Dodyxudoev (ed.),
Dušanbe: Doniš. (in Russian)
Ofaridaev, Nazri. 2002. Lingvističeskoe issledovanie ojkonimii Gornogo Badaxšana [Linguistic
study of the homonymy of Gorno-Badakhshan]. Dušanbe: Tajik State National University
Doctor Habilitatus Dissertation Synopsis. (in Russian)
Paxalina, Тat’jana N. 1975. Sravnitel’nyj obzor pamirskix jazykov [Comparative overview of Pamir
languages]. In Strany i narody Vostoka [Countries and peoples of the East] 16 “Pamir”,
222–250. Moskva: Nauka, Glavnaja redakcija vostočnoj literatury. (in Russian)
Pejsikov, Lazar’ S. 1960. Tegeranskij dialekt [Tehran dialect]. Moskva: IMO. (in Russian)
Pisarčik, Antonina K. 1953. O nekotoryx terminax rodstva u tadžikov [On some terms of kinship
among Tajiks]. In Sbornik Statej po istorii i filologii narodov Srednej Azii, posvjaščennyj
80-letiju so dnja roždenija A. A. Semenova [Collection of articles on the history and
philology of Central Asia, dedicated to the 80th anniversary of the birth of A. A. Semenov].
Trudy Instituta istorii, arxeologii, ètnografii imeni A. Doniša [Proceedings of the A. Donish
Institute of Archeology and Ethnography] XVII. 177–185. Dušanbe: Tajik Academy of
Sciences. (in Russian)
Rastorgueva, Vera S. 1952. Očerki po tadžikskoj dialektologii [Essays on Tajik dialectology] 2.
Moskva: Nauka. (in Russian)
Rastorgueva, Vera S. 1964. Opyt sravitel’nogo izučenija tadžikskix govorov [An experience of
comparative study of Tajik dialects]. Moskva: Nauka. (in Russian)
Rastorgueva, Vera S. & D. (Joy) I. Édel’man. 2003. Étimologičeskij slovar’ iranskix jazykov
[Etymological Dictionary of Iranian Languages]. 2. Мoskva: Vostočnaja literatura. (in
Russian)
Rastorgueva, Vera S. & D. (Joy) I. Édel’man. 2007. Étimologičeskij slovar’ iranskix jazykov
[Etymological Dictionary of Iranian Languages]. 3. Мoskva: Vostočnaja literatura. (in
Russian)
Rozenfel’d, Anna Z. 1956a. K voprosy o pamirsko-tadžikskix jazykovyx otnošenijax (na materiale
vandžskix govorov) [On the issue of Pamir-Tajik linguistic relations (on the data of Vanj
vernaculars)]. In Trudy Instituta jazykoznanija Akademii Nauk SSSR [Proceedings of the
Institute of Linguistics, USSR Academy of Sciences] 6. 273–280. Moskva: Nauka. (in
Russian)
Rozenfel’d, Anna Z. 1956b. Darvazskie govory tadžikskogo jazyka [Darvoz vernaculars of the
Tajik language]. In Trudy Instituta jazykoznanija Akademii Nauk SSSR [Proceedings of
the Institute of Linguistics, USSR Academy of Sciences] 6. 196–272. Moskva: Nauka. (in
Russian)
Rozenfel’d, Anna Z. 1960. Govory Karategina [Qarategin vernaculars]. In Trudy Instituta jazyka
i literatury Akademii Nauk Tadž. SSR [Proceedings of the Institute of Languages and
Literature of the Academy of Sciences of Tajikistan] XCIII. Stalinabad: Tajik Academy of
Sciences. (in Russian)
Rozenfel’d, Anna Z. 1964. Vandžskie govory tadžikskogo jazyka [Vanj vernaculars in the Tajik
language]. Leningrad: Izdatel’stvo Leningradskogo universiteta. (in Russian)
Rozenfel’d, Anna Z. 1971. Badaxšanskie govory tadžikskogo jazyka [Badakhshani vernaculars of
the Tajik language]. Leningrad: Izdatel’stvo Leningradskogo universiteta. (in Russian)

368

Leyli R. Dodykhudoeva

Rozenfel’d, Anna Z. 1975. Materialy po jazyku i ètnografii pripamirskix tadžikov [Materials on
the language and ethnography of the Tajiks in regions bordering on Pamirs]. In Strany i
narody Vostoka [Countries and peoples of the East] 16 “Pamir”, 210–221. Moskva: Nauka,
Glavnaja redakcija vostočnoj literatury. (in Russian)
Rozenfel’d, Anna Z. 1981. K dialektologii Tadžikistana. Areal’noe rasprostranenie dialektnoj
leksiki po rajonam jugo-vostočnogo Tadžikistana [On the dialectology of Tajikistan. Areal
spread of vocabulary in the districts of South-Eastern Tajikistan]. In Iranskoe jazykoznanie.
Ežegodnik 1980 [Iranian linguistics. Yearbook 1980], 200–206. Moskva: Nauka, Glavnaja
redakcija vostočnoj literatury. (in Russian)
Rozenfel’d, Anna Z. 1982. Tadžiksko-russkij dialektnyj slovar’. Jugo-vostočnyj Tadžikistan
[Dictionary of Tajik-Russian dialects. South-Eastern Tajikistan]. Leningrad: Izdatel’stvo
Leningradskogo universiteta. (in Russian)
Rubinčik, Jurij A. 1987. O sootnošenii persidskogo jazyka i obixodno-razgovornogo jazyka
[On the relationship between the Persian language and everyday spoken language]. In
Iranskoe jazykoznanie. Ežegodnik 1982 [Iranian linguistics. Yearbook 1982], 115–123.
Moskva: Nauka, Glavnaja redakcija vostočnoj literatury. (in Russian)
Sajmiddinov Dodxudo, Sanavbar D. Xolmatova & S. Karimov (eds.). 2006. Tadžiksko-russkij
slovar’. Farhangi tojikī ba rusī. 2nd edn. Dušanbe: Institut jazyka i literatury Akademii Nauk
Tadžikistana. (in Russian and Tajik)
Ševai janubii zaboni tojikī. Materialho 1980. [Southern vernaculars of the Tajik language.
Language materials]. Compiled by Jamol Murvatov, Rozalija L. Nemenova & Mansur
Mahmudov (eds.). 5. Dušanbe: Doniš. (in Tajik).
Sokolova, Valentina S. 1953. Očerki po fonetike iranskix jazykov [Phonetics of the Iranian
languages] II. Moskva & Leningrad: Nauka. (in Russian)
Sokolova, Valentina S. 1967. Genetičeskie otnošenija jazguljamskogo jazyka i šugnanskoj
jazykovoj gruppy [Genetic relations between Yazghulami and Shughni language group].
Leningrad: Nauka. (in Russian)
Sokolova, Valentina S., Rozalija L. Nemenova & Julija I. Bogorad. 1952. Novye svedenija po
fonetike iranskix jazykov 1. Jugo-vostočnye govory tadžikskogo jazyka [New data on
the phonetics of the Iranian languages 1. South-Eastern dialects of the Tajik language].
In Trudy Instituta jazykoznanija Akademii Nauk SSSR [Proceedings of the Institute of
Linguistics, USSR Academy of Sciences], 154–192. Moskva: Nauka. (in Russian)
Steblin-Kamensky, Ivan M. 1970. Fol’klor Vaxana [Folklore of Wakhan]. In Fol’klor i ètnografija
[Folklore and ethnography], 212–217. Leningrad: Nauka. (in Russian)
Steblin-Kamensky, Ivan M. 1982. Očerki po istorii leksiki pamirskix jazykov. Nazvanija kul’turnyx
rastenij [Essays on the history of the vocabulary of the Pamir languages. Names of
cultivated plants]. Moskva: Nauka, Glavnaja redakcija vostočnoj literatury. (in Russian)
Steblin-Kamensky, Ivan M. 1999. Étimologičeskij slovar’ vaxanskogo jazyka. [Etymological
dictionary of the Wakhi language]. Sankt-Peterburg: Peterburgskoe Vostokovedenie. (in
Russian)
TKD 1 – Nurjanov N. Kh. 1966. Oxota [Hunting]. In Nikolaj A. Kisljakov & Antonina K. Pisarčik
(eds.), Tadžiki Karategina i Darvaza [Tajiks of Qarategin and Darvoz] 1, 291–310. Dušanbe:
Doniš. (in Russian)
TKD 3 – Rahimov M. R. 1976. Roždenie i vospitanie rebenka [Birth and raising of children]. In
Nikolaj A. Kisljakov & Antonina K. Pisarčik (eds.), Tadžiki Karategina i Darvaza [Tajiks of
Qarategin and Darvoz] 3? 58–94. Dušanbe: Doniš. (in Russian)

6 Tajik dialects of Badakhshan and Shughnani: A comparative perspective

369

Uluǧzoda, Sotim. Subhi javonii mo [The morning of our youth]. https://zarowadk.ru/
s-ulughzoda-subhi-javonii-mo-3 (25 May, 2022). (in Tajik)
Xorkašev, Sahidod R. 2014a. Leksiko-semantičeskij i morfologičeskij analiz predmetnoj leksiki
v južnyx i jugo-vostočnyx govorax tadžikskogo jazyka [Lexico-semantic and morphological
analysis of the specific vocabulary in the Southern and South-Eastern dialects of the Tajik
language]. Dušanbe: Academy of Sciences of Tajikistan Doctor Habilitatus Dissertation. (in
Russian)
Xorkašev, Sahidod R. 2014b. Barrasii lingvistii gurūhhoi mavzūii tarkibi luǧati lahja [Linguistic
study of the structural elements in thematic groups of the dialect vocabulary]. Dušanbe:
Maorif. (in Tajik)
Zarubin, Ivan I. 1924. K spisku pamirskix jazykov [On the List of Pamir Languages]. In Doklady
Rossijskoj Akademii Nauk [Proceedings of the Russian Academy of Sciences]. Series V.
82–85. Petrograd. (in Russian)
Zarubin, Ivan I. 1960. Šugnanskie teksty i slovar’ [Shughnani texts and dictionary]. Moskva &
Leningrad: Nauka. (in Russian)

Dilia Hasanova

7 Linguistic landscape of Bukhara:
The ambiguous future of Tajik
Abstract: Using the concept of linguistic landscape (LL), this study illustrates and
examines the language visibility of public and private signs of Bukhara, Uzbekistan. Bukhara, once the capital of the Samanid empire and one of the great
trading cities along the Silk Road is the fifth largest city in Uzbekistan. It is also
one of the major Tajik speaking cities in Uzbekistan. Bukhara was chosen as a
research site because it offers a unique case to study LL as the city is the home
to many languages, including Tajik, Uzbek, and Russian. Moreover, this study
utilizes qualitative methods to investigate how privileging of Russian (during the
Soviet time), Uzbek (when the language regained its power in the wake of independence), and English (as a result of globalization) is viewed by the local people,
and how the prestige of these languages has endangered the Tajik language, the
native language of the people of Bukhara. Finally, the study examines local people’s attitudes towards the use of Uzbek, Russian, English, and Tajik languages
on public and private signs in the city. By examining LL in Bukhara, this study
aims to contribute not only to the field of LL but also to the study of bilingualism.

1 Introduction
Bukhara, one of the 12 major cities of Uzbekistan, is the fifth largest city in the
country with the multiethnic population of almost 250,000 (World Population
Review 2021). Present-day Uzbekistan is an independent country. The country
was one of the republics of the Soviet Union until 1991 and was the first country
(along with Kyrgyzstan) to declare its sovereignty from the USSR (August 31,
1991). According to the World Factbook’s 2017 estimate, Uzbeks comprise 83.8%
of the local population, Tajiks 4.8%, Kazakhs 2.5%, Russians 2.3%, Karakalpaks
2.2%, Tatars 1.5%, and others 4.4%. The official language of the country is Uzbek,
a member of the Turkic group of languages.
Bukhara is located in the central-southern part of the country (Figure 1). The
majority of local people of Bukhara speak Tajik (Tojiki) as their mother tongue,
but are also highly proficient in Uzbek, the official language of the country. It is
worth noting that local people born before the collapse of the Soviet Union also
have high proficiency in Russian, especially in reading and writing, as Russian
https://doi.org/10.1515/9783110622799-007

372

Dilia Hasanova

Figure 1: Map of Uzbekistan (Retrieved from University of Texas Libraries, The University of
Texas at Austin, www.lib.utexas.edu/maps).

was not only a required language in all educational institutions during the Soviet
era but also the language of power and prestige.
The collapse of the Soviet Union 30 years ago drastically changed the linguistic landscape of Bukhara as well as the whole country. Uzbek, the official
language of the country, became the language of authority, government, and
power, while Russian lost its prestige, and Tajik, the mother tongue of the local
people became a marginalized language with no official status or power. Noteworthy is that Tajik never had an official status in Uzbekistan, but during the
Soviet time the status of Tajik in Bukhara and Samarkand was prominent, i.e.,
there were Tajik medium schools in those cities and local people were able
to find written literature in Tajik, including children’s books and novels and

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

373

watch television programs in Tajik. However, since the new law of Uzbekistan on official language was implemented in early 1990s, the status of Tajik
in Tajik speaking cities faced rapid regression; the number of Tajik medium
schools went down to zero and written literature disappeared from local book
stores, and television programs in Tajik were discontinued from national television channels.
Since the downfall of the Iron Curtain, the linguistic situation in newly
independent countries of the USSR has been studied and the theory of linguistic landscape (LL) has been widely used to examine the presence and visibility of languages on public and private signs and “to inform in-group and outgroup members of the linguistic characteristics, territorial limits, and language
boundaries of the region they have entered” (Landry and Bourhis 1997: 25). For
example, Muth (2012) used this theory to study LL in Moldova and Lithuania,
while Tussupbekova (2016) used it to examine LL in Kazakhstan. This theory
was also used to examine LL in Azerbaijan (Shibliyev 2014), Nagorno-Karabakh
(Muth 2018), Ukraine (Pavlenko 2010), and Uzbekistan (Hasanova 2019). Yet, the
study of linguistic landscape in multilingual cities in those countries remains
unexplored.
The current linguistic situation in Bukhara and the attitude of local Bukharans, who come from different ethnic and language backgrounds, towards the
presence and use of different languages in the city makes linguistic landscape
of Bukhara. Landry and Bourhis’s (1997) definition of a ‘linguistic landscape’ is
used here to describe the Bukharan context and to find out what factors influence
language choice within the linguistic landscape of the city. According to Landry
and Bourhis (1997: 25), linguistic landscape comprises the language of public
road signs, advertising billboards, street names, place names, commercial shop
signs, and public signs on government buildings.

2 Bukhara: Historical background
Bukhara, one of the most ancient cities in Central Asia, is the birthplace of
world-renowned historical figures such as Avicenna and Al-Bukhari and is the
homeland of people of various cultural, ethnic, and linguistic backgrounds.
Based on the numerous of archeological findings and remnants, the city was estimated to be at least 2500 years old. Geographically, Bukhara is located on a fertile
desert land; hence it is also known as Bukhara oasis.
Before the Turkic settlers started migrating into the region in the 6th century,
Bukhara was inhabited by Zoroastrians speaking Sogdian or East-Iranian lan-

374

Dilia Hasanova

guages (Finke and Sancak 2012). According to Frye (1998: 7), “the rise of the city
of Bukhara to great prominence. . . dates from the Arab conquests and the coming
of Islam to Central Asia”. Arab conquest of Central Asian regions, including
Bukhara, was the catalyst for Central Asian people to accept Islam. According to
The CIA World Factbook (2017), nowadays 88% of Uzbeks are Muslim, a majority
of whom Sunni. Along with their religion Arabs brought their language to the
region, which “became the primary language for government, literature, and commerce” (About Uzbekistan 2020: 5). During the Islamic Golden Era (8th–14th cc.),
Bukhara became the intellectual capital of the Islamic world.
In 1220, the area was conquered by the Mongols led by Genghis Khan. With
the Mongol invasion (13th century), the Turkic languages started to vanish in this
region. In the early 16th century, the Shaybanids (Central Asia’s last great dynasty)
“introduced the name Uzbek into the region and again made Bukhara its political
center (McChesney 1996; Burton 1997 in Finke and Sancak 2012: 51). The Mongol
invasion and the arrival of the Shaybanids drastically changed the linguistic landscape of the region, bringing the Turkic languages into the region. Even though
Turkic languages were spreading quickly around the region, “Iranian-speakers
were still in the majority, although the earlier Soghdian tongues had given place
to the west-Iranian Persian or Tajik (Frye 1997; Fragner 1998)” (Finke and Sancak
2012: 51).
With the faltering of the Mongol empire in the early fourteenth century in
the region, Tamerlane (1336–1405) emerged (Tamerlane 2017). During Timur’s
reign Bukhara became a major cultural center in Central Asia, and it was during
this time that hundreds of mosques and madrassas (religious schools) were built
(Brief History of Bukhara 2021). After Timur’s death in 1405, the empire started to
collapse because of internal conflicts and as a result of it different khanates/kingdoms emerged in the region. One of these khanates/kingdoms was the kingdom
of Bukhara that was ruled by the Shaybanids (1506–1598). During their reign
Bukhara flourished; significant improvement was made in the field of arts, architecture, and literature (Shaybanids 2022).
Between the sixteenth century and the nineteenth century, Bukhara was
ruled by different emirs and khans. The last emir of Bukhara was Emir Alim Khan
who reigned the city between 1910–1920. Alim Khan was overthrown by the Red
Army in 1920. In 1924, the Bukharan People’s Republic was abolished by the Russians and the Central Asian territory was divided into five Soviet Socialist republics, one of which was the Uzbek Soviet Socialist Republic.

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

375

2.1 Language use- the Soviet era
Since the Soviet absorption of Uzbekistan, the language use in Bukhara as well
as the whole country changed, i.e., the influx of Russian presence became an
obvious and unavoidable phenomenon. Soviets made Russian the main language
of public signs, announcements, and advertisement, and a rigorous implementation of new language policy was implemented which aimed at unifying all Soviet
republics into one “uniform national culture” (Dietrich 2015: 465).
In the late 1930s “the Latin alphabet was abandoned in favour of the Cyrillic
script throughout Central Asia, while the teaching of the Russian language was
made compulsory in all non-Russian schools across the Soviet Union in 1938”
(Dietrich 2014: 466). Moreover, because of the new Soviet language policy, all
printed materials that were in Arabic were discarded and replaced with Russian
because they were believed to be anti-Soviet. There is no documentation available vis-à-vis local people’s attitude towards all these changes, however, it is
interesting to note that the implementation of the Soviet language policy, and
the Soviet ideology in general, were supported by prominent local writers and
poets such as Sadriddin Ayni (1878–1954) and Abdurauf Fitrat (1886–1938). These
Soviet intelligentsia depicted national Soviet-Uzbek and Soviet-Tajik identities in
their writings which were widely studied during the Soviet time in all Uzbek secondary schools and colleges.
The Soviet language policy had a significant impact on the linguistic landscape of the entire country, then called The Uzbek Soviet Socialist Republic. As a
result of the influx of Russian immigrants in the country and the widespread use
of the Russian language (especially in the urban areas), local languages, including Uzbek and Tajik lost their prominence in the country. It is worth noting that
even though speaking Russian and having a high proficiency in Russian would
increase one’s social status in the country at the time, local people continued
to use spoken Uzbek and Tajik languages not only at home but also in public
government offices, where written documents were mostly in Russian. The use of
Uzbek was particularly common in public government offices where no Russian
speakers were present, however, Tajik was never used as a language of formal
communication even though all officials may have been native speakers of Tajik.

3 Research methodology
To examine the presence and use of Tajik in linguistic landscape of Bukhara, this
study used qualitative methods and multiple data collection methods including

376

Dilia Hasanova

online interviews, observations, and comprehensive photography. All in-person
interviews were conducted in Tajik, Uzbek, or Russian (the choice of the language
depended on interviewees’ mother tongue) in March 2020, and online interviews
were conducted in Tajik and Uzbek between June 2020 and February 2021. Fieldwork for the study was conducted in Bukhara city in March 2020. Selection of
sites was based on two major criteria: location and target audience. Among the
businesses that were chosen for the study were shops, pharmacies, local bazaars,
and hair salons; and as were the official government offices, passport offices and
walk-in clinics The main target consumers and customers for all services and sites
were local Bukhara people.
To collect observational data, the focus was on the areas that were frequently
visited by the local people. To use public signs for the data, picked were the signs
that had clear writing and visibility; writing that was too small or unclear was not
included in the data. As for multilingual signs, each language was counted separately. Overall, 99 street signs scattered around central Bukhara were collected
and analyzed for this study.

3.1 Research findings and discussions
3.1.1 Language use: The post-Soviet era
The collapse of the USSR opened a new phase for language policy makers not
only in Uzbekistan but also in all fifteen republics of the former Soviet Union.
As elsewhere in the former Soviet republics, one of the significant changes that
happened in Uzbekistan was the proclamation of Uzbek as the sole national/state
language: as a result, Uzbek became the language of power, authority, and pride.
The language reform:
was launched in 1989 . . . and it illustrates how the Uzbek started to regain status and
positions lost under the rule of the USSR. For instance, the Supreme Soviet of Uzbekistan
adopted a law in October 1989, about two years before the disintegration, to promote the
status of the Uzbek language against Russian.
(Uzman 2009: 56–57)

As per new policy, all government level paperwork was supposed to be conducted
in Uzbek, and meetings and gatherings were also expected to be conducted in
the state language. To implement this new policy the government funding was
allocated to open language centers that would offer free Uzbek classes to the government employees that did not have a good command of Uzbek to carry on their
responsibilities in the state language. In fact, according to Article 4 of the law of
the Republic of Uzbekistan “On Official Language” (last amended 1995)

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

377

In the Republic of Uzbekistan all citizens shall be provided with conditions to study the
official language, the respective relation to languages of nations and ethnic groups that live
in its territory shall be supported, and the conditions to develop such languages shall be
arranged as well. Teaching citizens the official language shall be free of charge
(“On Official Language”,1989)

Another significant change with regard to language policy and reform in independent Uzbekistan was “romanization” of the Uzbek alphabet. The Law of the
Republic of Uzbekistan “On Official Language” (last amended 1995) reinforced
the implementation of the new Uzbek alphabet that was based on Latin script
starting September 1, 2005 (“On Official Language”, 1989). As a result, the education ministry had to act quickly to re-produce school textbooks and other printed
materials in Latin alphabet. Public news and other programs frequently ran
speeches of prominent Uzbek politicians and linguists that talked about the role
of language in nation building. They specifically emphasized “the state’s need
for the new identity for its nation, i.e., de-russification of its population, was of
utmost importance” (Language Education Policy Studies 2013).
Unfortunately, what these rushed implementations led to was linguistic
chaos as the new policies and changes were bombarding the Uzbek folk who were
not ready to embrace those changes all at once. Government’s effort to promote
Uzbek as the language of state, power, and pride, and to increase the presence of
Uzbek in the country’s linguistics landscape caused a steady decline of not only
the Russian language but also that of the local languages, including Tajik in cities
such as Bukhara and Samarkand.
The decrease of Russian use was especially visible in all educational sectors
in the country as the instructional hours of the Russian language in secondary
and higher educational institutions reduced dramatically.
Russian [was] no longer an obligatory school subject in Uzbekistan. After the dissolution
of the USSR, there was for many years a tendency to reduce the amount of school hours for
Russian language training and to devote more time to the teaching of Western languages,
above all English. . . . . The number of schools in Uzbekistan with Russian as the sole
medium of instruction is fairly small. . . .constituting merely 1 percent of all schools in the
country.
(Schlyter 2012: 196)

As for Tajik schools, if during the Soviet time there were a handful of Tajik-medium schools in Bukhara and Samarkand (no official statistics exists to specify
the number of Tajik medium schools), in post-Soviet Bukhara, the number of
Tajik medium schools dwindled down to none (Botirova, personal communication 2017):
The number of schools with instruction in Tajik was reduced to one mixed Tajik-Uzbek
primary school, an official from the Samarkand city Education Department told EurasiaNet.

378

Dilia Hasanova

In Samarkand District, which includes the city’s suburbs (but not the city), there are four
purely Tajik-language schools and 19 mixed Tajik-Uzbek schools, according to the regional
Education Department.
(Sadykov 2013: 11)

In his article, Sadykov (2013: 15) further reports that “Prior to Uzbekistan’s independence, Tajik-language schools were supplied with textbooks printed in Tajikistan. After 1991, Uzbekistan started publishing textbooks that conformed to its
own curricula, leading to shortages in minority languages”.
Despite government’s effort to promote the spread of the national language in
all regions of the country, Uzbek-Tajik bilingualism in Bukhara is still a common
phenomenon. Native Bukharans continue to use Tajik as the language of interpersonal communication in private events as well as public offices. However,
when the conversation becomes formal, so does the language of communication,
i.e., abrupt switch from Tajik into Uzbek (Observational notes, March 2020).
Observational notes also revealed that at large gatherings such as weddings,
funerals, birthday parties, etc. people always use Uzbek while giving speeches
or making announcements, even though most of the attendees may be Tajik
speakers. When asked why Tajik is not used to make announcements or to give
speeches, the respondents got confused and were not able to come up with a
clear answer. Those who felt confident to answer this question said that “the use
of Uzbek comes to [them] naturally as it is a national language” (personal communication, March 2020). Moreover, the respondents implied that using Uzbek
while giving speeches (congratulations on weddings, well wishes on birthdays,
etc.) makes their statements more formal, convincing, and relatable (interview
notes, March 2020).
It is noteworthy how people’s attitudes towards languages change depending
on their age. Observational notes and interview results revealed that anyone born
during the Soviet era showed more comfort and pride in using Tajik as their lingua-franca, while the younger generation of the post-Soviet Union showed more
pride and comfort in their fluency of Uzbek and in the status of Uzbek as the language of power and prestige. What both generations have in common, however,
was their high respect and pride for the national language. Neither generation
sees Uzbek and Tajik as competing languages as the place of Uzbek as the language of the government, power, and authority is a forgone conclusion to most.

3.1.2 Bukhara: Linguistic landscape
To examine the presence and use of different languages in private and public
signs in Bukhara, 99 signs scattered around the city were analyzed. Special atten-

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

379

tion was given to the quality, visibility, and the target audience of the signs, i.e.,
all signs targeted local people; no signs that targeted tourists were considered for
this study. If the signs were in multiple languages, each language was counted
separately.
Data analysis revealed that in Bukhara city out of 99 signs 43% were in Russian,
35% were in English, 21% were in Uzbek, and none were in Tajik. Russian was
mostly used in shopping centers, services, salons, and for advertisement. Although
some signs were both in Uzbek and Russian languages, Russian signs appeared in
primary position. For example, as shown in Figure 2, the same message appears
in Russian and Uzbek. Russian sign is written in Cyrillic and it is in bold letters
with the red background which makes it more visible than the Uzbek sign which
is written in Cyrillic with a blue background. Figure 3 shows a sign written in only
Russian using Cyrillic mixed with English alphabet. The first 3 letters of the word
“очков” (glasses) is written in Cyrillic ‘очк’, however, the last two Cyrillic letters
‘ов’ are replaced with English “off”: there is no Uzbek translation of the store name.

Figure 2: A sign by a local charity organization that is offering free food to people in need.
The writing with a red background is in Russian and the one with blue background is in Uzbek
written in Cyrillic [Translation: if you can – help; if you need – take]. Source: posted by Alisher
Ibragimov on Facebook group Buxoro va Buxoroliklar (December 4, 2020).

What can be driven from these data is that even though theoretically Uzbek is
the primary language of the county and Russian has lost its prestige since the
downfall of the Soviet Union, its presence and use in Bukhara is more visible
than Uzbek. When local people were asked about this phenomenon, they were

380

Dilia Hasanova

Figure 3: Private optical store Mup Oчkoff ‘world
of glasses’.

not surprised and did not show any displeasure. One of the respondents, a
history teacher with many years of secondary school teaching experience, mentioned that “it is not surprising. We were part of the Soviet Union for 70 years, and
we used Russian and considered Russian as the language of power. Moreover,
recently, a lot of Uzbeks are working in Russia and providing for their families as
a result of it. Knowing Russian has become an important asset again” (personal
communication, translated from Tajik, March 2020).
Eurasianet, a website based in the US that publishes political, economic,
and social news about post-Soviet republics, published an article on June 2019
describing the status of Russian in Uzbekistan and summarized what Aziza
Umarova, an Uzbek businesswoman said:
In addition to downgrading Russian in the 1990s, Uzbekistan also on paper adopted the
Latin alphabet, instead of the Cyrillic used by several Slavic languages. In reality, the
switchover has been extremely patchy and inconsistent, and the volume of books available
in Latin script is still quite small. This failure to nourish Uzbek has damaged the language
and [has] led to the kind of crude form often seen on social media. (Eurasia.net 2019: 11)

The analysis of big commercial signs and billboard advertisements highlights
the use of mainly Uzbek and English (Figure 4). It can be assumed that the lack

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

381

Figure 4: A big sign describing Bukhara Shopping
Center. The sign BUXORO SAVDO MAJMUASI is
written in Uzbek in Modified Latin alphabet.

of Russian in big signs and billboards may be the city’s response to the promotion of Uzbek in the linguistic landscape of the city. However, if that is the
case, one can’t help but wonder, why use English in big signs that are so visible?
(Figure 5)
English language business names and billboard signs are frequently seen in
the linguistic landscape of Bukhara. Interestingly, some English signs use English
words that are not commonly known to local people (Figures 6–8). For example,
the billboard sign “DREAM HOUSE” (Figure 6) is written only in English. While
the English word ‘house’ has entered the Uzbek lexicon because of new housing
market promotions and advertisements that frequently use descriptive words
such as ‘townhouse’ and ‘penthouse’, the majority of local people may be
unfamiliar with the word ‘DREAM’. What can be inferred from this data is that
the use of English signs targets high-middle class or high-class people who may
have more than basic proficiency in English and maybe well-travelled. It can also
be assumed that by using English signs, local businesses hope to associate themselves with foreign companies and show their quality and prestige.
As for the use of Tajik signs (formal or informal) in Bukhara, the data revealed
that even though Tajik is widely spoken in both formal and informal contexts
in Bukhara, there are no signs in Tajik: not even small informal signs in subur-

382

Dilia Hasanova

Figure 5: A big billboard advertisement promoting
“Andalus” sausage. The sign is written in Uzbek in
modified Latin alphabet.

Figure 6: Billboard sign DREAM HOUSE and
advertisement of grocery store and bakery located in
central Bukhara.

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

383

Figure 7: Fashion House in the shopping district is a
private store that sells clothing items. No Uzbek or
Russian translation of the fashion house is provided.

Figure 8: Professional School for Stylists. The name of
the business is in English. However, services offered
by this school are written in Russian.

384

Dilia Hasanova

ban parts of Bukhara were found. When asked about the absence of Tajik signs,
respondents didn’t show any signs of disappointment; they believe that it is not
surprising to not see any Tajik signs in a Tajik speaking city because Tajik does
not hold any status in the city. Moreover, they continued, “not everybody living
in Bukhara speaks Tajik”. Hence, to show respect to the national language, to
target larger poll of consumers, and to be as clear as possible about the nature of
the products or services, local businesses use Uzbek or Russian, or even English,
but not Tajik.
On October 21, 2019, Uzbekistan celebrated the thirteenth anniversary of
giving the Uzbek language the status of the sole national language of independent Uzbekistan. The president of Uzbekistan, Mr. Shavkat Mirziyoyev, attended
the ceremony and gave a very powerful speech on the role, significance, and
status of the Uzbek language. He especially emphasized “the need for further
increasing the authority of the state language in the life of the state and society,
improving the Law “On the State Language” based on today’s requirements, and
identified the urgent tasks in this sphere” (ddsmfa.uz, 2021: 6). President Mirziyoyev further noted
we should consider our attitude to the state language as a relation to our independence,
devotion and respect for it – as devotion and respect for the Motherland. This should
become the rule of our life. Each of us must begin this noble work with ourselves, our family
and team.
(ddsmfa.uz, 2021: 7)

President Mirziyoyev’s speech encouraged many new initiatives, vis-a-vis the
use of the state language in the country. On July 17, 2021, subscribers of Telegram-kanal pvbux received a news article informing them about the new initiative called “we are responsible for cleansing our language”, along with Article 5
of the Law of the Republic of Uzbekistan on Advertisement that promoted the use
of Uzbek in all advertisement and street signs. Hence, business owners began
removing all business signs that were in Russian and replaced them with signs in
Uzbek (see Figure 9).
According to Article 5, while the use of Uzbek in advertisement, street signs,
and services is encouraged, other languages can also be used in the original language if desired by the advertisers.

4 Conclusion
Using the concept of linguistic landscape, this study examined the presence of different languages on public and private signs in Bukhara city to see if the public and
private signs accurately reflect the linguistic diversity of the city. The study revealed

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

385

Figure 9: Local businessmen removing business
signs that are in Russian. Source: sent by Azim
Khasanov and downloaded from Telegram Kanal
pvbux.

four main findings. First, linguistic landscape of Bukhara does not do justice to
ethnic diversity of the city. In other words, there were no signs in languages other
than Russian, Uzbek, and English. Second, the data analysis revealed a dominant
presence of Russian in the linguistic landscape of Bukhara. Even though Russian
has lost its power and authority in the wake of the language reforms and policies in independent Uzbekistan, its presence and use, even after 30 years, is still
widespread. The continued popularity and presence of Russian may be because
Russian still remains the language of business and interethnic communication.
Third, the presence of English in the linguistic landscape of Bukhara has become
prominent mainly because of the status of English as a global language and the
local people’s positive attitude towards this language. Fourth, even though it has
been almost three decades since Uzbek replaced Russian as the language of power
and authority, the presence of Uzbek in public and private signs in Bukhara is not
as widespread as expected. In fact, of the three languages that are present on the
street signs of Bukhara, Uzbek was the least popular one. Finally, the absence of
Tajik, the mother tongue of the local people in the linguistic landscape of Bukhara
is mainly due to the fact that Tajik does not carry any power or authority in the city.
Moreover, native Bukharans, who speak Tajik natively, do not show any signs of
disappointment at the absence of Tajik as they firmly believe that Uzbek, as the
country’s official language, should be the main language of private and public
signs.
Tajik was, is, and will be not only the native language of local Bukharans but
it will remain the language of communication for the local people of Bukhara.
Unfortunately in post-Soviet Uzbekistan where attention is given to promoting
the sole official language of the country, the use and presence of Tajik in linguistic landscape of Bukhara becomes non-existent. While Uzbekistan’s efforts
to promote the Uzbek as the official language of the country is understandable,

386

Dilia Hasanova

its negligence towards local languages has to be re-considered as local people’s
mother tongue plays an important role in building their unique identity. The
research findings presented in this study are preliminary and exploratory, hence
they are far from comprehensive or complete; more in-depth data collection and
analysis must be conducted in order to have a better understanding of the use
and presence of different languages in Bukhara.

References
About Uzbekistan. 2020. http://uzbek-travel.com/about-uzbekistan/history/arrival-of-islam/
(accessed 20 February 2021).
Brief history of Bukhara. 2020. https://central-asia.guide/uzbekistan/destinations-uz/
bukhara/history-of-bukhara/ (accessed 13 May 2021).
CIA World Factbook, Uzbekistan. 2017. https://www.cia.gov/library/publications/the-worldwfactbook/geos/uz.htm (accessed 15 January 2021).
Dietrich, Ayse. 2014. Soviet and post-soviet language policies in the Central Asian republics and
the status of Russian. https://www.semanticscholar.org/paper/SOVIET-AND-POST-SOVIETLANGUAGE-POLICIES-IN-THE-AND/1aa1569e09ea1c2d2d3546a373087f71abc56760#paperheader (accessed 5 June 2021).
Emir of Bukhara. Bukhara. 2016. https://www.wdl.org/en/item/5869/ (accessed 13 June 2021).
Frye, Richard. 1998. Early Bukhara. Dossier «Boukhara-la-Noble». https://journals.openedition.
org/asiecentrale/527#quotation (accessed 13 June 2021).
Hasanova, Dilia. 2019. Linguistic landscape of Uzbekistan: The rise and fall of Uzbek, Russian,
Tajik, and English. In Stan Brunn (ed.), The Changing World Language Map, 1–15. New
York: Springer. https://link.springer.com/referenceworkentry/10.1007/978-3-319-734002_45-1#citeas (accessed 20 December 2020).
Landry, Rodrigue & Bourhis, Richard. 1997. Linguistic landscape and ethnolinguistic vitality an
empirical study. Journal of Language and Social Psychology 16(1). 23–49.
Language Education Policy Studies. 2013. Language education policy in Uzbekistan http://
www.languageeducationpolicy.org/lepbyworldregion/centraleurasiauzbekistan.html
(accessed 15 March 2020).
Law of the Republic of Uzbekistan “On Official Language” (last amended 1995). https://www.
justice.gov/sites/default/files/eoir/legacy/2013/11/08/Law_on_official_language.pdf
(accessed 15 June 2021).
Mirziyoyev, Shavkat. 2019. Uzbek language is a symbol of national identity and state
independence, a huge spiritual value for our people. http://www.ddsmfa.uz/en/
node/774/pdf (accessed 20 July 2021).
Muth, Sebastian. 2012. The Linguistic Landscapes of Chişinău and Vilnius: Linguistic
Landscape and the Representation of Minority Languages in Two Post-Soviet Capitals. In
Durk Gorter, Heiko F. Marten & Luk Van Mensel (eds.), Minority Languages in the Linguistic
Landscape, 204–224. (Palgrave Studies in Minority Languages and Communities). London:
Palgrave Macmillan.

7 Linguistic landscape of Bukhara: The ambiguous future of Tajik

387

Muth, Sebastian. 2014. War, language removal and self-identification in the linguistic
landscapes of Nagorno-Karabakh. Nationalities Papers 42(1). 63–87. doi:10.1080/009059
92.2013.856394 (accessed 2 May 2021).
Pavlenko, Aneta. 2009. Language Conflict in Post-Soviet Linguistic Landscapes. Journal of
Slavic Linguistics 17(1–2). 247–274.
Sadykov, Murat. 2013. Uzbekistan: Tajik language under pressure in ancient Samarkand. http://
www.refworld.org/docid/553108bb4.html (accessed 5 January 2020).
Schlyter, Birgit. 2012. Language Policy and Language Development in Multilingual Uzbekistan.
In Harold F. Schiffman (ed.), Language Policy and Language Conflict in Afghanistan
and Its Neighbors: The Changing Politics of Language Choice. https://www.academia.
edu/24001324/Language_Policy_and_Language_Development_in_Multilingual_
Uzbekistan (accessed 1 May 2021).
Shibliyev, Javanshir. 2014. Linguistic Landscape Approach to Language Visibility in
Post-Soviet Baku. Bilig 71. 205–232. https://www.academia.edu/34162665/
Linguistic_Landscape_Approach_to_Language_Visibility_in_Post-Soviet_Baku_Javanshir_
Shibliyev-_bilig_71._Say%C4%B1_G%C3%BCz_2014 (accessed 18 June 2021)
Tamerlane Prominent people of Uzbekistan. Known as Amir Temur in Uzbekistan https://orexca.
com/p_tamerlane.shtml (accessed 10 July 2017).
The World Factbook. 2019. Uzbekistan. https://www.cia.gov/library/publications/the-worldfactbook/geos/uz.html (accessed 15 July 2021).
Tussupbekova, Madina. 2016. Linguistic Landscape in Kazakhstan: Public Signs in Astana.
International Education and Research Journal 2454–9916(2). 18–20. https://www.
researchgate.net/publication/297691068_Linguistic_Landscape_in_Kazakhstan_Public_
Signs_in_Astana (accessed 15 July 2021).
Uzbekistan: A second coming for the Russian language? 2019. https://eurasianet.org/
uzbekistan-a-second-coming-for-the-russian-language (accessed 20 June 2021).
Uzman, Mehmet. 2010. Romanisation in Uzbekistan Past and Present. Journal of the Royal
Asiatic society 20(1).49–60. Cambridge University Press. https://www.academia.
edu/522245/Romanisation_in_Uzbekistan_Past_and_Present (accessed 5 February 2020).
World Population Review of Cities in Uzbekistan. 2021. https://worldpopulationreview.com/
countries/uzbekistan-population (accessed 28 July 2021).

Mirzo Hassan Sulton

8 Terminology in Tajik
Abstract: In this article, Tajik terminology, history of its formation and its achievements are briefly discussed. Likewise, in 14 paragraphs, the principles for creating Tajik terms are presented. The important contribution of the Committee on
Terminology in Tajikistan in streamlining Tajik terms is emphasized.
The lexicon of any language consists of common words, terminologies (sets of
words and phrases representing scientific concepts), and nomenclatures (sets of
words and phrases expressing geographical, astronomical, botanical, zoological,
and mineralogical names).
Words are the main component of every language and from a linguistic gaze,
they are the most important part of linguistic expressions. On the other hand,
terms, which make up terminologies, are special words, or more precisely, words
with special functions. A term is a word or phrase that has a special scientific
connotation and, within the scope of a certain science, accurately and concisely
expresses a specific concept. It is important to note that when a word acquires a
terminological meaning, all its other meanings are disregarded.
Everything connected with the definition and study of the concept of a term,
as well as its improvement and development in a language not only connect linguistics with other branches of science, they also communicate with the material
and spiritual history of mankind.
A deep look into the basis of the philosophical definition of term reveals its
two features. The first is the materiality of terminology, namely that the products
of cognition are fixed in material form with the help of terms. The second is that
terms, along with other linguistic signs, promote the discovery of new knowledge
(Danilenko et al. 1987:135).
The logical definition of term is associated with its philosophical definition,
since any given term has a direct relationship with a concept, i.e., a concept is
expressed through a term. As such, Kondakov (1971:518) defines term as “a word
or phrase expressing the exact name of a strictly defined concept of science, technology, etc.”
The history of each science, which incorporates the history of terminology
used in that science, inevitably includes the history of concepts and terms. The
lexeme term, which has been in use in Russian and some other European languages, was in wide use in the Tajik of the Soviet period, from the 1930s to the
early 1980s due to Tajik speakers’ preferential attitude towards Russian.
https://doi.org/10.1515/9783110622799-008

390

Mirzo Hassan Sulton

The word term is derived from Latin terminus, which in turn derives from
Termin, the deity of boundaries and boundary markers (Phillips 1973:471; Apresjan 2000:55) in the mythology of the ancient Romans. This word in Latin was
used in the sense of “border stone” or “border” and subsequently in reference to
“limit”, “end”, and “completion” (Šul’c 1905:669).
A.D. Xajutin, a linguist, assumes that the word terminus in Medieval Latin has
the meaning of “definition” or “expression” and that the word terme appears in
old French with the meaning of “word” as a result of an influence from the word
terminus (Xajutin 1972:2–3). This assumption is supported by the outstanding lexicographers E. Littre, O. Bloch, and W. von Wartburg – as well as P. Robert.
The word term gradually departed from its principal meaning, acquiring a
completely different terminological meaning. Term is rendered into Tajik as istiloh,
a loan word from Arabic. The word istiloh in Arabic comes from the verb iṣṭalaḥa
whose meanings are: 1) to improve, to correct; 2) to put up (with someone); 3) to
agree (with someone); to conspire, to agree (with something), to accept (something). Istiloh in Arabic means: 1) convention, general agreement; common usage;
conditionally; 2) special expression; term (Baranov 1984:442–443), while in Tajik
it is usually used specifically in the sense of term. However, it should be noted that
in Arabic the word muṣṭalaḥ(āt) is used for term.
Over the last few decades, deep and detailed studies of the concept of term
as a lexical unit of the language have been carried out in Tajikistan. For example,
this issue was considered by the author of the present article (Sulton 2003, 2008,
2011, 2019), as well as by Nurov (2009), Nazarzoda (2013), and Šokirov (2017),
among others.
If we summarize what is acceptable in the proposed points of view of various
scholars, the definition of “term” would look like this: a term is a word or phrase
that, within the scope of a certain science, expresses the exact concept and,
simultaneously with other linguistic units with which it is in interconnection,
creates an integral terminological system.
A term differs from a regular word or morpheme in the following ways: A
term 1) is a word/phrase with special functions; 2) has a single, special and specific meaning, while an ordinary word can have several meanings; 3) it is directly
dependent on the concept it expresses (in contrast, not every word is associated
with the concept it expresses); and 4) it has certain semantic boundaries.
A term, as a rule, is formed by imposing the content of a scientific concept
on ordinary words. As such, a term performs two functions: first, a term is an
expression of a scientific concept; secondly, a term is a reflection of a scientific
concept. However, it is not necessary the case that the term completely coincides
with the content and meaning of a scientific concept. It is important that the term
expresses one of main features of the concept, or expresses the concept’s main

8 Terminology in Tajik

391

semantic meaning. Moreover, sometimes a term may conflict with its nomenclatural meaning or have no links with it. Thus, the geographical term Karkaskūh/
Kargaskūh ‘vulture mountain’ (Rukopis’ Tumanskogo 1930: 7, 12a) refers not to a
mountain but to a desert near the city of Kirman.
Among those characteristics that terminologists attach to a term, in our
opinion, the following are of note: 1) unambiguity (a term expresses one specific scientific concept); 2) clarity (a term must be in accordance with grammar);
3) compactness (a term should be as curt as possible, laconic and convenient for
pronunciation and writing); 4) compositivity (a term is amenable to word formation and word creation so that new terms expressing other concepts can be
derived from it); and 5) phonetic correspondence (a term that is chosen or created,
borrowed to express new concepts, should not be coarse, awkward or contradict
the sound and phonetic norms of the language) (Sulton 2011: 48–55). Nomenclature is also a component and integral part of the terminological system of a particular branch of science and should be studied in conjunction with terminology.
It is no secret that the set of terms of any language mirrors the thoughts and
views of each nation, forming the basis for new knowledge. In addition, the set
of terms creates the basis for language and intellection, application of new concepts, promotion of the dynamics of gradual development of language, as well as
keeping the language relevant in the contemporary world.
However, today we are faced with difficulties in using terms in Tajik terminography. In the era of rapid scientific and technological growth, terminography becomes especially significant for increasing the capacity of language. As
such, there is a need to effectively use all opportunities available for expressing
new and subtle scientific and technological concepts (definitions). By reviving traditional models of terminography, it is necessary to bring into action the
dynamic mechanism of Tajik and to ensure an increase of the language capacity,
both through utilizing native Tajik words and through reasonable borrowing of
termsfrom other languages.
Tajik terminology has a rich history and harks back to a distant past. The
greatest national scientists Aburayhoni Berunī (973–1048), who authored Kitob-altafhim (1029), and Abualī ibni Sino (Avicenna) (980–1037), the author of Donišnoma
(1023–1037) are the founders of Tajik national terminology. Formation and further
development of Tajik scientific terminology was carried out later in works of Nosiri
Xusrav (1004–1088), Umari Xayyom (1048–1131), Nasiridduni Tusī (1201–1274),
Ismoili Jurjonī (1042–1136), among others. It is also worth noting that distant
ancestors (e.g. Muhammadi Xorazmī (780–850), Abulma”šari Balxī (787–886),
Ahmadi Farǧonī (798–861), Muhammad Zakariyoi Rozī (865–925), Abunasri
Forobī (870–950), Abulvafoi Buzjonī (940–998), Abusaidi Sijzī (950–1025), Abumahmudi Xujandī (940–1000), and Abulfazli Hiravī (dead nearly in 999) made

392

Mirzo Hassan Sulton

notable contribution to Arabic scientific terminology through their writings. Part
of the Arabic scientific terminology later found its way into Tajik and now comprises a foundational part of Tajik scientific terminology in such areas as astronomy, arithmetic and geometry, philosophy, geography, and medicine.
It is well known that the social, economic, and cultural transformations of
recent years have necessitated the introduction of thousands of new political,
social and cultural concepts. An overwhelming majority of such concepts arose in
connection with transformations in the world society, and are not an intellectual
product of the Tajik. Many of them do not yet have exact linguistic equivalents in
Tajik, although they are necessary to express new concepts.
In addition, because of rapid progress of science and technology in most
developed countries, it is often necessary to express new concepts and terms in
Tajik. One therefore needs to find Tajik equivalents, or foreign terms expressing
these concepts and their numerous derivatives can penetrate the Tajik language,
making it dependent on other languages and depriving it of its independence.
There is no denying borrowing terms from other languages. Over millennia, Tajik,
like other languages of the world, as a result of linguistic and cultural contacts
with other languages, has borrowed many words and terms, increasing its stock.
Today, however, it is a matter of concern that, along with borrowed terms, the
derivatives of these terms may penetrate Tajik, which will lead to the spread of
derivational and terminographic models which are alien to Tajik.
During the Soviet era, Russian derivational and terminographic models were
borrowed into Tajik, the consequences of which are still felt today. As a result of the
policy of Russification, the domains in which national languages’ (i.e. languages
spoken by non-Russian nationalities) were used were limited, and national languages in most Soviet republics deviated from their natural course of development, thereby losing their internal opportunities for enrichment. Tajik was not
an exception and new scientific concepts, and terms were copied from Russian
or were translated interlinearly into Tajik. As a result, derivational and terminographic models that are alien to Tajik became widespread in it. Even though wellknown Tajik scientists have repeatedly emphasized, in various conversations and
in their writings, the need to refrain from using artificial derivational and terminographic models, this issue has not yet received common approval.
Beside the influence of Russian on Tajik, the influence of English is also
increasing.
Today, in some scientific circles, there is a prevalent erroneous opinion that
Tajik words cannot properly capture new or international concepts. According
to generally approved terminological and terminographic rules, a term does not
need to correspond uniquely to a concept. There are many terms that have no
links to, or even contradict, their literal meanings. For example, paššaxona, a

8 Terminology in Tajik

393

compound noun which consists of pašša ‘mosquito’ and xona ‘house’, means not
a shelter for mosquitoes, but a canopy to protect against mosquitoes.
As noted above, most of the new concepts penetrating Tajik are a product of
a chaning world, and not the intellectual products of the Tajik people; therefore
they do not have exact equivalents in Tajik. Hence, to express these concepts in
Tajik, one should use four generally approved terminological methods, namely
1) to search for equivalents in the native Tajik lexicon, allowing Tajiks to benefit
from their own lexicon, which is preserved not only in scientific and literary
works, but also in the living folk language; 2) to coin new terms; 3) to translate
foreign terms; and 4) to borrow foreign terms.
For example, the political and economic transformations that Tajikistan
underwent in its transition to capitalism necessitated Tajik equivalents of such
economic terms as market, business, and businessman. In the early 1990s, the Terminology Committee of Tajikistan recommended that they be bozar, bozargonī,
and bozargon, respectively, in Tajik. These terms were proposed considering the
method of search for equivalent words from Tajik native vocabulary. The word
bozor ‘market’ is certainly in wide use today as it has been in the past. The words
bozargon and bozargonī were also widely used in works of the ancestors of the
Tajik, including Ta”rix-i Buxoro, in which, for example, one finds such passages
as “[t]hey [i.e. the people of the Sharg settlement] in ancient times had a bazaar,
and every year in the middle of winter from distant vilayats (regions) people came
for ten days for commerce and trade” (Naršaxī 2012: 42) and “there [the Vardona
settlement] once a week a bazaar was held, where many goods were traded”
(Naršaxī 2012: 43).
Obviously, such euphonious primordial Tajik words as bozargon and bozargonī
in the said book, and other historical works was used to imply commerce and trade,
respectively. However, if we assign the meanings of businessman and business to
these words, then they undoubtedly are able to readily express their meanings.
The second method is creation of terms. This method forms new terms with
strict adherence to historically formed word-formation and terminographic models
of Tajik. According to generally approved terminographic rules, a new term should
be created from current and meaningful elements, so that it would be possible to
form other terms on its basis. For example, for piecework in English, paymonkorī
has been created in Tajik. Other Tajik words created using the second method
include manfiatdor ‘beneficiary’, izofaandoz ‘excise’, pazirišgoh ‘acceptance house’,
me”yorhoi maqbul ‘prudential standards’, qarzi xudtajdid ‘revolving loan’, hissai
ištiroki afzaliyatnok ‘qualified participation interest’, sofkorii pul ‘money laundering’, amonati xudguzar ‘automatic deposit’, etc.
The third method is translation of foreign words. This method forms new
terms by translation from other languages, especially from Russian and English.

394

Mirzo Hassan Sulton

Terms such as iqtisodi bozorī ‘market economy’, nizomi idorai xudkor ‘automated
control system’, ašyoi kambahovu zudfarso ‘low value and quickly wearing items’,
majmaai kišovarziyu sanoatī ‘agroindustrial complex’, sabti dugonai hisobdorī
‘double entry’, sahmiyahoi nomī ‘registered shares’, tahlili xarojoti jorī ‘analysis
of current costs’, robitahoi iqtisodii xorijī ‘external economic relations’, nizomi
iqtisodī ‘economic system’, barnomai iqtisodī ‘economic program’, pešrafti iqtisodī ‘economic development’, and me”yorhoi iqtisodī ‘economic standards’ – and
more – are formed by translation from Russian.
For example, for English financing, mablaǧguzorī is preferrable to mablaǧdihī
or tamvil. The dissonant term moliyakunonī, which is used in some publications,
is very far from the grace of the Tajik syllable. The use of the term tamvil in this
meaning has an advantage because in economy and finance branches along with
the word financing, refinancing is still used, which, by analogy with the word
tamvil, may be translated as boztamvil as in me”yori boztamvili Bonki millii Tojikiston ‘refinancing rate of the National Bank of Tajikistan’.
The fourth methodis borrowing terms from other languages. That is, in the
absence of Tajik equivalents of foreign terms and possibility of creating new
native Tajik terms, terms are borrowed directly into Tajik from the original language or indirectly through a third language. Such terms include monitoring,
holding, transfer, forward, and forfeiting.
Tajik terminology is basically formed based on the rich word stock accumulated throughout the last millennium, and also by the process of native word formation and borrowing. Tajik terminology developed greatly during the Soviet era
when it obtained a regulated system under the influence of Russian.
A first manual on Tajik terminology was approved by the Central Committee
on New Alphabet and Terminology in 1936. The manual established basic principles of Tajik terminology that were to be observed until the 1970s. However, the
manual was compiled in conformity with the policy of its time and had a weak
scientific value and as such posed to be a hindrance in the development of Tajik
and national terminology in the 20th century.
In 1960, by the initiative of Muhammad Osimī, Chairman of Terminology
Committee and former President of the Academy of Sciences, the Draft of Basic
principles of the terminology of the Tajik language has been presented to the other
members of the Committee, which defined the principles of current Tajik terminography.The manual Basic principles of the terminology of the Tajik language
was published in 1971 under the editorship of Muhammad Osimī and Nosirjon
Ma”sumī. It is worth noting that the manual was developed by Yakub Kalontarov
and Abdulqodir Maniyozov and that Akbar Turson also participated in its final
edition.

8 Terminology in Tajik

395

The valuable manual contains 15 principles, with highly scientific and practical lasting until today. However, the new realities arisen with the independence and formation of national statehood of Tajikistan have called for revision,
improvement and serious revaluation of some principles of Tajik terminography.
The manual was published in the epoch dominated by Soviet ideology, which
was engaged in dissemination of Russian and propagation of Russian terminology – even when it formally propagandized equality of languages. As a result, the
use of Tajik in science and technology had limited possibilities and priority was
given to Russian terms in Tajik terminography.
Ten out of the fifteen principles stated in the manual (principles numbered
3, 4, 6, 7, 8, 9, 12, 13, 14, and 15) had direct and indirect ties to Russian. In particular, principle 3 advocates enrichment of terminological structure of Tajik at
the expense of Russian terms and other languages (Kalontarov 1971:22), while
principle 4 promotes exact translation of words and combinations (calque)
from Russian. Principle 12 stipulates that some Greek-Latin word-formation elements available in Russian terminology remain unchanged in Tajik (Kalontarov
1971:51).
However, in the independent Tajikistan, since 1991 Tajik receives a “state langauge” status; The moment and opportunity has come to define and establish
principles and approaches of Tajik terminography a new, proceeding from the
original and historically developed principles of Tajik.
Accordingly, we provide below a number of new principles of Tajik terminography, which in our opinion better reflect the nature of Tajik and its terminography, so as to stimulate discussion among experts. As terminography is a discipline that demands knowledge of word formation, examples are provided with
each principle for ease of understanding.
Principle 1. Effective use of Tajik vocabularary retained in scientific and literary
works of ancestors and in the present-day national language:
Example: zamin ‘earth’, osmon ‘sky’, sitora ‘star’, kūh ‘mountain’, dara ‘valley’, čašma
‘spring’, daryo ‘river’, sadaf ‘pearl’, nuqra ‘silver’, buzurgī ‘value’, safeda ‘proteins’, birinjī
‘bronze’, za”faron ‘saffron’, gahvorajunbon ‘praying mantis’, šudgor ‘ploughland’, hastī
‘genesis’, sulola ‘dynasty’ etc.

Principle 2. Terminologization of words in common use:
Examples: bozor ‘market’, bozori pul ‘money market, bozori mol ‘commodity market, bozori
kor ‘labour market, bozori koǧazhoi qimatnok ‘securities market, iqtisodi bozorī ‘market
economy’, devon ‘office’, devoni vaziron ‘the cabinet’, bozargonī ‘business’, bozargon ‘businessman’.

396

Mirzo Hassan Sulton

Principle 3. Formation of new terms with strict observance of historical manners
of Tajik word formation – and compounding: There are 8 methods of Tajik word
formation, which are 1) affixation; 2) compounding; 3) conversion; 4) compounding involing a conversion of a phrase into a word; 5) semantic extention; 6) acronymization/abbreviation; 7) izofat; and 8) compounding involving prepositions.
Examples of 1 (prefixation): abarsaxt ‘ultrastrong’, abarsado ‘hypersound’, abarmard ‘superman’, andaryoft ‘cognition’, barsū ‘upwards’, beandoza ‘dimensionless’, bozdošt ‘detention’,
farosavt ‘ultrasound’, furusū ‘downwards’, hamčand ‘equivalent’, norost ‘indirect’, nopaziro
‘unacceptable’, po’d’zahr ‘antivenom’.
Examples of 1 (suffixation): guša ‘corner’, daha ‘decade’, oveza ‘lifting bar’, asbak ‘lock
pin’, xištak ‘diopter’, ravanda ‘running; planet’, xazanda ‘craw’, darozo ‘length’, žarfo
‘depth’, pahno ‘width’, gardon ‘rotating’, nuhgona ‘nonary figures, i.e., from 1 to 9, jismonī
‘physical’, daryobor ‘area, abounded by rivers’, rasadgoh ‘observatory’, tihigoh ‘flank’, sangiston ‘rocky terrain’, šūriston ‘saline land’, varziš ‘sport’, yakī ‘unit’, sardser ‘frigid climate’,
garmser ‘hot climate’.
Examples of 2 (compounding of two words): bunovar ‘abscess’, govmeš ‘buffalo’, dupaykar
‘twins’, sangpušt ‘turtle’, turšširin ‘sour sweet’, darozmiyona ‘medium altitude’, donišnoma
‘encyclopedia’, tandurust ‘healthy’, yoddošt ‘memory’, surxrag ‘artery’, sesū ‘triangle’,
hazorpo ‘centipede’, sarsom ‘meningitis’, mardumgiyoh ‘ginseng, mandrake’, sitorayob
‘astrolabe’, šaborūz ‘day and night’, šohbalut ‘chinquapin tree’, modaronbū ‘milfoil’,
bodresa ‘nozzle for spindle’.
Examples of 2 (compounding of three words): ušturgovpalang ‘giraffe’, sangberunoranda
‘stone deferent’, duvozdahanguštī ‘duodenum’.
Examples of 3: nimburid ‘demilune’, purī ‘plenilune’, rostpahlu ‘rectangle’, istoda ‘immovable’, oveza ‘lifting bar’, oramida ‘immobile’, baranda ‘bearing’, gardanda ‘rotating’, giranda
‘occulting’, ravanda ‘running; planet’, paranda ‘flying’, guzar ‘passage’, šumor ‘arithmetic’,
došt ‘possession’, xost ‘will’.
Examples of 4: murǧobī ‘duck’, būyimodaron ‘milfoil’, dorčinī ‘cassia cinnamon’, Nimasb
‘Centaurus’, kanorirūzī ‘dayend’, kanorišabī ‘night end’, kičuman ‘who looks like me’,
čumankidid ‘who saw similar to me’.
Examples of 5: Sutun ‘column’ in ancient geometry is used as a term to mean ustuvona ‘cylinder’: sutunirost ‘straight cylinder’, sutunikaž ‘inclined cylinder’. Tir ‘arrow’ as a term in
the meaning of mehvar ‘axle’: tirisutun ‘axis of cylinder’. Surxī ‘redness’ in the meaning of
šihob ‘meteor’; balandī ‘altitude’ in the meaning of zirva ‘apogee’; хirman‘barnyard’ in the
meaning of hola ‘mirage’ as a term of astronomy. Kostanu afzudan, in arithmetics, is used
in the meaning of addition and deduction, while in astronomy it is used in the meaning of
reduction and addition, e.g., the Moon.
Examples of 6: α-zarra ‘α-particle’, β-šuo” ‘β-rays’, NBO Roǧun ‘Rogun HPP’, d.i.f. (doktori
ilmi filologiya) ‘Doctor of Philology’, km/s and m/s (kilometr/soniya va metr/soniya) ‘km/h
and m/sec’, AMIT (Akademiyai millii ilmhoi Tojikiston) ‘National Academy of Sciences of

8 Terminology in Tajik

397

Tajikistan’, MU SMM (Majmaai umumii Sozmoni Milali Muttahid) ‘United Nations General
Assembly’.
Examples of 7: adadi avval ‘first number’, adadi murakkab ‘complex number’, adadi tom
‘perfect number’, burji gardon ‘rotatable constellation’, kasri dahī ‘decimal number’, Rohi
Kahkašon ‘the Milky Way’, xatti rost ‘straight line’, inqilobi zimistonī ‘winter solstice’,
é”tidoli bahorī ‘vernal equinox’, ixtilofi manzar ‘parallax’, falaki homil ‘deferent’, falaki
tadvir ‘epicycle’, quvvati yoddošt ‘memory might’, qutbi šimol ‘North Pole’, dardi dandon
‘toothache’.
Examples of 8: afzunī ba adad ‘increment by account’, doira bar šakl ‘figure escribed in
circle’, šakl bar doira ‘figure escribed near circle’, payvand ba pahno ‘adjunction on width’.

Principle 4. Formation of new terms based on words that have currency today.
Such terms have the tendency of deriving many other terms; something that terms
formed based on Old and Middle Persian word formation lack.
Examples: sarmoya ‘capital’, sarmoyaguzor ‘investor’, sarmoyaguzorī ‘investment’, sarmoyador ‘capitalist’, sarmoyadorī ‘capitalizm’, sarmoyafizoī ‘capitalization’, sarmoyai
ijtimoī ‘social capital’, sarmoyai qarzī ‘loan capital’, sarmoyai aslī ‘basic capital’, sarmoyai
é”lonšuda ‘stated capital’, sarmoyai insonī ‘human capital’, sarmoyai avvaliya ‘statutory
fund’, sarmoyai sahmī ‘capital stock’, sarmoyai tabiī ‘natural stock’, sarmoyaguzorii mustaqil ‘independent investment’, sarmoyaguzorii doxilī ‘domestic investments’, sarmoyaguzorii
xorijī ‘foreign investment’, sarmoyaguzorii mustaqim ‘direct investment’, sarmoyaguzorii
voqe”ī ‘real investment’, paymon ‘contract’ ‘paymongar ‘contractor’, paymonkor ‘contractor’, paymonkorī ‘contract’, paymonšikan ‘breaking contract’, paymonšikanī ‘infringement
of a contract’, sahmiya ‘share’, sahmiyador ‘shareholder’, sahmiyadorī ‘share holdings’,
sahmiyafurūšī ‘sale of shares’, sahmiyai nomī ‘nominal share’, sahmiyai odī ‘common stock’,
sahmiyai imtiyozdor ‘preferred share’.

Principle 5. Literal translation of foreign terms: To achieve an exact and successful translation of a term is difficult, but the use of vulgar, or rough, words and
expressions should be avoided in translating foreign terms.
Examples: zarinbarg ‘golden leaved’, xudnavis ‘selfrecorder’, adadi duraqama ‘two-digit
number’, jadvali zarb ‘multiplication table’, buzurgii mutlaq ‘absolute value’, vazni qiyosī
‘specific weight’, naqliyoti rohi ohan ‘railway transport’, sarvi hamešabahor ‘evergreen
cypress’, arziši izofai mutlaq ‘absolute surplus value’, arziši aslī ‘prime cost’, huquqi ijora
‘lease law’, ijoragir ‘lessee’.

Principle 6. Free translation:
Examples: haqqi ištiroki afzaliyatnok ‘qualifying holding’, me”yorhoi maqbul ‘prudential
standards’, iqomatnoma ‘residence permit’, bimanoma ‘insurance policy’, qarzi xudtajdid
‘revolving credit’.

398

Mirzo Hassan Sulton

Principle 7. Notional translation of a term or formation of new terms by transmission of a term’s meaning into other languages:
Examples: iqrori ixtiyorī ‘acknowledgement of guilt’, tabaqai zerxok ‘subsoil’.

Principle 8. Formation of new terms because of hybridization of Tajik words,
expressions, and terms from other languages:
Examples: duatoma ‘diatomic’, karbidi ohan ‘ferriferous carbonium’, dahanai vulqon ‘volcanic neck’, ohanu beton ‘armed concrete’.

Principle 9. Formation of new terms by joining Tajik lexemes to terms of Russian
or other languages:
Examples: bonkdorī ‘banking’, betonrez ‘concreter’, bombaandoz ‘bombardment aircraft’,
mošinron ‘driver’, pianinonavoz ‘piano player’, futbolboz ‘football player’.

Principle 10. Formation of compounds for designation of terms, which in Russian
and other languages are word combinations:
Examples: sadoafkan ‘acoustic radiation element’, sadoparda ‘vocal cord’, angištsang
‘hardcoal’, namaksang ‘rock salt’, maǧzparda ‘brain tunic’, harommaǧz ‘spinal cord’,
sarsatr ‘new paragraph’, sikkaxona ‘mint’, tirreša ‘main root’.

Principle 11. Rendering foreign terms into word combinations:
Examples: sitorai dumdor‘comet’, kišti dubora ‘passage’, kišti ilova ‘undersow’, ta”siri
mutaqobila ‘interaction’, muhlati é”tibor ‘credit’.

Principle 12. Expression of one scientific notion by combining two or three borrowed, or Tajik, terms is subject to the grammatical rules of the Tajik language:
Examples: Dubbi Akbar, Xirsi Kalon, Haftdodaron ‘Ursa Major’; kusuf, giriftani Oftob
‘eclipse of the sun’, xusuf, giriftani Moh ‘eclipse of the moon’; harakat, junbiš ‘movement’;
pudina, hulbūy ‘spearmint’; tir, mehvar ‘axle’. Tajik names for zodiac constellations are also
formed in accordance with this principle: 1. Barra ‘Hamal’ ‘Aries’; 2. Gov ‘Savr’ ‘Taurus’; 3.
Dupaykar ‘Javzo’ ‘Gemini’; 4. Xarčang ‘Saraton’ ‘Cancer’; 5. Šer ‘Asad’ ‘Leo’; 6. Xūša/ Javonzan ‘Sunbula’ ‘Virgo’; 7. Tarozu ‘Mizon’ ‘Libra’; 8. Každum ‘Aqrab’ ‘Scorpius’; 9. Tirandoz/
Kamonvar ‘Qavs’ ‘Sagittarius’; 10. Buzǧola ‘Jady’ ‘Capricornus’; 11. Obrez ‘Dalv’ ‘Aquarius’;
and 12. Mohī ‘Hūt’ ‘Pisces’.

Principle 13. Borrowing of terms from other language are nativized (i.e., “Tajikized”) in pronunciation:
Examples: akademiya ‘academy’, kumita ‘committee’, aspirant ‘postgraduate’, doktorant
‘person working for doctor’s degree’, polkovnik ‘colonel’, sulfur ‘sulfur’, karbon ‘carbon’,

8 Terminology in Tajik

399

oksigen ‘oxygen’, hidrogen ‘hydrogen’, bonk ‘bank’, monitoring ‘monitoring’, holding ‘holding
company’, forvard ‘forward’.

Principle 14. Observance of expressiveness, elegance, subtlety of the Tajik language: A Tajik terminograph should have a good knowledge of Tajik and delicate
taste. He should be able to appreciate grace of words, because except for some
rough terms created during Soviet period and subsequent years, in the millenary
history of existence of scientific and literary Tajik, use of rough, obscure, and
intricate terms are not observed.
In Tajikistan, issues related to formation and regulation of terms are dealt with
by the Terminology Committee, which has more than 80 years of history. On June
11, 1933, it was established by a resolution by the Central Executive Committee of
the Soviets of the Tajik SSR on approval of Law on Central Committee of the New
Tajik Alphabet. In 1935, by a decree of the Central Executive Committee of the
Tajik SSR, the first Terminology Committee (Central Committee of New Alphabet
and Terminology) was established. This Committee consisted of 4 divisions, one
of which was called the Dictionaries and Terminology Division. The Committee
in 1936 approved the Instruction Approval of Tajik language terminology, which
defined the basic principles of terminology of Tajik.
On June 22, 1960, by the Decree of the Council of Ministers of the Tajik SSR
(No. 276), the Terminology Committee under the Presidium of the Academy of
Sciences of the Tajik SSR was reestablished. The Committee developed the Regulations, which were approved on September 6, 1961, by the Presidium of the
Academy of Sciences. The Committee had its own publication (Terminological
Bulletin).
In 1979, the Committee was liquidated and within the structure of the Institute
of Language and Literature named after the father of Persian literature Rudaki,
the Department of Terminology and Speech Culture was established, which was
entrusted with development and regulation of terms.
On September 12, 1990, by the Order of the Council of Ministers of the Tajik
SSR (No. 314), the Terminology Committee was reestablished under the Presidium
of the Academy of Sciences of the Tajik SSR. This order increased the rights and
competencies of the Committee and obliged all ministries and departments, and
institutions and enterprises within the territory of Tajikistan to comply with all
its decisions and instructions on terminology and terminography. A famous academic Muhammadjon Šakurī was elected as chairman of the Committee, which
consisted of such prominent Tajik scientists as Muhammad Osimī, Loiq Šeralī,
Habibullo Saidmurodov, Razzoq Ǧafforov, Šarofiddin Rustamov, Abduqodir
Maniyozov, Muso Dinorshoev, Jumaboy Azizqulov, Akbar Turson, Ǧaffor Ašurov,

400

Mirzo Hassan Sulton

Saidjafar Qodirī, and others. Mirzo Hassan Sulton, Sayfiddin Nazarzoda, Hassan
Yorzod, Saidahmad Qurbon, Ilhomjon Hojiev, Yusuf Akbarzoda, Salmon Jamolov,
Maqsud Hojimuhammad, Ǧiyosiddin Qodirov were involved in the Committee as
well.
The activities of this Committee before the outbreak of the civil war was fruitful and efficient. During this time, social and political terminology and the terminology of other branches of science and technology gradually began to be formed
and regulated. The Terminology Committee compiled and proposed dictionaries
of terms for the majority of ministries and departments. A large number of dictionaries and books were published and glossaries were approved by the Committee.
After the settlement of the political situation and strengthening peace and
harmony in the country, the activities of the Terminology Committee improved
significantly. In October 2009, the Language and Terminology Committee under
the Government of the Republic of Tajikistan was established, which now deals
with issues of control and regulation as well as the use of Tajik terms.
Over the last decade, some dictionaries on terminology were published,
such as Polytechnical Russian-English-Tajik dictionary (2016), and Russian-Tajik Explanatory dictionary on innovation and scientific-technological activities
(Nurov 2019), Tajik-Russian dictionary on law terminology (Šokirov 2012) is of
note as well.

Literature
Apresjan, Yurij D. (ed.). 2000. New English-Russian Dictionary. Vol. 3. Moscow: Russkij jazyk.
Baranov X.K. 1984. Arabsko-russkij slovar’ [Arabic-Russian Dictionary]. Moscow: Russkij jazyk.
Danilenko V.P., V.M. Lejčik, M. Muravickaja & V. Perebijnis. 1987. Terminovedenie i
terminografija v indoevropejskix jazykax [Terminology and terminography in
Indo-European Languages]. Vladivistok: Publisher unknown.
Kalontarov Y.I. 1971. Osnovnye principy terminologii tadžikskogo jazyka. [Basic principles of the
terminology of Tajik language]. Dushanbe: Doniš.
Kondakov N.I. 1971. Logičeskij slovar’ [Logical dictionary]. Мoscow: Nauka.
Naršaxī, Abubakr Muhammad Ibn Jafar. 2012. Ta”rixi Buxoro [The History of Bukhara]. Edited by
Golib Goibov, Karomatullo Olimov, & Nurmuhammad Amiršohī. Dushanbe: Payvand.
Nazarzoda, Sayfiddin. 2013. Istilohoti tojikī: ta”rix, garoyiš va durnamo [Terminology of Tajik
language, history, tendencies, perspectives]. Dushanbe: Irfon.
Nurov, Pirmahmad. 2009. Tadžikskaja naučno-texničeskaja terminologija [Tajik scientific
technical terminology]. Dushanbe: Doniš.
Nurov, Pirmahmad. 2019. Farhangi tafsirii rusī ba tojikī oid ba fa”oliyati innovatsionī va ilmiyu
texnologī [Russian-Tajik explanatory dictionary on innovation and scientific-technological
activities]. Dushanbe: Doniš.

8 Terminology in Tajik

401

Phillips, Robert S. (ed.). 1973. Funk & Wagnalls New Encyclopedia, vol. 23. New York: Funk &
Wagnalls, Inc.
Rukopis’ Tumanskogo [Manuscript of Tumanski]. 1930. Leningrad: Nauka.
Sulton, Mirzo Hassan. 2003. Istilohoti ilmii “Kitob-ut-tafhim” Aburayhoni Berunī [Scientific
terminology of “Kitab-al-tafhim” Abu Rayhan Biruni]. Dushanbe: Doniš.
Sulton, Mirzo Hassan. 2008. Stanovlenie i razvitie persidsko-tadžikskoj terminologii [Formation
and development of the Persian-Tajik Scientific Terminology]. Dushanbe: Doniš.
Sulton, Mirzo Hassan. 2011. Jazyk nauki i terminologija [The language of science and
terminology]. Dushanbe: Irfon.
Sulton, Mirzo Hassan. 2019. Istilohšinosī va istilohnigorii tojikī [Tajik terminology and
terminography]. Dushanbe: R-graph.
Šokirov, Tuǧral. 2012. Farhangi tojikī-rusii istilohoti huquq [Tajik-Russian dictionary on law
terminology]. Khujand: Nargis.
Šokirov, Tuǧral. 2017. Lingvističeskoe izučenie juridičeskix terminov [Linguistic studies of
juridical terms]. Khujand: Dabir.
Šul’c G. 1905. Latinsko-russkij slovar’ [Latin-Russian dictionary]. Saint Petersburg: Book Store
of K. Feldman.
Xajutin, A.D. 1972. Termin, terminologija, nomenklatura [Term, terminology, nomenclature].
Samarkand: Samarkand State University Publishers.

Index
Abdulqodir Muhiddinuf 22–24, 38, 49–50
Abdulvohid Munzim 49–50
Abduqodir Maniyozov 399
Abdurauf Fitrat / Abdurrauf Fitrat 13, 15, 24,
38–39, 46, 49–50, 61, 375
Abulfazli Hiravī 391
Abulma”šari Balxī 391
Abulvafoi Buzjonī 391
Abumahmudi Xujandī 391
Abunasri Forobī 391
Abusaidi Sijzī 391
Afghan Sign Language 229, 248–253, 257,
259–261, 263–268
Afghanistan 46, 48, 59, 184, 222, 229,
248–252, 261, 279, 283–285, 287, 291,
321, 358, 361
Ahmad Doniš 50
Ahmadi Farǧonī 391
Akbar Turson 399
Aktionsart 183, 215, 223
Al-Bukhari 373
Almosī 58, 69
American Sign Language 232, 250–251,
264–265, 267
Amir Olim Xon / Emir Alim Khan 4, 374
Amu Darya 3
Andijon 21
aorist 114–115
Arabic 14, 29–30, 39, 72, 79, 82, 84–87,
130–131, 140, 297, 338, 355, 359, 375,
390, 392
Arabic script 14, 50, 74, 84, 97, 168,
219, 250
Arabs 3–4
Ashgabad 14
Asht 287
Avestan 221
Avicenna 373, 391
Avliyoato 21
Azerbaijani / Azeri 6, 9, 34, 46
Bactrian 276
Badakhshan 184, 222, 275–278, 284–286,
289, 291, 301, 317, 320, 324, 327,
https://doi.org/10.1515/9783110622799-009

337–338, 346, 348–349, 355–356,
359, 360
Badakhshani Tajik 284–285, 289, 295,
297–298, 302, 306–307, 309, 311–312,
316–320, 322–323, 325–326, 330,
333, 339, 341, 343, 345, 348, 350–351,
356–357, 361
Bajuwi 308, 325, 339, 342
Balochi 128, 320
Bartang 27
Bartangi 292, 338, 358
Basmachi / basmači 20, 26
Baysundaryo 7
Behbudī, Mahmud 14. 17
Berunī 391
Boǧi šamol 5–6, 10, 16, 34
Bokhtar 232, 237–238, 241–242 see also
Qurǧonteppa
Bolsheviks 1–4, 6, 11–12, 20, 23, 33–34, 38
Boysun 287
Bukhara 3–4, 6–8, 12, 14, 21–28, 30, 34,
37–40, 46–56, 64, 73, 74, 96, 219–220,
287, 371–379, 381–382, 384–386
Bukharan Tajik 47, 50–51, 65, 78, 91, 176
Bulgarian 215
Buxoro axbori 22
China 46, 63, 278, 321
Chinese 63
Chust 287, 344
Classical Persian 117–118, 120, 123, 125,
127, 135, 167, 174, 183–184, 186–187,
197, 199–200, 203, 217–220, 223, 347,
356–357 see also Persian
conditional mood 113–117, 145, 170, 190,
202–203
conjectural mood 109, 114–116, 165,
173–176, 178, 186, 190, 213
counterfactual mood 115–116, 176, 201–203
Čūzī 58
Cyrillic script 184, 375
Dari 184–185, 193, 197, 203, 218–221, 223,
250, 275–276, 283, 295, 356, 358, 361

404

Index

Darvoz / Darwoz 19, 29, 30, 276, 278–280,
283, 287–292, 294, 298, 307, 318, 327,
347, 348, 356, 361
Darvozi Tajik 292–294, 327
Dašti Qipčaq 20
Dašti Yazǧulom 284
Dehqonobod 69
deontic modality 109, 111–112, 117, 148, 150,
153–156, 162–163
Derbent 287
durative aspect 166, 183, 197–198, 204–205,
207–210, 212, 215, 217, 221–223
Dushanbe / Dušanbe 27, 45, 48, 51, 52,
54–59, 66, 68–69, 71, 76, 78, 97,
109–110, 169, 184, 209, 211, 214,
232–242, 246, 251–252, 259, 264, 279
Dutch 85, 215
dynamic modality 109, 111–112, 148,
154–156, 158–159, 161–163
epistemic modality 109, 111–112, 114, 116,
124, 129, 134, 136, 148–150, 152, 155, 164
Éronī 3–6, 9–10, 12–13, 16, 20, 30, 34
Faizobod 237
Falgar / Falǧar 19, 29–30, 287
Fan-Darya 287
Farǧona / Fergana 8, 21–22, 32, 287
Finnish 85
forsī 4, 9–10, 14, 17, 19, 29, 31, 37, 46, 279,
285 see also Persian
French 72, 93, 390
French Sign Language 232, 251, 264–265, 267
Ǧaffor Ašurov 399
Ǧarm / Garm 25, 27, 48, 52, 69–70
Genghis Khan 374
German 72
Ghoron / Ǧoron 279, 282–285, 289–290,
295, 298, 342, 356, 357, 361
Ǧiyosiddin Qodirov 400
Ǧižduvon 48, 54
Guliston 48, 70
Habibullo Saidmurodov 399
habitual aspect 111, 176, 183, 188, 196,
198–199, 201, 203–204, 210, 212, 222

Hassan Yorzod 400
Hazaragi 59
Hebrew script 14, 50
Herat 193, 203, 251
Hesār 109 see also Hisor
Hisor 48, 58, 67–69, 237, 241, 287–288
Hojī Muin 14–16, 28, 32, 35
Ilhomjon Hojiev 400
imperative mood 109, 113–116, 130, 162,
166–170, 172, 176
imperfective aspect 184–185, 188,
200–201, 212
Imrūz 57
indicative mood 109, 113–116, 126, 128–129,
131, 143–144, 148, 162, 165–167,
169–170, 172–176, 178, 185, 190–191,
197–198, 200, 202, 222
inferential mood 114–116
intentional mood 115–116, 133
Interior Salish 72
Iran 9, 46, 126, 152, 167, 178, 183, 187, 195,
210, 217, 219, 222, 291
irrealis mood 113, 167, 176, 183, 185, 188–190,
196, 199, 201–204, 206, 208–209, 223
Isfara 21, 282
Ishkashim / Iškošim 27, 276, 278–279,
282–284, 290, 338, 342, 348, 361
Ishkashimi 276, 278–280, 282, 284, 321,
325, 330, 341–342, 347, 356–357
Ishkashimi Tajik 290
Iskandar 64
Ismoili Jurjonī 391
Italian 215
Jalalabad 249–250, 252–253
Japanese 48, 67
Japanese Sign Language 256, 265
Jews 3–4, 7, 9, 11–12, 50, 61
Jordan 250
Jordanian Sign Language 250
Judeo-Persian 173–174 see also Persian
Jumaboy Azizqulov 399
Kabul 193, 249–250, 252–253
Kazakhstan 48, 373
Kazan 233

Index

Khatlon 284, 287, 307, 320
Khorezm 6
Khorog / Khorogh 237–238, 279, 285, 290
Khotanese 221
Khujand / Xujand 19, 21, 24–25, 29–30, 35,
39–40, 47–49, 52, 54–56, 64–65, 70,
96, 201, 237–238, 251–252
Khujandi Tajik 55–56, 65, 67
Kofarnihon 7
Koni Bodom / Konibodom 21, 287
Kūlob / Kulob 27, 47–48, 52, 55–56, 232,
237–238, 240–242, 245, 287–288
Kūlobi Tajik 56, 287–288
Kunduz 46, 48
Kyrgyz 360
Kyrgyzstan 48, 371
Langar 27, 289
Latin script 377, 380
Leninabad 257 see also Khujand / Xujand
Leninsky 235
Leninsky School 235–238, 240–247,
252–268
light verb 122, 139, 140, 142, 145, 152, 191, 217
Lohūti (Abulqosim Ahmadzoda) 37, 50
Loiq Šeralī 399
M. K. Trojanovsky 16
Mahmudxoja Behbudī 14, 17–18
Maqsud Hojimuhammad 400
Marv / Merv 6, 34, 219
Masčoh 8, 29–30
Matcha / Matča 25, 287
Mexican Sign Language 256
Middle Persian 118, 174, 183, 187, 195, 208,
217, 220, 223, 294, 397 see also Persian
Mirzo Hassan Sulton 400
Moscow 16, 22, 36–37, 144, 146, 160, 233,
235, 246, 252
Mū”min Xoja 34
Muhammad Osimī 399
Muhammad Zakariyoi Rozī 391
Muhammadī 16
Muhammadi Xorazmī 391
Muhammadjon Šakurī 399
Mullo Nodiro 18
Munji 290, 321

405

Muso Dinorshoev 399
Muxtorī 16
Najot 13
Namangon 21
narrative mood 114, 115
Nasiridduni Tusī 391
Navrūzī 67
New Persian 61, 63–64, 73, 125, 174,
186–188, 219, 223 see also Persian
Nicaraguan Sign Language 243
Nisor Muhammad 28
Northern Tajik Chain Shift 61, 64–65, 71, 96
Nosiri Xusrav 391
Old Persian 168 see also Persian
Old Vanji 276, 278, 280, 283, 302–306,
330, 358
optative mood 113–116, 166, 170
Orenburg Cossacks 2
Ovozi tojik 8, 28–30, 32–35
Oyina 5, 14–16
Pakistan 249, 321
Pakistan Sign Language 250
Panj 19, 279, 288–292
Panjakent / Penjikent 64, 287
Pashto 249–250, 305
Pavlovsk 232
perfective aspect 183–188, 200–201, 212
Persian 1–25, 27–38, 46, 88, 110, 117–119,
122–123, 125–126, 128, 131–134,
151–152, 167, 173, 178, 183–184,
186–188, 190–191, 193–195, 200–201,
203, 206, 208, 210, 217–223, 275–276,
279, 281, 284, 286, 294–295, 312, 356,
359, 374, 399
Peshawar 249, 250, 252
Poland 6
porsī 19, 279, 284–285 see also Persian
presumptive mood 114–115, 173
progressive aspect 188, 196, 210, 215,
217, 220
Qarotegin / Qarategin / Karategin 19, 25,
29–30, 32, 283, 287, 290, 303–304,
320, 345, 349, 361

406

Index

Qassansai / Qasansay 287, 344
Qulmunda 68–69
Qurǧonteppa 27, 237
Qutuluš 21
Qyzyl-Su 7
Rahвari doniş 88–89
Rasht 279, 287, 320, 362
Razzoq Ǧafforov 399
Red Army 2, 5, 21, 374
Regar 236
Rogh 284, 287, 342–343
Roghi Tajik 284, 287, 318, 342–344
Roshorvi 292
Rošorv 27
Rost 14, 16
Rudaki 235, 237, 399
Rushani 276, 278, 280, 297, 302, 308, 328,
330, 338, 358
Rušon / Rushan 27, 278, 309
Russia 2, 10–12, 18, 55, 229–234, 243,
251–252, 261, 268
Russian 11–12, 14, 19, 25, 33, 52, 61–62,
79–82, 87–91, 157, 230, 239, 247, 286,
293, 298, 328, 348, 371–372, 374–377,
379–385, 389, 392–395, 398, 400
Russian Sign Language 229–233, 236, 239,
242–248, 251–253, 257, 261–268
Sadoi Dušanbe 69, 87
Sadoi Turkiston 10, 15
Sadriddin Aynī 10, 14–19, 21, 23, 28–32,
36–37, 49–50, 52, 375
Safina 58
Šahrinav 48, 58
Saidahmad Qurbon 400
Saidjafar Qodirī 400
Saidrizo Alizoda / Sayyid Rizo Alizoda 5–9,
12–13, 16–17, 28, 33–34, 39, 49
Salmon Jamolov 400
Samarkand 1, 4–10, 12–16, 18–19, 21, 25–35,
39–40, 47–52, 54–56, 64, 73–74, 96–97,
287, 372, 377–378
Samarkandi Tajik 50–51, 53–56, 61, 74, 89
Sanglichi / Sanglechi 321, 356, 359
Sari Osjo 27
Sarikoli 278

Šarofiddin Rustamov 399
Sayfiddin Nazarzoda 400
Scottish Gaelic 70
Sedaqat Deaf Center 253
Shakhristan 287
Shughnan 275, 278, 309, 311, 338, 348
Shughni / Shughnani 59, 70, 275–278,
280, 284–286, 290, 295–308, 310–311,
313, 315, 318–319, 321, 323–343, 346,
348–361
Simon Ḥakham 46
Širinšoh Šohtemur 33, 34
Širobod 7
Sogdian / Soghdian 221, 276, 356, 360,
373–374
Soktare 50
Spanish 72, 215
speculative mood 114–116, 173
St. Petersburg 232–233
Stalin 3, 31, 33–34
Stalinabad 39, 50–52, 54, 97 see also
Dushanbe / Dušanbe
Standard Tajik Chain Shift 63–65
stative verb 122, 183–184, 188, 198, 204,
205, 209, 212, 223
Šū”lai inqilob 4–11, 13–14, 16–18, 20–21, 24
subjunctive mood 109, 113–116, 118–121,
123, 125–127, 131, 135, 140, 142–146,
148, 166, 169–170, 172–173, 185–187,
190, 197, 201–202, 207, 210, 213
Šuǧnon 27, 275 see also Shughnan
Šuǧnoni 275 see also Shughni / Shughnani
Surxandaryo 7
Syr Darya 3, 11
Tamerlane 374
Tanviri afkor 17
Tashkent 2, 4, 6, 15–16, 21–22, 27–28, 30,
32–34, 48, 97
Tatar 9, 215
Tehran 193
Terhrani Persian 294, 312 see also Persian
Timur 374
Timurid Persian 63 see also Persian
Transoxiana 2, 8, 14
Tsar Nicholas II 10
Turki 3, 7–8, 10, 12–16, 19, 21–24, 27, 31

Index

Turkic 7, 16, 19–20, 23–24, 34–35, 38,
197, 219–220, 297, 355, 359–360, 371,
373–374
Turkish 9, 22
Turkish Sign Language 265
Turkiston 15
Turkiston 21
Tursunzoda 236
Umari Xayyom 391
ūroteppa / Ura-Tyube 21, 29–30, 35, 287
Uzbek 6, 8–9, 22, 28–35, 37, 39, 47, 59,
61, 70, 78–79, 88, 220, 360, 371–372,
374–385
Uzbek SSR 49–50, 371, 374–375
Uzbekistan 25–27, 32–35, 37–40, 46–49, 53,
59, 236, 291, 371–378, 380, 384–385
Uzbekistan Arabic 59
Uzbeks 10, 17, 19–26, 34, 38–39, 371,
374, 380
Vahdat 237, 241
Vakhiyo-Qarategini Tajik 317, 322,
345, 347
Vanj 276, 278–279, 281, 283, 287–289, 291,
292, 294, 303–307, 309, 315, 320, 345,
356, 361
Vanji Tajik 281, 283, 292, 296, 303–306,
311, 314–315, 318, 330, 332, 334, 339,
344–345, 357, 358
Varzob 48, 54, 287, 288

407

Vaxon / Waxon 27, 290 see also Wakhan
Viena 233
Wakhan 278–279, 282–286, 301, 321, 338,
348, 357
Wakhi 276, 278–280, 282, 285, 290, 299,
305–306, 321, 325, 328–330, 338, 341,
356–359
West Greenlandic 72
Xexak 284
Xinjiang 46
Xovar 57–58
Yaghnobi / Jaǧnobi 19, 221, 275–276,
310, 356,
Yazghulam 278–279, 283–284, 338
Yazghulami 276, 278–280, 283, 305–306,
321, 330, 333, 356, 358–359
Yidgha 356
Yoged 288–289, 292
Yusuf Akbarzoda 400
Za partiju 23
Zarafšān 109 see also Zarafšon
Zarafšon 24
Zarafšon 8, 19, 25–27, 30, 40
Zebak 283, 361
Zehnī (Tūraqul Narsiqulov) 34–35, 46, 49
Zirakī 67, 69
Zorkul 279