An insatiable appetite for ancient and modern tongues

Name Origin: urdu means 'camp' in Turkish, a name applied to the lingua franca between Muslim garrisons and the civilian population of Delhi.

Classification: Indo-European, Indo-Iranian, Modern Indo-Aryan, Central.

Overview. Urdu is the result of the convergence of Indian and Islamic civilizations, of Indo-Aryan and Arabo-Persian, brought about by the creation of the Delhi Sultanate in the 13th century and, later, of the Mughal Empire. From the common trunk of Khari Boli (the speech of Delhi) were born, more or less at the same time, the sister languages of Urdu and Hindi. Both are very similar at the phonological and grammatical level but Urdu has been deeply influenced by Persian, and indirectly by Arabic, at the lexical and cultural levels.

Distribution: Urdu is spoken mainly in India and Pakistan. In India, is widely distributed, and states with more than one million speakers include Uttar Pradesh, Bihar, Jharkhand, West Bengal, Madhya Pradesh, Maharashtra, Andhra Pradesh, and Karnataka. Delhi is also an important centre for Urdu language and literature with close to a million speakers. In Pakistan, most Urdu speakers live in the eastern provinces of Sindh and Panjab.

    Within South Asia, Bangladesh and Nepal have also substantial numbers of Urdu speakers. Smaller minorities can be found in the Middle East (Oman, Bahrain, Qatar), Africa (South Africa, Mauritius), Europe (Germany, Norway) and Fiji.

Speakers. About 75 million in the following countries:












South Africa

















Status. Urdu is the national language of Pakistan and one of its two official  languages (the other is English). It is used there as a lingua franca, and by South Asian Muslim expatriates or in pilgrimage to Mecca. Urdu is also one of the 23 official languages of India and the official language of the state of Jammu and Kashmir.

Varieties. Urdu is very similar to Hindi but there are substantial differences in their vocabularies, scripts, and religious and cultural backgrounds so they are considered as separate languages. Related languages are Hindustani, Khari Boli (Delhi speech). Braj (in the area of Agra), Avadhi (around Oudh), Bhojpuri (around Varanasi), Magahi (around Patna and in southern Bihar), and Rajasthani (in the state of Rajasthan).

    Dakhani ('southern') is a dialect of Urdu spoken in the Deccan Plateau of India, arising with the foundation of Daulatabad as a new capital of the Delhi Sultanate in the 14th century.

Oldest Documents. See Hindi.


Urdu shares with Hindi the same sound system. The only difference is the inclusion in Urdu of non-Indo-Aryan sounds derived from Arabic and Persian like the fricatives f, z, ʒ, x, ɣ, and the uvular and glottal stops q, ʔ.

Vowels (10). Urdu has a ten vowel system composed of three lax and seven tense vowels. Lax vowels (ɪ, ʊ, ə) are phonetically short and tense vowels (i, e, ɛ, u, o, ɔ, ɑ) are phonetically long. [ɪ] is slightly lower and more centralized than [i], [ʊ] is slightly lower and more centralized than [u]. All have nasal forms. Oral and nasal vowels are contrastive.


Consonants (41). Urdu has 41 consonants in total, including 22 stops and affricates, 8 fricatives, 5 nasals, and 6 liquids/glides. The stops and nasals are articulated at five different places, being classified as labial, dental, retroflex, palatal and velar. The palatal stops are, in fact, affricates. Every series of stops includes voiceless and voiced consonants, unaspirated and aspirated, this four-way contrast being unique to Indo-Aryan among Indo-European languages (Proto-Indoeuropean had a three-way contrast only).

    The retroflex consonants of  Urdu, articulated immediately behind the alveolar crest, are not from Indo-European origin though present already in Sanskrit. They are, probably, the result of Dravidian language influence. Urdu has, also, a retroflex liquid (unaspirated and aspirated) not inherited from Sanskrit. [ʋ] is pronounced as v or w depending on context.



Stress: usually falls on the penultimate syllable. It is not phonemic, words are not distinguished based on stress alone.

Script and Orthography

Urdu, in contrast to Hindi which uses Devanāgarī, is written in a modified form of the Perso-Arabic script. There is some redundancy in this script because several letters reflect the orthography of the numerous loanwords from Persian and Arabic. In the first column is shown the name of each letter, in the third its usual transliteration and in the fourth the equivalent in the International Phonetic alphabet when differs from the  transliteration.


*aspirated sounds are represented with special signs.

Morphology and Syntax

Are very similar to those of Hindi. One minor difference is that Urdu does not distinguish between singular and plural in demonstrative pronouns.


Urdu and Hindi share between 80-90 % of their vocabulary, at least in informal speech. However, the same word can have different cultural overtones in each language. Arabic and Persian are major influences in the Urdu lexicon. In contrast, Hindi prefers Sanskrit loanwords. In Pakistan, Urdu is being influenced by Panjabi vocabulary.

Key Literary Works (forthcoming)

