header

Imla Shanaas: The Urdu spell-checker

Imla Shanaas: The Urdu spell-checker

Introduction

As the name suggests Imla Shanaas is a spell-checker for modern Urdu used both in India and Pakistan. The Spell-checker has features which incorporate the latest in both technology as well as in language

  1. The dictionary comprises over 70,000 root words which when exploded can spellcheck around 700,000 words in Urdu
  2. The words in the dictionary are based on the latest spelling norms so as to ensure full compliance with the Urdu Imlaa.
  3. The dictionary is a judicious mix of vocabulary culled from lexical databases as well as corpora covering topics such as daily news, philosophy, poetry, literature, advertisements, general knowledge, current affairs, basic science vocabulary, mathematical terms as well as vocabulary from encyclopedia to provide the largest range possible of spell-checking.
  4. Suggestions are the heart of a Spell-checker. Based on suggestion heuristics as well as the most common errors made by Urdu speakers, Imla Shanaas provides normally a hit within the top three suggestions. An intelligent word-splitting algorithm ensures that compounding is safely handled. Airabs are also accounted for and the spellchecker
    can handle all and every diacritic mark used in modern Urdu.
  5. Imlaa Shanas can handle Unicode, UTF8 as well as PASCII (the proprietary standard of C-DAC GIST).
  6. A floating keyboard allows the user to correct text within the text-box itself.
  7. The file can be saved in multiple formats: PASCII, Unicode to suit the user's requirements.

Click here for Demo of Imla Shanaas

Click here for English Demo of Imla Shanaas