Product Information

CHITRANKAN

Chitrankan archives Indian Language content in electronic form through OCR.

Brief Description

Hitherto, any written or printed document, if it is to be replicated digitally, needs to be photocopied or scanned. Such a replicated document cannot be altered in terms of the spellings, words, font style and size that the document contains. Also typing an entire document in order to replicate it, is extremely time consuming.

In order to overcome the above-mentioned issues C-DAC GIST has developed Chitrankan- the first OCR (Optical Character Recognition) system for Indian Languages.

The OCR process involves:

Conversion of printed matter into an electronic image - the printed matter can be converted into an image using Scanner or a Digital Camera
Electronic Image Processing - this involves identifying text information by analyzing the image for noise and skew. Once text information is available another algorithm reads and recognizes the printed matter
Storing the extracted text information as a electronic data: the recognized input is converted to a standard format, which can be opened in any word processing application, facilitating the user to edit the text data.

Chitrankan archives Indian Language content in electronic form through OCR. It enables the user to take a book, magazine or printed text in an Indian Language, feed it directly into an electronic computer file, and edit the file using a word processor. Once the data is in the form of electronic text it can be searched, sorted and indexed.

Chitrankan saves the user the effort of typing an entire document.
Chitrankan scans a document to screen by recognizing the text and other images as objects. These scanned images are flawless and can be stored or printed time and again.

Exceedingly user-friendly with features that can edit, move, resize or duplicate the scanned document, Chitrankan also provides a spell check facility.

The potential of Chitrankan is enormous as it enables users to harness the power of computers to access printed documents in Indian Languages.

Software Advantage:

Recognizes Hindi and Marathi languages along with Embedded English Text.
Skew detection and correction for input image upto ± 15°
Grabs images directly from the scanner for processing
Automatic Text and Picture region detection
Supports all TWAIN compatible scanners and digital cameras
Supports 256 grayscale/color, .bmp/.tiff images scanned at 300 dpi as input image for recognition
Ideal for font sizes between 10 pt. and 36 pt, and all popular fonts.
Saves scanned/modified images as .BMP files
Saves recognized text in ISCII format or exporting as .RTF for editing using GIST range of software
Uses advanced DSP (Digital Signal Processing) algorithms to remove "Noise" and "Back Page Reflection"
Enables printing both - the input image as well as the recognized text.
Provided with inbuilt Flip, Rotate and Negate options for Input Image

User Advantage:

Allows deletion of associated pictures from the image by using the ERASE option
Provides painting tools to join the breaks in the characters to get good results
Allows OCR to be applied on an image rotated by 180° or flipped
Applies OCR to image having text in reverse by using INVERT option
Provides inbuilt spell checking facility
Provides editing tools like cut, copy, paste, find and replace options for use on recognized text

Application Areas:

Office Automation
Archival of Text Matter
DTP
Data Entry

System Requirements:

Minimum Configuration:
Pentium II with 64 MB RAM
Virtual Memory requirement 300 MB (Swap File Space in Hard Disk)
Recommended Configuration:
Pentium III with 128 MB RAM and above
Virtual Memory requirement 400 MB
Operating Systems Supported:
Window NT ver. 4.0, Service Pack 6.0 and above/ Windows 9X and above, Windows 2000 and Windows XP.

Contact Details

info.gist@cdac.in
More information on GIST products
sales.gist@cdac.in
Sales related information
support.gist@cdac.in
Support related information