ESNOLA based Bangla TTS

C-DAC Logo

ESNOLA based Bangla TTS

In the last decade there has been a significant trend for development of speech synthesizers using Concatenative based Synthesis techniques. There are a number of different methodologies for Concatenative Synthesis like TDPSOLA, PSOLA, MBROLA and Epoch Synchronous Non Over Lapping Add (ESNOLA).

Concatenation Synthesis

In concatenation synthesis, speech is generated by combining splices of pre-recorded natural speech. To take care of context-dependency and information embedded in transition segments, the splices are selected such that they begin and end with comparatively steady states.


It is a concatenative speech synthesis system which uses a new set of signal units in sub-phonemic level, namely, partneme as the smallest signal units for concatenation. The Epoch Synchronous Non Overlap Add (ESNOLA) algorithm is developed for concatenation, regeneration as well as for pitch and duration (prosodic) modification. The methodology of concatenation provides adequate processing for proper matching between different segments during concatenation. The use of special type of basic signal segments makes the size of signal dictionary very small so there is a possibility of its implementation in low-cost general-purpose electronic devices. The phoneme string output from the Text Analyzer is assigned tokens, based on the indexing of the segmented partneme voice signals. Normalization of pitch and amplitude has been done to implement the prosody and intonation. The selected segments are concatenated at epoch positions to get the raw output signal. Steady states of the nucleus vowel segment are generated by the linear interpolation with appropriate weights of the last period and the first period respectively of the preceding and the succeeding segments. The generated signals require some spectral smoothing at the point of concatenation to remove mismatch and other spectral disturbances.

Figure- Basic Block Diagram of TTS System using ESNOLA Technique

The above block diagram describes the basic part of the ESNOLA technique for the development of text-to speech synthesis system.
Based on the above technique Bangla TTS system has been developed and named as "BANGLA VAANI"

System Features:

Bangla Vaani


CDAC, Kolkata is one of the active members of DeitY-TTS Consortium (Phase - II).

Click here for Brochure