Gujarati Language Segmentation Analysis

Segmentation For Handwritten Gujarati Text Documents: A Review Kunal Shah. Babu Madhav Institute of Information Technology, Uka Tarsadia University Maliba Campus, Gopal Vidhyanagar, Bardoli, Gujarat, India. ABSTRACT Optical Character Recognition is most use full thing in market. Optical Character Recognition very difficult task in information technology area but in some where achieve the solution by many researcher and experts people. In this paper you may know about what is OCR and its application, its varies technique and example, the main thing here I focus that is segmentation part mainly in Guajarati handwritten text. I tried to show difficulty of segmentation in Guajarati language. And also compare various approach that done by researcher…show more content…
Consonants can be connected with vowel extensions. Figure 2 : Diagram of Guajarati script 1.4 Problems in Gujarati text document segmentation Sr. no. Types Problems Description Examples 1. Line Segmentation 1) Modifier overlapping The lower modifier of one line overlaps with the upper modifiers of lower line. Figure no 3a 2) Zigzag line/Word/Character , It creates curvature in the lines. text is not in proper line. Figure no 3b 3) Unusual line spacing Spacing is not proper between two or more then two lines Figure no 3c 2. Word Segmentation 1) Unusual spacing in inter-word and intra-word Spacing between two word are not proper because of that spacing problem occurs. Figure no 4 3. Character Segmentation 1) Upper region problems i. Unusual size of upper modifier Figure no 5a ii. Merging of lower modifier with consonant Figure no 5b iii. Touching of upper modifier with another upper modifier Figure no…show more content…
from above literature review they conclude that there is still so many problems comes in OCR of handwritten Guajarati characters for Segmentation. mainly problems comes in character segmentation phase. If want to better result then character must be in human readable form and also in proper manner. If some how work is done in Guajarati segmentation that is also for printed text not for handwritten text. Solving above problems we can increase the accuracy of recognition phase and get better result in OCR.

