Rashwan, The arabic OCR segmented-based system, Life Sci. Asian Federation for Natural Language Processing, Beijing, China, 2010, pp. 8th Workshop on Asian Language Resources. Hussain, Word segmentation for Urdu OCR system, Proc. Ahmed et al., Semi-supervised learning using frequent itemset and ensemble learning for SMS classification, Expert Syst. Adnan, Urdu nastaleeq optical character recognition, World Academy of Science, Engineering and Technology 1(8), 2380–2383, 2007. Khan, UCOM offline dataset-an handwritten dataset generation, Int. The proposed system is simple to implement especially in software front of OCR system also the proposed technique is useful for printed text as well as handwritten text and it will help in developing more accurate Urdu OCR’s software systems in the future. The proposed system evaluations under a variety of experimental settings apprehend 98.4% training results and 97.3% test results, which is the highest recognition rate ever achieved by any Urdu language OCR system. In this study, supervised learning-based OCR system is proposed for Nastalique Urdu language. make the Urdu language too complex to obtain accurate OCR results. ![]() Highly cursive, complex structure, bi-directionality, and compound in nature, etc. A huge amount of Urdu language’s data is available in handwritten or in printed form that needs to be converted into digital format for knowledge acquisition. OCR’s can read documents and convert manual text documents into digital text and this digital text can be processed to extract knowledge. The second technique is to use an Optical Character Recognition (OCR) system. The first technique is to create an image of written/printed text, but images are large in size so they require huge memory space to store, as well as text in image form cannot be undergo further processes like edit, search, copy, etc. There are two main techniques to convert written or printed text into digital format.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |