cols

Who I am

Anand kumar M
Working as a Research Associate in CEN,Amrita Viswa Vidyapeetham. Coimbatore.

Native : Portonovo/Parangipettai Cuddalore Dist

Areas of Interest : Morphological Analyzer and Generator, Dependency Parsing, Statistical Machine Translation, Machine learning, Support Vector Machines, Machine learning for NLP .

Completed Projects

POS-Tagger for Tamil.
Morphological Analyzer for Tamil(Novel Method)
Morphological Analyzer for Malayalam
Morphological Analyzer for Telugu
Morphological Generator for Tamil (Novel Method)
Morphological Generator for Malayalam
Morphological Generator for Telugu
Statistical Machine Translation for English to Tamil (Currently Working)

Publications

International Journals

A Novel Data Driven Algorithm for Tamil Morphological Generator, International Journal of Computer Applications(IJCA) - Foundation of Computer Science, 6(12):52,56, 2010. Download PDF

A Sequence Labeling Approach to Morphological. Analyzer for Tamil Language, International Journal on Computer Science and Engineering (IJCSE), Vol. 02, No. 06, 2201-2208, 2010. Download PDF


A Natural Language Processing Tools for Tamil Grammar Learning and Teaching, International Journal of Computer Applications(IJCA) - Foundation of Computer Science, October 2010. Download PDF


“Tamil POS Tagging using Linear Programming”, International Journal of Recent Trends in Engineering, Vol. 1, No. 2, ISSN 1797-9617. PDF

A Paradigm Based Morphological Analyzer for English to Kannada using a Machine Learning Approach, Research India Publication(RIP), October 2010.

International Conferences

“Tamil Part-of-Speech tagger based on SVMTool”, Proceedings of International Conference on Asian Language Processing 2008 (IALP 2008), Chiang Mai, Thailand .

“Morphological Analyzer for Agglutinative Languages Using Machine Learning Approaches”, Proceedings of International Conference on Advances in Recent Technologies in Communication and Computing( ARTCom 2009), Kottayam, India .

“Chunker for Tamil”, Proceedings of International Conference on Advances in Recent Technologies in Communication and Computing( ARTCom 2009), Kottayam, India.

“Postagger and Chunker for Tamil Language”, Proceedings of the 8th Tamil Internet Conference, Cologne, Germany.

“A Novel Approach for Tamil Morphological Analyzer”, Proceedings of the 8th Tamil Internet Conference 2009, Cologne, Germany.

“Chunker for Tamil using Machine Learning”, 7th International Conference on Natural Language Processing 2009( ICON2009), IIIT Hyderabad, India.


“Morphological generator for Tamil a new data driven approach”, 9th Tamil Internet Conference, Chemmozhi Maanaadu, Coimbatore, India.

“Grammar Teaching Tools for Tamil” Technology for Education Conference (T4E), IIT Bombay, India.

“ A Novel Approach to Morphological Generator for Tamil”, 2nd International Conference on Data Engineering and Management (ICDEM 2010) , Trichy, India.

“Morphological Analyzer for Malayalam Using Machine Learning ”, 2nd International Conference on Data Engineering and Management (ICDEM 2010) , Trichy, India.

“Morphological analyzer for Telugu using Support Vector Machine”, International Conference on Advances in Information and Communication Technologies (ICT 2010), Kochi, India.

“A Novel Algorithm for Tamil Morphological generator”, 8th International Conference on Natural Language Processing 2010( ICON2010), IIT-Kharagpur, India.

Tuesday, August 17, 2010

Morphological Analyzer Using Machine Learning


Morphological analyzer using machine learning approach for complex agglutinative natural languages is a novel approach. Morphological analysis is concerned with retrieving the structure, the syntactic and morphological properties or the meaning of a morphologically complex word. The morphology structure of agglutinative language is unique and capturing its complexity in a machine analyzable and generatable format is a challenging job. Generally rule based approaches are used for building morphological analyzer system. In rule based approaches what works in the forward direction may not work in the backward direction. This new and state of the art machine learning approach based on sequence labeling and training by kernel methods captures the non-linear relationships in the different aspect of morphological features of natural languages in a better and simpler way. The overall accuracy obtained for the morphologically rich agglutinative languages (Tamil,Malayalam,Telugu) was really encouraging. Screen shot