
Who I am

Anand kumar M
Working as a Research Associate in CEN,Amrita Viswa Vidyapeetham. Coimbatore.

Native : Portonovo/Parangipettai Cuddalore Dist

Areas of Interest : Morphological Analyzer and Generator, Dependency Parsing, Statistical Machine Translation, Machine learning, Support Vector Machines, Machine learning for NLP .

Completed Projects

POS-Tagger for Tamil.
Morphological Analyzer for Tamil(Novel Method)
Morphological Analyzer for Malayalam
Morphological Analyzer for Telugu
Morphological Generator for Tamil (Novel Method)
Morphological Generator for Malayalam
Morphological Generator for Telugu
Statistical Machine Translation for English to Tamil (Currently Working)


International Journals

A Novel Data Driven Algorithm for Tamil Morphological Generator, International Journal of Computer Applications(IJCA) - Foundation of Computer Science, 6(12):52,56, 2010. Download PDF

A Sequence Labeling Approach to Morphological. Analyzer for Tamil Language, International Journal on Computer Science and Engineering (IJCSE), Vol. 02, No. 06, 2201-2208, 2010. Download PDF

A Natural Language Processing Tools for Tamil Grammar Learning and Teaching, International Journal of Computer Applications(IJCA) - Foundation of Computer Science, October 2010. Download PDF

“Tamil POS Tagging using Linear Programming”, International Journal of Recent Trends in Engineering, Vol. 1, No. 2, ISSN 1797-9617. PDF

A Paradigm Based Morphological Analyzer for English to Kannada using a Machine Learning Approach, Research India Publication(RIP), October 2010.

International Conferences

“Tamil Part-of-Speech tagger based on SVMTool”, Proceedings of International Conference on Asian Language Processing 2008 (IALP 2008), Chiang Mai, Thailand .

“Morphological Analyzer for Agglutinative Languages Using Machine Learning Approaches”, Proceedings of International Conference on Advances in Recent Technologies in Communication and Computing( ARTCom 2009), Kottayam, India .

“Chunker for Tamil”, Proceedings of International Conference on Advances in Recent Technologies in Communication and Computing( ARTCom 2009), Kottayam, India.

“Postagger and Chunker for Tamil Language”, Proceedings of the 8th Tamil Internet Conference, Cologne, Germany.

“A Novel Approach for Tamil Morphological Analyzer”, Proceedings of the 8th Tamil Internet Conference 2009, Cologne, Germany.

“Chunker for Tamil using Machine Learning”, 7th International Conference on Natural Language Processing 2009( ICON2009), IIIT Hyderabad, India.

“Morphological generator for Tamil a new data driven approach”, 9th Tamil Internet Conference, Chemmozhi Maanaadu, Coimbatore, India.

“Grammar Teaching Tools for Tamil” Technology for Education Conference (T4E), IIT Bombay, India.

“ A Novel Approach to Morphological Generator for Tamil”, 2nd International Conference on Data Engineering and Management (ICDEM 2010) , Trichy, India.

“Morphological Analyzer for Malayalam Using Machine Learning ”, 2nd International Conference on Data Engineering and Management (ICDEM 2010) , Trichy, India.

“Morphological analyzer for Telugu using Support Vector Machine”, International Conference on Advances in Information and Communication Technologies (ICT 2010), Kochi, India.

“A Novel Algorithm for Tamil Morphological generator”, 8th International Conference on Natural Language Processing 2010( ICON2010), IIT-Kharagpur, India.

Saturday, December 4, 2010

Morphological Generator For Tamil (Novel Algorithm)

Tamil is a morphologically rich language with agglutinative nature. Being agglutinative language most of the word features are postpositionally affixed to the root word. The morphological generator takes lemma, POS category and morpho-lexical description as input and gives a word-form as output. It is a reverse process of morphological analyzer. In any natural language generation system, morphological generator is an essential component in post processing stage. Morphological generator system implemented here is based on a new algorithm, which is simple, efficient and does not require any rules and morpheme dictionary. A paradigm classification is done for noun and verb based on Dr.S.Rajendran’s paradigm classification. Tamil verbs are classified into 32 paradigms with 1884 inflected forms. Like verbs, nouns are classified into 25 paradigms with 325 word forms. This approach requires only minimum amount of data. So this approach can be easily implemented to less resourced and morphologically rich languages.

Using this Morphological generator the Verb Conjugation also Developed..

1 comment: