Automata for santali language processing
Author | |
---|---|
Keywords | |
Abstract |
Santali is a language in the Munda or Kolarlan family and has reached a higher stage of development than other language of its family. It is being written in Roman script from a long time and this script has been reoriented to express the peculiar pronunciation and phonetics of Santali language. Automata can be used at all the stages of natural language processing. It has the ability to efficiently characterize morphological and phonological rules. In order to develop a dependency parser for Santali language, we have used automata for morphological analysis. This paper presents the morphology of Santali language. We have used finite-state transducers (FST) for finite-state morphology and discussed initial morphological analysis of nouns and verbs. This paper discusses Santali nouns in terms of gender, number and case. We have shown Verbs in Santali inflect for tense/aspect/mood (TAM), voice and the person and number of the subject. The automaton for the subject and the object markers in first, second and third person have been shown. © 2017 IEEE. |
Year of Conference |
2017
|
Conference Name |
2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017
|
Volume |
2017-January
|
Number of Pages |
939-943,
|
Publisher |
Institute of Electrical and Electronics Engineers Inc.
|
ISBN Number |
978-150906367-3 (ISBN)
|
DOI |
10.1109/ICACCI.2017.8125962
|
Conference Proceedings
|
|
Download citation | |
Cits |
1
|