By Thomas Hanneforth (auth.), Cerstin Mahlow, Michael Piotrowski (eds.)

From the perspective of computational linguistics, morphological assets are the foundation for all higher-level functions. this is often very true for languages with a wealthy morphology, corresponding to German or Finnish. A morphology part may still hence have the capacity to interpreting unmarried note kinds in addition to complete corpora. for lots of sensible purposes, not just morphological research, but in addition iteration is needed, i.e., the creation of surfaces similar to speci?c different types. except makes use of in computational linguistics, there also are a variety of sensible - plications that both require morphological research and iteration or that may significantly bene?t from it, for instance, in textual content processing, consumer interfaces, or info - trieval. those functions have speci?c specifications for morphological parts, together with specifications from software program engineering, reminiscent of programming interfaces or robustness. In 1994, the 1st Morpholympics came about on the collage of Erlangen- Nuremberg, a contest among a number of structures for the research and iteration of German be aware types. 8 structures participated within the First Morpholympics; the convention complaints [1] hence provide a great evaluate of the cutting-edge in computational morphologyfor German as of 1994.

Writing full-scale dictionaries in L EX C may well be compared to having programmers write sophisticated applications in C without access to any of the modern high-level libraries. It is possible, but unless it is C. Mahlow and M. ): SFCM 2009, CCIS 41, pp. 28–47, 2009. c Springer-Verlag Berlin Heidelberg 2009 HFST Tools for Morphology 29 done in some principled way, one may easily end up with spaghetti-code that is difficult to maintain. e. modular. With this insight and as computers became more powerful, the initial calculus that was conceived for abstract objects like automata and transducers in T WOL C and L EX C was expanded and migrated into the lexical programming environment XFST documented by Beesly and Karttunen [6], where smaller lexical modules for various purposes can be tailored and combined using finite-state calculus operations.

L = ( JRoot a k k u JN1b | JN1b JNounSg | JNounSg +sg : l + all : l ε : e JEnnd | JEnnd J# ) F= JRoot JRoot | JN1b JN1b | JNounSg JNounSg | JEnnd JEnnd L ◦ F = JRoot a k k u JN1b JN1b JNounSg JNounSg +sg : l +all : l ε : e JEnnd JEnnd J# Fig. 1. Filtering a single path in HFST-L EX C with a morphotax filter Finally, all the symbols in Γ are removed. While this is trivial, it introduces some indeterminism in the final transducer, which would otherwise have been introduced by building direct epsilon arcs.

This style may be helpful for small to medium sized lexica, but only to a lesser extent for very large lexica. +[ sur : schenk , cor : schenken ] ... Listing 3. The instance notation for ‘lernen’, ‘erben’ (to inherit), ‘schenken’ (to make a gift) The sequence notation can be used if feature structures differ in only one attribute value. Instead of specifying instances, the attribute to be added and the list of corresponding values are specified. For each value in the list an entry is added to the lexicon consisting of all the values specified by the last template in addition to the indicated attribute value pair.

