Identifying Abbreviations in Texts
This article discusses abbreviations, a generally growing phenomenon present in all languages. Many Slovenian grammarians have dealt with abbreviations, they have been classified in detail by Matej Rode, and the last comprehensive classification appeared in the Slovenski pravopis (Slovenian Normative Guide) of 2001. Abbreviations are found in general, specialized, monolingual, and bilingual dictionaries. Because they arise quickly, they are difficult to collect, and printed dictionaries are published too infrequently to allow the dictionaries themselves be updated regularly. In recent years an increasing number of online databases have appeared; these are freely accessible and updatable, and also allow users to enter new abbreviations. The databases are based on lexical corpora or online sources, and can also be automatically formatted using rules and algorithms. The article continues by presenting automatic abbreviationrecognition procedures in the ADAM databases as well as the Satev-Nikolov method. With the help of this method, an example is given of an algorithm for recognizing abbreviations in Slovenian texts.Downloads
