RUSSE: The first workshop on Russian semantic similarity / Panchenko A., Loukachevitch N.V., Ustalov D., Paperno D., Meyer C.M., Konstantinova N. // Komp'juternaja Lingvistika i Intellektual'nye Tehnologii. - 2015. - V. 2, l. 14. - P. 89-105.

ISSN:
22217932
Type:
Conference Paper
Abstract:
The paper gives an overview of the Russian Semantic Similarity Evaluation (RUSSE) shared task held in conjunction with the Dialogue 2015 conference. There exist a lot of comparative studies on semantic similarity, yet no analysis of such measures was ever performed for the Russian language. Exploring this problem for the Russian language is even more interesting, because this language has features, such as rich morphology and free word order, which make it significantly different from English, German, and other wellstudied languages. We attempt to bridge this gap by proposing a shared task on the semantic similarity of Russian nouns. Our key contribution is an evaluation methodology based on four novel benchmark datasets for the Russian language. Our analysis of the 105 submissions from 19 teams reveals that successful approaches for English, such as distributional and skip-gram models, are directly applicable to Russian as well. On the one hand, the best results in the contest were obtained by sophisticated supervised models that combine evidence from different sources. On the other hand, completely unsupervised approaches, such as a skip-gram model estimated on a largescale corpus, were able score among the top 5 systems.
Author keywords:
Co-hyponyms; Computational linguistics; Hypernyms; Lexical semantics; Semantic relatedness; Semantic relation extraction; Semantic relations; Semantic similarity measures; Synonyms
Index keywords:
нет данных
DOI:
нет данных
Смотреть в Scopus:
https://www.scopus.com/inward/record.uri?eid=2-s2.0-84951826148&partnerID=40&md5=29caa593a0cefc459d9a12c83f96d505
Соавторы в МНС:
Другие поля
Поле Значение
Link https://www.scopus.com/inward/record.uri?eid=2-s2.0-84951826148&partnerID=40&md5=29caa593a0cefc459d9a12c83f96d505
Affiliations TU Darmstadt, Darmstadt, Germany; Universite Catholique de Louvain, Louvain-la-Neuve, Belgium; Moscow State University, Moscow, Russian Federation; N. N. Krasovskii Institute of Mathematics and Mechanics, Ural Branch of the RAS, Russian Federation; NLPub, Yekaterinburg, Russian Federation; University of Trento, Rovereto, Italy; University of Wolverhampton, Wolverhampton, United Kingdom
Author Keywords Co-hyponyms; Computational linguistics; Hypernyms; Lexical semantics; Semantic relatedness; Semantic relation extraction; Semantic relations; Semantic similarity measures; Synonyms
References Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., Soroa, A., A study on similarity and relatedness using distributional and wordnet-based approaches (2009) Proceedings of NAACL-HLT 2009, pp. 19-27. , Boulder, CO, USA; Baroni, M., Lenci, A., One distributional memory, many semantic spaces (2009) Proceedings of the EACL GEMS Workshop, pp. 1-8. , Athens, Greece; Baroni, M., Lenci, A., How we BLESSed distributional semantic evaluation (2011) Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, pp. 1-10. , Edinburgh, Scotland; Baroni, M., Dinu, G., Kruszewski, G., Don't count predict! A systematic comparison of context-counting vs context-predicting semantic vectors (2014) Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 1, pp. 238-247. , Baltimore, MD, USA; Biemann, C., Martin, R., Text: Now in 2D! a framework for lexical expansion with contextual similarity (2013) Journal of Language Modelling, 1 (1), pp. 55-95; Bullinaria, J.A., Levy, J.P., Extracting semantic representations from word co-occurrence statistics: Stop-lists, stemming, and SVD (2012) Behavior Research Methods, 44 (3), pp. 890-907; Curran, J.R., (2004) From Distributional to Semantic Similarity, , PhD thesis, University of Edinburgh, UK; Ferret, O., Testing semantic similarity measures for extracting synonyms from a corpus (2010) Proceedings of LREC 2010, pp. 3338-3343. , Valletta, Malta; Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E., Placing search in context: The concept revisited (2001) Proceedings of the 10th International Conference on World Wide Web, pp. 406-414. , Hong Kong, China; Griffiths, T.L., Steyvers, M., A probabilistic approach to semantic representation (2002) Proceedings of the 24th Annual Conference of the Cognitive Science Society, pp. 381-386. , Fairfax, VA, USA; Griffiths, T.L., Steyvers, M., (2003) Prediction and Semantic Association, Advances in Neural Information Processing Systems 15, pp. 11-18. , British Columbia, Canada; Griffiths, T., Steyvers, M., Tenenbaum, J., Topics in semantic representation (2007) Psychological Review, 114, pp. 211-244; Gurevych, I., Using the structure of a conceptual network in computing semantic relatedness (2005) Proceedings of the 2nd International Joint Conference on Natural Language Processing, pp. 767-778. , Jeju Island, South Korea; Hassan, S., Mihalcea, R., Cross-lingual semantic relatedness using encyclopedic knowledge (2009) Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 3, pp. 1192-1201. , Singapore; Jin, P., Wu, Y., Semeval-2012 task 4: Evaluating Chinese word similarity (2012) Proceedings of the First Joint Conference on Lexical and Computational Semantics-1: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp. 374-377. , Montreal, Canada; Kiss, G., Armstrong, C., Milroy, R., Piper, J., (1973) An Associative Thesaurus of English and Its Computer Analysis, the Computer and Literary Studies, Edinburgh, pp. 153-165. , University Press, Edinburgh, Scotland, UK; Krippendorff, K., (2013) Content Analysis: An Introduction to Its Methodology (Third Edition), , SAGE, Thousand Oaks, CA, USA; Krizhanovski, A.A., Evaluation experiments on related terms search in Wikipedia (2007) SPIIRAS Proceedings, 5, pp. 113-116; Krukov, K.V., Pankova, L.A., Pronina, V.S., Sukhoverov, V.S., Shiplina, L.B., Semantic similarity measures in ontology (2010) Control Sciences, 5, pp. 2-14; Landauer, T.K., Dumais, S.T., A solution to plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge (1997) Psychological Review, 104 (2), pp. 211-240; Lapesa, G., Evert, S., A large scale evaluation of distributional semantic models: Parameters, interactions and model selection (2014) Transactions of the Association for Computational Linguistics, 2, pp. 531-545; Lee, L., Measures of distributional similarity (1999) Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 25-32. , College Park, MA, USA; Loukachevitch, N.V., Dobrov, B.V., Chetviorkin, I.I., RuThes-lite, a publicly available version of thesauru of Russian language RuThes (2014) Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference "dialogue", pp. 340-349. , Bekasovo, Russia; Lund, K., Burgess, C., Producing high-dimensional semantic spaces from lexical co-occurrence (1996) Behavior Research Methods, 28 (2), pp. 203-208; Meyer, C.M., Gurevych, I., To exhibit is not to loiter: A multilingual, sense- disambiguated wiktionary for measuring verb similarity (2012) Proceedings of COLING 2012: Technical Papers, pp. 1763-1780. , Mumbai, India; Mikolov, T., Chen, K., Corrado, G., Dean, J., (2013) Efficient Estimation of Word Representations in Vector Space, , http://arxiv.org/abs/1301.3781; Miller, G.A., Charles, W.G., Contextual correlates of semantic similarity (1991) Language and Cognitive Processes, 6 (1), pp. 1-28; Panchenko, A., (2013) Similarity Measures for Semantic Relation Extraction, , PhD thesis, Universite catholique de Louvain, Louvain-la-Neuve, Belgium; Pennington, J., Socher, R., Manning, C.D., Glove: Global vectors for word representation (2014) Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP), pp. 1532-1543. , Doha, Qatar; Postma, M., Vossen, P., What implementation and translation teach us: The case of semantic similarity measures in wordnets (2014) Proceedings of Global Word- Net Conference 2014, pp. 133-141. , Tartu, Estonia; Rapp, R., Zock, M., The CogALex-IV shared task on the lexical access problem (2014) Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon, pp. 1-14. , Dublin, Ireland; Richardson, R., Smeaton, A., Murphy, J., Using wordnet as a knowledge base for measuring semantic similarity between words (1994) Proceedings of AICS Conference, , Dublin, Ireland; Rubenstein, H., Goodenough, J.B., Contextual correlates of synonymy (1965) Communications of the ACM, 8 (10), pp. 627-633; Sahlgren, M., (2006) The Word-Space Model: Using Distributional Analysis to Represent Syntagmatic and Paradigmatic Relations between Words in High-dimensional Vector Spaces, , PhD thesis, Stockholm University, Stockholm, Sweden; Sokirko, A., (2013) Mining Semantically Similar Language Expressions for the Yandex Information Retrieval System (Through to 2012) [Mayning Blizkikh Po Smyslu Vyrazheniy Dlya Poiskovoy Sistemy Yandex (Do 2012 Goda)], , http://www.aot.ru/docs/MiningQueryExpan.pdf; Turdakov, D.Y., (2010) Methods and Software for Term Sense Disambiguation Based on Document Networks [Metody i Programmnye Sredstva Razresheniya Leksicheskoy Mnogoznachnosti Terminov Na Osnove Setey Dokumentov], , PhD thesis, Lomonosov Moscow State University, Moscow, Russia; Van De Cruys, T., (2010) Mining for Meaning: The Extraction of Lexicosemantic Knowledge from Text, , PhD thesis, University of Groningen, Groningen, The Netherlands; Yang, D., Powers, D.M.W., Verb similarity on the taxonomy of WordNet (2006) Proceedings of GWC-06, pp. 121-128. , Jeju Island, Korea; Zesch, T., Gurevych, I., Wisdom of crowds versus wisdom of linguists- measuring the semantic relatedness of words (2010) Natural Language Engineering, 16 (1), pp. 25-59; Zhang, E., Zhang, Y., (2009) Average Precision, Encyclopedia of Database Systems, pp. 192-193. , Springer US; Zervanou, K., Iosif, E., Potamianos, A., Word semantic similarity for morphologically rich languages (2014) Proceedings of the Ninth International Conference on Language Resources and Evaluation, pp. 1642-1648. , Reykjavik, Iceland
Publisher Rossiiskii Gosudarstvennyi Gumanitarnyi Universitet
Conference name International Conference on Computational Linguistics and Intellectual Technologies, Dialogue 2015
Conference date 27 May 2015 through 30 May 2015
Conference code 117347
Language of Original Document English
Abbreviated Source Title Komp'ut. Lingvist. Intellekt. Tehnol.
Source Scopus