Expanding hierarchical contexts for constructing a semantic word network / Ustalov D.A. // Komp'juternaja Lingvistika i Intellektual'nye Tehnologii. - 2017. - V. 1, l. 16. - P. 360-372.

ISSN:
22217932
Type:
Conference Paper
Abstract:
A semantic word network is a network that represents the semantic relations between individual words or their lexical senses. This paper proposes Watlink, an unsupervised method for inducing a semantic word network (SWN) by constructing and expanding the hierarchical contexts using both the available dictionary resources and distributional semantics' methods for is-a relations. It has three steps: context construction, context expansion, and context disambiguation. The proposed method has been evaluated on two different datasets for the Russian language. The former is a wellknown lexical ontology built by the group of expert lexicographers. The latter, LRWC ("Lexical Relations from the Wisdom of the Crowd"), is a new resource created using crowdsourcing that contains both positive and negative human judgements for subsumptions. The proposed method outperformed the other relation extraction methods on both datasets according to recall and F1-score. Both the implementation of the Watlink method and the LRWC dataset are publicly available under libré licenses.
Author keywords:
Crowdsourcing; Hypernym; Hyponym; Lexical semantics; Russian; Semantic network; Subsumption
Index keywords:
нет данных
DOI:
нет данных
Смотреть в Scopus:
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85021788483&partnerID=40&md5=1b648631e49230b0be10a95aa9913a2d
Соавторы в МНС:
Другие поля
Поле Значение
Link https://www.scopus.com/inward/record.uri?eid=2-s2.0-85021788483&partnerID=40&md5=1b648631e49230b0be10a95aa9913a2d
Affiliations Krasovskii Institute of Mathematics and Mechanics, Ural Federal University, Yekaterinburg, Russian Federation
Author Keywords Crowdsourcing; Hypernym; Hyponym; Lexical semantics; Russian; Semantic network; Subsumption
Funding Details 16-37-00354, RFBR, Russian Foundation for Basic Research
Funding Text The reported study is funded by RFBR according to grant no. 16-37-00354. The author is grateful to Natalia Loukachevitch who provided machine-readable versions of the RuThes and RuWordNet datasets. The author would also like to thank Alexander Panchenko, Nikolay Arefyev, Andrey Sozykin, and three anonymous reviewers for useful comments on the present study.
References Abramov, N., (1999) The Dictionary of Russian Synonyms and Semantically Related Expressions [Slovar' Russkikh Sinonimov i Skhodnykh Po Smyslu Vyrazhenii], , Russkie Slovari, Moscow, Russia; Biemann, C., Chinese whispers: An efficient graph clustering algorithm and its application to natural language processing problems (2006) Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, pp. 73-80. , New York, NY, USA; Bocharov, V.V., Alexeeva, S.V., Granovsky, D.V., Protopopova, E.V., Stepanova, M.E., Surikov, A.V., Crowdsourcing morphological annotation (2013) Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference "Dialogue", pp. 109-124. , Bekasovo, Russia; Braslavski, P., Ustalov, D., Mukhin, M., Kiselev, Y., YARN: Spinning-in-progress (2016) Proceedings of the 8th Global WordNet Conference, pp. 56-65. , Bucharest, Romania; Dikonov, V.G., Development of lexical basis for the Universal Dictionary of UNL Concepts (2013) Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference "Dialogue", pp. 212-221. , Bekasovo, Russia; Evgen'eva, A.P., (1999) Small Academic Dictionary [Malyi Akademicheskii Slovar'], , Rus. yaz; Poligrafresursy, Moscow, Russia; Faralli, S., Panchenko, A., Biemann, C., Ponzetto, S.P., Linked disambiguated distributional semantic networks (2016) The Semantic Web - ISWC 2016:15th International Semantic Web Conference, pp. 56-64. , Kobe, Japan, October 17-21, 2016, Proceedings; Fellbaum, C., (1998) WordNet: An Electronic Lexical Database, , MIT Press, Cambridge, MA, USA; Fu, R., Guo, J., Qin, B., Che, W., Wang, H., Liu, T., Learning semantic hierarchies via word embeddings (2014) Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 1, pp. 1199-1209. , Long Papers, Baltimore, MA, USA; Hearst, M.A., Automatic acquisition of hyponyms from large text corpora (1992) Proceedings of the 14th Conference on Computational Linguistics, 2, pp. 539-545. , Nantes, France; Kan, K.-L., Hsueh, H.-Y., Conceptual information retrieval system based on automatically constructed semantic word network (2013) Intelligent Technologies and Engineering Systems, Proceedings of the 2nd International Conference on Intelligent Technologies and Engineering Systems (ICITES2013), pp. 277-283. , Kaohsiung, Taiwan; Kiselev, Y., Porshnev, S.V., Mukhin, M., Current status of Russian electronic thesauri: Quality, completeness and availability (2015) Software Engineering [Programmnaya Inzheneriya], 6, pp. 34-40; Kiselev, Y., Porshnev, S.V., Mukhin, M., Method of Extracting Hyponym-Hypernym Relationships for Nouns from Definitions of Explanatory Dictionaries (2015) Software Engineering [Programmnaya Inzheneriya], 10, pp. 38-48; Korobov, M., Morphological analyzer and generator for Russian and Ukrainian languages (2015) Analysis of Images, Social Networks and Texts: 4th International Conference, AIST 2015, pp. 320-332. , Revised Selected Papers, Yekaterinburg, Russia; Lee, S., Lee, M., Kim, P., Jung, H., Sung, W.-K., OntoFrame S3: Semantic web-based academic research information portal service empowered by STAR-WIN (2010) The Semantic Web: Research and Applications: 7th Extended Semantic Web Conference, ESWC 2010, pp. 401-405. , May 30-June 3, 2010, Proceedings, Part II, Heraklion, Crete, Greece; Loukachevitch, N.V., (2011) Thesauri in Information Retrieval Tasks [Tezaurusy V Zadachakh Informatsionnogo Poiska], , Idz-vo MGU, Moscow, Russia; Loukachevitch, N.V., Lashevich, G., Gerasimova, A.A., Ivanov, V.V., Dobrov, B.V., Creating Russian WordNet by conversion (2016) Computational Linguistics and Intellectual Technologies: Papers from the Annual Conference "Dialogue", pp. 405-415. , Moscow, Russia; Lyashevskaya, O., Sharoff, S., (2009) Frequency Dictionary of Modern Russian Based on the Russian National Corpus [Chastotnyi Slovar' Sovremennogo Russkogo Yazyka (na Materialakh Natsional'nogo Korpusa Russkogo Yazyka)], , Azbukovnik, Moscow, Russia; Manning, C.D., Raghavan, P., Schütze, P., (2008) Introduction to Information Retrieval, , Cambridge University Press, Cambridge, UK; Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J., Distributed representations of words and phrases and their compositiona lity (2013) Advances in Neural Information Processing Systems 26, pp. 3111-3119. , Harrahs and Harveys, NV, USA; Navigli, R., Ponzetto, S.P., BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network (2012) Artifi Cial Intelligence, 193, pp. 217-250; Panchenko, A., Morozova, O., Naets, H., A semantic similarity measure based on lexico-syntactic patterns (2012) Proceedings of KONVENS 2012, pp. 174-178. , Vienna, Austria; Panchenko, A., Ustalov, D., Arefyev, N., Paperno, D., Konstantinova, N., Loukachevitch, N., Biemann, C., Human and machine judgements for Russian semantic relatedness, analysis of images (2017) Social Networks and Texts: 5th International Conference, AIST 2016, Revised Selected Papers, pp. 303-317. , Yekaterinburg, Russia; Riedl, M., Biemann, C., Unsupervised compound splitting with distributional semantics rivals supervised methods (2016) Proceedings of the 2016 Confer Ence of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 617-622. , San Diego, CA, USA; Sanchez-Monzon, J., Putzke, J., Fischbach, K., Automatic generation of product association networks using latent dirichlet allocation (2011) Procedia - Social and Behavioral Sciences, 26, pp. 63-75; Shwartz, V., Goldberg, Y., Dagan, I., Improving hypernymy detection with an integrated path-based and distributional method (2016) Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 1, pp. 2389-2398. , Long Papers, Berlin, Germany; Wa, J.F., (1987) Semantic Networks, , http://www.jfsowa.com/pubs/semnet.htm; Ustalov, D., Arefyev, N., Biemann, C., Panchenko, A., Negative sampling improves hypernymy extraction based on projection learning (2017) Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2, pp. 543-550. , Short Papers, Valencia, Spain; Ustalov, D., Concept discovery from synonymy graphs (2017) Computational Technologies [Vychislitel'nye Tekhnologii], 22, pp. 99-112. , Special Issue 1; Van Dongen, S., (2000) Graph Clustering by Flow Simulation, , Ph. D. thesis, University of Utrecht, Utrecht, The Netherlands; Zesch, T., Müller, C., Gurevych, I., Extracting lexical semantic knowledge from wikipedia and wiktionary (2008) Proceedings of the 6th International Conference on Language Resources and Evaluation, pp. 1646-1652. , Marrakech, Morocco
Correspondence Address Ustalov, D.A.; Krasovskii Institute of Mathematics and Mechanics, Ural Federal UniversityRussian Federation; email: dau@imm.Uran.ru
Publisher Rossiiskii Gosudarstvennyi Gumanitarnyi Universitet
Language of Original Document English
Abbreviated Source Title Komp'ut. Lingvist. Intellekt. Tehnol.
Source Scopus