Computation of Similarity Between Two Pair of Sentence Using Word-Net
Keywords:
Semantic, Similarity, Verb, WordNet, Stem, Derivation NounAbstract
In the current era the data is available enormously and abundantly, but to find relevant and accurate data from the availability is a humongous task. Searching is required to be accurate and exact then it gives a satisfaction., But path based or edge counting approach, Information content approach, feature based approach and Hybrid approaches unable to provide the satisfactory search result. Current available algorithms are not that efficient to provide exact and accurate search result. In this paper we have implemented and found a better similarity computation compared with existing algorithms. Calculation of similarity between sentence pair based on Word-Net noun IS-A relationship and verb relationship have been done. The proposed algorithm is at par with the mean human similarity measure and it performs efficiently in sentence similarity computation too.
Downloads
References
Lin, Dekang. "An information-theoretic definition of similarity." In Icml, vol. 98, no. 1998, pp. 296-304. 1998.
Pesquita, Catia, Daniel Faria, Andre O. Falcao, Phillip Lord, and Francisco M. Couto. "Semantic similarity in biomedical ontologies." PLoS computational biology 5, no. 7 (2009): e1000443.
Lord, Phillip W., Robert D. Stevens, Andy Brass, and Carole A. Goble. "Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation." Bioinformatics 19, no. 10 (2003): 1275-1283.
Pedersen, Ted, Serguei VS Pakhomov, Siddharth Patwardhan, and Christopher G. Chute. "Measures of semantic similarity and relatedness in the biomedical domain." Journal of biomedical informatics 40, no. 3 (2007): 288-299.
Freitas, André, Joao Gabriel Oliveira, Seán O’Riain, Edward Curry, and João Carlos Pereira Da Silva. "Querying linked data using semantic relatedness: a vocabulary independent approach." In International Conference on Application of Natural Language to Information Systems, pp. 40-51. Springer, Berlin, Heidelberg, 2011.
Varelas, Giannis, Epimenidis Voutsakis, Paraskevi Raftopoulou, Euripides GM Petrakis, and Evangelos E. Milios. "Semantic similarity methods in wordnet and their application to information retrieval on the web." In Proceedings of the 7th annual ACM international workshop on Web information and data management, pp. 10-16. 2005.
Ko, Youngjoong, Jinwoo Park, and Jungyun Seo. "Improving text categorization using the importance of sentences." Information processing & management 40, no. 1 (2004): 65-79.
Fellbaum, C. "WordNet: Wiley online library." The Encyclopedia of Applied Linguistics 7 (1998).
Baddeley, Alan D. "Short-term memory for word sequences as a function of acoustic, semantic and formal similarity." Quarterly journal of experimental psychology 18, no. 4 (1966): 362-365.
Resnik, Philip. "Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language." Journal of artificial intelligence research 11 (1999): 95-130.
Miller, George A., and Walter G. Charles. "Contextual correlates of semantic similarity." Language and cognitive processes 6, no. 1 (1991): 1-28.
Li, Yuhua, David McLean, Zuhair A. Bandar, James D. O'shea, and Keeley Crockett. "Sentence similarity based on semantic nets and corpus statistics." IEEE transactions on knowledge and data engineering 18, no. 8 (2006): 1138-1150.
Rubenstein, Herbert, and John B. Goodenough. "Contextual correlates of synonymy." Communications of the ACM 8, no. 10 (1965): 627-633.
O'Shea, James, Zuhair Bandar, Keeley Crockett, and David McLean. "Pilot short text semantic similarity benchmark data set: Full listing and description." Computing (2008).
Boyce, Bert R., Bert R. Boyce, Charles T. Meadow, Donald H. Kraft, Donald H. Kraft, and Charles T. Meadow. Text information retrieval systems. Elsevier, 2017.
Bollegala, Danushka, Yutaka Matsuo, and Mitsuru Ishizuka. "Measuring semantic similarity between words using web search engines." www 7, no. 2007 (2007): 757-766.
Cilibrasi, Rudi L., and Paul MB Vitanyi. "The google similarity distance." IEEE Transactions on knowledge and data engineering 19, no. 3 (2007): 370-383.
Bird, Steven. "NLTK: the natural language toolkit." In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp. 69-72. 2006.
Pawar, Atish, and Vijay Mago. "Calculating the similarity between words and sentences using a lexical database and corpus statistics." arXiv preprint arXiv:1802.05667 (2018).
L. Tan, “Pywsd: Python implementations of word sense disambiguation (wsd) technologies [software],” https://github.com/alvations/pywsd, 2014.
Miller, George A. "WordNet: a lexical database for English." Communications of the ACM 38, no. 11 (1995): 39-41.
Islam, Aminul, and Diana Inkpen. "Semantic text similarity using corpus-based word similarity and string similarity." ACM Transactions on Knowledge Discovery from Data (TKDD) 2, no. 2 (2008): 1-25.
Dunbar, George. "Looking Up: An Account of the COBUILD Project in Lexical Computing and the Development of the Collins COBUILD English Language Dictionary." (1988): 263-266.
Lee, Ming Che, Jia Wei Chang, and Tung Cheng Hsieh. "A grammar-based semantic similarity algorithm for natural language sentences." The Scientific World Journal 2014 (2014).
Gupta, Atul, and Dharamveer Kr Yadav. "Semantic similarity measure using information content approach with depth for similarity calculation." (2014).
Gupta, Atul, and Krishan Kumar Goyal. "Classification of Semantic Similarity Technique between Word Pairs using Word Net."
Goyal, Krishan Kumar. "Computation of Verb Similarity." Design Engineering (2021): 4127-4140.
Islam, Aminul, and Diana Inkpen. "Semantic text similarity using corpus-based word similarity and string similarity." ACM Transactions on Knowledge Discovery from Data (TKDD) 2, no. 2 (2008): 1-25.
Skabar, Andrew, and Khaled Abdalgader. "Improving sentence similarity measurement by incorporating sentential word importance." In Australasian Joint Conference on Artificial Intelligence, pp. 466-475. Springer, Berlin, Heidelberg, 2010.
Nadschläger, Stefan, Hilda Kosorus, Andreas Boegl, and Josef Kueng. "Content-based recommendations within a QA system using the hierarchical structure of a domain-specific taxonomy." In 2012 23rd International Workshop on Database and Expert Systems Applications, pp. 88-92. IEEE, 2012.
Sitaula, Chiranjibi, and Raj Ojha Yadav. "Semantic Sentence Similarity Using Finite State Machine." Intelligent Information Management 5, no. 6 (2013): 171.
Gomaa, Wael H., and Aly A. Fahmy. "A survey of text similarity approaches." international journal of Computer Applications 68, no. 13 (2013): 13-18.
Lee, Ming Che, Jia Wei Chang, and Tung Cheng Hsieh. "A grammar-based semantic similarity algorithm for natural language sentences." The Scientific World Journal 2014 (2014).
Li, Xiao, and Qingsheng Li. "Calculation of sentence semantic similarity based on syntactic structure." Mathematical Problems in Engineering 2015 (2015).
Majumder, Goutam, Partha Pakray, Alexander Gelbukh, and David Pinto. "Semantic textual similarity methods, tools, and applications: A survey." Computación y Sistemas 20, no. 4 (2016): 647-665.
Soğancıoğlu, Gizem, Hakime Öztürk, and Arzucan Özgür. "BIOSSES: a semantic sentence similarity estimation system for the biomedical domain." Bioinformatics 33, no. 14 (2017): i49-i58.
Wali, Wafa, Bilel Gargouri, and Abdelmajid Ben Hamadou. "Enhancing the sentence similarity measure by semantic and syntactico-semantic knowledge." Vietnam Journal of Computer Science 4, no. 1 (2017): 51-60.
Pawar, Atish, and Vijay Mago. "Calculating the similarity between words and sentences using a lexical database and corpus statistics." arXiv preprint arXiv:1802.05667 (2018).
Pandit, Rajat, Saptarshi Sengupta, Sudip Kumar Naskar, Niladri Sekhar Dash, and Mohini Mohan Sardar. "Improving Semantic Similarity with Cross-Lingual Resources: A Study in Bangla—A Low Resourced Language." In Informatics, vol. 6, no. 2, p. 19. Multidisciplinary Digital Publishing Institute, 2019.
Farouk, Mamdouh. "Measuring sentences similarity: a survey." arXiv preprint arXiv:1910.03940 (2019).
Chandrasekaran, Dhivya, and Vijay Mago. "Evolution of Semantic Similarity—A Survey." ACM Computing Surveys (CSUR) 54, no. 2 (2021): 1-37.
Armendariz, Carlos Santos, Matthew Purver, Senja Pollak, Nikola Ljubešić, Matej Ulčar, Ivan Vulić, and Mohammad Taher Pilehvar. "SemEval-2020 Task 3: Graded Word Similarity in Context." In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 36-49. 2020.
Farouk, Mamdouh. "Measuring Sentences Similarity Based on Discourse Representation Structure." Computing and Informatics 39, no. 3 (2020): 464-480.
Varghese, Nisha, and M. Punithavalli. "Semantic Similarity Analysis on Knowledge Based and Prediction Based Models."
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.