A Deductive Word Sense Disambiguation Approach Based on Data Mining and Knowledge Extraction in Expert Systems

Authors

    Zahra Pourbahman Department of Computer Science, Shahed University, Tehran, Iran
    Niloofar Rastin Faculty of Computer Engineering, Iranian E-University, Tehran, Iran
    Mostafa Fakhrahmad * Electronic and Computer Engineering, Shiraz University, Shiraz, Iran fakhrahmad@shirazu.ac.ir
https://doi.org/10.61838/jaiai.1.3.3

Keywords:

Natural language processing, Expert system, Word sense disambiguation, Lexical ambiguity resolution, Forward chaining

Abstract

Word Sense Disambiguation (WSD) involves assigning the appropriate sense to ambiguous words. WSD is one of the most challenging problems in several Natural Language Processing (NLP) tasks, such as machine translation. This paper proposes a novel approach consisting of four main components. In the first part, a mining process is used to construct a tree structure that represents helpful knowledge about the conceptual relationships between each ambiguous word and its relevant context. In the second part, a Knowledge Base (KB) is constructed based on the chains derived from the tree structure. The third part involves designing an expert system for lexical ambiguity resolution using the forward chaining strategy. In the final part, the KB is upgraded to improve its effectiveness in determining the correct senses of ambiguous words. The performance of the proposed approach is evaluated on the TWA corpus. The results demonstrate the effectiveness of the proposed expert system.

Downloads

Download data is not yet available.

References

D. W. Oard and A. R. Diekema, "Cross-language information retrieval," Annual Review of Information Science and Technology (ARIST), vol. 33, pp. 223-256, 1998. [Online]. Available: https://www.asist.org/publications/arist.

S. M. Fakhrahmad, M. H. Sadreddini, and M. Zolghadri Jahromi, "A proposed expert system for word sense disambiguation: deductive ambiguity resolution based on data mining and forward chaining," Expert Systems, vol. 32, no. 2, pp. 178-191, 2015, doi: 10.1111/exsy.12075.

S. D. Samantaray, "Example based machine translation approach for Indian languages," in Proceedings of ICCS, 2004, pp. 1-10. [Online]. Available: https://www.iccs-meeting.org/iccs2004.

A. L. Lagarda, V. Alabau, F. Casacuberta, R. Silva, and E. Diaz-de-Liano, "Statistical post-editing of a rule-based machine translation system," in Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, 2009, pp. 217-220, doi: 10.3115/1620853.1620913.

F. Ahmed and A. Nürnberger, "Arabic/english word translation disambiguation using parallel corpora and matching schemes," in Proceedings of the 12th Annual Conference of the European Association for Machine Translation, 2008, pp. 6-11. [Online]. Available: https://aclanthology.org/2008.eamt-1.

S. Abdul-Rauf and H. Schwenk, "Exploiting comparable corpora with TER and TERp," in Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: From Parallel to Non-Parallel Corpora (BUCC), 2009, pp. 46-54, doi: 10.3115/1690339.1690351.

A. H. Rasekh, M. H. Sadreddini, and S. M. Fakhrahmad, "Word sense disambiguation based on lexical and semantic features using naive Bayes classifier," Journal of Computing and Security, vol. 1, no. 2, pp. 123-132, 2014. [Online]. Available: https://sid.ir/paper/341649/en.

A. Rezapour, S. M. Fakhrahmad, and M. H. Sadreddini, "Applying various distance functions and feature extraction schemes to ambiguity resolution," Intelligent Data Analysis, vol. 22, no. 3, pp. 617-638, 2018, doi: 10.3233/IDA-173385.

S. Rahmani, S. M. Fakhrahmad, and M. H. Sadreddini, "Co-occurrence graph-based context adaptation: a new unsupervised approach to word sense disambiguation," Digital Scholarship in the Humanities, vol. 36, no. 2, pp. 449-471, 2021, doi: 10.1093/llc/fqz048.

C. Banea and R. Mihalcea, "Word sense disambiguation with multilingual features," in Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011), 2011. [Online]. Available: https://aclanthology.org/W11-0100.

H. Ji, "One sense per context cluster: Improving word sense disambiguation using web-scale phrase clustering," in 2010 4th International Universal Communication Symposium, 2010, pp. 181-184, doi: 10.1109/IUCS.2010.5666225.

W. A. Gale, K. W. Church, and D. Yarowsky, "A method for disambiguating word senses in a large corpus," Computers and the Humanities, vol. 26, pp. 415-439, 1992, doi: 10.1007/BF00136984.

J. Veronis and N. Ide, "Word sense disambiguation with very large neural networks extracted from machine readable dictionaries," in COLING 1990 Volume 2: Papers Presented to the 13th International Conference on Computational Linguistics, 1990, doi: 10.3115/997939.998006.

G. Tsatsaronis, M. Vazirgiannis, and I. Androutsopoulos, "Word sense disambiguation with spreading activation networks generated from thesauri," in IJCAI, 2007, vol. 27, pp. 223-252. [Online]. Available: https://ijcai.org/proceedings/2007.

S. Reddy and A. Inumella, "WSD as a distributed constraint optimization problem," in Proceedings of the ACL 2010 Student Research Workshop, 2010, pp. 13-18. [Online]. Available: https://aclanthology.org/P10-3003.

F. Tacoa, D. Bollegala, and M. Ishizuka, "A context expansion method for supervised word sense disambiguation," in 2012 IEEE Sixth International Conference on Semantic Computing, 2012, pp. 339-341, doi: 10.1109/ICSC.2012.27.

Y. Chen, H. Cao, Q. Mei, K. Zheng, and H. Xu, "Applying active learning to supervised word sense disambiguation in MEDLINE," Journal of the American Medical Informatics Association, vol. 20, no. 5, pp. 1001-1006, 2013, doi: 10.1136/amiajnl-2012-001244.

T. Wang, J. Rao, and Q. Hu, "Supervised word sense disambiguation using semantic diffusion kernel," Engineering Applications of Artificial Intelligence, vol. 27, pp. 167-174, 2014, doi: 10.1016/j.engappai.2013.08.007.

B. K. Mishra and S. Jain, "Word Sense Disambiguation for Indic Language using Bi-LSTM," Multimedia Tools and Applications, pp. 1-26, 2024, doi: 10.1007/s11042-024-19499-9.

X. Zhang et al., "Word Sense Disambiguation by Refining Target Word Embedding," in Proceedings of the ACM Web Conference 2023, 2023, pp. 1405-1414, doi: 10.1145/3543507.3583191.

C. X. Zhang, Y. L. Zhang, and X. Y. Gao, "Multi-head self-attention gated-dilated convolutional neural network for word sense disambiguation," IEEE Access, vol. 11, pp. 14202-14210, 2023, doi: 10.1109/ACCESS.2023.3243574.

S. Kaddoura and R. Nassar, "EnhancedBERT: A feature-rich ensemble model for Arabic word sense disambiguation with statistical analysis and optimized data collection," Journal of King Saud University-Computer and Information Sciences, vol. 36, no. 1, p. 101911, 2024, doi: 10.1016/j.jksuci.2023.101911.

D. Tufiș, R. Ion, Ș. D. Dumitrescu, and D. Ștefănescu, "Wikipedia as an SMT Training Corpus," in Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP 2013), Hissar, Bulgaria, 2013. [Online]. Available: https://aclanthology.org/R13-1091.

X. Wang and D. Martinez, "Word sense disambiguation using automatically translated sense examples," in Proceedings of the Cross-Language Knowledge Induction Workshop, 2006, doi: 10.3115/1608842.1608849.

R. Nayak, R. Mills, C. De-Vries, and S. Geva, "Clustering and labeling a web scale document collection using Wikipedia clusters," in Proceedings of the 5th International Workshop on Web-Scale Knowledge Representation Retrieval & Reasoning, 2014, pp. 23-30, doi: 10.1145/2663792.2663803.

N. Bloom, M. Theune, and F. De Jong, "Document categorization using multilingual associative networks based on Wikipedia," in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 841-846, doi: 10.1145/2740908.2743003.

A. Saif, N. Omar, M. J. Ab Aziz, U. Z. Zainodin, and N. Salim, "Semantic concept model using Wikipedia semantic features," Journal of Information Science, vol. 44, no. 4, pp. 526-551, 2018, doi: 10.1177/0165551517706231.

Z. Wu and C. Giles, "Sense-aware semantic analysis: A multi-prototype word representation model using wikipedia," in Proceedings of the AAAI Conference on Artificial Intelligence, 2015, vol. 29, 1 ed., doi: 10.1609/aaai.v29i1.9496.

H. Husni, Y. Kustiyahningsih, F. H. Rachman, E. M. S. Rochman, and H. Yulian, "Query expansion using pseudo relevance feedback based on the bahasa version of the wikipedia dataset," in AIP Conference Proceedings, 2023, vol. 2679, 1 ed., doi: 10.1063/5.0111273.

Downloads

Published

2024-07-01

Submitted

2024-03-05

Revised

2024-04-20

Accepted

2024-06-03

How to Cite

Pourbahman, Z., Rastin, N., & Fakhrahmad, M. (2024). A Deductive Word Sense Disambiguation Approach Based on Data Mining and Knowledge Extraction in Expert Systems. Journal of Artificial Intelligence, Applications and Innovations, 1(3), 20-30. https://doi.org/10.61838/jaiai.1.3.3

Similar Articles

You may also start an advanced similarity search for this article.