Global Attention on BiLSTMs with BPE for English to Telugu CLIR
Keywords:
Attention, Global Attention, Cross-language IR, Bidirectional LSTMs, Byte pair encoding, Preprocessing, NMTAbstract
An effective Cross-Lingual Information Retrieval (CLIR), will heavily rely on the accurate translation of queries and this is typically accomplished through Neural Machine Translation (NMT). NMT serves as a widely utilized method for translating queries from one language to another. In the present work, NMT is used to translate a query in English to the Indian language Telugu. For performing translation, NMT requires a parallel corpus. However the English-Telugu parallel corpora are resource-poor, so it may not be possible to supply the required amount of parallel corpus. The NMT will struggle to handle problems like Out Of Vocabulary (OOV) in resource-poor languages. The Byte Pair Encoding (BPE) mechanism will be helpful in solving OOV problems in resource-poor languages. In BPE, it segments the rare words into subword units and tries to translate the subword units. In NMT, the efficiency of translation still has issues in handling Named Entity Recognition (NER). The NER problems can be fulfilled using Bidirectional Long Short-Term Memories (BiLSTMs). The BiLSTMs will be helpful for training the system in the forward and backward directions for the dataset, which helps in recognizing the named entities. These NMT mechanisms will be sufficient for handling sentences without having long-range dependencies, but they will face issues while handling long-range dependencies in the sentences. Global Attention is useful to address these challenges, which is an integration between the encoder and decoder in NMT. This global attention mechanism proves beneficial in enhancing the translation quality, particularly for source sentences with long-range dependencies. In NMT, the Bilingual Evaluation Understudy (BLEU) scores and other parameters have shown that the efficiency in translating the source sentences is higher for global Attention on BiLSTMS with BPE than in regular models.
Downloads
References
Karunesh Kumar Arora, Shyam S. Agrawal, “Pre-Processing of English-Hindi Corpus for Statistical Machine Translation,” Computación y Sistemas, pp. 725-737, 2017.
Richard Kimera, Daniela N. Rim, Heeyoul Choi, “Building a Parallel Corpus and Training Translation Models Between Luganda and English,” Journal of KIISE, Vol. 49, No. 11, pp. 1009-1016, 2022.
G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving products of Bessel functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955. (references)
Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, and Le-Minh Nguyen. “Overcoming the Rare Word Problem for low-resource language pairs in Neural Machine Translation,” In Proceedings of the 6th Workshop on Asian Translation, pp. 207–214, 2019.
Mengjiao Zhang and Jia Xu, “Byte-based Multilingual NMT for Endangered Languages,” In Proceedings of the 29th International Conference on Computational Linguistics, pp. 4407–4417, 2022.
Rico Sennrich, Barry Haddow and Alexandra Birch, “Neural Machine Translation of Rare Words with Subword Units,”In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, August 7-12, pp. 1715-1725, 2016, DOI: 10.18653/v1/P16-1162.
Mai Oudah, Amjad Almahairi and Nizar Habash, “The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation,”ArXiv.org, Aug. 19-23, pp. 214-221, 2019.
B. N. V. Narasimha Raju, M. S. V. S. Bhadri Raju, K. V. V. Satyanarayana, “Effective preprocessing based neural machine translation for English to Telugu cross-language information retrieval,” IAES International Journal of Artificial Intelligence (IJ-AI), pp. 306-315, Vol. 10, No. 2, June 2021, DOI: 10.11591/ijai.v10.i2.pp306-315.
M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, Nov. 1997.
Sutskever, I., Vinyals, O., and Le, Q, “Sequence to sequence learning with neural networks,” Proceedings of the 27th International Conference on Neural Information Processing Systems,Vol. 2, pp. 3104–3112, 2014.
Sébastien, J., Kyunghyun, C., Memisevic, R., and Bengio, Y, “On using very large target vocabulary for neural machine translation,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1-10, 2015.
Ali Araabi, Christof Monz, and Vlad Niculae, “How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?,” In Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pp. 117–130, 2022.
Aloka Fernando and Surangika Ranathunga, “Data Augmentation to Address Out of VocabularyProblem in Low Resource Sinhala English Neural Machine Translation,” In Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, pp. 61–70, 2021.
Longtu Zhang and Mamoru Komachi, “Neural Machine Translation of Logographic Language Using Sub-character Level Information,” In Proceedings of the Third Conference on Machine Translation: Research Papers, pp. 17–25, 2018.
Martin Sundermeyer, Tamer Alkhouli, Joern Wuebker, and Hermann Ney, “Translation Modeling with Bidirectional Recurrent Neural Networks,” Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 14–25, October 25-29, 2014, DOI: 10.3115/v1/D14-1003.
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, “Sequence to sequence learning with neural networks,” In Proceedings of the 27th International Conference on Neural Information Processing Systems, Vol. 2, pp. 3104–3112, 2014.
Hamidreza Ghader and Christof Monz, “What does Attention in Neural Machine Translation Pay Attention to?,” In Proceedings of the Eighth International Joint Conference on Natural Language Processing, Vol. 1, pp. 30–39, 2017.
Bahdanau, Dzmitry & Cho, Kyunghyun & Bengio, Y, “Neural Machine Translation by Jointly Learning to Align and Translate,” ArXiv. 1409, 2014.
Thang Luong, Hieu Pham, and Christopher D. Manning, “Effective Approaches to Attention-based Neural Machine Translation,” In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421, 2015.
Sébastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio,” On Using Very Large Target Vocabulary for Neural Machine Translation,” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Vol. 1, pp. 1–10, 2015.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.