Enhancing Real-Time Vision-Based Sign Language Interpretation: A Deep Learning Approach
Keywords:
Sign Language Interpretation, Deep Learning, Gated Recurrent Unit (GRU), Real-Time Vision-Based SystemsAbstract
Sign language interpretation via real-time vision-based systems presents a complex challenge due to the intricate nature of sign language gestures and the variability in human motion. Effective interpretation requires robust systems that can handle the nuances of visual data and translate them into comprehensible text or speech. This study explores the efficacy of various deep learning architectures in improving the accuracy and reliability of sign language interpretation. Specifically, the application of Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) alongside simpler Artificial Neural Networks (ANN) with different activation functions such as Rectified Linear Unit (ReLU) and Leaky ReLU (LReLU). Through experiments, research shows that LSTM and GRU are particularly effective for continuous frame data due to their ability to process temporal sequences, simpler ANNs with targeted hyperparameter tuning for static frames. The study provides a comparative analysis, revealing that GRU outperforms LSTM in handling short sequences, and that there is negligible performance difference between ANNs using ReLU and LReLU for single-frame interpretation. Findings contribute to the ongoing efforts to refine and enhance technological solutions for the deaf and mute communities, ensuring more accessible and effective communication tools. The research underscores the importance of clean input data and highlights specific preprocessing techniques that aid in focusing on relevant data points, thus significantly boosting the performance of vision-based sign language interpretation systems.
Downloads
References
Sign language interpretation via real-time vision-based systems presents a complex challenge due to the intricate nature of sign language gestures and the variability in human motion. Effective interpretation requires robust systems that can handle the nuances of visual data and translate them into comprehensible text or speech. This study explores the efficacy of various deep learning architectures in improving the accuracy and reliability of sign language interpretation. Specifically, the application of Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) alongside simpler Artificial Neural Networks (ANN) with different activation functions such as Rectified Linear Unit (ReLU) and Leaky ReLU (LReLU). Through experiments, research shows that LSTM and GRU are particularly effective for continuous frame data due to their ability to process temporal sequences, simpler ANNs with targeted hyperparameter tuning for static frames. The study provides a comparative analysis, revealing that GRU outperforms LSTM in handling short sequences, and that there is negligible performance difference between ANNs using ReLU and LReLU for single-frame interpretation. Findings contribute to the ongoing efforts to refine and enhance technological solutions for the deaf and mute communities, ensuring more accessible and effective communication tools. The research underscores the importance of clean input data and highlights specific preprocessing techniques that aid in focusing on relevant data points, thus significantly boosting the performance of vision-based sign language interpretation systems.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.