Enhancing Real-Time Vision-Based Sign Language Interpretation: A Deep Learning Approach

Authors

  • Irfanali J. Shaikh, Prasanna Shete

Keywords:

Sign Language Interpretation, Deep Learning, Gated Recurrent Unit (GRU), Real-Time Vision-Based Systems

Abstract

Sign language interpretation via real-time vision-based systems presents a complex challenge due to the intricate nature of sign language gestures and the variability in human motion. Effective interpretation requires robust systems that can handle the nuances of visual data and translate them into comprehensible text or speech. This study explores the efficacy of various deep learning architectures in improving the accuracy and reliability of sign language interpretation. Specifically, the application of Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) alongside simpler Artificial Neural Networks (ANN) with different activation functions such as Rectified Linear Unit (ReLU) and Leaky ReLU (LReLU). Through experiments, research shows that LSTM and GRU are particularly effective for continuous frame data due to their ability to process temporal sequences, simpler ANNs with targeted hyperparameter tuning for static frames. The study provides a comparative analysis, revealing that GRU outperforms LSTM in handling short sequences, and that there is negligible performance difference between ANNs using ReLU and LReLU for single-frame interpretation. Findings contribute to the ongoing efforts to refine and enhance technological solutions for the deaf and mute communities, ensuring more accessible and effective communication tools. The research underscores the importance of clean input data and highlights specific preprocessing techniques that aid in focusing on relevant data points, thus significantly boosting the performance of vision-based sign language interpretation systems.

Downloads

Download data is not yet available.

References

Sign language interpretation via real-time vision-based systems presents a complex challenge due to the intricate nature of sign language gestures and the variability in human motion. Effective interpretation requires robust systems that can handle the nuances of visual data and translate them into comprehensible text or speech. This study explores the efficacy of various deep learning architectures in improving the accuracy and reliability of sign language interpretation. Specifically, the application of Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) alongside simpler Artificial Neural Networks (ANN) with different activation functions such as Rectified Linear Unit (ReLU) and Leaky ReLU (LReLU). Through experiments, research shows that LSTM and GRU are particularly effective for continuous frame data due to their ability to process temporal sequences, simpler ANNs with targeted hyperparameter tuning for static frames. The study provides a comparative analysis, revealing that GRU outperforms LSTM in handling short sequences, and that there is negligible performance difference between ANNs using ReLU and LReLU for single-frame interpretation. Findings contribute to the ongoing efforts to refine and enhance technological solutions for the deaf and mute communities, ensuring more accessible and effective communication tools. The research underscores the importance of clean input data and highlights specific preprocessing techniques that aid in focusing on relevant data points, thus significantly boosting the performance of vision-based sign language interpretation systems.

Downloads

Published

09.07.2024

How to Cite

Irfanali J. Shaikh. (2024). Enhancing Real-Time Vision-Based Sign Language Interpretation: A Deep Learning Approach. International Journal of Intelligent Systems and Applications in Engineering, 12(22s), 993 –. Retrieved from https://www.ijisae.org/index.php/IJISAE/article/view/6582

Issue

Section

Research Article