Aprendizado Profundo para o Reconhecimentos de Sinais Isolados em Libras

Gabriela Tavares Barreto

This work investigates deep learning approaches for the recognition of isolated signs in Brazilian Sign Language (Libras) using pose- and hand-based landmark sequences extracted with MediaPipe Holistic. Two neural architectures were evaluated: a bidirectional LSTM encoder and a compact Transformer encoder. Experiments were conducted on two Brazilian RGB video datasets, MINDS-Libras and V-Librasil, considering signer-dependent, signer-independent, cross-dataset transfer, and mixed-domain training scenarios. The results show that the Transformer model achieves higher performance when sufficient data is available, outperforming the LSTM in both signer-dependent and signer-independent evaluations on MINDS-Libras. However, both models exhibit limited generalization across domains, achieving near-zero transferability when trained on MINDS-Libras and tested directly on V-Librasil, highlighting substantial differences between datasets despite visual similarity. Mixed-dataset training
partially mitigates these effects but still reveals strong dataset dependency. Overall, the findings reinforce the need for larger, more diverse and standardized Libras datasets, and suggest that domain robustness remains a key challenge for automatic sign recognition in Brazilian Sign Language.


2025/2 - POC2

Orientador: Pedro Olmo Stancioli Vaz de Melo

Palavras-chave: Aprendizado Profundo, Libras, IA, Deep Learning

PDF Disponível