USE CASE STUDY: Powering SDSU research team with Vertex AI experimentation workbench to unlock AI/ML for American Sign Language recognition.
*SDSU LABORATORY FOR LANGUAGE AND COGNITIVE NEUROSCIENCE
General Challenges for Automated Sign Recognition
The state of sign language AI is far behind the state of AI systems for spoken and written languages. This is due to several factors:
- a lack of adequate sign language datasets
- a lack of knowledge exchange between computational scientists and sign language linguistics experts
- a lack of a conventionalized written system for signed languages
- most existing language models are built on spoken/written language
All of these factors result in unreliable models. AI-assisted sign language recognition should be leveraged in a way that benefits research and the various stakeholders, in particular, sign language communities.
Current Challenges for SDSU
Analysis of signed language datasets is laborious and costly because it requires trained humans to watch vast amounts of video footage to label or annotate signs and their components manually. Even partial automatization of these annotation processes – as has been possible for speech recognition – would significantly advance the researchers’ ability to examine signed languages.
The Laboratory for Language and Cognitive Neuroscience (LLCN) at San Diego State University (SDSU) wants to develop AI models trained on the recognition of handshape parts rather than signs as wholes (whole-word approach is prevalent in current models) because it approximates approaches to automatic speech recognition (i.e., breaking down speech audio into individual speech sounds) and thus can lead to improved success of the models because those would be the most robust to natural variation in how signers articulate signs (e.g., dialects).
The SDSU team would like to use Google’s AI/ML to recognize and classify the most common American Sign Language (ASL) handshapes and handshape parts to boost linguistic research on ASL. The SDSU team wants to train models on large structured datasets (they collected thousands of annotated video clips of people producing signs) and to efficiently carry out and manage experiments for developing the best ML-boosted solution. SDSU plans to use GCP tools to create a benchmark (an experimental workbench) for recognizing the most common handshapes and handshape parts in ASL.
In the first phase, SDSU wants to evaluate the capabilities of a baseline model for recognizing the handshapes and/or handshape parts.
SOLUTION DELIVERED BY F33
F33 has followed its AI/ML Framework to deliver AI/ML solution for SDSU. As a result, we helped SDSU to formulate requirements, prepare datasets and train their research team to use the solution.
The project yielded several significant outcomes. Firstly, it involved developing a flexible and scalable laboratory that tests diverse machine learning (ML) approaches for American Sign Language (ASL). This specialized lab provided a conducive environment for researchers and practitioners to explore and refine their ML models specific to ASL.
Another noteworthy result of the project was the establishment of a standardized and reproducible workflow. This workflow was a systematic framework for conducting experiments and comparing different ML models designed for ASL recognition.
Furthermore, the project delivered a baseline model as a reference point for performance evaluation. This baseline model provided a benchmark against which other ML models could be compared. By having a standardized starting point, researchers and developers could gauge the effectiveness of their models and identify areas for improvement.
In summary, the project produced a flexible and scalable lab for testing various ML approaches to ASL, along with a standardized workflow for experimentation and comparison of different ML models. Additionally, a baseline model was established to enable accurate performance assessments. These outcomes have significantly contributed to advancing ASL recognition using machine learning techniques.
– Zed Sehyr, PhD, SDSU
“An independent multi-channel training and recognition will support automatic annotation of signs and their parts, such as handshape, what fingers are selected, are they spread or bent, etc., and can aid or fully automate corpus annotation. Such corpora could be used to improve models of sign recognition and translation. Why is a language-based “sub-unit” based modeling approach important? For example, the meaning of a sign can change depending on which fingers are selected or whether they are bent or straight. There could also be subtle variation in the way a person articulates a sign (e.g., due to dialects/accents). Models informed by findings from sign linguistics research will lead to systems that more closely mirror real-world sign language usage.”
Learn more about F33 and discover how artificial intelligence can help your business. With F33, you can quickly deploy AI models that scale with your business needs and stay ahead of the competition.