top of page
header.png
Team Runarljod: Translating American Sign Language

written by

Elizabeth Campolongo

Monday, February 20, 2023

Congratulations to Team Runarljod for being a Top 5 Project of the The Erdős Institute’s Fall 2022 Data Science Boot Camp with their project to translate American Sign Language (ASL)!

 

Team Runarljod was a six-person team composed of Chemical and Biochemical Engineering Ph.D. Candidate Akash Banerjee (Rutgers University), Mathematics Ph.D. Candidate Ethan Farber (Boston College), Mathematician Balin Fleming, Physics Ph.D. Candidate Lauren Keyes (The Ohio State University), Test and Automation Engineer Abdullateef Shodunke (Standard Motor Products), and Physics Ph.D. Candidate Rouzbeh Modarresi-Yazdi (McGill University). They utilized a Convolutional Neural Network (CNN) to translate ASL fingerspelling (spelling words and acronyms by hand using the ASL alphabet, which corresponds one-to-one to the English alphabet). Options for automatic ASL translation are relatively new and still limited; Team Runarljod looked to contribute to this burgeoning field with their fingerspelling translator. They combined datasets on Kaggle and GitHub into 26,500 training images (at varying levels of quality), which they preprocessed using MediaPipe to identify and graph the hand. From here, they augment the dataset by randomly rotating and reflecting images before feeding it into the CNN for training. Overall, their model was able to achieve 85% accuracy, which they believe could be improved through further fine-tuning of their model.

 

When determining what project to develop, they first ruled out areas they didn’t want to work in. They were interested in image classification, but wanted to try something new, so they settled on ASL translation as the most meaningful and interesting from their short-list. They knew early on that they would want to use a CNN, and divided into smaller groups to try both a Pytorch and a Tensorflow approach. Balin explained that they planned to extend their initial model to then take video clips of fingerspelling. As such, they employed MediaPipe for hand-recognition and graphing, and applied alterations to the images (rotation, contrast, etc.) to avoid biasing the model by pixel location. Ultimately, they determined that the Tensorflow model was the better direction for their project and they regrouped to finalize it.

 

Given more time, Team Runarljod would have liked to explore video transcription in greater depth. They had researched the topic, read about optical flow and iterative attention techniques, and prepared a video dataset and pipeline to process and classify videos; however, they did not have the time required to train the model. Rouzbeh added that they had significantly improved their static-image model through batch-normalization, and he would have liked to explore model tuning more.

 

They all agreed that their greatest challenge was organization and coordination with a larger team across multiple time-zones. Lauren remarked that it would have been helpful to have a project manager with a “bird’s-eye-view of what’s going on” with the project, who could delegate tasks. On the technical side, they were working with a massive dataset on a CNN which required intensive resources to train and adjust. Rouzbeh suggested that future boot camp participants keep this in mind when choosing their projects: be aware of RAM and harddrive limitations. He and Ethan stressed the importance of access to computing clusters and early on in the process, with Ethan advising that they “decide early on what kind of project experience you want to have…we had to just figure out how to deal with massive amounts of data, and if not for our access to a computing cluster, then [I] really don’t know what we would’ve done.”

 

The whole team really enjoyed the modeling process: “the two PNG files that we got—validation and testing results—I love those,” exclaimed Rouzbeh. Ethan recalled his satisfaction in troubleshooting a Tensorflow issue, “after reading a bunch of message boards and documentation, I realized” a one-line solution; “then everything worked like we wanted it to.” Lauren found it exciting to work with a “bunch of new people that I’d never met before.” She added that “it was fun to learn about new machine learning techniques” and seeing the effect of normalization techniques applied at the end. Abdullateef also enjoyed the collaborative effort, remarking on the joy of them coming together as a team meeting nearly every day in the last two weeks to wrap up the project.

 

Congratulations again to Team Runarljod for being a Top 5 Project of The Erdős Institute’s Fall 2022 Data Science Boot Camp!

TEAM

Geoguessers

Aashraya Jha

clear.png

In the popular online game Geoguessr, the player is shown a random image from Google Street View and is tasked with guessing their location on the globe as accurately as possible. In this project, we seek to solve a simplified version of this problem but using a strategy often used by professional Geoguesser players: using man-made features (for example, traffic lights) to accurately guess a city.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL
bottom of page