top of page
header.png
Team Da Vinci: Chick ID

written by

Elizabeth Campolongo

Thursday, February 16, 2023

Congratulations to Team Da Vinci for being a Top 5 Project of the The Erdős Institute’s Spring 2022 Data Science Boot Camp with their project Chick ID: Bird Classification with a Convolutional Neural Network!

 

Team Da Vinci was a four-person team composed of Soumen Deb (University of Alberta) and Adam Kawash (Michigan State University), both Astrophysics Ph.D. Candidates at the time, now graduated and working as Sr. Data Scientist at Sun Life Financial and Michelin, respectively, with Social Psychology Ph.D. Candidate Allison Londerée (The Ohio State University) and Ecology Ph.D. Student Moeka Ono (Texas A&M University). Together, they successfully created a classifier to identify bird species from photographs using a Convolutional Neural Network (CNN). This has broad applications to conservationists and bird enthusiasts (birders) alike. Bird watching is a growing hobby in the United States, especially since the onset of the COVID-19 pandemic, and team Da Vinci identified a burgeoning market for their work. Many birders pay for subscriptions to relatively few apps available for bird identification, so Team Da Vinci adapted their model to an app, ChickID, which quickly and precisely (80%) identifies the species of a bird from a photo, adding competition to this relatively untapped market. They further highlight that, using their model, “a camera phone is the only other tool a citizen scientist would need to document rare or at-risk bird species spotted in their backyard.”

 

In selecting a project to develop over the course of the Data Science Boot Camp, Moeka suggested a project in forest ecology—in line with her Ph.D. research. Initially she was looking at leaf classification, but then found a similar Kaggle project about birds, which encouraged them to follow this direction, as they had decided to pursue a computer vision project. They further sourced images from a photographer on Long Island to increase the diversity of the image type: photographs containing foliage instead of just the bird. Thus, they decided to focus on birds from New York State. To do this they scraped a list of birds from Wikipedia and cross-referenced that with their Kaggle dataset to reduce it to just the 100 species of birds present in New York. They started with a random forest and CNN model, initially expecting to do a comparison, but the random forest was extremely slow to train. They switched their focus to the CNN and Soumen ran it on his computer GPU, which was faster. Data augmentation to increase the size of their training set did not sufficiently reduce their overfitting issue, so they switched to a popular pre-trained model, retrained the final 3 convolutional layers, and utilized both transfer learning and data augmentation, which produced better results (95% on the validation set).

 

Since Allison’s computer was struggling to run the models, she directed her focus to creating an interactive and intuitive app to showcase their classifier. Their app allows the user to upload an image for classification, or feed the model an image from their test set. This will also display the image of the selected species for a secondary comparative use. They highlight that beyond the use for birders, their app fills a need for conservationists in opening a role for citizen scientists to document the bird species of their area. This helps in tracking species at risk from climate change. Moreover, if given additional time to work on ChickID and the classifier itself, they would like to access improved hardware, since the complexity of the project was limited by their devices (due to the size of their data). This would allow them to compare results across different architectures, which Moeka was keen to try. After the project, Adam tried using the app, but couldn’t get a good enough picture with his older phone, so he borrowed a friend’s—then it worked because he could zoom in. Based on this experience, he thinks it would be helpful to add in a feature to find the bird in the image and zoom in (crop to just the bird). Finally, they would like to add more variety and complex variance to their images for a more diverse training set: good to clean high quality, different angles, so they have a functional app that has higher capacity for real-life images.

 

The whole team agreed that the success of their app in implementing their model was the most rewarding part of their project. Soumen and Adam tried it in real settings and shared it with friends, who also successfully tested it. Seeing the classification in action—not just black box accuracy scores—was extremely satisfying.

 

To future boot camp participants, Adam emphasized “when you’re picking your project team, I would suggest, seek out other people that seem excited and willing to put in the work to do this.” Though they felt the project was ambitious for the timeline, Soumen noted that they all wanted a challenge and were determined to train a neural network, remarking that “it is challenging, but it is doable.” Allison added that perseverance and guidance from their mentor, Kashif Bari, were also essential. “It’s really easy to be intimidated,” but “if you’re willing to put in the elbow grease to make it happen, just know that it can happen for you.”

 

Congratulations again to Team Da Vinci for being a Top 5 Project of The Erdős Institute’s Spring 2022 Data Science Boot Camp!

TEAM

smart search on arXiv

Xiaoyu Wang, Xin Su, Zeinab Elmi, Monalisa Dutta, Ketan Sand, Tantrik Mukerji

clear.png

Standard search methods on arXiv.org are outdated, and based on keyword matching. Modern chatbots such as chatGPT appears to do better. In this project proposal, We'd like to do something similar to chatGPT, if not better. Namely we would develop a chatbot specializing in arXiv.org. With questions like:
1. What are the recent research results on XYZ?
2. Are there common topics both domain A and domain B are working on, but researchers are too lazy to spot it (by hopping over to a different domain and suffering from a different set of jargons)?
3. I want to research XYZ, could you provide a summary of the research results in the past month?
4. Could you provide a summary of the research results in arXiv:0123.45678v2?
Project required skills: web scraping, NLP, NLP fine tuning methods such as RAG or whatever we could invent, deployment

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL
bottom of page