thom22
Registered: December 2020 City/Town/Province: Pine Brook Posts: 1
View this Member's Photo Gallery
|
Butterflies are truly wonders of nature; they have enormous stamina but are extremely fragile at the same time. The monarch butterfly's annual migration route in North America is an astonishing feat of the natural world, demonstrating the true beauty of the species. Each year, thousands of individuals fly south from the United States and southern Canada to Mexico, where their wintering grounds are located. However, in recent years, the monarch butterfly has become increasingly under threat from a number of external factors, including climate change, deforestation, droughts, loss of habitat, and loss of milkweed. This has caused troubling decreases in population and disruptions to their migration routes. In order to address this problem, I knew that we first had to accurately measure these trends and to really find out the rate at which these declines were occurring. This would require accurate counts of the butterflies in existence. One problem I found when pondering this problem was that the habitats of monarch butterflies also overlap with many other species, some of which look extremely similar. Examples include the viceroy butterfly (Limenitis archippus) and the soldier butterfly (Danaus eresimus). In some cases, this is due to Mullerian mimicry. Therefore, in order to gain accurate insights into population and migration trends, monarch butterflies need to be distinguished from these other species with similar phenotypes.
In tackling this problem, I knew that I couldn't only go outside and try to count butterflies myself; I had to get my hands dirty and devise new technology to help citizen scientists (who are leading the charge in monarch butterfly conservation) identify the correct butterflies accurately and efficiently. I have been heavily involved in machine learning research throughout my high school career — writing, publishing, and presenting papers on various topics. Recently, for example, I have been focused in the area of using machine learning for building damage detection in imagery, for humanitarian relief after natural disasters. Besides this interest of mine, I have also always been an avid wildlife ecologist. Since I was little, I have always enjoyed visiting the Bronx Zoo and seeing animals from all across the world. My grandfather and I used to buy hundreds of figurines of different animal species, and this interest has never subsided. Therefore, I decided to combine my knowledge of machine learning and computer science research and my passion for wildlife, in taking decisive action for the conservation of a threatened species, namely the monarch butterfly.
After scouting the current state of the field and conducting a literature review, I found that currently, there are not any datasets available that include imagery of both monarch butterflies and the species' "look-alikes." Therefore, I decided to create such a dataset. I used Google Images queries and downloaded images of 6 different species, with labels. I then needed many volunteers (family, friends, etc.) to add bounding-box labels to them, so that we could eventually train convolutional neural networks on them. The dataset, called MonarchNet, contains labeled images of monarch butterflies (Danaus plexippus) and of species that have a similar physical appearance: viceroy butterflies (Limenitis archippus), red admirals (Vanessa atalanta), painted ladies (Vanessa cardui), queen butterflies (Danaus gilippus), and soldier butterflies" (Danaus eresimus). After many hours of dataset curation and labeling, I trained a baseline convolutional neural network on it to evaluate the dataset and its usefulness for the monarch butterfly versus look-alike species classification problem. The input consists of a labeled butterfly image, with the species and bounding box coordinates. The output is a digit from 0 to 5 (corresponding to the category of butterfly) representing the predicted species of the butterfly in the image. The model architecture is ResNet50, pretrained on ImageNet data. The criterion for optimization is the cross-entropy loss function. I trained on a randomly selected 80% of the dataset with a batch size of 32. The Adam optimizer with a learning rate of 0.01 is utilized. The network is trained for 100 epochs on NVIDIA Tesla K80 GPUs (utilizing the Google Colaboratory service). The result I achieved was a weighted F1 score of 0.824 (representing the accuracy/precision of the model on a scale from 0 to 1). Hopefully, further research and publicly releasing the dataset will allow the scientific community to achieve an even better F1 score.
The Nicodemus Wilderness Project presented a wonderful opportunity to pursue this project because it hosts a diverse community of "Apprentice Ecologists": young folks that are completing inspiring projects to protect and improve the environment. I wanted to engage with this community and present the research that I have been working so hard on. This project enhances the community and the entire world by providing better computational mechanisms to accurately identify these butterflies, leading to better detections of population and migration trends and therefore more pointed methods of mitigating and solving for threats facing the species. Monarch butterflies are so important to the ecosystems to which they contribute, as they are important pollinators. Preserving them not only continues the impressive migration that they complete each year, but also benefits all of the environments that they inhabit and the other organisms that inhabit them.
I was able to publish my preliminary dataset and early results in the Proceedings of Learning Meaningful Representations of Life, after I presented my work at the Learning Meaningful Representations of Life workshop (www.lmrl.org) at the Neural Information Processing Systems (NeurIPS) Conference, which is one of the premier scientific conferences for machine learning research. It was an honor to have researchers across the world examine my work and engage in conversation with them at the poster session. This presentation also coincided with my presentations at other workshops at NeurIPS that weekend, including the Tackling Climate Change with ML workshop and the AI for Earth sciences workshop, where I presented my work on using machine learning and remote sensing to assess building damage post-natural disaster with satellite imagery. All in all, it was a wonderful experience and it motivated me to continue my work with monarch butterflies. In the future, I plan to finalize the dataset, conduct more experiments on it (utilizing different deep-learning architectures, for example), so that I can publicly release it and allow the machine learning and computer vision communities to try to improve on my baseline model.
The Apprentice Ecologist Project enriched my life greatly by allowing me to combine two seemingly disparate passions of mine. It has inspired me to conduct further research projects using machine learning and science research to raise awareness and develop solutions to the pressing problems that face the world today, particularly in the scope of climate change and environmental conservation. It was a perfect project for the duration of the COVID-19 pandemic because it did not require me to engage in large gatherings, and everything could be done right here on my computer and in my backyard. I am currently working on a new project using computer vision for training models to identify different African wildlife species that are caught on wildlife cameras. This research will hopefully be useful in the future to prevent poaching and gain more accurate insights into the populations. I will be happy to share this project with the Nicodemus Wilderness Project when it is finished or near finished.
|