Projects

Multi-label Ingredient Classification in Food Images

The project focuses on creating a model that identifies ingredients in a cuisine from its image, motivated by the need to promote healthier eating habits amidst rising obesity rates in the U.S. It addresses this as a multi-label classification problem, where the model predicts the presence or absence of various ingredients in a dish. The model operates on RGB images of food, utilizing pre-trained feature models like Resnet50, VGG16, InceptionV3, and EfficientNet for feature extraction. Various learning methods, including multi-class logistic regression, decision trees, ensemble methods, SVM, and a unique unsupervised learning algorithm developed by the team, are explored for accurate ingredient classification. This tool aims to empower individuals to make healthier dietary choices by providing instant and precise recognition of meal ingredients.

Screenshot 2023-12-14 at 12.14.50 AM.png

Maximizing MinBert for Multi-Task Learning

In this study, we investigate the efficacy of multi-task learning within a BERT fine-tuning framework, focusing on optimizing performance for sentiment analysis, paraphrase detection, and semantic textual similarity tasks simultaneously. We explore different model variants, incorporating techniques such as round-robin training, the application of cosine similarity in the STS head, adding additional Head NN layers, lexicon encoding before task-specific layers, SMART Regularization, and a multi-layered transform. Overall, we found that the combination of round-robin training, cosine similarity in the STS head, multiple task-specific layers, and lexicon pre-processing yielded the best results.

Pose Estimation for Pedestrians and Cyclists

This project aims to tackle the Waymo Challenge 2 task, which involves predicting 3D poses of pedestrians and cyclists using lidar frames and camera images, a critical aspect of autonomous driving safety. Initially, the team applied the AlphaPose multi-pose estimation model to Waymo's image data, noting low confidence in keypoint predictions due to AlphaPose's design for multi-person scenarios, which differs from the Waymo dataset's pedestrian and cyclist images. After familiarizing themselves with the dataset and enhancing their understanding of pose estimation, the team plans to re-train the AlphaPose model, experimenting with changes in the model's architecture and data augmentation techniques to improve its performance. Subsequently, the focus will shift to adapting the model for lidar data from Waymo, starting with camera data for benchmarking before integrating lidar data for comprehensive testing in the Waymo environment.

Estimating an Optimal Solution to Backgammon

Backgammon, a game with a history of around 5000 years, involves two players moving 15 white and 15 black pieces across a board with 24 positions. The game, known for its complexity, has about 18 quintillion possible legal positions and is played by rolling dice to determine moves. The complexity of Backgammon remains unsolved due to its vast state space, with an action space of approximately 633 that varies based on the current game state, and a state space complexity of the order of 10^20. Its game tree complexity, indicating the number of paths to different states, is at least 10^144, far surpassing the number of atoms in the universe (10^80 ). Given this complexity, researchers have focused on estimating optimal solutions rather than solving the game entirely. Initial strategies included one-step lookahead with simple heuristics, followed by advancements like Gerald Tesauro’s TD-Gammon, a significant step in estimating an optimal solution. More recent approaches involve simplified neural network reinforcement learning, as demonstrated by Tobias Vogt, who modified aspects of Tesauro's model, such as reducing the number of hidden layers.

Maintaining Plasticity in Continual Reinforcement Learning

Plasticity in continual reinforcement learning (RL) is an emerging field of research. In this paper, we investigate methods to maintain neural network plasticity in a continual reinforcement learning scenario within a simulated robotics environment -- the MuJoCo Hopper environment. We systematically modify the environmental parameters of damping, friction, and gravity over training episodes in order to enhance the robustness of the agent and evaluate the capabilities of the agent to adapt over time to the changing environment. We evaluate various algorithmic modifications of the baseline Proximal Policy Optimization (PPO) algorithm over 100 trianing episodes. Our findings suggest that in specific adjustments to the environment and modifications we can outperform PPO. Specifically, show that augmenting PPO with ReDo (0.1 threshold), L2Init and/or CReLUs in different scenarios may be a promising path to improving the agent's performance and flexibility, which is crucial for deploying robotics in real-world, dynamic conditions.