I developed a simple animal classifier using PyTorch Lightning. The main motivation behind this project stemmed from a simple thought: "People sometimes compare each other to certain animals, like cats or dogs. What if I build an animal classifier and feed it images of humans? Would it be able to identify the most similar animal?" To accomplish this, I obtained the Animal Face dataset from Kaggle. The initial dataset consisted of only three classes: Dog, Cat, and Wildlife. Since I wanted more categories, I manually classified each image (16,130 images) into the following categories: ["cat", "cheetah", "dog", "fox", "leopard", "lion", "tiger", "wolf"].
Next, I preprocessed the dataset using pl.utils.data.Dataset and pl.LightningDataModule, transforming the data and passing it to a dataloader. The selected transformations were designed to maximize accuracy. Then, using pl.LightningModule, I implemented the complete training pipeline using the ResNet34 backbone. I utilized the AdamW optimizer and customized the training and validation steps. EarlyStopping was employed with a patience of 5, and the model was saved as a .ckpt file after each epoch.
Since the model used was only ResNet-34 (meaning it has 34 layers), the training process itself took approximately 40 minutes, even with a single GPU on my RTX-3060 laptop. During the initial attempt, the loss consistently plateaued at a high value, with the accuracy barely surpassing random selection. This issue was resolved by switching from Adam to AdamW optimizer, as the weight decay in Adam was likely the cause. Eventually, the loss reached the global minimum, achieving an accuracy of 99.9%.
However, despite having a well-functioning animal classifier, the initial hypothesis was proven wrong. I expected that a deep neural network would capture abstract features of an image, and when provided with an image of a human without a specific "human" class, the network would output the most similar-looking animal. However, this hypothesis proved to be incorrect. Regardless of the animal a person resembled, the output was ALWAYS a "dog". I realized my hypothesis was flawed and learned that AI generally perceives humans as resembling dogs.
Overall, this project was enjoyable. Although my initial hypothesis was incorrect, developing a model from scratch on a local GPU (I had only done it on Google Colaboratory previously) was a valuable learning experience. I gained practice in connecting CUDA, utilizing an external GPU, working with Conda, and using PyTorch Lightning. If I were to redo this project, now equipped with web development knowledge, I would consider implementing it into a web application, after addressing the fundamental issue.
Detailed hyperparameters and the code can be found below.
Technologies
Pytorch Lightning
Python