Week 7

This week has been quite challenging in my computer vision research. I made some modifications to my current model, incorporating an additional convolutional layer and flattening the tensor after the convolutional layers. Additionally, I increased the batch size and epoch size in an attempt to enhance the model’s performance. However, my code had numerous errors that required an entire week of debugging. Thankfully, my mentor, Dr. Furst, and my advisor, Dr. Maxwell, provided valuable assistance during this process. We thoroughly analyzed various sections of the code, particularly focusing on data normalization and the convolutional neural network’s numerical aspects. Dr. Maxwell suggested that there might still be some residual code from the original fully connected network, which needed modification to resolve all errors. If my alternative approach doesn’t yield the desired results, I may have to revisit this aspect.

During my meeting with Dr. Maxwell, I presented another version of the task that I developed based on the official PyTorch tutorial and the “Cats & Dogs Classification with Pytorch” implementation. The tutorials provided clear instructions on dataset loading, data transformation, loading image data, model building with data augmentation, and making predictions. While using larger sample sizes often results in higher accuracy, this model runs rather slow on my CPU. As our goal is to achieve 80% accuracy on the original JPEG files, I decided to adjust the epoch size to consistently maintain over 80% accuracy, optimizing both runtime and accuracy.

Encouragingly, I obtained promising results for both JPEG (original) and PNG (linear) files. However, when attempting to load EXR (log) files, I encountered an error. The “ImageFolder” class, which I used for loading image data, does not support EXR files. Consequently, I need to devise a method to modify this class and enable loading of EXR files. Despite searching online for existing solutions, I haven’t found a suitable one yet. However, I stumbled upon a class called “FishLogImageSubDataset” in a previous student’s work. While it seems promising, I need to ascertain its applicability for loading log data. To clarify this, I sought guidance from Dr. Maxwell.

For the upcoming week, my plan is to implement the “FishLogImageSubDataset” class into my CatOrDog project and train the model with log data. I’m hopeful that this will allow me to perform a comprehensive performance comparison. As a computer vision scientist, I understand that research and work in this field involve continuous experimentation, facing failures, and achieving satisfying breakthroughs. Despite the challenges, I remain optimistic and determined that my efforts will lead to promising outcomes.

Written on July 27, 2023