Week 4
This week (meet on 7/6/2023), I focused on expanding my skills in computer vision and deep learning techniques. I started by learning how to utilize PyTorch to train a convolutional neural network (CNN) for image classification. Dr. Maxwell shared a helpful website that guided me through the process.
After gaining a basic understanding of PyTorch on CNN, I applied my newly acquired knowledge to train a sophisticated model using the FashionMNIST dataset. The goal was to accurately identify different types of clothing items. This exercise allowed me to practice implementing CNN architectures and training them on real-world image data.
To evaluate the impact of various aspects of the network architecture on performance and training time, I conducted an experiment. Based on the given dimensions, I selected three parameters to modify: the number of training epochs, the dropout rates of the Dropout layer, and the addition of another dropout layer after the fully connected layer.
Firstly, I increased the number of training epochs from 3 to 10. This change resulted in a substantial boost in performance, elevating it from 73% to 84%. The improvement aligns with the intuition that more training rounds contribute to better performance.
In an effort to enhance performance further, I decreased the dropout rate from 0.5 to 0.3. Upon analyzing the generated figures, I observed that the data did not exhibit signs of overfitting. Consequently, this change positively affected the performance, increasing it from 84% to 86%.
Next, I introduced an additional dropout layer after the fully connected layer. However, this adjustment did not yield any noticeable improvement in performance. This aligns with my understanding of dropout layers, which are primarily employed to mitigate overfitting, suggesting that the dataset did not exhibit significant overfitting.
While attempting to add more convolution layers to the architecture, I encountered some challenges. It was feasible to transition from 1 to 2 convolution layers, but I consistently encountered errors as I attempted to add additional layers. Consulting with Dr. Maxwell confirmed that it is uncommon to include excessive convolution layers in such models.
Overall, I thoroughly enjoyed the coding process and the iterative improvements in performance through architectural modifications. It was a productive week, and I feel confident in my progress and understanding.
Looking ahead to next week, I have planned two main topics for exploration. Firstly, I will select a computer vision task that involves raw data, code, and pretrained models for object recognition. I will investigate fine-tuning techniques, particularly in the context of log image data, to determine if such methods can enhance training outcomes. Additionally, I intend to delve into the raw data package called rawpy, expanding my knowledge in this area of computer vision.