carinawang26

FINAL PRESENTATION + POSTER

Posted on August 26, 2021September 2, 2021 by carinawang26 in Week 11

Presentation Powerpoint Download

Slide 7 source links:

1. Diagram

2. Diagrams

3. Book

Slide 25:

1. GitHub

2. Blog

3. Flowcharts

Digital Takeaway:

1. Robot Inspiration

Poster Download

DAY 1: FLOWCHART + JIRA

Posted on August 16, 2021August 16, 2021 by carinawang26 in Week 11

New JIRA Tickets I opened:

I created to summarize the findings of my internship regarding steps to process and classify images with HPCC GNN. In the future, anyone wishing to use their own images to train the GNN model can refer to this.

DAY 3+4: TESTING VARIOUS THOR SLAVES

Posted on August 12, 2021August 13, 2021 by carinawang26 in Week 10

With the MobileNetV2 model, 224x224x3 images, and 5 epochs, I ran various number of thor slaves with default CPU and memory to evaluate differences in accuracy and timing.

Since only 4 thor slaves were compatible under default settings, I manually changed the CPU to 8 and the memory to 16G.

It is expected that more thor slaves = shorter running time

DAY 1: README.MD

Posted on August 9, 2021August 9, 2021 by carinawang26 in Week 10

As I continued working with the HPCC GNN model, I began creating a README.md file to document the steps so that later on, I could turn it into “instructions” for someone to recreate my project.

So far, the README.md file outlines how to setup Kubernetes, use AKS created by others, complete the Azure storage setup, create the storage account and shares, deploy HPCC, spray data, and change the number of thor slaves.

DAY 1: TESTING DIFFERENT MODELS

Posted on August 9, 2021 by carinawang26 in Week 10

Dataset size: 4839 images

I changed the model name in Jupyter Notebook to popular image classification models.

List of models: https://keras.io/api/applications/

MobileNetV2

Between 37 and 48 seconds per epoch
Reached 100% accuracy on the first epoch

InceptionResNetV2

This model link didn’t work:

So, I tested a different link for the same model and it worked

Between 95 and 96 seconds per epoch
Reached 100% accuracy after 1 epoch

NASNetMobile

Between 74 and 84 seconds per epoch
Reached 100% accuracy on the first epoch

EfficientNet_V2

Between 51 and 56 seconds per epoch
Reached 100% accuracy on the first epoch

bit

Between 288 and 319 seconds per epoch
Reached 100% on the first epoch

Ultimately, the MobileNet V2 model I originally used was the most time-efficient at 37 to 48 seconds for each epoch. All of the models reached 100% accuracy on the first epoch.

WEEKEND + DAY 1: OPENING JIRAS

Posted on August 9, 2021 by carinawang26 in Week 10

JIRA: HPCC-26359

Create storage account secret error on Mac

After trying to run the create-secret.sh file with the command ./create-secret.sh, an error came up saying “exactly one NAME is required, got 2”.

JIRA: ML-496

While experimenting with different models to see how changing the model affects the accuracy/time taken, I got this error message for all of the EfficientNet models: “TypeError: ‘numpy.float64’ object cannot be interpreted as an integer.”

DAY 5 + WEEKEND: PLAN

Posted on August 9, 2021 by carinawang26 in Week 09

Confirm that the dataset size is 4,839 images
Run the full dataset with different models in Jupyter Notebook
- MobileNetV2
- InceptionResNetV2
- NASNetMobile
- EfficientNet_V2
- bit
Run GPU on the desktop
On the HPCC GNN model, change the number of thor slaves (1, 2, 4, 8, 12, 20) and document how this variable affects the total cluster time (using MobileNetV2, 224x224x3 images, and 5 epochs)
On the HPCC GNN model, change the CPU and memory. Run the various number of thor slaves again (1, 2, 4, 8, 12, 20) with 8 CPU and 16g memory)
Write documentation for how to spray 4,000+ images into HPCC
Write documentation for how to recreate this project using Jupyter Notebook and HPCC GNN
Load this model onto my phone
Write README.md files
Create presentation with the following information
- What is a neural network
- What is CNN
- What is GNN
- What is image classification
- How did I get my data
- Results/how I trained my local device
- Results/how I used Jupyter Notebook (and link for others to recreate)
- Which model I used
- Move this job to HPCC (how we set it up) (tables with 1 thor, 2 thors, 4, 8, 12, and 20 in default)
  - specify which model yielded what result
- Change to 8 CPU and 16g memory
  - affect on time and accuracy
- Results from GPU
- GitHub link
- Jiras that I opened and which ones got resolved
- Future work

DAY 4+5: JIRA SOLUTION

Posted on August 6, 2021 by carinawang26 in Week 09

Following Lili and Roger’s advice, I changed the code from UNSIGNED1 (which is 1 byte) to UNSIGNED4 (4 bytes). Previously, the maximum was 255 images before the model wouldn’t run anymore. Today, I tested 256 images and it successfully worked.

The next step was to spray all 4,000+ images and run the model with the complete dataset.

With 4,839 images and 4 thor slaves, the model took 1 hour and 13 minutes to run 20 epochs, reaching 100% accuracy on the student images after 2 epochs. When I tested 5 epochs, the training used 31 minutes and 22 seconds. Next week, I will keep retraining the model to see if/how much the speed increases when continuously processing the same images.

This was a huge step for this internship project and the HPCC GNN model as it can now process exponentially more images for multiple purposes.

DAY 2+3: GNN MODEL

Posted on August 4, 2021 by carinawang26 in Week 09

The main priority this week has been communicating with other LexisNexis employees to fix the GNN Model 255 images constraint. As of right now, we are examining issues in the code itself rather than Azure/the cloud. While working on that, I have also been testing different models (besides the TensorFlow Transfer Learning one) to see what changes it will reflect on the accuracy percentages. Overall, the TensorFlow Models run smoothly on Jupyter Notebook with consistently high accuracy rates.

With the GNN HPCC Model image count limitation, I would only be able to train 255 out of close to 5,000 student/non-student images. Even with this limitation, I will still be able to bring it past the “proof of concept” stage by running the full dataset successfully on Jupyter Notebook using the Transfer Learning Model. However, ideally, the HPCC model will be processing the images on the cloud. Fixing the 255 image limitation will open the doors for more practical applications of the HPCC GNN model in the future even beyond the scope of my internship project, which makes it a top priority for this week.

DAY 5: CONDENSED MODEL

Posted on July 30, 2021 by carinawang26 in Week 08

Based on the TensorFlow Transfer Learning Model, I created a condensed version that is shorter, but fits the same purpose.

Since the first time I used this model, the accuracy has always reached 100% after the first epoch. I was curious as to what would happen if I added a new image with a different background to see if it would lower the accuracy, but it didn’t change.

Then, I ran the condensed model with animal images first, then the student images so that it “retrains” the model. The animal images were processed more quickly (since there are significantly fewer images), and the accuracy took 3 epochs to reach 100%. After this, I tested my student images again but it still reached 100% accuracy after 41 seconds in the first epoch.

# of Thor Slaves (with default CPU and Memory)	End Time (Total Cluster Time)
1	*error-terminated
2	*error-terminated
4 (default # of thor slaves)	28:12 Second trial: 28:13
8	Failed Second trial: Failed Third trial: still failed Reason why it failed the third time: not enough memory Reason why it failed the first 2 times: the gnncarina resource group on Azure must be set to selected networks, so each time I create a new aks cluster, I must add the network

# of Thor Slaves (with 8 CPU and 16G Memory)	End Time (Total Cluster Time)
1	1:38:12
2	48:10 ^^ 1 vs 2 epochs