DAY 5: DATASET

Using various converter softwares, I turned WEBM videos into JPG images. The initial software I used successfully turned the video into 100 photos, but the problem was that it couldn’t resize the photos. I tried the NCH Photo Resize software instead, but it didn’t successfully output consistent dimensions. Then, I used BatchPhoto, which successfully resized, but requires a subscription to get rid of the watermark. Since BatchPhoto works with Windows, I sent the sample photos from my Windows desktop to my Mac laptop (where I will train the model) so I can see that it works before I purchase. The final, working solution was to use VLC following this tutorial: https://www.raymond.cc/blog/extract-video-frames-to-images-using-vlc-media-player/. I retrieved 551 images from one person’s video. The next step is to figure out a consistent naming convention for student images vs. non student images. A possibility is “s123450”.

I also read:

  • Chapter 12 of Hands-On Machine Learning (Using TensorFlow like NumPy)
    • Similarities in TensorFlow and NumPy tenro operations (ex. math)
    • You can apply TensorFlow operations to NumPy and vice versa
    • tf.Variable is used when you want to be able to modify tf.Tensor
    • Custom Loss Functions
      • Train a regression model but the training set is noisy
        • Remove outliers but dataset is still noisy
        • Use Huber loss
    • Saving models with custom components
      • Need to map the names to the objects
    • Losses are used to train the model. Metrics are used to evaluate the accuracy of the model (need to be easier to interpret)
    • Keep track of the number of true positives and false positives → keras.metrics.Precision

DAY 4: WEBCAM

During the daily meeting, we talked about the security measures necessary when working with data, images, and personal identifiable information in order to ensure the safety of all parties involved. There are precautions and procedures we must follow to stay compliant with data security requirements.

After the webcam was mounted on the surface of the robot, I tested the procedures for turning a video into still images. Using Google Meets, I configured the camera input to the webcam, and screen-recorded the window with Screencastify, which then creates a video link via Google Drive.

The next step was to convert the WEBM video into multiple JPG files. The first website I found online only outputted 1 file, whereas I need 20-30 frames per second, all outputted as JPG images.

So, I used a different website (EZGif) which allowed me to choose how many images to output from the ~6 second video

Then, it was brought to my attention that using a third party converter like EZGif may violate data privacy regulations, so I will now use a similar application that runs locally through a software download.

DAY 3: GNN

The goal for today was the try and spray sample animal images from a database by following the ECL GNN Tutorial.

When I first tested it, I got this error message in the landing zone:

Solution: change target name to “imgdb::cw” instead of “animal_images”

The next goal was to get to this page (screenshot taken from the tutorial video)

In order to do that, I tried running BWR_ImageGNN in VS Code (View → Command Palette → ECL Submit)

It showed an error message: “No ECL Launch configurations”

Solution: close VS code → “.” so any directory with the same location will be included

(now it includes “.vscode” like it should)

Then, I got this error message on VS Code

and this error message on ECL Watch

Possible tested solution: change the “imagedb::bmd” to “imgdb::cw” because that is the file name I set for it in ECL watch

“Setuptest” works, but when doing the GNN training, I get this error message. I will investigate, then open a JIRA ticket if I can’t find a solution.

Update: the solution was to include a “~” symbol in front of the target name. Give a file name and a file size.

Now, the results are accurate. The output shows 1 for any image of a dog, and 0 for an image of a different animal. Since I have completed testing the demo on my local environment, I can now replicate the process with my own images.

DAY 2: GNN + CAMERA

I continued following the “Machine Learning with HPCC GNN Tutorial” on Docker Desktop.

• In ECL, shapes (ex. [n,n] is a 2-dimensional Tensor, [n,n,n,n] is a 4-dimensional Tensor) are expressed using a RECORD-oriented Tensor

• Always starts at 0. (ex. 50 x 50 colored image = [0,50,50,3] where 3=the 3 colors RGB)

Two steps for creating an ECL Tensor:

  1. Create the data for the TensData RECORD
  2. Create a Tensor with that data. Pack it into a block oriented form called “slices”

GNN Documentation: https://cdn.hpccsystems.com/pdf/ml/GNN.pdf

VS Code –> GNNTutorial.zip –> Images

ECL Watch –> Landing Zone –> images uploaded from local machine (.bmp and .csv) –> select all images (.bmp) –> BLOB –> set up name and file size –> spray –> verify in “Logical Files” list under “Logical Name”

YES or NO: is the image a dog? (1 = YES, 0 = NO)

Next step: start GNN processing/Tensors

Resize for a consistency

Demo set epochs to 200 (for a small data set)

For the new data/photo obtaining procedure, we still need to get a physical camera working on the security robot. There are essentially 2 options:

  1. Webcam: We could plug a webcam into a laptop, place the laptop on the robot temporarily, and have students stand in front of the webcam as I start and stop the video on the laptop.
  2. Amcrest Security Camera (already installed on the robot): We have to disassemble the surface of the robot and hook it up to a computer to check compatibility, and test the ease of starting/stoping the video. If this camera works, it will be a more permanent solution that can be turned into an automated process, and it won’t take up additional surface space like the webcam will.

DAY 1: PROCEDURE CHANGE

During my daily meeting with my mentor David, we talked about an advancement in the application of the facial recognition program onto the robot in order to incorporate HPCC. Instead of using a program to augment the data (ex. turn 1 photo of student A into 100 photos), we will try to take a 10-second video of each student moving their head around, and turn the video into still, jpg images. This will allow me to train the model with real data, instead of “manufactured” data that isn’t as representative of our real-world application on the school security robot. In case this plan falls through, plan B is to use a video from a phone. However, mounting a camera on the robot will be best.

In order to help this plan come to fruition, I talked to my team members about the video output format on the cameras we have. The first issue we ran into was that the original pan and tilt camera is an IP camera, which will be more difficult to turn into still images. It would be easier to use an analog camera, so now we are figuring out whether or not we can install an analog camera instead. This way, we can use a capture card to get the images from the video.

In the future, the robot can take a video of a new student, integrate the new images by retraining the model using an onboard version of HPCC Systems machine learning, and continuously improve the student/staff database through an automated process.