I continued reading the Hands-On Machine Learning book while taking notes on skills applicable to my internship project
- Chapter 4
- Gradient descent → algorithm that finds the optimal solution
- Polynomial regression → using a linear model to fit nonlinear data
- Logistic regression → estimates the probability that X belongs to a positive class (1 if positive, 0 if negative)
- Chapter 5
- Decision function → predict if its a positive class (1) or negative (0)
- Support vector machine → machine learning model → supports linear and nonlinear
- Large margin classification: fitting the widest possible street between the classes
- Soft margin: more flexible, works with outliers
- To use SVMs for regression instead of classification, try to fit as many instances on the street (instead of trying to fit the largest possible street between two classes)
- Try to minimize w to get a large margin
- Quadratic programming → quadratic optimization problems with linear constraints
- Chapter 11: Training Deep Neural Networks
- Used for detecting hundreds of types of objects in images
- Vanishing/exploding gradients problem
- Gradients get bigger and bigger, or smaller and smaller
- Using He initialization along with ELU can reduce the danger of vanishing/exploding gradients, but they can still come back during training
- Batch normalization addresses the issue
- Standardizes inputs, then rescales and offsets them
- Batch normalization addresses the issue
- Reusing the same model when the tasks are similar
- You don’t have much labeled training data
- Unsupervised pre training
- Train a first neural network on an auxiliary task → easy to get labeled training data
- Facial recognition
- Gather a lot of pictures of random people, and train a neural network to detect whether or not two pictures are the same person
- Reusing lower layers lets you train a face classifier with little data
- Facial recognition