The Fundamentals of Machine Learning
1.The Machine Learning Landscape
What Is Machine Learning?
Why Use Machine Learning?
Examples of Applications
Types of Machine Learning Systems
Supervised/Unsupervised Learning
Batch and Online Learning
Instance-Based Versus Model-Based Learning
Main Challenges of Machine Learning
Insufficient Quantity of Training Data
Nonrepresentative Training Data
Poor-Quality Data
Irrelevant Features
Overfitting the Training Data
Underfitting the Training Data
Stepping Back
Testing and Validating
Hyperparameter Tuning and Model Selection
Data Mismatch
2. End-to-End Machine Learning Project
Working with Real Data
Look at the Big Picture
Frame the Problem
Select a Performance Measure
Check the Assumptions
Get the Data
Create the Workspace
Download the Data
Take a Quick Look at the Data Structure
Create a Test Set
- Discover and Visualize the Data to Gain Insights
Visualizing Geographical Data
Looking for Correlations
Experimenting with Attribute Combinations
Prepare the Data for Machine Learning Algorithms
Data Cleaning
Handling Text and Categorical Attributes
Custom Transformers
Feature Scaling
Transformation Pipelines
Select and Train a Model
Training and Evaluating on the Training Set
Better Evaluation Using Cross-Validation
Fine-Tune Your Model
Grid Search
Randomized Search
Ensemble Methods
Analyze the Best Models and Their Errors
Evaluate Your System on the Test Set
Launch, Monitor, and Maintain Your System
Try It Out!
3. Classification
MNIST
Training a Binary Classifier
Performance Measures
Measuring Accuracy Using Cross-Validation
Confusion Matrix
Precision and Recall
Precision/Recall Trade-off
The ROC Curve
Multiclass Classification
Error Analysis
Multilabel Classification
Multioutput Classification
4. Training Models
Linear Regression
The Normal Equation
Computational Complexity
Gradient Descent
Batch Gradient Descent
Stochastic Gradient Descent
Mini-batch Gradient Descent
Polynomial Regression
Learning Curves
Regularized Linear Models
Lasso Regression
Elastic Net
Early Stopping
- Logistic Regression
Training and Cost Function
Decision Boundaries
Softmax Regression
5. Support Vector Machines
Linear SVM Classification
Soft Margin Classification
Nonlinear SVM Classification
Polynomial Kernel
Similarity Features
Gaussian RBF Kernel
Computational Complexity
SVM Regression
Under the Hood
Decision Function and Predictions
Training Objective
Quadratic Programming
The Dual Problem
Kernelized SVMs
Online SVMs
6. Decision Trees
Training and Visualizing a Decision Tree
Making Predictions
Estimating Class Probabilities
The CART Training Algorithm
Computational Complexity
Gini Impurity or Entropy?
Regularization Hyperparameters
Regression
Instability
Exercises
7. Ensemble Learning and Random Forests
Voting Classifiers
Bagging and Pasting
Bagging and Pasting in Scikit-Learn
Out-of-Bag Evaluation
Random Patches and Random Subspaces
Random Forests
Extra-Trees
Feature Importance
Boosting
AdaBoost
Gradient Boosting
Stacking
8. Dimensionality Reduction
The Curse of Dimensionality
Main Approaches for Dimensionality Reduction
Projection
Manifold Learning
PCA
Preserving the Variance
Principal Components
Projecting Down to d Dimensions
Using Scikit-Learn
Explained Variance Ratio
Choosing the Right Number of Dimensions
PCA for Compression
Randomized PCA
Incremental PCA
Selecting a Kernel and Tuning Hyperparameters
LLE
>Other Dimensionality Reduction Techniques
9. Unsupervised Learning Techniques
Clustering
K-Means
Limits of K-Means
Using Clustering for Image Segmentation
Using Clustering for Preprocessing
Using Clustering for Semi-Supervised Learning
DBSCAN
Other Clustering Algorithms
Gaussian Mixtures
Anomaly Detection Using Gaussian Mixtures
Selecting the Number of Clusters
Bayesian Gaussian Mixture Models
Other Algorithms for Anomaly and Novelty Detection