当前位置：网站首页>Wu Enda's machine learning mind mapping insists on clocking in for 23 days - building a knowledge context, reviewing, summarizing and replying

Wu Enda's machine learning mind mapping insists on clocking in for 23 days - building a knowledge context, reviewing, summarizing and replying

2022-07-02 20:04:00 【AXYZdong】

Author：AXYZdong Automation Engineering Male
A little bit of thinking , Have a little idea , A little bit rational ！
Set a small goal , Try to be a habit ！ Meet a better self in the most beautiful years ！
[email protected],CSDN First episode ,AXYZdong original
The only blog update address is ： AXYZdong The blog of
B The homepage of the website is ：AXYZdong My personal homepage

List of articles

0. Preface
1. Instructions for using mind map
2. The main content of mind mapping
3. Mind map text
4. About in the title “ Keep punching 23 God ”
5. reference

0. Preface

Machine learning is one of the most exciting directions in information technology . This paper takes teacher Wu Enda's machine learning course as the main line , Use Process On Online drawing constructs the mind map of machine learning .

1. Instructions for using mind map

Cooperate with teacher Wu Enda's machine learning video , Build knowledge context , Review and summarize the reply .

Browse all mind maps online ： Wu Enda machine learning - Mind Mapping ProcessOn

Students who need to browse the full picture pay attention to AXYZdong official account , reply machine learning Get the password ！

2. The main content of mind mapping

introduction （Introduction）

Supervised learning part ：

Univariate linear regression （Linear Regression with One Variable）
Multivariate linear regression （Linear Regression with Multiple Variables）
Logical regression （Logistic Regression）
Regularization （Regularization）
neural network ： describe （Neural Networks:Representation）
neural network ： Study （Neural Networks:Learning）
Support vector machine （Support Vector Machines）

Unsupervised learning part ：

clustering （Clustering）
Dimension reduction （Dimensionality）
Anomaly detection （Anomaly Detection）

Special applications ：

Recommendation system （Recommender Systems）
Large scale machine learning （Large Scale Machine Learning）

Suggestions on establishing machine learning system ：

Suggestions for applied machine learning （Advice for Applying Machine Learning）
The design of machine learning system （Machine Learning System Design）
Application example ： Picture text recognition （Application Example: Photo OCR）

3. Mind map text

0. introduction （Introduction）

The introduction mainly introduces the definition of machine learning 、 Related algorithms of machine learning 、 Supervised learning and unsupervised learning .

There is no uniform definition of machine learning , The following two are the two scholars' understanding of machine learning mentioned in the video .

Arthur Samuel (1959). Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.
Tom Mitchell (1998). Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task Tand some performance measure P, if its performance on T, as measured by P, improveswith experience E.

Insert picture description here

^{▲ Part0 Introduction}

1. Univariate linear regression （Linear Regression with One Variable）

The main content of this part includes the model representation of univariate linear regression 、 Cost function 、 Gradient descent method and using gradient descent method to solve the minimum value of cost function .

Insert picture description here

^{▲ Part1 Linear Regression with One Variable}

2. Multivariate linear regression （Linear Regression with Multiple Variables）

Multivariable linear regression is equivalent to the expansion of univariate , Mainly according to the model assumptions 、 The idea of constructing cost function and studying the minimum value of cost function .

Unlike univariate linear regression , Multivariate linear regression may also involve the problem of feature scaling , The main reason is that there are characteristic variables with different scales , In order to make the gradient descent converge quickly , These characteristic variables need to be unified （ Similar to the idea of normalization ）

Compared with univariate linear regression , Multivariable linear regression is used to solve the characteristic equation of cost function , In addition to the gradient descent method , You can also use regular equations . According to the number of characteristic variables , Choose these two methods flexibly .

Insert picture description here

^{▲ Part2 Linear Regression with Multiple Variables}

3. Logical regression （Logistic Regression）

there “ Return to ” Different from linear regression , It's a customary name . Its essence is classification , The variables to be predicted are discrete values .

Insert picture description here

^{▲ Part3 Logistic Regression}

4. Regularization （Regularization）

Regularization （Regularization） The proposed , It mainly solves the problem of fitting （over-fitting） The problem of . Including the regularization of linear regression and the regularization of logical regression , Its essence is to preserve all features by adding regularization terms , At the same time, reduce the parameters （ Coefficient before characteristic variable ） Size .

One hypothesis can be better fitted to the training data than others , But it can't fit the data well on the data set other than the training data , At this time, we think that there is a phenomenon of over fitting in this hypothesis . The main reasons for this are noise or too little training data .

Insert picture description here

^{▲ Part4 Regularization}

5. neural network ： describe （Neural Networks:Representation）

neural network （Neural Networks） A brief statement of , Involving nonlinear assumptions 、 Model representation of neural networks 、 Intuitive understanding of neural networks and multiple classification .

When there are too many features , Ordinary logistic regression model , Can't handle so many features effectively , Now we need neural networks .

Insert picture description here

^{▲ Part5 Neural Networks:Representation}

6. neural network ： Study （Neural Networks:Learning）

neural network （Neural Networks） The cost function of , Gradient descent seeks the minimum value of the cost function , Using back propagation algorithm （Backpropagation Algorithm） Calculate the direction of gradient descent .

Numerical tests using gradients （Numerical Gradient Checking） Method , The cost of prevention seems to be decreasing , But the final result may not be the problem of optimal solution .

If you let the initial parameters be 0, Then the activation unit of the second layer will have the same value . Therefore, you need to initialize the parameters , The method of random initialization is adopted ,Python The code is as follows ：

Theta1 = rand(10,11) * (2*eps) - eps

Insert picture description here

^{▲ Part6 Neural Networks:Learning Learning}

7. Suggestions for applied machine learning （Advice for Applying Machine Learning）

When using the trained model to predict unknown data, it is found that there is a large error , What to do next ？ Use the diagnostic method to judge which methods are effective for our algorithm .
The training set and test set are used to evaluate whether the hypothesis function is over fitted , The parameters obtained by minimizing the cost function of the training set are substituted into the cost function of the test set .
Cross validation sets to help select models . Diagnostic bias and variance , The performance of the algorithm is not ideal , Or the deviation is relatively large , Or the variance is bigger . let me put it another way , What happens is either an under fit , It's either an over fitting problem .
The learning curve takes the training set error and cross validation set error as the number of training set instances （m） The graph drawn by the function of .

^{▲ Part7 Advice for Applying Machine Learning}

8. The design of machine learning system （Machine Learning System Design）

The main content of this part is error analysis 、 The error measure of class skew 、 The trade-off between precision and recall and machine learning data .

Insert picture description here

^{▲ Part8 Machine Learning System Design}

9. Support vector machine （Support Vector Machines）

Support vector machine (Support Vector Machines) In essence, it is to optimize the objective function in logistic regression , Will contain log Item usage cost Function instead of .

Support vector machine uses a maximum spacing to separate samples , Robust , It is sometimes called a large spacing classifier .

Kernel function (Kernel) Introduce support vector machine SVM in , Instead of the corresponding high-dimensional vector inner product .

Insert picture description here

^{▲ Part9 Support Vector Machines}

10. clustering （Clustering）

clustering （Clustering） One kind of unsupervised learning .

Key algorithms ：K- Mean algorithm .K-Means Is the most popular clustering algorithm , The algorithm accepts an unmarked data set , Then clustering the data into different groups .

Insert picture description here

^{▲ Part10 Clustering}

11. Dimension reduction （Dimensionality）

Dimension reduction （Dimensionality） It is mainly used for data compression and data visualization , It is also a kind of unsupervised learning .

Important algorithm ： Principal component analysis PAC（Principal Component Analysis） Algorithm .

Insert picture description here

^{▲ Part11 Dimensionality}

12. Anomaly detection （Anomaly Detection）

This part mainly includes Gaussian distribution （Gaussian Distribution）, Gaussian algorithm is used for anomaly detection , Feature transformation transforms the original data into Gaussian distribution .

Important algorithm ： gaussian （Gaussian ） Algorithm .

Insert picture description here

^{▲ Part12 Anomaly Detection}

13. Recommendation system （Recommender Systems）

This part includes ： Content based recommendation system 、 Collaborative filtering （Collaborative Filtering）、 Vectorization ： Low rank matrix decomposition 、 Implementation details ： Mean normalization .

Important algorithm ： Collaborative filtering （Collaborative Filtering） Algorithm .

Insert picture description here

^{▲ Part13 Recommender Systems}

14. Large scale machine learning （Large Scale Machine Learning）

primary coverage ： Random gradient descent method （Stochastic Gradient Descent）、 Small batch gradient descent （Mini-Batch Gradient Descent）、 Convergence of stochastic gradient descent algorithm 、 Online learning （Online Learning） and Mapping simplification and data parallelism （Map Reduce and Data Parallelism）.

Insert picture description here

^{▲ Part14 Large Scale Machine Learning}

15. Application example ： Picture text recognition （Application Example: Photo OCR）

Focus on Steps of image and character recognition and The sliding window （Sliding Windows） Use .

Insert picture description here

^{▲ Part15 Application Example: Photo OCR}

16. summary （Conclusion）

Insert picture description here

^{▲ Part16 Conclusion}

4. About in the title “ Keep punching 23 God ”

Blink Keep punching 23 God

Insert picture description here

^{▲ Blink Clock in 23 God}

5. reference

[1]：[ Chinese and English subtitles ] Wu Enda machine learning series
[2]：fengdu78, Coursera-ML-AndrewNg-Notes, (2018), GitHub repository, https://github.com/fengdu78/Coursera-ML-AndrewNg-Notes

This sharing is here