当前位置：网站首页>One article takes you to understand machine learning

One article takes you to understand machine learning

2022-07-03 16:17:00 【Tread on the clouds with ice】

List of articles

1. What is machine learning
2. Why machine learning is needed
3. Machine learning and artificial intelligence , The relationship between deep learning
4. Steps for developing machine learning applications
5. Machine learning algorithm classification
- 5.1 Supervised learning (supervised learning)
- 5.2 Unsupervised learning (unsupervised learning)
6. What should we do
7. How to learn

1. What is machine learning

Insert picture description here
We can sum up rules from a lot of daily experience , When faced with new problems , We can use the laws summarized in the past to analyze the actual situation , Adopt the best strategy .
And machine learning is similar .

So machine learning can be defined as follows ：

Definition of machine learning : Machine learning is the automatic analysis of models from data , And use the model to predict the unknown data .

2. Why machine learning is needed

21 In the th century, machine learning has attracted people's attention again , Behind these concerns is the change of the whole environment , We have more and more data , The hardware is getting stronger and stronger . There is an urgent need to liberate people's productive forces , Automatically find the law of data . Solve problems in more professional fields . Machine learning has been widely used in data mining 、 Computer vision 、 natural language processing 、 biometrics 、 Search engine 、 Medical diagnosis 、 Detect credit card fraud 、 Stock market analysis 、DNA Sequencing 、 Voice and handwriting recognition 、 Strategic games and robotics .
The following is a general list ：

Data analysis - CRM, Marketing Analysis , Audience research
Predictive analysis - Stock market forecast , market research , Fraud prevention
Service personalization - Recommendation engine , User modeling
natural language processing - The text generated , Text analysis , Text translation , chatbot
Sentiment analysis - Audience research , Customer service , Handle , Suggest
Computer vision - Image recognition , Visual search , Face recognition
speech recognition - Ai assistant , Voice to text , Auto subtitle and so on .

3. Machine learning and artificial intelligence , The relationship between deep learning

I believe many people have heard of these nouns but do not know their previous relationship , Here is a picture directly ：
Insert picture description here
It can be seen that AI covers machine learning and deep learning , More precisely Machine learning is an approach to artificial intelligence , Deep learning is a method of machine learning ,

There are different algorithms in machine learning , Like linear regression 、 Logical regression 、SVM、 Decision tree 、 Neural networks, etc . Because machine learning using neural network algorithm is special , So this kind of machine learning is named separately For deep learning , So it can also be understood as Deep learning is machine learning using neural network algorithms .
You may still have questions here , What is the definition of AI ？

As early as 1956 The meeting in the summer of , The pioneers of artificial intelligence dream of using the computers that just appeared at that time to construct complex 、 Machines with the same essential characteristics as human intelligence . This is what we are talking about now Strong artificial intelligence （General AI）. This omnipotent machine , It has all our senses （ Even more than people ）, All our reason , Think like us .
Strong AI still exists only in movies and science fiction , Because we can't realize them yet . What we can achieve at present is generally called Weak artificial intelligence （Narrow AI）. Weak AI can be the same as people , Even better than people to perform specific tasks of Technology . for example , Alpha series go robots that are common or heard of in our life , Face recognition , speech recognition , Small voice robot , Driverless, etc. These are examples of weak artificial intelligence in practice .

How to learn Artificial Intelligence ？ so to speak , The circle of artificial intelligence is too big , All walks of life involve , Optional side Xiang is also diverse 、 Each are not identical , Including data mining 、 Computer vision 、 Natural language processing and other fields . that , Is it every The contents to be learned in the direction are very different ？ No, it isn't . In fact, the core is machine learning , You can't do anything without it , therefore No matter which field you choose , We must lay a solid foundation . therefore , The first goal is to solve the major algorithms of machine learning , And master its application and practice methods .

4. Steps for developing machine learning applications

（1） collecting data

There are many ways we can collect sample protectors , Such as ： Make a web crawler to extract data from the website 、 from RSS Feedback or API Get information from 、 The measured data sent by the equipment .

（2） Ready to input data

After getting the data , You must also ensure that the data format meets the requirements .

（3） Analyze input data

The main function of this step is to ensure that there is no garbage data in the data set . If you are using a trusted data source , Then you can skip this step directly

（4） Training algorithm

Machine learning algorithms really begin to learn from this step . If unsupervised learning algorithm is used , Because there is no target variable value , Therefore, there is no need to train the algorithm , All the content related to the algorithm is in Chapter （5） Step

（5） The test algorithm

This step will actually use section （4） Next, the knowledge information obtained by machine learning . Of course, we also need to evaluate the accuracy of the results , Then retrain your algorithm as needed

（6） Usage algorithm

Convert to application , Perform actual tasks . To check whether the above steps can work normally in the actual environment . If you encounter new data problems , It is also necessary to repeat the above steps

The above steps should be more professional , You can see the following flow chart ：
Insert picture description here

5. Machine learning algorithm classification

Insert picture description here

5.1 Supervised learning (supervised learning)

Definition ： Input data is composed of input eigenvalue and target value . The output of a function can be a continuous value ( It's called regression ）, Or the output is a finite number of discrete values （ It's called classification ）.
The tasks of supervised learning can be divided into two categories ： Classification and regression
Classification and regression , Both belong to the category of supervised learning , It's all about learning . The reason why I choose two different names , Because the corresponding output values have different forms , That's it .

classification ： To classify all kinds of things , For discrete prediction .
such as ： Picture recognition of cats and dogs :
The eigenvalue ： cat / Pictures of dogs ; The target ： cat / Dog - Category
This is the problem of classification
Common classification algorithms ： k- Nearest neighbor algorithm 、 Bayesian classification 、 Decision trees and random forests 、 Logical regression 、 Neural network and so on

Return to ： Forecast continuous 、 Specific values . For example, the prediction of housing prices ：
The eigenvalue ： Property information of the house ; The target ： House price - Continuous data
This is the question of return .
Common regression algorithms ： Linear regression 、 Ridge return, etc

5.2 Unsupervised learning (unsupervised learning)

Definition ： The input data is composed of input eigenvalues .
Unsupervised learning problems refer to , The model learns without any instructions . Because there is no instruction , That means the only thing the model can learn is what the sample itself contains , Such as the similarity and pattern between samples , Or the distribution of the population in which the sample is located .

Unsupervised learning mainly has 3 Characteristics ：

Unsupervised learning has no clear purpose
Unsupervised learning doesn't need to label data
Unsupervised learning cannot quantify the effect

Unsupervised learning is mainly used for dimension reduction and clustering .
Clustering of unsupervised learning ：
K-means It's often called the Lloyd's algorithm , This is the most classic in data clustering , It's also a relatively easy to understand model .

The process of algorithm execution is divided into 4 Stages :
First , Set randomly K Points in the feature space are the initial clustering centers .
then , For eigenvectors based on each data , from K Search for the nearest cluster center , And mark the data as the cluster center .
next , After all the data are marked with cluster centers , The newly allocated class clusters based on these data , Create a new centroid weight by averaging all samples assigned to each previous centroid , New pair K Cluster centers for calculation .
Last , Calculate the difference between the old and new centroids , If the cluster center of all data points does not change from the last assigned cluster , Then the iteration can stop , Otherwise go back to step 2 Continue to cycle .

Dimensionality reduction of unsupervised learning ：

PCA( The principal components ） Analysis method is the most commonly used dimension reduction method ,PCA The idea is to project data from high-dimensional space onto low-dimensional space , Make the data as scattered as possible in the low dimensional space , So as to keep most of the information of the data .

6. What should we do

Machine learning in Internet companies 、 What are the work contents of data mining engineers ?
Study various algorithms , Design tall model ?
The application of deep learning ,N Layer neural networks ?
…
None of this is , In fact, most of the algorithm refinement of complex models is done by data scientists ,
And most programmers mainly do ：

Running data , Various map-reduce,hive SQL, Data warehouse moves brick
Data cleaning , Data cleaning , Data cleaning
Analysis business , analysis case, Look for features
Common algorithm running model

So for us , What's the most important ？
Learn to analyze problems
Master the basic idea of the algorithm , Learn to solve problems with corresponding algorithms
Learn to use simple libraries or frameworks to solve problems

7. How to learn

If you have Baidu machine learning how to learn , Many people would recommend reading all the books related to college mathematics first , Like gaoshu , linear algebra , Probability theory and mathematical statistics , Li Hang's statistical learning methods , Like these ：

Insert picture description here
Maybe many people are quite passionate at first , It will feel boring later, and it is not easy to stick to it , Personally, I suggest not to do this . The best way to learn is to start with , Then learn while doing cases , Make up what you lack , This is not boring, but also deepened the understanding and Application .
We've already said that , We must be clear about what is most important to us ,
It is important for us to master some machine learning algorithms and other skills , Cut into a business area to solve problems .

Recommended video tutorial ：
There is no outer chain here , I'll just talk about what I've seen , Open b You can search directly on the station Ng machine learning , This is the most classic ancestor , There are better ones that can be searched directly Machine learning whiteboard derivation , There is also a machine learning course taught by huhaoji of Zhejiang University , Direct search Zhejiang University machine learning You can find .

Recommended books ：
Insert picture description here
The most classic is the first watermelon book , After finishing the foundation, you can find some practical courses of machine learning framework , The framework also has a lot to learn , classical tensorflow frame , Framework rookie pytorch, These two can be well studied , Other frames are like caffe、theano、chainer Those who have energy can study , In fact, it's enough to play with oneortwo familiar frames , Just like using python do web Development , You can use large and complete Django frame , You can also use small and refined Flask, These are not tangled , Choose a framework with corresponding complexity according to your own time, and learn first , Our goal is still to encounter problems , Solve the problem and finally complete the task .