当前位置:网站首页>One article takes you to understand machine learning
One article takes you to understand machine learning
2022-07-03 16:17:00 【Tread on the clouds with ice】
List of articles
1. What is machine learning
We can sum up rules from a lot of daily experience , When faced with new problems , We can use the laws summarized in the past to analyze the actual situation , Adopt the best strategy .
And machine learning is similar .
So machine learning can be defined as follows :
Definition of machine learning : Machine learning is the automatic analysis of models from data , And use the model to predict the unknown data .
2. Why machine learning is needed
21 In the th century, machine learning has attracted people's attention again , Behind these concerns is the change of the whole environment , We have more and more data , The hardware is getting stronger and stronger . There is an urgent need to liberate people's productive forces , Automatically find the law of data . Solve problems in more professional fields . Machine learning has been widely used in data mining 、 Computer vision 、 natural language processing 、 biometrics 、 Search engine 、 Medical diagnosis 、 Detect credit card fraud 、 Stock market analysis 、DNA Sequencing 、 Voice and handwriting recognition 、 Strategic games and robotics .
The following is a general list :
- Data analysis - CRM, Marketing Analysis , Audience research
- Predictive analysis - Stock market forecast , market research , Fraud prevention
- Service personalization - Recommendation engine , User modeling
- natural language processing - The text generated , Text analysis , Text translation , chatbot
- Sentiment analysis - Audience research , Customer service , Handle , Suggest
- Computer vision - Image recognition , Visual search , Face recognition
- speech recognition - Ai assistant , Voice to text , Auto subtitle and so on .
3. Machine learning and artificial intelligence , The relationship between deep learning
I believe many people have heard of these nouns but do not know their previous relationship , Here is a picture directly :
It can be seen that AI covers machine learning and deep learning , More precisely Machine learning is an approach to artificial intelligence , Deep learning is a method of machine learning
,
There are different algorithms in machine learning , Like linear regression 、 Logical regression 、SVM、 Decision tree 、 Neural networks, etc . Because machine learning using neural network algorithm is special , So this kind of machine learning is named separately For deep learning
, So it can also be understood as Deep learning is machine learning using neural network algorithms
.
You may still have questions here , What is the definition of AI ?
As early as 1956 The meeting in the summer of , The pioneers of artificial intelligence dream of using the computers that just appeared at that time to construct complex 、 Machines with the same essential characteristics as human intelligence . This is what we are talking about now
Strong artificial intelligence
(General AI). This omnipotent machine , It has all our senses ( Even more than people ), All our reason , Think like us .Strong AI still exists only in movies and science fiction , Because we can't realize them yet . What we can achieve at present is generally called
Weak artificial intelligence
(Narrow AI). Weak AI can be the same as people , Even better than people to perform specific tasks of Technology . for example , Alpha series go robots that are common or heard of in our life , Face recognition , speech recognition , Small voice robot , Driverless, etc. These are examples of weak artificial intelligence in practice .
How to learn Artificial Intelligence ? so to speak , The circle of artificial intelligence is too big , All walks of life involve , Optional side Xiang is also diverse 、 Each are not identical , Including data mining 、 Computer vision 、 Natural language processing and other fields . that , Is it every The contents to be learned in the direction are very different ? No, it isn't . In fact, the core is machine learning , You can't do anything without it , therefore No matter which field you choose , We must lay a solid foundation . therefore , The first goal is to solve the major algorithms of machine learning , And master its application and practice methods .
4. Steps for developing machine learning applications
(1) collecting data
There are many ways we can collect sample protectors , Such as : Make a web crawler to extract data from the website 、 from RSS Feedback or API Get information from 、 The measured data sent by the equipment .
(2) Ready to input data
After getting the data , You must also ensure that the data format meets the requirements .
(3) Analyze input data
The main function of this step is to ensure that there is no garbage data in the data set . If you are using a trusted data source , Then you can skip this step directly
(4) Training algorithm
Machine learning algorithms really begin to learn from this step . If unsupervised learning algorithm is used , Because there is no target variable value , Therefore, there is no need to train the algorithm , All the content related to the algorithm is in Chapter (5) Step
(5) The test algorithm
This step will actually use section (4) Next, the knowledge information obtained by machine learning . Of course, we also need to evaluate the accuracy of the results , Then retrain your algorithm as needed
(6) Usage algorithm
Convert to application , Perform actual tasks . To check whether the above steps can work normally in the actual environment . If you encounter new data problems , It is also necessary to repeat the above steps
The above steps should be more professional , You can see the following flow chart :
5. Machine learning algorithm classification
5.1 Supervised learning (supervised learning)
Definition : Input data is composed of input eigenvalue and target value . The output of a function can be a continuous value ( It's called regression ), Or the output is a finite number of discrete values ( It's called classification ).
The tasks of supervised learning can be divided into two categories : Classification and regression
Classification and regression , Both belong to the category of supervised learning , It's all about learning . The reason why I choose two different names , Because the corresponding output values have different forms , That's it .
classification : To classify all kinds of things , For discrete prediction .
such as : Picture recognition of cats and dogs :
The eigenvalue : cat / Pictures of dogs ; The target : cat / Dog - Category
This is the problem of classification
Common classification algorithms : k- Nearest neighbor algorithm 、 Bayesian classification 、 Decision trees and random forests 、 Logical regression 、 Neural network and so on
Return to : Forecast continuous 、 Specific values .
For example, the prediction of housing prices :
The eigenvalue : Property information of the house ; The target : House price - Continuous data
This is the question of return .
Common regression algorithms : Linear regression 、 Ridge return, etc
5.2 Unsupervised learning (unsupervised learning)
Definition : The input data is composed of input eigenvalues .
Unsupervised learning problems refer to , The model learns without any instructions . Because there is no instruction , That means the only thing the model can learn is what the sample itself contains , Such as the similarity and pattern between samples , Or the distribution of the population in which the sample is located .
Unsupervised learning mainly has 3 Characteristics :
- Unsupervised learning has no clear purpose
- Unsupervised learning doesn't need to label data
- Unsupervised learning cannot quantify the effect
Unsupervised learning is mainly used for dimension reduction and clustering .
Clustering of unsupervised learning :
K-means It's often called the Lloyd's algorithm , This is the most classic in data clustering , It's also a relatively easy to understand model .
The process of algorithm execution is divided into 4 Stages :
- First , Set randomly K Points in the feature space are the initial clustering centers .
- then , For eigenvectors based on each data , from K Search for the nearest cluster center , And mark the data as the cluster center .
- next , After all the data are marked with cluster centers , The newly allocated class clusters based on these data , Create a new centroid weight by averaging all samples assigned to each previous centroid , New pair K Cluster centers for calculation .
- Last , Calculate the difference between the old and new centroids , If the cluster center of all data points does not change from the last assigned cluster , Then the iteration can stop , Otherwise go back to step 2 Continue to cycle .
Dimensionality reduction of unsupervised learning :
PCA( The principal components ) Analysis method is the most commonly used dimension reduction method ,PCA The idea is to project data from high-dimensional space onto low-dimensional space , Make the data as scattered as possible in the low dimensional space , So as to keep most of the information of the data .
6. What should we do
Machine learning in Internet companies 、 What are the work contents of data mining engineers ?
Study various algorithms , Design tall model ?
The application of deep learning ,N Layer neural networks ?
…
None of this is , In fact, most of the algorithm refinement of complex models is done by data scientists ,
And most programmers mainly do :
Running data , Various map-reduce,hive SQL, Data warehouse moves brick
Data cleaning , Data cleaning , Data cleaning
Analysis business , analysis case, Look for features
Common algorithm running model
So for us , What's the most important ? Learn to analyze problems
Master the basic idea of the algorithm , Learn to solve problems with corresponding algorithms
Learn to use simple libraries or frameworks to solve problems
7. How to learn
If you have Baidu machine learning how to learn , Many people would recommend reading all the books related to college mathematics first , Like gaoshu , linear algebra , Probability theory and mathematical statistics , Li Hang's statistical learning methods , Like these :
Maybe many people are quite passionate at first , It will feel boring later, and it is not easy to stick to it , Personally, I suggest not to do this . The best way to learn is to start with , Then learn while doing cases , Make up what you lack , This is not boring, but also deepened the understanding and Application .
We've already said that , We must be clear about what is most important to us , It is important for us to master some machine learning algorithms and other skills , Cut into a business area to solve problems .
Recommended video tutorial :
There is no outer chain here , I'll just talk about what I've seen , Open b You can search directly on the station Ng machine learning
, This is the most classic ancestor , There are better ones that can be searched directly Machine learning whiteboard derivation
, There is also a machine learning course taught by huhaoji of Zhejiang University , Direct search Zhejiang University machine learning
You can find .
Recommended books :
The most classic is the first watermelon book , After finishing the foundation, you can find some practical courses of machine learning framework , The framework also has a lot to learn , classical tensorflow frame , Framework rookie pytorch, These two can be well studied , Other frames are like caffe、theano、chainer Those who have energy can study , In fact, it's enough to play with oneortwo familiar frames , Just like using python do web Development , You can use large and complete Django frame , You can also use small and refined Flask, These are not tangled , Choose a framework with corresponding complexity according to your own time, and learn first , Our goal is still to encounter problems , Solve the problem and finally complete the task .
边栏推荐
- 首发!!lancet饿了么官方文档
- PHP二级域名session共享方案
- Jmeter线程组功能介绍
- Nifi from introduction to practice (nanny level tutorial) - flow
- 程序猿如何快速成长
- Construction practice camp - graduation summary of phase 6
- 【声明】关于检索SogK1997而找到诸多网页爬虫结果这件事
- Everyone in remote office works together to realize cooperative editing of materials and development of documents | community essay solicitation
- The difference between calling by value and simulating calling by reference
- 相同切入点的抽取
猜你喜欢
Please be prepared to lose your job at any time within 3 years?
App mobile terminal test [3] ADB command
Deep understanding of grouping sets statements in SQL
ThreeJS 第二篇:顶点概念、几何体结构
SDNU_ ACM_ ICPC_ 2022_ Winter_ Practice_ 4th [individual]
Shell script import and export data
Semi supervised learning
Interviewer: how does the JVM allocate and recycle off heap memory
Mb10m-asemi rectifier bridge mb10m
How can technology managers quickly improve leadership?
随机推荐
SVN使用规范
[combinatorics] non descending path problem (outline of non descending path problem | basic model of non descending path problem | non descending path problem expansion model 1 non origin starting poi
六月 致 -.-- -..- -
深入理解 SQL 中的 Grouping Sets 语句
[web security] - [SQL injection] - error detection injection
"Remake Apple product UI with Android" (2) -- silky Appstore card transition animation
A Fei's expectation
Mb10m-asemi rectifier bridge mb10m
ASEMI整流桥UMB10F参数,UMB10F规格,UMB10F封装
[proteus simulation] 74hc595+74ls154 drive display 16x16 dot matrix
用通达信炒股开户安全吗?
工资3000,靠“视频剪辑”月入40000:会赚钱的人,从不靠拼命!
高等数学(第七版)同济大学 习题2-1 个人解答
Unity项目优化案例一
请求头不同国家和语言的表示
“用Android复刻Apple产品UI”(2)——丝滑的AppStore卡片转场动画
NSQ源码安装运行过程
[redis foundation] understand redis persistence mechanism together (rdb+aof graphic explanation)
App mobile terminal test [4] APK operation
TCP擁塞控制詳解 | 3. 設計空間