当前位置:网站首页>Common algorithm interview has been out! Machine learning algorithm interview - KDnuggets
Common algorithm interview has been out! Machine learning algorithm interview - KDnuggets
2020-11-06 01:20:00 【On jdon】
If the common algorithm is the common programmer's necessary knowledge , So is a more practical machine learning algorithm ? Or is it a necessary knowledge for data scientists ?
In preparing for an interview in Data Science , It is necessary to have a clear understanding of the various machine learning models - Give a brief description of each ready-made model . ad locum , We summarize various machine learning models by highlighting the main points , To help you communicate complex models .
Linear regression
Linear regression involves the use of the least square method to find “ Best fit line ”. The least squares method involves finding a linear equation , The equation minimizes the sum of squares of residuals . The residual is equal to the actual negative predictive value .
for instance , The red line is a better fit than the green line , Because it's closer to the point , So the residuals are small .
The picture was created by the author .
Ridge Return to
Ridge Return to ( Also known as L2 Regularization ) It's a regression technique , A small amount of deviation can be introduced to reduce over fitting . It works by minimizing the square of residuals And plus Penalty points to achieve this goal , The penalty is equal to λ Times the slope squared .Lambda It means the severity of the punishment .
The picture was created by the author .
If there is no punishment , Then the slope of the best fit line becomes steeper , That means it's good for X More sensitive to subtle changes in . By introducing punishment , Best fit line pairs X It becomes less sensitive . Back of the ridge return .
Lasso Return to
Lasso Return to , Also known as L1 Regularization , And Ridge Return to similar . The only difference is , The penalty is calculated using the absolute value of the slope .
Logical regression
Logistic Regression is a classification technique , You can also find “ The most suitable straight line ”. however , Unlike linear regression , In linear regression , Use the least square to find the best fit line , Logistic regression uses maximum likelihood to find the best fit line ( The logic curve ). This is because y Value can only be 1 or 0. watch StatQuest In the video , Learn how to calculate the maximum likelihood .
The picture was created by the author .
K Nearest neighbor
K Nearest neighbor is a classification technique , Classify the new samples by looking at the nearest classification point , So called “ K lately ”. In the following example , If k = 1, Then unclassified points are classified as blue dots .
The picture was created by the author .
If k The value of is too low , There may be outliers . however , If it's too high , It is possible to ignore classes with only a few samples .
Naive Bayes
Naive Bayes classifier is a classification technique inspired by Bayes theorem , The following equation is stated :
Because of naive assumptions ( Hence the name ), Variables are independent in the case of a given class , So it can be rewritten as follows P(X | y):
Again , Because we have to solve y, therefore P(X) It's a constant , This means that we can remove it from the equation and introduce proportionality .
therefore , Each one y The probability of value is calculated as given y Conditional probability of x n The product of the .
Support vector machine
Support vector machine is a classification technique , We can find the best boundary called hyperplane , This boundary is used to separate different categories . Find hyperplanes by maximizing the margin between classes .
The picture was created by the author .
Decision tree
Decision tree is essentially a series of conditional statements , These conditional statements determine the path taken by the sample before it reaches the bottom . They are intuitive and easy to build , But it's often inaccurate .
Random forests
Random forest is an integrated technology , This means that it combines multiple models into one model to improve its predictive power . say concretely , It uses bootstrap data sets and random subsets of variables ( Also known as bagging ) Thousands of smaller decision trees have been built . With thousands of smaller decision trees , Random forest use “ The majority wins ” Model to determine the value of the target variable .
for example , If we create a decision tree , The third decision tree , It will predict 0. however , If we rely on all 4 A decision tree model , Then the predicted value will be 1. This is the power of random forests .
AdaBoost
AdaBoost It's an enhancement algorithm , Be similar to “ Random forests ”, But there are two important differences :
- AdaBoost It's not usually made up of trees , It's a forest of stumps ( A stump is a tree with only one node and two leaves ).
- The decision of each stump has a different weight in the final decision . The total error is small ( High accuracy ) The stump has a higher voice .
- The order in which the stumps are created is important , Because each subsequent stump emphasizes the importance of samples that were not correctly classified in the previous stump .
Gradient rise
Gradient Boost And AdaBoost similar , Because it can build multiple trees , Each of these trees was built from the previous tree . And AdaBoost You can build stumps differently ,Gradient Boost Can be built, usually with 8 to 32 A leafy tree .
what's more ,Gradient Boost And AdaBoost The difference is in the way decision trees are constructed . Gradient enhancement starts with the initial prediction , It's usually the average . then , The decision tree is constructed based on the residuals of samples . By using the initial prediction + The learning rate is multiplied by the result of the residual tree to make a new prediction , Then repeat the process .
XGBoost
XGBoost Essentially with Gradient Boost identical , But the main difference is how to construct the residual tree . Use XGBoost, The residual tree can be determined by calculating the similarity score between the leaf and the previous node , To determine which variables are used as roots and nodes .
版权声明
本文为[On jdon]所创,转载请带上原文链接,感谢
边栏推荐
- Real time data synchronization scheme based on Flink SQL CDC
- Analysis of react high order components
- Menu permission control configuration of hub plug-in for azure Devops extension
- Thoughts on interview of Ali CCO project team
- Character string and memory operation function in C language
- 從小公司進入大廠,我都做對了哪些事?
- 关于Kubernetes 与 OAM 构建统一、标准化的应用管理平台知识!(附网盘链接)
- 技術總監,送給剛畢業的程式設計師們一句話——做好小事,才能成就大事
- How to select the evaluation index of classification model
- Synchronous configuration from git to consult with git 2consul
猜你喜欢
“颜值经济”的野望:华熙生物净利率六连降,收购案遭上交所问询
I'm afraid that the spread sequence calculation of arbitrage strategy is not as simple as you think
Basic principle and application of iptables
從小公司進入大廠,我都做對了哪些事?
2019年的一个小目标,成为csdn的博客专家,纪念一下
IPFS/Filecoin合法性:保护个人隐私不被泄露
向北京集结!OpenI/O 2020启智开发者大会进入倒计时
熬夜总结了报表自动化、数据可视化和挖掘的要点,和你想的不一样
Filecoin主网上线以来Filecoin矿机扇区密封到底是什么意思
如何将数据变成资产?吸引数据科学家
随机推荐
PHPSHE 短信插件说明
Real time data synchronization scheme based on Flink SQL CDC
I'm afraid that the spread sequence calculation of arbitrage strategy is not as simple as you think
100元扫货阿里云是怎样的体验?
快快使用ModelArts,零基礎小白也能玩轉AI!
数字城市响应相关国家政策大力发展数字孪生平台的建设
比特币一度突破14000美元,即将面临美国大选考验
(2)ASP.NET Core3.1 Ocelot路由
PN8162 20W PD快充芯片,PD快充充电器方案
Synchronous configuration from git to consult with git 2consul
After reading this article, I understand a lot of webpack scaffolding
C language 100 question set 004 - statistics of the number of people of all ages
Just now, I popularized two unique skills of login to Xuemei
Asp.Net Core學習筆記:入門篇
xmppmini 專案詳解:一步一步從原理跟我學實用 xmpp 技術開發 4.字串解碼祕笈與訊息包
Aprelu: cross border application, adaptive relu | IEEE tie 2020 for machine fault detection
Summary of common algorithms of linked list
Let the front-end siege division develop independently from the back-end: Mock.js
阿里云Q2营收破纪录背后,云的打开方式正在重塑
《Google軟體測試之道》 第一章google軟體測試介紹