当前位置:网站首页>Common algorithm interview has been out! Machine learning algorithm interview - KDnuggets
Common algorithm interview has been out! Machine learning algorithm interview - KDnuggets
2020-11-06 01:20:00 【On jdon】
If the common algorithm is the common programmer's necessary knowledge , So is a more practical machine learning algorithm ? Or is it a necessary knowledge for data scientists ?
In preparing for an interview in Data Science , It is necessary to have a clear understanding of the various machine learning models - Give a brief description of each ready-made model . ad locum , We summarize various machine learning models by highlighting the main points , To help you communicate complex models .
Linear regression
Linear regression involves the use of the least square method to find “ Best fit line ”. The least squares method involves finding a linear equation , The equation minimizes the sum of squares of residuals . The residual is equal to the actual negative predictive value .
for instance , The red line is a better fit than the green line , Because it's closer to the point , So the residuals are small .
The picture was created by the author .
Ridge Return to
Ridge Return to ( Also known as L2 Regularization ) It's a regression technique , A small amount of deviation can be introduced to reduce over fitting . It works by minimizing the square of residuals And plus Penalty points to achieve this goal , The penalty is equal to λ Times the slope squared .Lambda It means the severity of the punishment .
The picture was created by the author .
If there is no punishment , Then the slope of the best fit line becomes steeper , That means it's good for X More sensitive to subtle changes in . By introducing punishment , Best fit line pairs X It becomes less sensitive . Back of the ridge return .
Lasso Return to
Lasso Return to , Also known as L1 Regularization , And Ridge Return to similar . The only difference is , The penalty is calculated using the absolute value of the slope .
Logical regression
Logistic Regression is a classification technique , You can also find “ The most suitable straight line ”. however , Unlike linear regression , In linear regression , Use the least square to find the best fit line , Logistic regression uses maximum likelihood to find the best fit line ( The logic curve ). This is because y Value can only be 1 or 0. watch StatQuest In the video , Learn how to calculate the maximum likelihood .
The picture was created by the author .
K Nearest neighbor
K Nearest neighbor is a classification technique , Classify the new samples by looking at the nearest classification point , So called “ K lately ”. In the following example , If k = 1, Then unclassified points are classified as blue dots .
The picture was created by the author .
If k The value of is too low , There may be outliers . however , If it's too high , It is possible to ignore classes with only a few samples .
Naive Bayes
Naive Bayes classifier is a classification technique inspired by Bayes theorem , The following equation is stated :
Because of naive assumptions ( Hence the name ), Variables are independent in the case of a given class , So it can be rewritten as follows P(X | y):
Again , Because we have to solve y, therefore P(X) It's a constant , This means that we can remove it from the equation and introduce proportionality .
therefore , Each one y The probability of value is calculated as given y Conditional probability of x n The product of the .
Support vector machine
Support vector machine is a classification technique , We can find the best boundary called hyperplane , This boundary is used to separate different categories . Find hyperplanes by maximizing the margin between classes .
The picture was created by the author .
Decision tree
Decision tree is essentially a series of conditional statements , These conditional statements determine the path taken by the sample before it reaches the bottom . They are intuitive and easy to build , But it's often inaccurate .
Random forests
Random forest is an integrated technology , This means that it combines multiple models into one model to improve its predictive power . say concretely , It uses bootstrap data sets and random subsets of variables ( Also known as bagging ) Thousands of smaller decision trees have been built . With thousands of smaller decision trees , Random forest use “ The majority wins ” Model to determine the value of the target variable .
for example , If we create a decision tree , The third decision tree , It will predict 0. however , If we rely on all 4 A decision tree model , Then the predicted value will be 1. This is the power of random forests .
AdaBoost
AdaBoost It's an enhancement algorithm , Be similar to “ Random forests ”, But there are two important differences :
- AdaBoost It's not usually made up of trees , It's a forest of stumps ( A stump is a tree with only one node and two leaves ).
- The decision of each stump has a different weight in the final decision . The total error is small ( High accuracy ) The stump has a higher voice .
- The order in which the stumps are created is important , Because each subsequent stump emphasizes the importance of samples that were not correctly classified in the previous stump .
Gradient rise
Gradient Boost And AdaBoost similar , Because it can build multiple trees , Each of these trees was built from the previous tree . And AdaBoost You can build stumps differently ,Gradient Boost Can be built, usually with 8 to 32 A leafy tree .
what's more ,Gradient Boost And AdaBoost The difference is in the way decision trees are constructed . Gradient enhancement starts with the initial prediction , It's usually the average . then , The decision tree is constructed based on the residuals of samples . By using the initial prediction + The learning rate is multiplied by the result of the residual tree to make a new prediction , Then repeat the process .
XGBoost
XGBoost Essentially with Gradient Boost identical , But the main difference is how to construct the residual tree . Use XGBoost, The residual tree can be determined by calculating the similarity score between the leaf and the previous node , To determine which variables are used as roots and nodes .
版权声明
本文为[On jdon]所创,转载请带上原文链接,感谢
边栏推荐
- Polkadot series (2) -- detailed explanation of mixed consensus
- Wiremock: a powerful tool for API testing
- Using Es5 to realize the class of ES6
- 钻石标准--Diamond Standard
- 中国提出的AI方法影响越来越大,天大等从大量文献中挖掘AI发展规律
- The practice of the architecture of Internet public opinion system
- Deep understanding of common methods of JS array
- Installing the consult cluster
- Linked blocking Queue Analysis of blocking queue
- Why do private enterprises do party building? ——Special subject study of geek state holding Party branch
猜你喜欢
DevOps是什么
Summary of common algorithms of linked list
怎么理解Python迭代器与生成器?
I'm afraid that the spread sequence calculation of arbitrage strategy is not as simple as you think
至联云分享:IPFS/Filecoin值不值得投资?
阿里云Q2营收破纪录背后,云的打开方式正在重塑
Examples of unconventional aggregation
速看!互联网、电商离线大数据分析最佳实践!(附网盘链接)
PLC模拟量输入和数字量输入是什么
PHPSHE 短信插件说明
随机推荐
PN8162 20W PD快充芯片,PD快充充电器方案
嘗試從零開始構建我的商城 (二) :使用JWT保護我們的資訊保安,完善Swagger配置
Listening to silent words: hand in hand teaching you sign language recognition with modelarts
Can't be asked again! Reentrantlock source code, drawing a look together!
Python crawler actual combat details: crawling home of pictures
Calculation script for time series data
ES6 essence:
使用 Iceberg on Kubernetes 打造新一代云原生数据湖
(2)ASP.NET Core3.1 Ocelot路由
至联云分享:IPFS/Filecoin值不值得投资?
选择站群服务器的有哪些标准呢?
Network programming NiO: Bio and NiO
Deep understanding of common methods of JS array
10 easy to use automated testing tools
High availability cluster deployment of jumpserver: (6) deployment of SSH agent module Koko and implementation of system service management
CCR炒币机器人:“比特币”数字货币的大佬,你不得不了解的知识
Jmeter——ForEach Controller&Loop Controller
The difference between Es5 class and ES6 class
关于Kubernetes 与 OAM 构建统一、标准化的应用管理平台知识!(附网盘链接)
如何玩转sortablejs-vuedraggable实现表单嵌套拖拽功能