当前位置:网站首页>Understanding of maximum likelihood estimation, gradient descent, linear regression and logistic regression
Understanding of maximum likelihood estimation, gradient descent, linear regression and logistic regression
2022-07-28 07:12:00 【The most beautiful wish must be the craziest】
Maximum likelihood
I estimate the conditional probability of maximum likelihood ( Posterior probability ) And the understanding of a priori probability : Suppose an experiment , There are two possible outcomes ,A perhaps B
All in all 50 Experiments ,A There is 20 Time ,B There is 30 Time , So please A Probability p.
The problem is coming. , How to find a reasonable p Is it worth it
L Express A The probability of occurrence is p Under the circumstances , Conduct 50 Experiments , Probability of various situations .

It's easy to understand , If there is 20 Time A,30 Time B, be x1+x2+...+x50=20, Appears as 1, Do not appear as 0, So it's here 20 Time .
therefore :

When L Maximum time , It shows that the situation of these samples is the most likely , In this case ,A The probability of is the most reasonable . If L Very small , It shows that the situation of these samples is very extreme , here A The probability of is very unreasonable .
When the number of experiments becomes N When ,A The number of occurrences becomes m Then the maximum likelihood estimation A The prior probability of is m/N
alike , When the category of the experiment changes to A、B、C... When waiting for multiple categories, you can A Let's set the probability of p, This class of other classes is (1-p), It can also be deduced from the above figure A The prior probability of is m/N
Using maximum likelihood to estimate conditional probability is also the same method as the above figure , It's just N Is the total number of samples under a certain condition , and m Under this condition A Number of occurrences .
gradient descent
Here is a statement , The relationship between matrix and vector :
The vector is just n That's ok 1 Column's special matrix

Let's look at the definition of gradient from a mathematical point of view . In calculus , Find the parameter of multivariate function ∂ Partial derivative , Write the partial derivatives of the parameters in the form of vectors , It's gradient . The so-called vector expression is written in the form of matrix . Like functions f(x,y), Respectively for x,y Find the partial derivative , The gradient vector that we get is (∂f/∂x, ∂f/∂y)T, abbreviation grad f(x,y) perhaps ▽f(x,y).

In a geometric sense , Gradient is where the change of function increases the fastest . say concretely , For the function f(x,y), At point (x0,y0), Along the direction of the gradient vector is (∂f/∂x0, ∂f/∂y0)T The direction is f(x,y) The fastest growing place . Or say , In the direction of the gradient vector , It's easier to find the maximum value of a function . On the other hand , In the opposite direction of the gradient vector , That is to say -(∂f/∂x0, ∂f/∂y0)T The direction of , The gradient decreases the fastest , That is, it is easier to find the minimum value of a function .( Just remember this sentence , Don't ask why , Add to your troubles , This sentence is not a few words , Anyway , The opposite direction of the gradient is the fastest falling direction .)
Solution of gradient descent

Linear regression
From the perspective of linear regression, the gradient decreases
hypothesis n Samples X1={x11,x12,x13,x1i:y1},X2={x21,x22,x23,x2i:y2}......Xn={xn1,xn2,xn3,xni:yn}
about n Linear regression model of dimensional model


Logical regression
The difference between logical regression and linear regression is , The dependent variable of linear regression y It is a continuous value, and it is also a prediction of continuous values ( Such as house price , Age 、 Temperature etc. ). Logical regression is right 0-1 Type classification problem , Such as ( Judge men and women, etc ), Logistic regression is often used in binary classification problems .
Suppose our sample is {Xn:yn} ,yn yes 0 perhaps 1,Xn yes i Is the eigenvector
X1={x11,x12,x13,x1i:y1},X2={x21,x22,x23,x2i:y2}......Xn={xn1,xn2,xn3,xni:yn}



边栏推荐
- Standard C language learning summary 7
- Results fill in the blanks carelessly (violent solution)
- DOM - Events
- Shell --- conditional statement practice
- 一个定时任务提醒工具
- Standard C language summary 4
- MOOC翁恺C语言第七周:数组运算:1.数组运算2.搜索3.排序初步
- VLAN的配置
- 小甲鱼C(第六章数组1、2)
- Log in to Oracle10g OEM and want to manage the monitor program, but the account password input page always pops up
猜你喜欢

Codesensor: convert the code into AST and then into text vector

Servlet

静态和浮动路由

Starting point Chinese website font anti crawling technology web page can display numbers and letters, and the web page code is garbled or blank

Neo4j running error occurred during initialization of VM incompatible minimum and maximum heap sizes spec

Shell--- function

Esxi arm edition version 1.10 update

Esxi community network card driver

Easypoi one to many, merge cells, and adapt the row height according to the content

Group management and permission management
随机推荐
DNS domain name resolution
Result number of filled briquettes
shell---函数
Small turtle C (Chapter 6 arrays 1 and 2)
Implementation method of Bert
Animation animation realizes the crossing (click) pause
MOOC翁恺C语言第七周:数组运算:1.数组运算2.搜索3.排序初步
Standard C language learning summary 8
Shell--- circular statement practice
RAID disk array
DHCP service
读取xml文件里switch节点的IP和设备信息,ping设备,异常显示在列表里
根据excel生成create建表SQL语句
远程访问云服务器上Neo4j等服务的本地网址
Icc2 analysis timing artifact analyze_ design_ violations
Uni app double click event simulation
静态和浮动路由
起点中文网 字体反爬技术 网页可以显示数字字母 网页代码是乱码或空格
Standard C language learning summary 5
分解路径为目录名和文件名的方法