当前位置:网站首页>Understanding of maximum likelihood estimation, gradient descent, linear regression and logistic regression
Understanding of maximum likelihood estimation, gradient descent, linear regression and logistic regression
2022-07-28 07:12:00 【The most beautiful wish must be the craziest】
Maximum likelihood
I estimate the conditional probability of maximum likelihood ( Posterior probability ) And the understanding of a priori probability : Suppose an experiment , There are two possible outcomes ,A perhaps B
All in all 50 Experiments ,A There is 20 Time ,B There is 30 Time , So please A Probability p.
The problem is coming. , How to find a reasonable p Is it worth it
L Express A The probability of occurrence is p Under the circumstances , Conduct 50 Experiments , Probability of various situations .

It's easy to understand , If there is 20 Time A,30 Time B, be x1+x2+...+x50=20, Appears as 1, Do not appear as 0, So it's here 20 Time .
therefore :

When L Maximum time , It shows that the situation of these samples is the most likely , In this case ,A The probability of is the most reasonable . If L Very small , It shows that the situation of these samples is very extreme , here A The probability of is very unreasonable .
When the number of experiments becomes N When ,A The number of occurrences becomes m Then the maximum likelihood estimation A The prior probability of is m/N
alike , When the category of the experiment changes to A、B、C... When waiting for multiple categories, you can A Let's set the probability of p, This class of other classes is (1-p), It can also be deduced from the above figure A The prior probability of is m/N
Using maximum likelihood to estimate conditional probability is also the same method as the above figure , It's just N Is the total number of samples under a certain condition , and m Under this condition A Number of occurrences .
gradient descent
Here is a statement , The relationship between matrix and vector :
The vector is just n That's ok 1 Column's special matrix

Let's look at the definition of gradient from a mathematical point of view . In calculus , Find the parameter of multivariate function ∂ Partial derivative , Write the partial derivatives of the parameters in the form of vectors , It's gradient . The so-called vector expression is written in the form of matrix . Like functions f(x,y), Respectively for x,y Find the partial derivative , The gradient vector that we get is (∂f/∂x, ∂f/∂y)T, abbreviation grad f(x,y) perhaps ▽f(x,y).

In a geometric sense , Gradient is where the change of function increases the fastest . say concretely , For the function f(x,y), At point (x0,y0), Along the direction of the gradient vector is (∂f/∂x0, ∂f/∂y0)T The direction is f(x,y) The fastest growing place . Or say , In the direction of the gradient vector , It's easier to find the maximum value of a function . On the other hand , In the opposite direction of the gradient vector , That is to say -(∂f/∂x0, ∂f/∂y0)T The direction of , The gradient decreases the fastest , That is, it is easier to find the minimum value of a function .( Just remember this sentence , Don't ask why , Add to your troubles , This sentence is not a few words , Anyway , The opposite direction of the gradient is the fastest falling direction .)
Solution of gradient descent

Linear regression
From the perspective of linear regression, the gradient decreases
hypothesis n Samples X1={x11,x12,x13,x1i:y1},X2={x21,x22,x23,x2i:y2}......Xn={xn1,xn2,xn3,xni:yn}
about n Linear regression model of dimensional model


Logical regression
The difference between logical regression and linear regression is , The dependent variable of linear regression y It is a continuous value, and it is also a prediction of continuous values ( Such as house price , Age 、 Temperature etc. ). Logical regression is right 0-1 Type classification problem , Such as ( Judge men and women, etc ), Logistic regression is often used in binary classification problems .
Suppose our sample is {Xn:yn} ,yn yes 0 perhaps 1,Xn yes i Is the eigenvector
X1={x11,x12,x13,x1i:y1},X2={x21,x22,x23,x2i:y2}......Xn={xn1,xn2,xn3,xni:yn}



边栏推荐
猜你喜欢

Codesensor: convert the code into AST and then into text vector

About gcc:multiple definition of

MOOC Weng Kai C language week 7: array operation: 1. array operation 2. Search 3. preliminary sorting

Neo4j运行报错Error occurred during initialization of VM Incompatible minimum and maximum heap sizes spec

MOOC翁恺C语言第七周:数组运算:1.数组运算2.搜索3.排序初步

Redis哨兵模式及集群

Wangeditor (@4.7.15) - lightweight rich text editor

Joern的代码使用-devign

Construction of Yum warehouse

Remotely access the local website of services such as neo4j on the ECS
随机推荐
Shell script - sort, uniq, TR, array sort, cut, Eval command configuration
anaconda3无法打开navigator的解决办法
Static and floating routes
232 (female) to 422 (male)
Standard C language learning summary 9
MySQL排除节假日,计算日期差
Bond mode configuration
Anaconda3 cannot open navigator solution
Group management and permission management
静态和浮动路由
读取xml文件里switch节点的IP和设备信息,ping设备,异常显示在列表里
Small turtle C (Chapter 6 arrays 1 and 2)
Media set up live broadcast server
Neo4j运行报错Error occurred during initialization of VM Incompatible minimum and maximum heap sizes spec
shell---函数
MOOC Weng Kai C language week 3: judgment and cycle: 1. Judgment
小甲鱼C(第五章循环控制结构程序567)break和continue语句
Small turtle C (Chapter 5 loop control structure program 567) break and continue statements
easypoi一对多,合并单元格,并且根据内容自适应行高
VLAN configuration