当前位置:网站首页>2021 Li Hongyi machine learning (1): basic concepts
2021 Li Hongyi machine learning (1): basic concepts
2022-07-05 02:38:00 【Three ears 01】
2021 Li hongyi machine learning (1): Basic concepts
B On the site 2021 Li Hongyi's learning notes of machine learning course , For reuse .
1 Basic concepts
Machine learning is ultimately about finding a function .
1.1 Different function categories
- Return to Regression—— Output is numeric
- classification Classification—— The output is in different categories classes, Do multiple choice questions
- Structural learning Structured Learning—— Generate a structured file ( Draw a picture 、 Write an article ), Let the machine learn to create
1.2 How to find functions (Training):
- First , Write a function with unknown parameters ;
- secondly , Definition loss( A function related to parameters ,MAE—— Absolute error ,MSE—— Mean square error );
- Last , Optimize , Find the loss Minimum parameters —— gradient descent
1) Randomly select the initial value of the parameter ;
2) Calculation ∂ L ∂ w ∣ w = w 0 \left.\frac{\partial L}{\partial w}\right|_{w=w^{0}} ∂w∂L∣∣w=w0, Then step down the gradient , The step size is l r × ∂ L ∂ w ∣ w = w 0 \left.lr\times\frac{\partial L}{\partial w}\right|_{w=w^{0}} lr×∂w∂L∣∣w=w0
3) Update parameters
This method has a huge drawback : Usually we will find Local minima, But what we want is global minima
1.3 Model
Linear model linear model There's a big limit , Cannot simulate polyline 、 Curve , This restriction is called model bias, So we need to improve .
How to improve :Piecewise Linear Curves
Many such sets can be fitted into curves .
1.3.1 sigmoid
It can be used sigmoid function y = c 1 1 + e − ( b + w x 1 ) = c s i g m o i d ( b + w x 1 ) y=c \frac{1}{1+e^{-\left(b+w x_{1}\right)}}=c sigmoid(b+wx_1) y=c1+e−(b+wx1)1=csigmoid(b+wx1) Fit the blue broken line :
y = b + ∑ i c i sigmoid ( b i + ∑ j w i j x j ) y=b+\sum_{i} c_{i} \operatorname{sigmoid}\left(b_{i}+\sum_{j} w_{i j} x_{j}\right) y=b+i∑cisigmoid(bi+j∑wijxj)
All unknown parameters in this , Use both θ \theta θ Express :
Use all at once θ \theta θ To calculate , Make a gradient descent , Such a large amount of data , Therefore, small batches are used batch:
every last data The number of updates depends on the total amount of data and batch Number :
1.3.2 ReLU
In front of it is soft sigmoid, That's the curve , In fact, you can use two ReLU Quasi synthesis hard sigmoid, That's the broken line :
above sigmoid The formula becomes :
1.3.3 Yes sigmoid The calculation of can be done several more times
There are many such layers , It is called neural network Neural Network, Later called Deep learning=Many hidden layers
边栏推荐
- Asynchronous and promise
- Subject 3 how to turn on the high beam diagram? Is the high beam of section 3 up or down
- Vb+access hotel service management system
- Hmi-32- [motion mode] add light panel and basic information column
- The database and recharge are gone
- A label colorful navigation bar
- Write a thread pool by hand, and take you to learn the implementation principle of ThreadPoolExecutor thread pool
- Three properties that a good homomorphic encryption should satisfy
- d3js小记
- Moco V2 literature research [self supervised learning]
猜你喜欢
Bert fine tuning skills experiment
Avoid material "minefields"! Play with super high conversion rate
Cut! 39 year old Ali P9, saved 150million
【LeetCode】98. Verify the binary search tree (2 brushes of wrong questions)
Pytest (5) - assertion
Missile interception -- UPC winter vacation training match
spoon插入更新oracle数据库,插了一部分提示报错Assertion botch: negative time
Design of KTV intelligent dimming system based on MCU
【LeetCode】222. The number of nodes of a complete binary tree (2 mistakes)
[technology development-26]: data security of new information and communication networks
随机推荐
Matrixone 0.2.0 is released, and the fastest SQL computing engine is coming
When to catch an exception and when to throw an exception- When to catch the Exception vs When to throw the Exceptions?
Hmi-32- [motion mode] add light panel and basic information column
Bumblebee: build, deliver, and run ebpf programs smoothly like silk
Use the difference between "Chmod a + X" and "Chmod 755" [closed] - difference between using "Chmod a + X" and "Chmod 755" [closed]
Privatization lightweight continuous integration deployment scheme -- 01 environment configuration (Part 1)
Official announcement! The third cloud native programming challenge is officially launched!
【LeetCode】110. Balanced binary tree (2 brushes of wrong questions)
RichView TRVStyle MainRVStyle
Last words record
问题解决:AttributeError: ‘NoneType‘ object has no attribute ‘append‘
Write a thread pool by hand, and take you to learn the implementation principle of ThreadPoolExecutor thread pool
Problem solving: attributeerror: 'nonetype' object has no attribute 'append‘
[illumination du destin - 38]: Ghost Valley - chapitre 5 Flying clamp - one of the Warnings: There is a kind of killing called "hold Kill"
Application and Optimization Practice of redis in vivo push platform
【LeetCode】111. Minimum depth of binary tree (2 brushes of wrong questions)
. Net starts again happy 20th birthday
D3js notes
[Yu Yue education] National Open University spring 2019 0505-22t basic nursing reference questions
Icu4c 70 source code download and compilation (win10, vs2022)