当前位置:网站首页>2021 Li Hongyi machine learning (1): basic concepts
2021 Li Hongyi machine learning (1): basic concepts
2022-07-05 02:38:00 【Three ears 01】
2021 Li hongyi machine learning (1): Basic concepts
B On the site 2021 Li Hongyi's learning notes of machine learning course , For reuse .
1 Basic concepts
Machine learning is ultimately about finding a function .
1.1 Different function categories
- Return to Regression—— Output is numeric
- classification Classification—— The output is in different categories classes, Do multiple choice questions
- Structural learning Structured Learning—— Generate a structured file ( Draw a picture 、 Write an article ), Let the machine learn to create
1.2 How to find functions (Training):
- First , Write a function with unknown parameters ;
- secondly , Definition loss( A function related to parameters ,MAE—— Absolute error ,MSE—— Mean square error );
- Last , Optimize , Find the loss Minimum parameters —— gradient descent
1) Randomly select the initial value of the parameter ;
2) Calculation ∂ L ∂ w ∣ w = w 0 \left.\frac{\partial L}{\partial w}\right|_{w=w^{0}} ∂w∂L∣∣w=w0, Then step down the gradient , The step size is l r × ∂ L ∂ w ∣ w = w 0 \left.lr\times\frac{\partial L}{\partial w}\right|_{w=w^{0}} lr×∂w∂L∣∣w=w0
3) Update parameters
This method has a huge drawback : Usually we will find Local minima, But what we want is global minima
1.3 Model
Linear model linear model There's a big limit , Cannot simulate polyline 、 Curve , This restriction is called model bias, So we need to improve .
How to improve :Piecewise Linear Curves
Many such sets can be fitted into curves .
1.3.1 sigmoid
It can be used sigmoid function y = c 1 1 + e − ( b + w x 1 ) = c s i g m o i d ( b + w x 1 ) y=c \frac{1}{1+e^{-\left(b+w x_{1}\right)}}=c sigmoid(b+wx_1) y=c1+e−(b+wx1)1=csigmoid(b+wx1) Fit the blue broken line :
y = b + ∑ i c i sigmoid ( b i + ∑ j w i j x j ) y=b+\sum_{i} c_{i} \operatorname{sigmoid}\left(b_{i}+\sum_{j} w_{i j} x_{j}\right) y=b+i∑cisigmoid(bi+j∑wijxj)
All unknown parameters in this , Use both θ \theta θ Express :
Use all at once θ \theta θ To calculate , Make a gradient descent , Such a large amount of data , Therefore, small batches are used batch:
every last data The number of updates depends on the total amount of data and batch Number :
1.3.2 ReLU
In front of it is soft sigmoid, That's the curve , In fact, you can use two ReLU Quasi synthesis hard sigmoid, That's the broken line :
above sigmoid The formula becomes :
1.3.3 Yes sigmoid The calculation of can be done several more times
There are many such layers , It is called neural network Neural Network, Later called Deep learning=Many hidden layers
边栏推荐
- 【LeetCode】110. Balanced binary tree (2 brushes of wrong questions)
- Asynchronous and promise
- Richview trvunits image display units
- Action News
- Naacl 2021 | contrastive learning sweeping text clustering task
- Moco V2 literature research [self supervised learning]
- A label making navigation bar
- When the low alcohol race track enters the reshuffle period, how can the new brand break the three major problems?
- Kotlin - coroutine
- Word processing software
猜你喜欢
Grub 2.12 will be released this year to continue to improve boot security
Practice of tdengine in TCL air conditioning energy management platform
Character painting, I use characters to draw a Bing Dwen Dwen
Yolov5 model training and detection
Practical case of SQL optimization: speed up your database
Action News
A tab Sina navigation bar
8. Commodity management - commodity classification
openresty ngx_lua執行階段
Missile interception -- UPC winter vacation training match
随机推荐
"C zero foundation introduction hundred knowledge and hundred cases" (72) multi wave entrustment -- Mom shouted for dinner
Can you really learn 3DMAX modeling by self-study?
He was laid off.. 39 year old Ali P9, saved 150million
Practice of tdengine in TCL air conditioning energy management platform
Application and Optimization Practice of redis in vivo push platform
低度酒赛道进入洗牌期,新品牌如何破局三大难题?
LeetCode 314. Binary tree vertical order traversal - Binary Tree Series Question 6
Character painting, I use characters to draw a Bing Dwen Dwen
Why do you understand a16z? Those who prefer Web3.0 Privacy Infrastructure: nym
Summary and practice of knowledge map construction technology
A label colorful navigation bar
ICSI 311 Parser
Advanced learning of MySQL -- Application -- Introduction
Marubeni Baidu applet detailed configuration tutorial, approved.
Go RPC call
openresty ngx_lua变量操作
Design and implementation of community hospital information system
Structure of ViewModel
100 basic multiple choice questions of C language (with answers) 04
A label making navigation bar