当前位置:网站首页>2021 Li Hongyi machine learning (1): basic concepts
2021 Li Hongyi machine learning (1): basic concepts
2022-07-05 02:38:00 【Three ears 01】
2021 Li hongyi machine learning (1): Basic concepts
B On the site 2021 Li Hongyi's learning notes of machine learning course , For reuse .
1 Basic concepts
Machine learning is ultimately about finding a function .
1.1 Different function categories
- Return to Regression—— Output is numeric
- classification Classification—— The output is in different categories classes, Do multiple choice questions
- Structural learning Structured Learning—— Generate a structured file ( Draw a picture 、 Write an article ), Let the machine learn to create
1.2 How to find functions (Training):
- First , Write a function with unknown parameters ;
- secondly , Definition loss( A function related to parameters ,MAE—— Absolute error ,MSE—— Mean square error );
- Last , Optimize , Find the loss Minimum parameters —— gradient descent
1) Randomly select the initial value of the parameter ;
2) Calculation ∂ L ∂ w ∣ w = w 0 \left.\frac{\partial L}{\partial w}\right|_{w=w^{0}} ∂w∂L∣∣w=w0, Then step down the gradient , The step size is l r × ∂ L ∂ w ∣ w = w 0 \left.lr\times\frac{\partial L}{\partial w}\right|_{w=w^{0}} lr×∂w∂L∣∣w=w0
3) Update parameters
This method has a huge drawback : Usually we will find Local minima, But what we want is global minima
1.3 Model
Linear model linear model There's a big limit , Cannot simulate polyline 、 Curve , This restriction is called model bias, So we need to improve .
How to improve :Piecewise Linear Curves
Many such sets can be fitted into curves .
1.3.1 sigmoid
It can be used sigmoid function y = c 1 1 + e − ( b + w x 1 ) = c s i g m o i d ( b + w x 1 ) y=c \frac{1}{1+e^{-\left(b+w x_{1}\right)}}=c sigmoid(b+wx_1) y=c1+e−(b+wx1)1=csigmoid(b+wx1) Fit the blue broken line :
y = b + ∑ i c i sigmoid ( b i + ∑ j w i j x j ) y=b+\sum_{i} c_{i} \operatorname{sigmoid}\left(b_{i}+\sum_{j} w_{i j} x_{j}\right) y=b+i∑cisigmoid(bi+j∑wijxj)
All unknown parameters in this , Use both θ \theta θ Express :
Use all at once θ \theta θ To calculate , Make a gradient descent , Such a large amount of data , Therefore, small batches are used batch:
every last data The number of updates depends on the total amount of data and batch Number :
1.3.2 ReLU
In front of it is soft sigmoid, That's the curve , In fact, you can use two ReLU Quasi synthesis hard sigmoid, That's the broken line :
above sigmoid The formula becomes :
1.3.3 Yes sigmoid The calculation of can be done several more times
There are many such layers , It is called neural network Neural Network, Later called Deep learning=Many hidden layers
边栏推荐
- A label making navigation bar
- Spoon inserts and updates the Oracle database, and some prompts are inserted with errors. Assertion botch: negative time
- GFS分布式文件系统
- Collection of gmat750 wrong questions
- Pytorch register_ Hook (operate on gradient grad)
- Android advanced interview question record in 2022
- GFS distributed file system
- Security level
- ICSI 311 Parser
- Visual studio 2019 set transparent background (fool teaching)
猜你喜欢
Design and implementation of high availability website architecture
LeetCode 314. Binary tree vertical order traversal - Binary Tree Series Question 6
Hmi-32- [motion mode] add light panel and basic information column
The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
Pytest (4) - test case execution sequence
Introduce reflow & repaint, and how to optimize it?
spoon插入更新oracle数据库,插了一部分提示报错Assertion botch: negative time
Action News
Practical case of SQL optimization: speed up your database
【LeetCode】404. Sum of left leaves (2 brushes of wrong questions)
随机推荐
Advanced conditional statements of common SQL operations
ELK日志分析系统
Elfk deployment
LeetCode --- 1071. Great common divisor of strings problem solving Report
打破信息茧房-我主动获取信息的方法 -#3
STL container
Learn game model 3D characters, come out to find a job?
Richview trvunits image display units
Official announcement! The third cloud native programming challenge is officially launched!
Elk log analysis system
Application and Optimization Practice of redis in vivo push platform
Pytest (4) - test case execution sequence
Hmi-30- [motion mode] the module on the right side of the instrument starts to write
Design and implementation of high availability website architecture
The steering wheel can be turned for one and a half turns. Is there any difference between it and two turns
He was laid off.. 39 year old Ali P9, saved 150million
Yolov5 model training and detection
The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
Summary and practice of knowledge map construction technology
The database and recharge are gone