当前位置:网站首页>2021 Li Hongyi machine learning (1): basic concepts
2021 Li Hongyi machine learning (1): basic concepts
2022-07-05 02:38:00 【Three ears 01】
2021 Li hongyi machine learning (1): Basic concepts
B On the site 2021 Li Hongyi's learning notes of machine learning course , For reuse .
1 Basic concepts
Machine learning is ultimately about finding a function .
1.1 Different function categories
- Return to Regression—— Output is numeric
- classification Classification—— The output is in different categories classes, Do multiple choice questions
- Structural learning Structured Learning—— Generate a structured file ( Draw a picture 、 Write an article ), Let the machine learn to create
1.2 How to find functions (Training):
- First , Write a function with unknown parameters ;
- secondly , Definition loss( A function related to parameters ,MAE—— Absolute error ,MSE—— Mean square error );
- Last , Optimize , Find the loss Minimum parameters —— gradient descent
1) Randomly select the initial value of the parameter ;
2) Calculation ∂ L ∂ w ∣ w = w 0 \left.\frac{\partial L}{\partial w}\right|_{w=w^{0}} ∂w∂L∣∣w=w0, Then step down the gradient , The step size is l r × ∂ L ∂ w ∣ w = w 0 \left.lr\times\frac{\partial L}{\partial w}\right|_{w=w^{0}} lr×∂w∂L∣∣w=w0
3) Update parameters
This method has a huge drawback : Usually we will find Local minima, But what we want is global minima
1.3 Model
Linear model linear model There's a big limit , Cannot simulate polyline 、 Curve , This restriction is called model bias, So we need to improve .
How to improve :Piecewise Linear Curves
Many such sets can be fitted into curves .
1.3.1 sigmoid
It can be used sigmoid function y = c 1 1 + e − ( b + w x 1 ) = c s i g m o i d ( b + w x 1 ) y=c \frac{1}{1+e^{-\left(b+w x_{1}\right)}}=c sigmoid(b+wx_1) y=c1+e−(b+wx1)1=csigmoid(b+wx1) Fit the blue broken line :
y = b + ∑ i c i sigmoid ( b i + ∑ j w i j x j ) y=b+\sum_{i} c_{i} \operatorname{sigmoid}\left(b_{i}+\sum_{j} w_{i j} x_{j}\right) y=b+i∑cisigmoid(bi+j∑wijxj)
All unknown parameters in this , Use both θ \theta θ Express :
Use all at once θ \theta θ To calculate , Make a gradient descent , Such a large amount of data , Therefore, small batches are used batch:
every last data The number of updates depends on the total amount of data and batch Number :
1.3.2 ReLU
In front of it is soft sigmoid, That's the curve , In fact, you can use two ReLU Quasi synthesis hard sigmoid, That's the broken line :
above sigmoid The formula becomes :
1.3.3 Yes sigmoid The calculation of can be done several more times
There are many such layers , It is called neural network Neural Network, Later called Deep learning=Many hidden layers
边栏推荐
- 使用druid连接MySQL数据库报类型错误
- Design of KTV intelligent dimming system based on MCU
- Yolov5 model training and detection
- Erreur de type de datagramme MySQL en utilisant Druid
- Write a thread pool by hand, and take you to learn the implementation principle of ThreadPoolExecutor thread pool
- [機緣參悟-38]:鬼穀子-第五飛箝篇 - 警示之一:有一種殺稱為“捧殺”
- Three properties that a good homomorphic encryption should satisfy
- Scientific research: are women better than men?
- Icu4c 70 source code download and compilation (win10, vs2022)
- Serious bugs with lifted/nullable conversions from int, allowing conversion from decimal
猜你喜欢
Yolov5 model training and detection
Design of KTV intelligent dimming system based on MCU
[technology development-26]: data security of new information and communication networks
Pytest (5) - assertion
Traditional chips and AI chips
Action News
[download white paper] does your customer relationship management (CRM) really "manage" customers?
Bert fine tuning skills experiment
8. Commodity management - commodity classification
Practice of tdengine in TCL air conditioning energy management platform
随机推荐
Design and implementation of campus epidemic prevention and control system based on SSM
Summary and practice of knowledge map construction technology
Some query constructors in laravel (2)
Spark SQL learning bullet 2
Go RPC call
Talk about the things that must be paid attention to when interviewing programmers
Last words record
Single line function*
[机缘参悟-38]:鬼谷子-第五飞箝篇 - 警示之一:有一种杀称为“捧杀”
Visual studio 2019 set transparent background (fool teaching)
Use the difference between "Chmod a + X" and "Chmod 755" [closed] - difference between using "Chmod a + X" and "Chmod 755" [closed]
Good documentation
Why do you understand a16z? Those who prefer Web3.0 Privacy Infrastructure: nym
【LeetCode】98. Verify the binary search tree (2 brushes of wrong questions)
Kotlin - 协程 Coroutine
Day_ 17 IO stream file class
【附源码】基于知识图谱的智能推荐系统-Sylvie小兔
Missile interception -- UPC winter vacation training match
RichView TRVStyle MainRVStyle
February database ranking: how long can Oracle remain the first?