当前位置:网站首页>2021 Li Hongyi machine learning (1): basic concepts
2021 Li Hongyi machine learning (1): basic concepts
2022-07-05 02:38:00 【Three ears 01】
2021 Li hongyi machine learning (1): Basic concepts
B On the site 2021 Li Hongyi's learning notes of machine learning course , For reuse .
1 Basic concepts
Machine learning is ultimately about finding a function .
1.1 Different function categories
- Return to Regression—— Output is numeric
- classification Classification—— The output is in different categories classes, Do multiple choice questions
- Structural learning Structured Learning—— Generate a structured file ( Draw a picture 、 Write an article ), Let the machine learn to create
1.2 How to find functions (Training):
- First , Write a function with unknown parameters ;
- secondly , Definition loss( A function related to parameters ,MAE—— Absolute error ,MSE—— Mean square error );
- Last , Optimize , Find the loss Minimum parameters —— gradient descent
1) Randomly select the initial value of the parameter ;
2) Calculation ∂ L ∂ w ∣ w = w 0 \left.\frac{\partial L}{\partial w}\right|_{w=w^{0}} ∂w∂L∣∣w=w0, Then step down the gradient , The step size is l r × ∂ L ∂ w ∣ w = w 0 \left.lr\times\frac{\partial L}{\partial w}\right|_{w=w^{0}} lr×∂w∂L∣∣w=w0
3) Update parameters
This method has a huge drawback : Usually we will find Local minima, But what we want is global minima
1.3 Model
Linear model linear model There's a big limit , Cannot simulate polyline 、 Curve , This restriction is called model bias, So we need to improve .
How to improve :Piecewise Linear Curves
Many such sets can be fitted into curves .
1.3.1 sigmoid
It can be used sigmoid function y = c 1 1 + e − ( b + w x 1 ) = c s i g m o i d ( b + w x 1 ) y=c \frac{1}{1+e^{-\left(b+w x_{1}\right)}}=c sigmoid(b+wx_1) y=c1+e−(b+wx1)1=csigmoid(b+wx1) Fit the blue broken line :
y = b + ∑ i c i sigmoid ( b i + ∑ j w i j x j ) y=b+\sum_{i} c_{i} \operatorname{sigmoid}\left(b_{i}+\sum_{j} w_{i j} x_{j}\right) y=b+i∑cisigmoid(bi+j∑wijxj)
All unknown parameters in this , Use both θ \theta θ Express :
Use all at once θ \theta θ To calculate , Make a gradient descent , Such a large amount of data , Therefore, small batches are used batch:
every last data The number of updates depends on the total amount of data and batch Number :
1.3.2 ReLU
In front of it is soft sigmoid, That's the curve , In fact, you can use two ReLU Quasi synthesis hard sigmoid, That's the broken line :
above sigmoid The formula becomes :
1.3.3 Yes sigmoid The calculation of can be done several more times

There are many such layers , It is called neural network Neural Network, Later called Deep learning=Many hidden layers
边栏推荐
- STL container
- Action News
- Video display and hiding of imitation tudou.com
- Yuan universe also "real estate"? Multiple second-hand trading websites block metauniverse keywords
- The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
- 打破信息茧房-我主动获取信息的方法 -#3
- tuple and point
- 返回二叉树中两个节点的最低公共祖先
- Start the remedial work. Print the contents of the array using the pointer
- Which common ports should the server open
猜你喜欢

Spoon inserts and updates the Oracle database, and some prompts are inserted with errors. Assertion botch: negative time

Design and implementation of community hospital information system

The steering wheel can be turned for one and a half turns. Is there any difference between it and two turns

【LeetCode】110. Balanced binary tree (2 brushes of wrong questions)
![[technology development-26]: data security of new information and communication networks](/img/13/10c8bd340017c6516edef41cd3bf6f.png)
[technology development-26]: data security of new information and communication networks

. Net starts again happy 20th birthday
![[技术发展-26]:新型信息与通信网络的数据安全](/img/13/10c8bd340017c6516edef41cd3bf6f.png)
[技术发展-26]:新型信息与通信网络的数据安全

Android advanced interview question record in 2022

Application and Optimization Practice of redis in vivo push platform

Unpool(nn.MaxUnpool2d)
随机推荐
openresty ngx_lua执行阶段
Single line function*
Design of kindergarten real-time monitoring and control system
Exploration of short text analysis in the field of medical and health (I)
A label making navigation bar
[download white paper] does your customer relationship management (CRM) really "manage" customers?
Medusa installation and simple use
Traditional chips and AI chips
Matrixone 0.2.0 is released, and the fastest SQL computing engine is coming
The perfect car for successful people: BMW X7! Superior performance, excellent comfort and safety
8. Commodity management - commodity classification
Visual explanation of Newton iteration method
Timescaledb 2.5.2 release, time series database based on PostgreSQL
Hmi-32- [motion mode] add light panel and basic information column
A label colorful navigation bar
【LeetCode】404. Sum of left leaves (2 brushes of wrong questions)
官宣!第三届云原生编程挑战赛正式启动!
Write a thread pool by hand, and take you to learn the implementation principle of ThreadPoolExecutor thread pool
Why do you understand a16z? Those who prefer Web3.0 Privacy Infrastructure: nym
Abacus mental arithmetic test