当前位置:网站首页>Introduction to neural network (Part 2)
Introduction to neural network (Part 2)
2022-07-04 07:31:00 【Uncertainty!!】
Introduction to neural networks ( Next )
Note source :Neural Networks Demystified
Statement : I am Xiaobai , First time to learn relevant knowledge , This is a study note , If there is a mistake , Please correct me !
Observation = Signal + Noise
The model should adapt to the signal , Instead of adapting to noise
What is Noise in Machine Learning?
Humans are prone to making mistakes when collecting data, and data collection instruments may be unreliable, resulting in dataset errors. The errors are referred to as noise. Data noise in machine learning can cause problems since the algorithm interprets the noise as a pattern and can start generalizing from it. -- Excerpt from :What is Noise in Machine Learning
Machine learning noise detection and removal
PCA attempts to eliminate corrupted data from a signal or picture using preservative noise while maintaining the critical features– Excerpt from :What is Noise in Machine Learning
About PCA I wrote a note before , Portal : Principal component analysis (Principal Component Analysis,PCA)
1.1 Over fitting (Overfitting)
Over fitting phenomenon (Overfitting)
In mathematical modeling, overfitting is “the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably” -- Excerpt from :Overfitting
The green line represents the fitting model ( That's a function ), It can well match the training data , But the dependence on training data is too high , Once the unknown data that is not in the training data is predicted, there will be a large deviation , The over fitting model lacks generalization ability
The black line represents the regularization model ( Improvement of over fitting model , Improve the generalization ability )
Add a set of data , Make the over fitting more obvious
After adding data , We retrain
Draw the graph after adding new data
New training model ( Surfaces ) as follows
Black dots are training set data
We found that some data have been inconsistent with the facts , This is caused by the over fitting of the model
As shown in the figure below , When Hours Sleep When fixed at a value , With Hours Study An increase in ,TestScore It will decrease first and then increase , This is obviously not in line with reality
1.2 Tested
How to detect whether the model is over fitted ?
First, we divide the data set into : Training set and test set
1. Training set
Your training data is a subset of your dataset that you use to teach a machine learning model to recognize patterns or perform your criteria. -- Excerpt from :What is Training Data?
2. Test set
Once your machine learning model is built (with your training data), you need unseen data to test your model. This data is called testing data, and you can use it to evaluate the performance and progress of your algorithms’ training and adjust or optimize it for improved results. -- Excerpt from :What is Testing Data?
Testing data has two main criteria. It should:
1.Represent the actual dataset
2.Be large enough to generate meaningful predictions
Content extension : Compare datasets (Contrastive dataset)
Assume you need to clean a noisy dataset that includes big background patterns as noise that a data scientist isn’t interested in. Then, using an adaptive noise cancellation approach, this method offers a solution by eliminating the noisy signal. This technique employs two signals: one is the target signal, and the other is a noise-free background signal.– Excerpt from :What is Noise in Machine Learning
The Fourier transform
Researches have already shown that our signal or data has a structure, we can remove noise from it directly. The Fourier Transform of the signal is used to translate the signal into the frequency domain in this process.– Excerpt from :What is Noise in Machine Learning
The Fourier transform of the signal often turns the signal to the frequency domain , So as to remove a corresponding noise
I wrote a note about Fourier transform before , Portal : Fourier series 、 The Fourier transform 、 spectrum
The following figure comes from LaTeX Studio
Original data set
Training set and test set
We have tested the fitting through the test set
1.3 Regularization (Regularization)
What is regularization?
Regularization is a process that changes the result answer to be “simpler”.– Excerpt from :Regularization
Regularization is to add a term to our cost function that penalizes overly complex models
Repair over fitting by regularization
Modify initialization function , add to lambda
In the cost function J as well as dJdW1 and dJdW2 Add regular items to
Other functions have not been changed
stay trainer Add the following new content
All modifications are completed , Next, let's retrain
The following figure shows the errors of the model in the test set and the training set
Our goal is to constantly adjust lambda bring Testing Error Gradually close Training Error, To improve the generalization ability of the model
Training error Is the average error of the model on the training set , The fitting of the model to the training set is measured . The large training error indicates that the characteristics of the training set are not learned enough , The training error is too small, which indicates that the characteristics of the training set are over learned , It's easy to get fitted .– Excerpt from Model evaluation —— Training error and test error 、 Over fitting and under fitting 、 Confusion matrix
Test error Is the average error of the model on the test set , The generalization ability of the model is measured . In practice , I hope the smaller the test error is, the better .– Excerpt from Model evaluation —— Training error and test error 、 Over fitting and under fitting 、 Confusion matrix
The model no longer fits all the data perfectly , This completes the repaired fitting
Corresponding contour chart
边栏推荐
- BUUCTF(4)
- MySQL 数据库 - 函数 约束 多表查询 事务
- A real penetration test
- MySQL中的文本處理函數整理,收藏速查
- Blog stop statement
- Amd RX 7000 Series graphics card product line exposure: two generations of core and process mix and match
- [real case] how to deal with the failure of message consumption?
- Mysql database - function constraint multi table query transaction
- Introduction to sap commerce cloud B2B organization function
- L1-027 rental (20 points)
猜你喜欢
CMS source code of multi wechat management system developed based on thinkphp6, with one click curd and other functions
Comparison between applet framework and platform compilation
提升复杂场景三维重建精度 | 基于PaddleSeg分割无人机遥感影像
Status of the thread
Advanced MySQL: Basics (5-8 Lectures)
Vulhub vulnerability recurrence 77_ zabbix
用于压缩视频感知增强的多目标网络自适应时空融合
Xcode 14之大变化详细介绍
Zephyr Learning note 2, Scheduling
Introduction to spark core components
随机推荐
This monitoring system can monitor the turnover intention and fishing all, and the product page has 404 after the dispute appears
User login function: simple but difficult
Status of the thread
Novel website program source code that can be automatically collected
The crackdown on Huawei prompted made in China to join forces to fight back, and another enterprise announced to invest 100 billion in R & D
Implementation of ZABBIX agent active mode
How to send mail with Jianmu Ci
Transition technology from IPv4 to IPv6
Zhanrui tankbang | jointly build, cooperate and win-win zhanrui core ecology
Zephyr 学习笔记2,Scheduling
Research on an endogenous data security interaction protocol oriented to dual platform and dual chain architecture
Oceanbase is the leader in the magic quadrant of China's database in 2021
Blue Bridge Cup Quick sort (code completion)
[Flink] temporal semantics and watermark
BasicVSR++: Improving Video Super-Resolutionwith Enhanced Propagation and Alignment
Rapidjson reading and writing JSON files
Introduction to sap commerce cloud B2B organization function
[C language] open the door of C
MySQL error resolution - error 1261 (01000): row 1 doesn't contain data for all columns
I was pressed for the draft, so let's talk about how long links can be as efficient as short links in the development of mobile terminals