当前位置:网站首页>Day 8 of DL
Day 8 of DL
2022-07-28 02:55:00 【The sun is falling】
1. If deep learning models already exist 1000 Ten thousand faces , How to find a new face through query ?
This problem is about the application of deep learning algorithm in practice , The key to this problem is the index method of data . This is going to be One Shot learning The last step in face recognition , But this is the most important step in deploying in practice .
Basically , Answer the question , You should first describe One Shot learning The general method of face recognition . It can be simply understood as converting each face into a vector , The new face recognition is to find the closest face to the input face ( Most similar ) Vector . Usually , People will use triplet loss The deep learning model of loss function to achieve this . However , As the number of images increases , In each recognition, the 1000 Distance of ten thousand vectors is not a wise solution , This makes the system much slower . In order to make the query more convenient , We need to consider how to index data in real vector space . The main idea of these methods is to divide the data into structures that are easy to query new data ( It may be similar to tree structure ). When new data is available , Searching in the tree is helpful to find the nearest vector quickly . There are several ways to do this , such as Locality Sensitive Hashing—LSH,Approximate Nearest Neighbors — Annoy Indexing wait .
2. For the classification problem , Whether the accuracy index is completely reliable ? What metrics do you usually use to evaluate your model ?
For a class problem , There are many different evaluation methods . In terms of accuracy , The formula simply divides the number of correctly predicted data points by the total data . That sounds reasonable , But in reality , For unbalanced data , This number is not significant enough . Suppose we're building a prediction model for cyber attacks ( Suppose that attack requests account for 1/100000). If the model predicts that all requests are normal , So the accuracy is 99.9999%, And this number is usually unreliable in classification models . The above precision calculation usually shows us how much percentage of the data is predicted correctly , But it doesn't specify how to classify each class . contrary , We can use the confusion matrix . Basically , The obfuscation matrix shows how many data points actually belong to one class , And is predicted to belong to a class . Its form is as follows :
In addition to indicating that true positive and false positive indicators correspond to changes in each threshold defining the classification , We have one more ROC chart . according to ROC curve , We can know if the model works . ideal ROC The curve is the orange line closest to the top left corner . Really positive , False positives are lower .
3. What do you mean by activation function ? What is the saturation interval of the activation function ?
The meaning of the activation function
The activation function is generated for Break the linear characteristic of neural network . These functions can be simply understood as using a filter to determine whether information passes through neurons . In neural network training , The activation function plays an important role in adjusting the slope of the derivative . Some activation functions , Such as sigmoid、fishy or ReLU, It will be discussed further in the following sections . However , What we need to understand is , The properties of these nonlinear functions enable the neural network to learn the representation of more complex functions than just using linear functions . Most activation functions are continuous differentiable functions .
These functions are continuous functions , in other words , If the input variable is small and differentiable ( Every point in its domain has a derivative ), Then there will be a small change in output . Of course , As mentioned above , The calculation of derivatives is very important , It's a decisive factor in whether our neurons can be trained . Mention a few common activation functions , Such as Sigmoid, Softmax, ReLU.
The saturation interval of the activation function

Tanh、Sigmoid、ReLU All the nonlinear activation functions have saturation intervals . It's easy to understand that , The saturation interval of the activation function refers to that even if the input value changes , The output value of the function does not change . There are two problems in the range of change , In the forward propagation of neural network , The value of this layer gradually falls into the saturation range of the activation function , Multiple identical outputs will gradually appear .
This will produce the same data flow throughout the model . This phenomenon is called covariance shift . The second question is , In back propagation , The derivative is zero in the saturation region , So the Internet can learn almost nothing . That's why we need to set the range of values to mean 0 Why .
边栏推荐
- [image hiding] digital image information hiding system based on DCT, DWT, LHA, LSB, including various attacks and performance parameters, with matlab code
- 分布式 session 的4个解决方案,你觉得哪个最好?
- 程序里随处可见的interface,真的有用吗?真的用对了吗?
- JS中的reduce()函数介绍
- Selenium+pytest+allure comprehensive exercise
- Four methods of modifying MySQL password (suitable for beginners)
- retainface使用报错:ModuleNotFoundError: No module named 'rcnn.cython.bbox'
- Trivy [1] tool scanning application
- CNN循环训练的解释 | PyTorch系列(二十二)
- Retainface use error: modulenotfounderror: no module named'rcnn.cyton.bbox'
猜你喜欢

Some shortest path problems solved by hierarchical graph

LETV responded that employees live an immortal life without internal problems and bosses; Apple refuses to store user icloud data in Russia; Dapr 1.8.0 release | geek headlines

Redis AOF log persistence

基于FPGA的64位8级流水线加法器

【自我成长网站收集】

新基建助力智能化道路交通领域的转型发展

Newline required at end of file but not found.

Cesium3Dtilesets 使用customShader的解读以及泛光效果示例

selenium+pytest+allure综合练习

ps 简单使用
随机推荐
0 dynamic programming medium leetcode873. Length of the longest Fibonacci subsequence
Consolidate the data foundation in the data center
CNN训练循环重构——超参数测试 | PyTorch系列(二十八)
[wechat applet development (VI)] draw the circular progress bar of the music player
CSDN TOP1“一个处女座的程序猿“如何通过写作成为百万粉丝博主?
Leetcode judge whether palindrome number
【TA-霜狼_may-《百人计划》】图形3.7 移动端TP(D)R架构
"The faster the code is written, the slower the program runs"
JS 事件对象 offsetX/Y clientX Y PageX Y
TFX airflow experience
tfx airflow 使用体验
【英雄哥七月集训】第 26天:并查集
From prediction to decision-making, Chapter 9 Yunji datacanvas launched the ylearn causal learning open source project
JS 事件对象2 e.charcode字符码 e.keyCode键码 盒子上下左右移动
pytest最好的测试框架
Pychart shortcut key for quickly modifying all the same names on the whole page
基于FPGA的64位8级流水线加法器
Interpretation of cesium3dtilesets using customshader and examples of Omni effects
When iPhone copies photos to the computer, the device connection often fails and the transmission is interrupted. Here's the way
Maskedauutoencoders visual learner cvpr2022