当前位置：网站首页>Day 8 of DL

Day 8 of DL

2022-07-28 02:55:00 【The sun is falling】

1. If deep learning models already exist 1000 Ten thousand faces , How to find a new face through query ？

This problem is about the application of deep learning algorithm in practice , The key to this problem is the index method of data . This is going to be One Shot learning The last step in face recognition , But this is the most important step in deploying in practice .
Basically , Answer the question , You should first describe One Shot learning The general method of face recognition . It can be simply understood as converting each face into a vector , The new face recognition is to find the closest face to the input face ( Most similar ) Vector . Usually , People will use triplet loss The deep learning model of loss function to achieve this . However , As the number of images increases , In each recognition, the 1000 Distance of ten thousand vectors is not a wise solution , This makes the system much slower . In order to make the query more convenient , We need to consider how to index data in real vector space . The main idea of these methods is to divide the data into structures that are easy to query new data ( It may be similar to tree structure ). When new data is available , Searching in the tree is helpful to find the nearest vector quickly . There are several ways to do this , such as Locality Sensitive Hashing—LSH,Approximate Nearest Neighbors — Annoy Indexing wait .

2. For the classification problem , Whether the accuracy index is completely reliable ？ What metrics do you usually use to evaluate your model ？

For a class problem , There are many different evaluation methods . In terms of accuracy , The formula simply divides the number of correctly predicted data points by the total data . That sounds reasonable , But in reality , For unbalanced data , This number is not significant enough . Suppose we're building a prediction model for cyber attacks ( Suppose that attack requests account for 1/100000). If the model predicts that all requests are normal , So the accuracy is 99.9999%, And this number is usually unreliable in classification models . The above precision calculation usually shows us how much percentage of the data is predicted correctly , But it doesn't specify how to classify each class . contrary , We can use the confusion matrix . Basically , The obfuscation matrix shows how many data points actually belong to one class , And is predicted to belong to a class . Its form is as follows :
ad locum A Insert picture description
In addition to indicating that true positive and false positive indicators correspond to changes in each threshold defining the classification , We have one more ROC chart . according to ROC curve , We can know if the model works . ideal ROC The curve is the orange line closest to the top left corner . Really positive , False positives are lower .
Insert picture description here

3. What do you mean by activation function ？ What is the saturation interval of the activation function ？

The meaning of the activation function

The activation function is generated for Break the linear characteristic of neural network . These functions can be simply understood as using a filter to determine whether information passes through neurons . In neural network training , The activation function plays an important role in adjusting the slope of the derivative . Some activation functions , Such as sigmoid、fishy or ReLU, It will be discussed further in the following sections . However , What we need to understand is , The properties of these nonlinear functions enable the neural network to learn the representation of more complex functions than just using linear functions . Most activation functions are continuous differentiable functions .
These functions are continuous functions , in other words , If the input variable is small and differentiable ( Every point in its domain has a derivative ), Then there will be a small change in output . Of course , As mentioned above , The calculation of derivatives is very important , It's a decisive factor in whether our neurons can be trained . Mention a few common activation functions , Such as Sigmoid, Softmax, ReLU.

The saturation interval of the activation function

Insert picture description here
Tanh、Sigmoid、ReLU All the nonlinear activation functions have saturation intervals . It's easy to understand that , The saturation interval of the activation function refers to that even if the input value changes , The output value of the function does not change . There are two problems in the range of change , In the forward propagation of neural network , The value of this layer gradually falls into the saturation range of the activation function , Multiple identical outputs will gradually appear .
This will produce the same data flow throughout the model . This phenomenon is called covariance shift . The second question is , In back propagation , The derivative is zero in the saturation region , So the Internet can learn almost nothing . That's why we need to set the range of values to mean 0 Why .

原网站

版权声明
本文为[The sun is falling]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/197/202207132250187069.html