当前位置：网站首页>Wake up wrist - neural network and deep learning (tensorflow application) updating

Wake up wrist - neural network and deep learning (tensorflow application) updating

2022-06-11 23:04:00 【Wake wrist】

machine learning 、 Introduction to deep learning

At present, the proportion of research papers related to traditional machine learning is indeed not too high , Some people deep learning is make complaints about systematic engineering. , There is no mathematical gold content . But there is no denying that deep learning is too easy to use , It greatly simplifies the overall algorithm analysis and learning process of traditional machine learning , More importantly, in some general fields, tasks refresh the accuracy and accuracy that traditional machine learning algorithms can not achieve .

In depth learning has been particularly popular in recent years , Just like big data a few years ago , However, deep learning mainly belongs to the field of machine learning , So in this article, let's talk about the difference between the algorithm process of machine learning and deep learning .

Insert picture description here

What is machine learning ？

Simply put, it is the method of converting unordered data into value , In a broad sense , Machine learning is a method that can endow machine learning with the ability to complete the function that direct programming can't . But in the sense of practice , Machine learning is a way of using data , Training out models , Then use a method of model prediction .

“ Training ” And “ forecast ” Two processes of machine learning ,“ Model ” Is the intermediate output of the process ,“ Training ” produce “ Model ”,“ Model ” To guide the “ forecast ”.
The machine learning method is that the computer makes use of the existing data ( Experience ), And we got a model ( The law of being late ), And use this model to predict the future ( Are you late? ) One way .
Let's compare the process of machine learning with that of human induction of historical experience .

The value and importance of machine learning ?

We focus on the ability of these tools to solve practical problems and machine learning practices , Extract rules from data , And used to predict the future .

automation (Automatically) : The machine learning method can be regarded as the algorithm of automatic generation algorithm .
Fast (Fast) ： Machine learning can save time . Compared with manual processing , Machine learning method can analyze sample data and generate algorithm more quickly .
Accuracy (Accurate) : Due to the nature of Automation , Machine learning methods can be based on more data 、 Run longer , Generate more accurate decisions .
scale (Scale) : Machine learning methods can provide solutions to problems that cannot be solved by human beings .

Examples of machine learning applications

 Classification problem ： Image recognition 、 Spam recognition 
 The return question ： Stock price forecast 、 Housing forecast 
 Scheduling problem ： Click through rate estimate 、 recommend 
 Generate problems ： Image generation 、 Image style conversion 、 Image text description generation

Machine learning application process
Insert picture description here
Algorithm flow of machine learning

In fact, machine learning studies data science （ Sounds a little boring ）, The following is the main flow of machine learning algorithm ： Mainly from 1） Data set preparation 、2） Exploratory analysis of data 、3） Data preprocessing 、4） Data segmentation 、5） Machine learning algorithm modeling 、6） Select machine learning tasks , Of course, the last thing is to evaluate the application of machine learning algorithm to actual data .

Deep learning algorithm set

contain ：1. Convolutional neural networks 2. Cyclic neural network 3. Automatic encoder 4. Sparse coding 5. Deep belief network 6. Limit the Boltzmann machine

Neuron - Logistic regression model

Because the simulation object of neural network is the human brain , So before we discuss the specific model , We need to look at the characteristics of the human brain from a biological point of view .

According to biological research , The computing unit of the human brain is the neuron （neuron）. It can respond to environmental changes , Then send the information to other neurons . In the human brain , There are about 860 Billion neurons , They are interconnected to form an extremely complex nervous system , The latter is the material basis of human wisdom . So follow the biological structure of the human brain , We first need to build a model to simulate human neurons .

Basic introduction to neurons

A neuron is the smallest structure of a neural network , A neural network is formed by combining multiple neurons . Neurons can also form a logistic regression model after some settings .

Insert picture description here
The input signal comes from the output of an external or other processing unit , Expressed mathematically as a line vector $x=(x_1,x_2,…,x_m)$ , among $x_i$ For the first time $i$ Excitation levels of inputs , $m$ Indicates the number of inputs .

Connect to node k The weighting of is expressed as a weighting vector $W_k=(w_k1,w_k2,…,w_km)$ , among $w_{ki}$ Represents a slave node $i$ （ Or the first $i$ Input points ） To the node $k$ A weighted , Or called $i$ And $k$ Connection strength between nodes .

The main function of the calculation function is to process each input signal to determine its strength （ weighting ）; Determine the combined effect of all input signals （ Sum up ）; Then determine its output （ Transfer characteristics ）.

in other words , When neurons receive information from n The input signals from these other neurons , The neuron adds up the received input values according to a certain weight , The superimposed stimulus intensity S Can be expressed by formula ：
$w_1x_1 + w_2x_2 + \cdots + w_nx_n = \sum_{i=1}^{n}{w_ix_i}$

And this output , It is not directly output in a naked way , It is compared with the current neuron threshold , And then through Activation function (Activation Function) Express output outwards , Conceptually, this is called a perceptron (Perceptron), Its model can be expressed by formula ：

$f(\sum_{i=1}^{n}{w_ix_i - \theta})$

here $\theta$ Is the so-called threshold (Threshold), $f$ It's the activation function , $y$ Is the final output .

Neuronal targets

The goal of the neuron is to adjust the weights according to a large number of input and output examples . therefore , Suppose we show neurons a thousand examples of cat pictures and non cat pictures , And we show what features we show in each example and how certain we are that they are here . Based on thousands of images, neurons decide ：

 Which features are important and positive （ For example, every cat has a tail , So the weight must be large and positive ）
 Which characteristics are not important （ for example , Only a few pictures have 2 Eyes , So the weight is very small ）
 Which characteristics are important and negative （ For example, each picture containing a horn is actually a picture of a unicorn rather than a cat , So the weight must be large and negative ）

Insert picture description here

Neuron - Simple basic calculation problem

A neural network is a group of hierarchical neurons . Every neuron is a mathematical operation , It accepts input , Multiply by its weight , The sum is then passed to the other neurons through the activation function . Neural networks are learning how to classify inputs by adjusting their weights according to the previous example .

It multiplies the input values by their weights , Then add them up , after , It applies the activation function to the summation .

Insert picture description here

Logical STI model of binary classification

The distribution function of logistic distribution $F (x)$ The curve of is shown in the figure , The graph is a S Shape curve , The curve grows fastest near the center , Slow growth at both ends . When $x$ At infinity , $F (x)$ Close to the 1; When $x$ Infinite hours , $F (x)$ Close to the 0.

Binomial logistic regression model is a classification model , By conditional probability distribution $P (Y ∣ X)$ Express , The form is parameterized logistic distribution ？ Here are random variables X The value is a real number , A random variable $Y$ The value is 1 or 0.

Insert picture description here
Suppose the distribution of a set of data is as shown in the figure above , What kind of model do you build to distinguish the two categories ？

linear regression model $z = W^Tx + b$ , The output value of linear regression model is a real value , The output flag of the two classification task （ In binomial logistic regression , We force positive classes to be marked as 1, Negative class marked as 0, The reasons for this will be mentioned later ）, So we consider the real value $z$ Convert to $0 / 1$ value .
$\begin{cases} 0 & if & z < 0 \\ 0.5 & if & z=0 \\ 1 & if & z > 0\end{cases}$
But the unit step function is discontinuous , We hope to find an alternative function that is close to the unit step function to a certain extent , And hope it is monotonous and differentiable , Logarithmic probability function is such a commonly used substitute function , Logarithmic probability function （ Also called sigmod function ,logistic function ）

$\frac{1}{1+e^{-z}}$

For a given input instance $x$ , According to the above distribution function, we can get $P (Y = 1 ∣ x) and P (Y = 0 ∣ x)$ . Logistic regression is to compare the size of two conditional probability values , Will instance $x$ Into the category with high probability value .

Insert picture description here
Neurons have multiple outputs ： $W$ From vector to matrix , Output $W * x$ Becomes a vector

In Statistics , Multiclass logistic regression is a classification method obtained by generalizing logistic regression into multiclass problems . In more professional terms , It is a model used to predict the probability of different possible results of a dependent variable with category distribution .
Insert picture description here

Binomial logistic regression model is a binomial classification model , Used in binary classification problems . It can be extended to multiple logistic regression models , For multi classification problems . Suppose a discrete random variable Y The set of possible values for is {1,2,…,K}, So the multiple logistic regression model is ：

Insert picture description here

Multinomial logistic regression is also called softmax Return to , It is a generalization of binomial logistic regression , For multi category classification .

Insert picture description here

gradient descent 、 Loss function

What is gradient descent ?

First, we can decompose the gradient descent into gradient + falling , So the gradient can be interpreted as a derivative （ For multidimensional, it can be understood as partial derivative ）, So it all adds up to ： Derivative descent , That's the question , What does the derivative decline do ？ Here I give the answer directly ： Gradient descent is used to find the corresponding value of the independent variable when finding the minimum value of a function .

A function in this sentence refers to ： Loss function （cost/loss function）, The direct point is the error function .

The loss function is a parameter whose argument is the algorithm , Function whose value is the error value . Therefore, gradient descent is to find the parameters taken by the algorithm when the error value is minimized .

In machine learning, one kind of algorithm is to generate a curve to fit the existing data , In this way, we can predict the future data , We call this algorithm regression .

Another kind of algorithm also produces a curve , But this curve is used to divide the point into two pieces , Implementation classification , We call this algorithm classification . However, the fitting curves generated by the two algorithms are not completely coincident with the existing points , There is an error between the fitted curve and the true value . So we usually use the value of the loss function to measure the error , Therefore, the more obvious the error value of the loss function is, the better the fitting effect is .

Simple understanding ： The loss function represents the error between the predicted value and the actual value .

Introduction to declarative programming

Insert picture description here

The more declarative , It means there's a lot more to do down there , Or the more powerful it is . It also means a loss of efficiency . The more imperative , It means that the upper layer has more operating space for the lower layer , You can ask the lower level to deal with it in a certain way according to your specific needs .

actually , This pair of concepts should be called “ Declarative interface ” and “ Command interface ”. Maybe it's because it mostly talks about “ Language ” This interface mode is only used , So it's called “ Declarative programming ” and “ Command programming ”.

Of course , You can also think of it as a programming idea , in other words , When building your own code , For the sake of legibility of the structure , Layer the code , Interfaces between layers should be declarative as much as possible . In this way, your code naturally describes what you need from a human perspective on one level ; On the other layer, the computer logic is used to realize the needs of people .

Insert picture description here

Data processing and model building

TensorFlow It is commonly used in deep learning Python Neural network framework .TensorFlow It's a data flow graph (data flow graphs), Open source software library for numerical calculation .

TensorFlow By the first Google Brain groups ( Affiliated to the Google Machine Intelligence Research Institute ) The researchers and engineers of , For machine learning and deep neural network research , But the versatility of this system makes it widely used in other computing fields .

It is based on DistBelief Second generation AI learning system for R & D .2015 year 11 month 9 Japan ,Google Release artificial intelligence system TensorFlow And announce open source .

. With the help of Anaconda Installation tensorflow

Anaconda Official website ：https://www.anaconda.com/

Choose the appropriate Anaconda Installation , Get into Anaconda Its official website , Download the corresponding system version of Anaconda, The current version of the official website is For Windows Python 3.9 • 64-Bit Graphical Installer • 594 MB.
Insert picture description here
Just like installing ordinary software , Select Default for all , Note that the check box will python3.9 Add to environment variable .

anaconda To configure ： open cmd Switch to the domestic image source

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/

TensorFlow install ： It is recommended to install tensorflow 1.15 Version of , If you need to install other versions, you only need to install tensorflow Modify the corresponding version number in the command line of .

open cmd function , First create tensorflow 1.15 The environment needed （cmd Command line python -V View version ）

conda create -n tensorflow pip python=3.9

And then activate TensorFlow Environmental Science

activate tensorflow

原网站

版权声明
本文为[Wake wrist]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/162/202206112300319361.html

当前位置：网站首页>Wake up wrist - neural network and deep learning (tensorflow application) updating

Wake up wrist - neural network and deep learning (tensorflow application) updating

machine learning 、 Introduction to deep learning

Neuron - Logistic regression model

gradient descent 、 Loss function

Data processing and model building

边栏推荐

猜你喜欢

随机推荐