当前位置：网站首页>[CV] Wu Enda machine learning course notes | Chapter 8

[CV] Wu Enda machine learning course notes | Chapter 8

2022-07-07 07:49:00 【Fannnnf】

If there is no special explanation in this series of articles , The text explains the picture above the text
machine learning | Coursera
Wu Enda machine learning series _bilibili

Catalog

8 Representation of neural networks

8 Representation of neural networks

8-1 Nonlinear hypothesis

For an image , If the gray value of each pixel or other feature representation method is taken as a data sample , The data set will be very large , If we use the previous regression algorithm to calculate , There will be a very large computational cost

8-2 Neurons and the brain

8-3 Forward propagation - Model display I

Insert picture description here

The figure above refers to a with Sigmoid Artificial neuron of activation function , In terms of neural networks , $g(z)=\frac{1}{1+e^{-θ^TX}}$ It is called activation function
Neural network refers to a set of Neural Networks , first floor （Layer 1） Called the input layer （Input Layer）, The second floor （Layer 2） Called hidden layer （Hidden Layer）, The third level （Layer 3） Called output layer （Output Layer）
use $a_i^{(j)}$ To represent the $j$ Layer of the first $i$ Activation items of neurons （“activation” of unit $i$ in layer $j$ ）, The so-called activation term refers to the value calculated and output by a specific neuron
use $\Theta^{(j)}$ Says from the first $j$ Layer to tier $j + 1$ Layer weight matrix （ Parameter matrix ）, That's what happened before $\theta$ matrix （ Previous $\theta$ It can be called parameter $p a r a m e t e r s$ It can also be called weight $w e i g h t s$ ）
$a_1^{(2)}$ 、 $a_2^{(2)}$ and $a_3^{(2)}$ The calculation formula of has been written in the above figure
among $\Theta^{(1)}$ It's a $3 \times 4$ Matrix
If the neural network is in the $j$ Layer has a $s_j$ A unit , In the $j + 1$ Layer has a $s_{j+1}$ A unit , that $\Theta^{(j)}$ It's a $s_{j+1}×(s_j+1)$ Matrix

8-4 Forward propagation - Model display II

Vectorization of forward propagation ：
Insert picture description here

Put $\Theta^{(1)}_{10}+\Theta^{(1)}_{11}+\Theta^{(1)}_{12}+\Theta^{(1)}_{13}$ Expressed as $z_1^{(2)}$
be $a_1^{(2)}=g(z_1^{(2)})$
Extend to the whole domain , Activation value of the second layer $a^{(2)}=g(z^{(2)})$ , among $z^{(2)}=\Theta^{(1)}a^{(1)}$ , In addition, you need to add an offset term $a^{(2)}_0=1$

8-5 Examples and understanding I

8-6 Examples and understanding II

Insert picture description here
The figure above shows the calculation $x_1$ XNOR $x_2$ The neural network of
From the first floor to the second floor, calculate $x_1$ AND $x_2$ obtain $a_1^{(2)}$ , Calculation (NOT $x_1$ ) AND (NOT $x_2$ ) obtain $a_2^{(2)}$
And then to $a_1^{(2)}$ and $a_2^{(2)}$ by $x_1$ and $x_2$ Calculation $x_1$ OR $x_2$ The result is $x_1$ XNOR $x_2$

8-7 Multivariate classification

Insert picture description here
There are four outputs ：pedestrian、car、motorcycle、truck
So there are four output units
Output $y^{(i)}$ For one 4 D matrix , May be ：
$\begin{bmatrix} 1\\ 0\\ 0\\ 0\\ \end{bmatrix} or \begin{bmatrix} 0\\ 1\\ 0\\ 0\\ \end{bmatrix} or \begin{bmatrix} 0\\ 0\\ 1\\ 0\\ \end{bmatrix} or \begin{bmatrix} 0\\ 0\\ 0\\ 1\\ \end{bmatrix} One of them$
respectively pedestrian or car or motorcycle or truck