当前位置:网站首页>Complementary knowledge of auto encoder
Complementary knowledge of auto encoder
2022-07-04 12:24:00 【hello_ JeremyWang】
1. Yes Auto-Encoder Ask for more
stay Pytorch actual combat _ Image dimensionality reduction and clustering in , I have briefly introduced Auto-Encoder Principle . For the simplest Auto-Encoder, Our requirement is to minimize reconstruction loss, That is, the restored image or article should be as close as possible to the original image or article .
But beyond that , Can we talk about Auto-Encoder Put forward more requirements ? The answer is yes , Let's take a look :
- Not just reduce reconstruction loss
- Get more acceptable embedding
1.1 Demand one
Request a request that we not only reduce reconstruction loss, And ask us to get embedding Can represent our original pictures or words ( It's like writing wheel eyes represents the yuzhibo family ). How can we make the machine do this ?
From the bottom PPT It can be seen that , We need to build another classifier Discriminator To measure embedding How well it fits the original picture . The specific process is , We set the parameter to θ \theta θ Of Encoder Compress the picture , And compress the obtained embedding and Put pictures together Discriminator To classify , from Discriminator To determine whether the two fit . For each θ \theta θ Come on , We all adjust Discriminator Parameters of ϕ \phi ϕ To make Discriminator The training error is as small as possible , We define this error as L D ∗ L_D^{*} LD∗. Finally, we should adjust the parameters θ \theta θ bring L D ∗ L_D^{*} LD∗ As small as possible .
1.2 Requirement 2
Claim 2 requires us to get embedding More explanatory . Usually we get embedding It looks like a mess , Just like below PPT The picture in the upper right corner is the same . We want to know embedding What information does each part represent . As shown in the figure below , In speech training , What we got embedding It may contain the information of the speaker ( Such as : Pronunciation habits and so on ) And the information in the discourse itself , We want to separate them .
How to do it specifically ? A simple and natural idea is , We train two Encoder, One of them is specially used to extract the information of the speech itself , The other is used to extract the information of the speaker . What's the use ? For example, we can combine the information of another speaker with the information of the discourse itself , Realize the effect of changing sound .
How to train? There are Encoder Well ? One way is reverse training . Similarly, we create a binary Discriminator, This Discriminator The function of is to eat the part that represents the information of the discourse itself embedding, And decide who said it . If our Encoder Be able to cheat Discriminator, He couldn't tell who said it , That explains this part embedding The information of the speaker is no longer contained in .
边栏推荐
- The detailed installation process of Ninja security penetration system (Ninjitsu OS V3). Both old and new VM versions can be installed through personal testing, with download sources
- Btrace tells you how to debug online without restarting the JVM
- Azure solution: how can third-party tools call azure blob storage to store data?
- [Yunju entrepreneurial foundation notes] Chapter II entrepreneur test 9
- Decrypt the advantages of low code and unlock efficient application development
- Snowflake won the 2021 annual database
- Four sorts: bubble, select, insert, count
- [Yunju entrepreneurial foundation notes] Chapter II entrepreneur test 18
- Awk getting started to proficient series - awk quick start
- Here, the DDS tutorial you want | first experience of fastdds - source code compilation & Installation & Testing
猜你喜欢
The detailed installation process of Ninja security penetration system (Ninjitsu OS V3). Both old and new VM versions can be installed through personal testing, with download sources
2018 meisai modeling summary +latex standard meisai template sharing
Detailed explanation of NPM installation and caching mechanism
OSI model notes
01. Basics - MySQL overview
13、 C window form technology and basic controls (3)
(August 9, 2021) example exercise of air quality index calculation (I)
How to use "bottom logic" to see the cards in the world?
Leetcode: 408 sliding window median
2021-08-09
随机推荐
QQ group administrators
OSI seven layer model & unit
01. Basics - MySQL overview
OSI seven layer reference model
Exness: positive I win, negative you lose
French Data Protection Agency: using Google Analytics or violating gdpr
[directory] search
Recommend a cool geospatial data visualization tool with low code
Attributes and methods in math library
Introduction of network security research direction of Shanghai Jiaotong University
Exceptions and exception handling
C language compilation process
Pat 1059 prime factors (25 points) prime table
Force buckle 142 Circular linked list II
DVC use case (VI): Data Registry
Memory computing integration: AI chip architecture in the post Moorish Era
Source code analysis of the implementation mechanism of multisets in guava class library
Serialization oriented - pickle library, JSON Library
How to judge the advantages and disadvantages of low code products in the market?
Process communication and thread explanation