当前位置:网站首页>Thesis reading [distinctive late semantic graph for video capturing]
Thesis reading [distinctive late semantic graph for video capturing]
2022-07-01 19:31:00 【hei_ hei_ hei_】
Discriminative Latent Semantic Graph for Video Captioning
List of articles
Summary
- publish :ACM MultiMedia 2021
- Code :D-LSG
- idea: In order to strengthen object-level interactions and frame-level information( In fact, it is for the commonly used features after processing :2D-CNN、3D-CNN、R-CNN), The author is divided into three main parts :Enhanced Object Proposal: Use Graph Integrate the characteristics of time and space into latent object in ;Visual Knowledge: Aggregate the above features in latent nodes And used to predict semantic words;Sentence Validation: Use GAN The model discriminates the reconstructed visual features .
Detailed design
- Core design : Feature fusion / The way of aggregation ( In the picture )

ps: I feel a little attention The smell of
1. Multiple Feature Extraction
- Routine treatment , Generally used 2D-CNN extract appearance(frame-level) features V a V^a Va,3D-CNN extract motion features V m V^m Vm,R-CNN extract region(object) features R R R
2. Enhanced Object Proposal
- take region feature Aggregate to motion feature and appearance feature in . Use GNN Each one region feature As a node.

Force an explanation according to the formula : v a v^a va With all the region feature All have sides connected , So it aggregates all region feature Characteristics of
here Ψ Ψ Ψ and Φ Φ Φ All are Linear function Then followed a Tanh Activate . v ^ t m \hat v_t^m v^tm The calculation of is similar to
3. Visual Knowledge
- Mainly in the Graph Introduced some new nodes (latent nodes), Aggregate the above information to generate K Candidates object visual words and K individual motion visual words( The calculation is similar to )

4. Discriminative Language Validation
- In order to make the generated caption Have better semantic information (semantic concepts). The author generates captions restructure P o P^o Po and P m P^m Pm, Then a discriminator is used to distinguish the reconstructed visual features P ^ o , P ^ m \hat P^o,\hat P^m P^o,P^m And real signs P o , P m P^o, P^m Po,Pm.
- The specific implementation will be generated caption Through some 1D CNN+ residual Layer get sentence feature S S S, And then let P o P^o Po“ polymerization ” S S S Characteristics of

- Give the generated visual features P ^ o \hat P^o P^o And real visual features P o P^o Po Scoring , Treat it as a pair, Similar to calculating their similarity


- The output score of discriminant model ( Learn to give low marks to generative features , High scores for real characteristics )

- Discriminant model Loss( The latter is the regularization term )

- The loss of generative models

Code
边栏推荐
- PostgreSQL varchar[] array type operation
- axure不显示元件库
- Boost the development of digital economy and consolidate the base of digital talents - the digital talent competition was successfully held in Kunming
- Chinese and English instructions human soluble advanced glycation end products receptor (sRAGE) ELISA Kit
- Nacos configuration file publishing failed, please check whether the parameters are correct solution
- Summary of SQL query de duplication statistics methods
- Write it down once Net travel management background CPU Explosion Analysis
- Once the SQL is optimized, the database query speed is increased by 60 times
- Solidity - contract structure - error - ^0.8.4 NEW
- Contos 7 搭建sftp之创建用户、用户组以及删除用户
猜你喜欢

Case sharing: basic networking configuration of QinQ

测试自学人必看:软件测试如何找测试项目?

Dlib+opencv library for fatigue detection

如何正确使用Vertx操作Redis(3.9.4带源码分析)

Once the SQL is optimized, the database query speed is increased by 60 times

Lumiprobe 亚磷酰胺丨六甘醇亚磷酰胺说明书

Bao, what if the O & M 100+ server is a headache? Use Xingyun housekeeper!

Lake shore M91 fast hall measuring instrument
![[live broadcast appointment] database obcp certification comprehensive upgrade open class](/img/38/1ec382d0edda83d4052868255af9ea.jpg)
[live broadcast appointment] database obcp certification comprehensive upgrade open class

机械设备行业数字化供应链集采平台解决方案:优化资源配置,实现降本增效
随机推荐
Netease games, radical going to sea
Intensive cultivation of channels for joint development Fuxin and Weishi Jiajie held a new product training conference
[live broadcast appointment] database obcp certification comprehensive upgrade open class
论文阅读【Learning to Discretely Compose Reasoning Module Networks for Video Captioning】
Transform + ASM data
Learn MySQL from scratch - database and data table operations
Lumiprobe 细胞成像研究丨PKH26细胞膜标记试剂盒
使用环信提供的uni-app Demo,快速实现一对一单聊
Team up to learn! 14 days of Hongmeng equipment development "learning, practicing and testing" practical camp, free of charge!
[pytorch record] automatic hybrid accuracy training torch cuda. amp
[to.Net] C set class source code analysis
Parallelism, concurrency and life cycle of threads
Lumiprobe 亚磷酰胺丨六甘醇亚磷酰胺说明书
ubuntu14安装MySQL并配置root账户本地与远程访问
【Go ~ 0到1 】 第五天 7月1 类型别名,自定义类型,接口,包与初始化函数
Junit单元测试框架详解
sql查询去重统计的方法总结
nacos启动失败问题解决与总结
学习笔记【gumbel softmax】
Love business in Little Red Book