当前位置:网站首页>Thesis reading [distinctive late semantic graph for video capturing]
Thesis reading [distinctive late semantic graph for video capturing]
2022-07-01 19:31:00 【hei_ hei_ hei_】
Discriminative Latent Semantic Graph for Video Captioning
List of articles
Summary
- publish :ACM MultiMedia 2021
- Code :D-LSG
- idea: In order to strengthen object-level interactions and frame-level information( In fact, it is for the commonly used features after processing :2D-CNN、3D-CNN、R-CNN), The author is divided into three main parts :Enhanced Object Proposal: Use Graph Integrate the characteristics of time and space into latent object in ;Visual Knowledge: Aggregate the above features in latent nodes And used to predict semantic words;Sentence Validation: Use GAN The model discriminates the reconstructed visual features .
Detailed design
- Core design : Feature fusion / The way of aggregation ( In the picture )

ps: I feel a little attention The smell of
1. Multiple Feature Extraction
- Routine treatment , Generally used 2D-CNN extract appearance(frame-level) features V a V^a Va,3D-CNN extract motion features V m V^m Vm,R-CNN extract region(object) features R R R
2. Enhanced Object Proposal
- take region feature Aggregate to motion feature and appearance feature in . Use GNN Each one region feature As a node.

Force an explanation according to the formula : v a v^a va With all the region feature All have sides connected , So it aggregates all region feature Characteristics of
here Ψ Ψ Ψ and Φ Φ Φ All are Linear function Then followed a Tanh Activate . v ^ t m \hat v_t^m v^tm The calculation of is similar to
3. Visual Knowledge
- Mainly in the Graph Introduced some new nodes (latent nodes), Aggregate the above information to generate K Candidates object visual words and K individual motion visual words( The calculation is similar to )

4. Discriminative Language Validation
- In order to make the generated caption Have better semantic information (semantic concepts). The author generates captions restructure P o P^o Po and P m P^m Pm, Then a discriminator is used to distinguish the reconstructed visual features P ^ o , P ^ m \hat P^o,\hat P^m P^o,P^m And real signs P o , P m P^o, P^m Po,Pm.
- The specific implementation will be generated caption Through some 1D CNN+ residual Layer get sentence feature S S S, And then let P o P^o Po“ polymerization ” S S S Characteristics of

- Give the generated visual features P ^ o \hat P^o P^o And real visual features P o P^o Po Scoring , Treat it as a pair, Similar to calculating their similarity


- The output score of discriminant model ( Learn to give low marks to generative features , High scores for real characteristics )

- Discriminant model Loss( The latter is the regularization term )

- The loss of generative models

Code
边栏推荐
- Helium transmission line of lake shore cryostat
- sql查询去重统计的方法总结
- Netease games, radical going to sea
- Cdga | if you are engaged in the communication industry, you should get a data management certificate
- Methods of finding various limits
- 学习笔记-JDBC连接数据库操作的步骤
- Lumiprobe 活性染料丨吲哚菁绿说明书
- 下载(导出)pdf模板文件(比如:审批单),报错:Invalid nested tag *** found, expected closing tag ***
- SuperVariMag 超导磁体系统 — SVM 系列
- 制造业SRM管理系统供应商全方位闭环管理,实现采购寻源与流程高效协同
猜你喜欢

Lake Shore低温恒温器的氦气传输线

The intelligent epidemic prevention system provides safety guarantee for the resumption of work and production at the construction site

sql查询去重统计的方法总结

Instagram 为何从内容共享平台变成营销工具?独立站卖家如何利用该工具?

June issue | antdb database participated in the preparation of the "Database Development Research Report" and appeared on the list of information technology and entrepreneurship industries

Superoptimag superconducting magnet system - SOM, Som2 series

Cdga | if you are engaged in the communication industry, you should get a data management certificate

Lake shore optimag superconducting magnet system om series

Lumiprobe 活性染料丨吲哚菁绿说明书

More information about M91 fast hall measuring instrument
随机推荐
Case sharing: basic networking configuration of QinQ
How to redraw the header of CListCtrl in MFC
[go ~ 0 to 1] day 4 June 30 defer, structure, method
[pytorch record] distributed training dataparallel and distributeddataparallel of the model
简版拼多多商品数据
【pytorch记录】模型的分布式训练DataParallel、DistributedDataParallel
ddr4测试-2
如何正确使用Vertx操作Redis(3.9.4带源码分析)
从零开始学 MySQL —数据库和数据表操作
Lake Shore—CRX-EM-HF 型低温探针台
Boost the development of digital economy and consolidate the base of digital talents - the digital talent competition was successfully held in Kunming
赋能「新型中国企业」,SAP Process Automation 落地中国
CMU AI PhD 第一年总结
Intensive cultivation of channels for joint development Fuxin and Weishi Jiajie held a new product training conference
Contos 7 搭建sftp之创建用户、用户组以及删除用户
Team up to learn! 14 days of Hongmeng equipment development "learning, practicing and testing" practical camp, free of charge!
【To .NET】C#集合类源码解析
Redis 实现限流的三种方式
Love business in Little Red Book
Three ways for redis to realize current limiting