当前位置:网站首页>Thesis reading [distinctive late semantic graph for video capturing]
Thesis reading [distinctive late semantic graph for video capturing]
2022-07-01 19:31:00 【hei_ hei_ hei_】
Discriminative Latent Semantic Graph for Video Captioning
List of articles
Summary
- publish :ACM MultiMedia 2021
- Code :D-LSG
- idea: In order to strengthen object-level interactions and frame-level information( In fact, it is for the commonly used features after processing :2D-CNN、3D-CNN、R-CNN), The author is divided into three main parts :Enhanced Object Proposal: Use Graph Integrate the characteristics of time and space into latent object in ;Visual Knowledge: Aggregate the above features in latent nodes And used to predict semantic words;Sentence Validation: Use GAN The model discriminates the reconstructed visual features .
Detailed design
- Core design : Feature fusion / The way of aggregation ( In the picture )
ps: I feel a little attention The smell of
1. Multiple Feature Extraction
- Routine treatment , Generally used 2D-CNN extract appearance(frame-level) features V a V^a Va,3D-CNN extract motion features V m V^m Vm,R-CNN extract region(object) features R R R
2. Enhanced Object Proposal
- take region feature Aggregate to motion feature and appearance feature in . Use GNN Each one region feature As a node.
Force an explanation according to the formula : v a v^a va With all the region feature All have sides connected , So it aggregates all region feature Characteristics of
here Ψ Ψ Ψ and Φ Φ Φ All are Linear function Then followed a Tanh Activate . v ^ t m \hat v_t^m v^tm The calculation of is similar to
3. Visual Knowledge
- Mainly in the Graph Introduced some new nodes (latent nodes), Aggregate the above information to generate K Candidates object visual words and K individual motion visual words( The calculation is similar to )
4. Discriminative Language Validation
- In order to make the generated caption Have better semantic information (semantic concepts). The author generates captions restructure P o P^o Po and P m P^m Pm, Then a discriminator is used to distinguish the reconstructed visual features P ^ o , P ^ m \hat P^o,\hat P^m P^o,P^m And real signs P o , P m P^o, P^m Po,Pm.
- The specific implementation will be generated caption Through some 1D CNN+ residual Layer get sentence feature S S S, And then let P o P^o Po“ polymerization ” S S S Characteristics of
- Give the generated visual features P ^ o \hat P^o P^o And real visual features P o P^o Po Scoring , Treat it as a pair, Similar to calculating their similarity
- The output score of discriminant model ( Learn to give low marks to generative features , High scores for real characteristics )
- Discriminant model Loss( The latter is the regularization term )
- The loss of generative models
Code
边栏推荐
- Download (export) PDF template file (such as approval form), and report error: invalid nested tag * * * found, expected closing tag***
- English语法_形容词/副词3级 -注意事项
- nacos配置文件发布失败,请检查参数是否正确的解决方案
- Transform + ASM data
- [go ~ 0 to 1] day 5 July 1 type alias, custom type, interface, package and initialization function
- What must be done in graduation season before going to Shanhai
- MFC中如何重绘CListCtrl的表头
- 论文阅读【Learning to Discretely Compose Reasoning Module Networks for Video Captioning】
- M91快速霍尔测量仪—在更短的时间内进行更好的测量
- Solution of intelligent supply chain management platform in aquatic industry: support the digitalization of enterprise supply chain and improve enterprise management efficiency
猜你喜欢
The former 4A executives engaged in agent operation and won an IPO
Bao, what if the O & M 100+ server is a headache? Use Xingyun housekeeper!
微信公众号开发相关流程及功能介绍
研究了11种实时聊天软件,我发现都具备这些功能…
The market value evaporated by 74billion yuan, and the big man turned and entered the prefabricated vegetables
精耕渠道共謀發展 福昕攜手偉仕佳傑開展新產品培訓大會
sql查询去重统计的方法总结
Solution of digital supply chain centralized purchase platform in mechanical equipment industry: optimize resource allocation and realize cost reduction and efficiency increase
任务:拒绝服务DoS
Facebook聊单,SaleSmartly有妙招!
随机推荐
Facebook聊单,SaleSmartly有妙招!
241. Different Ways to Add Parentheses
【英语语法】Unit1 冠词、名词、代词和数词
Lake Shore—OptiMag 超导磁体系统 — OM 系列
The intelligent epidemic prevention system provides safety guarantee for the resumption of work and production at the construction site
Chaos engineering platform chaosblade box new heavy release
Contos 7 set up SFTP to create users, user groups, and delete users
[English grammar] Unit1 articles, nouns, pronouns and numerals
Instagram 为何从内容共享平台变成营销工具?独立站卖家如何利用该工具?
Lake shore optimag superconducting magnet system om series
[go ~ 0 to 1] day 5 July 1 type alias, custom type, interface, package and initialization function
nacos配置文件发布失败,请检查参数是否正确的解决方案
Lake Shore M91快速霍尔测量仪
白盒加密技术浅理解
C-end dream is difficult to achieve. What does iFLYTEK rely on to support the goal of 1billion users?
Learn MySQL from scratch - database and data table operations
生鲜行业B2B电商平台解决方案,提高企业交易流程标准化和透明度
Solution of digital supply chain centralized purchase platform in mechanical equipment industry: optimize resource allocation and realize cost reduction and efficiency increase
Intensive cultivation of channels for joint development Fuxin and Weishi Jiajie held a new product training conference
测试自学人必看:软件测试如何找测试项目?