当前位置:网站首页>论文阅读【Discriminative Latent Semantic Graph for Video Captioning】
论文阅读【Discriminative Latent Semantic Graph for Video Captioning】
2022-07-01 18:44:00 【hei_hei_hei_】
Discriminative Latent Semantic Graph for Video Captioning
文章目录
概要
- 发表:ACM MultiMedia 2021
- 代码:D-LSG
- idea:为了加强object-level interactions和frame-level information(其实是为了常用的处理后的特征:2D-CNN、3D-CNN、R-CNN),作者主要分为三部分主要工作:Enhanced Object Proposal:使用Graph将时空上的特征融合到 latent object中;Visual Knowledge:聚合上述特征于 latent nodes 中并用来预测 semantic words;Sentence Validation:使用GAN模型对重构的视觉特征进行判别。
详细设计
- 核心设计:特征融合/聚合方式(在图中)

ps:感觉有点attention的味道
1. Multiple Feature Extraction
- 常规处理,一般都会用2D-CNN提取appearance(frame-level)特征 V a V^a Va,3D-CNN提取motion特征 V m V^m Vm,R-CNN提取region(object)特征 R R R
2. Enhanced Object Proposal
- 将 region feature 分别聚合到 motion feature 和 appearance feature 中。使用GNN将每个region feature都视为一个node。

根据公式强行解释: v a v^a va与所有region feature都有边相连,所以聚合了所有region feature的特征
这里 Ψ Ψ Ψ和 Φ Φ Φ都是Linear function之后跟了一个Tanh激活。 v ^ t m \hat v_t^m v^tm的计算类似
3. Visual Knowledge
- 主要是在Graph引入了一些新的节点(latent nodes),聚合上述信息分别生成K个候选object visual words和K个motion visual words(计算类似)

4. Discriminative Language Validation
- 为了让生成的caption具有更好的语义方面的信息(semantic concepts)。作者通过从生成的captions重构 P o P^o Po和 P m P^m Pm,然后通过一个判别器进行判别重构的视觉特征 P ^ o , P ^ m \hat P^o,\hat P^m P^o,P^m和真实的征 P o , P m P^o, P^m Po,Pm。
- 具体实现是将生成的caption通过一些1D CNN+残差 的层得到sentence feature S S S,然后让 P o P^o Po“聚合” S S S的特征

- 给生成的视觉特征 P ^ o \hat P^o P^o和真实的视觉特征 P o P^o Po打分,将其视为一个pair,类似于计算他们的相似性


- 判别式模型的输出分数(学习给生成特征低分,真实特征高分)

- 判别式模型Loss(后者是正则化项)

- 生成式模型的损失

代码
边栏推荐
- Lake Shore - crx-em-hf low temperature probe station
- B2B e-commerce platform solution for fresh food industry to improve the standardization and transparency of enterprise transaction process
- How to use the low code platform of the Internet of things for personal settings?
- Docker deploy mysql8.0
- Lake Shore低温恒温器的氦气传输线
- 中英说明书丨人可溶性晚期糖基化终末产物受体(sRAGE)Elisa试剂盒
- M91 fast hall measuring instrument - better measurement in a shorter time
- Chinese and English instructions human soluble advanced glycation end products receptor (sRAGE) ELISA Kit
- Bao, que se passe - t - il si le serveur 100 + O & M a mal à la tête? Utilisez le majordome xingyun!
- PostgreSQL varchar[] array type operation
猜你喜欢

【pytorch记录】自动混合精度训练 torch.cuda.amp
![[live broadcast appointment] database obcp certification comprehensive upgrade open class](/img/38/1ec382d0edda83d4052868255af9ea.jpg)
[live broadcast appointment] database obcp certification comprehensive upgrade open class

Junit单元测试框架详解

Lake Shore M91快速霍尔测量仪

前4A高管搞代运营,拿下一个IPO

Solution of intelligent supply chain management platform in aquatic industry: support the digitalization of enterprise supply chain and improve enterprise management efficiency

Getting started with kubernetes command (namespaces, pods)

Chinese and English instructions human soluble advanced glycation end products receptor (sRAGE) ELISA Kit

网易游戏,激进出海

一次SQL优化,数据库查询速度提升 60 倍
随机推荐
Team up to learn! 14 days of Hongmeng equipment development "learning, practicing and testing" practical camp, free of charge!
Summary of cases of players' disconnection and reconnection in Huawei online battle service
【pytorch记录】模型的分布式训练DataParallel、DistributedDataParallel
一次SQL优化,数据库查询速度提升 60 倍
组队学习! 14天鸿蒙设备开发“学练考”实战营限时免费加入!
Cache problems after app release
AppGallery Connect场景化开发实战—图片存储分享
云服务器ECS夏日省钱秘籍,这次@老用户快来领走
记一次 .NET 差旅管理后台 CPU 爆高分析
Solution of digital supply chain centralized purchase platform in mechanical equipment industry: optimize resource allocation and realize cost reduction and efficiency increase
Learning notes - steps of JDBC connection database operation
助力数字经济发展,夯实数字人才底座—数字人才大赛在昆成功举办
水产行业智能供应链管理平台解决方案:支撑企业供应链数字化,提升企业管理效益
Lake Shore—CRX-EM-HF 型低温探针台
Intensive cultivation of channels for joint development Fuxin and Weishi Jiajie held a new product training conference
PostgreSQL varchar[] array type operation
Solidity - 合约结构 - 错误(error)- ^0.8.4版本新增
ACM mm 2022 video understanding challenge video classification track champion autox team technology sharing
Once the SQL is optimized, the database query speed is increased by 60 times
bean的生命周期核心步骤总结