当前位置:网站首页>VPT模型视频讲解
VPT模型视频讲解
2022-06-27 10:32:00 【智源社区】
Minecraft is one of the harder challenges any RL agent could face. Episodes are long, and the world is procedurally generated, complex, and huge. Further, the action space is a keyboard and a mouse, which has to be operated only given the game's video input. OpenAI tackles this challenge using Video PreTraining, leveraging a small set of contractor data in order to pseudo-label a giant corpus of scraped footage of gameplay. The pre-trained model is highly capable in basic game mechanics and can be fine-tuned much better than a blank slate model. This is the first Minecraft agent that achieves the elusive goal of crafting a diamond pickaxe all by itself.
OUTLINE:
0:00 - Intro
3:50 - How to spend money most effectively?
8:20 - Getting a large dataset with labels
14:40 - Model architecture
19:20 - Experimental results and fine-tuning
25:40 - Reinforcement Learning to the Diamond Pickaxe
30:00 - Final comments and hardware
Blog: https://openai.com/blog/vpt/
Paper: https://arxiv.org/abs/2206.11795
Code & Model weights: https://github.com/openai/Video-Pre-Training
边栏推荐
- Future & CompletionService
- C语言学习-Day_04
- 并发,并行,异步,同步,多线程,互斥的概念
- [cloud enjoys freshness] community weekly · vol.68- Huawei cloud recruits partners in the field of industrial intelligence to provide strong support + business realization
- Leetcode 729. 我的日程安排表 I(牛逼,已解决)
- Product strength benchmarking seal /model 3, with 179800 pre-sales of Chang'an dark blue sl03
- Leetcode 729. 我的日程安排表 I(提供一种思路)
- 【HCIE-RS复习思维导图】- STP
- Privacy computing fat offline prediction
- 嵌入式软件架构设计-模块化
猜你喜欢

前馈-反馈控制系统设计(过程控制课程设计matlab/simulink)

Feedforward feedback control system design (process control course design matlab/simulink)

Win10 shortcut key sorting

C语言学习-Day_06

Une compréhension facile de la simplicité de la classification bayésienne du lissage laplacien

【TcaplusDB知识库】Tmonitor单机安装指引介绍(二)

KDD 2022 | 基于分层图扩散学习的癫痫波预测

C language learning day_ 06

直播電子商務應用程序開發需要什麼基本功能?未來發展前景如何?

Easy to understand Laplace smoothing of naive Bayesian classification
随机推荐
KDD 2022 | 基于分层图扩散学习的癫痫波预测
21:第三章:开发通行证服务:4:进一步完善【发送短信,接口】;(在【发送短信,接口】中,调用阿里云短信服务和redis服务;一种设计思想:BaseController;)
C语言学习-Day_06
What basic functions are required for live e-commerce application development? What is the future development prospect?
Go zero micro Service Practice Series (VII. How to optimize such a high demand)
直播电子商务应用程序开发需要什么基本功能?未来发展前景如何?
mysql数据库汉字模糊查询出现异常
使用Karmada实现Helm应用的跨集群部署【云原生开源】
audiotrack与audioflinger
在外企远程办公是什么体验? | 社区征文
Arduino PROGMEM静态存储区的使用介绍
三层架构中,数据库的设计在哪一层实现,不是在数据存储层吗?
20 jeunes Pi recrutés par l'Institut de microbiologie de l'Académie chinoise des sciences, 2 millions de frais d'établissement et 10 millions de fonds de démarrage (à long terme)
Red envelope rain: a wonderful encounter between redis and Lua
If you find any loopholes later, don't tell China!
Change PIP mirror source
邮件系统(基于SMTP协议和POP3协议-C语言实现)
C语言学习-Day_04
Frequently asked questions about closures
leetcode:968. Monitor the binary tree [tree DP, maintain the three states of each node's subtree, it is very difficult to think of the right as a learning, analogous to the house raiding 3]