当前位置：网站首页>VPT Model Video Explanation

VPT Model Video Explanation

2022-06-27 10:54:00 【Zhiyuan community】

Minecraft is one of the harder challenges any RL agent could face. Episodes are long, and the world is procedurally generated, complex, and huge. Further, the action space is a keyboard and a mouse, which has to be operated only given the game's video input. OpenAI tackles this challenge using Video PreTraining, leveraging a small set of contractor data in order to pseudo-label a giant corpus of scraped footage of gameplay. The pre-trained model is highly capable in basic game mechanics and can be fine-tuned much better than a blank slate model. This is the first Minecraft agent that achieves the elusive goal of crafting a diamond pickaxe all by itself.

OUTLINE:

0:00 - Intro

3:50 - How to spend money most effectively?

8:20 - Getting a large dataset with labels

14:40 - Model architecture

19:20 - Experimental results and fine-tuning

25:40 - Reinforcement Learning to the Diamond Pickaxe

30:00 - Final comments and hardware

Blog: https://openai.com/blog/vpt/

Paper: https://arxiv.org/abs/2206.11795

Code & Model weights: https://github.com/openai/Video-Pre-Training

原网站