当前位置:网站首页>It is the most difficult to teach AI to play iron fist frame by frame. Now arcade game lovers have something
It is the most difficult to teach AI to play iron fist frame by frame. Now arcade game lovers have something
2022-07-03 23:43:00 【Zhiyuan community】
Bowen From the Aofei temple
qubits | official account QbitAI
current AI Have begun to learn to rub the move frame by frame to play the arcade ?
《 The king of Fighters 98》、《 Street Fighter 》、《 Death or life 》…… I've played with all my childhood memories , There are also claims to fight 5000 site To get started 3D Fighting games 《 An iron fist 》:
you 're right , It's the one who is very unfriendly to novices 《 An iron fist 》, Casually pull a character's rubbing table to feel the complexity :
( you 're right , Various framing decisions JF Technology is one of its characteristics )
△ An iron fist TT2 List of moves
but AI It happens that you can pass the customs before long after entering the pit The highest difficulty :
△ On the left AI
This kind of AI Behind it is a personal developer , He is also a hardcore arcade game enthusiast .
He trained novices “ Blacksmith ” stay Reddit Of “ Game recording ” There has been a near 500 The heat of the :
Reinforcement learning and training framework
the AI Behind the blacksmith , It's called DIAMBRA Arena Reinforcement learning interaction framework .
DIAMBRA Arena It provides multiple reinforcement learning research and experimental environments , Episodic reinforcement learning tasks , By discrete actions ( Such as joystick buttons ) And pixels and data in the screen ( Such as human blood bar ) form .
In this framework , Intelligent experience sends an action to the environment , The environment treats it , And correspondingly transform a starting state into a new state , Then return the observation and reward to the agent , With this interactive loop :
The code to implement the above loop is also very simple :
import diambraArena
# Mandatory settings
settings = {}
settings["gameId"] = "doapp" # Game selection
settings["romsPath"] = "/path/to/roms/" # Path to roms folder
env = diambraArena.make("TestEnv", settings)
observation = env.reset()
while True:
actions = env.action_space.sample()
observation, reward, done, info = env.step(actions)
if done:
observation = env.reset()
break
env.close()
This framework currently supports Linux、Windows、MacOS And other mainstream operating systems .
which AI Of “ Real battlefield ” For the early Tekken Tag Tournament, Of course , The complexity of rubbing moves is not inferior to that of the later new version ……
Developers chose fengjianren (Jin) Hejiguang (Yoshimitsu) Two representative roles are the main operation objects .
The inputs are : from RGB Convert to grayscale , And shrink to 128 x 128px Pixel value of the game screen 、 To the number of battles (Stage)、 Character blood bar 、 One side of the game interface .
The reward in training is a function based on health , If the opponent's HP is damaged, he will get a positive reward , The loss of health of the character controlled by your side will be negatively punished .
meanwhile ,AI The action rate of is also limited to the maximum rate 1/10, That is, every 6 Step send an action .
Because the framework uses a discrete action space , therefore , An agent can only choose one movement at a time during training 9 Up , Wait down ) Or attack ( impact , kick , Punch ).
therefore , Although a combo combination has stronger actual combat ability , But because of AI Cannot click two actions at the same time , In the real battle , It will appear. AI Use kicks frequently (Kick) And changing roles (swap) Two actions :
There are senior in the comment area PVP Fans say , Want to see this top AI Fierce scenes of players' mutual abuse , The developer himself agrees with this :
We are creating a platform , On this platform , Programmers will submit their well-trained AI And confront each other , And broadcast the game on our channel .
AI The tournament
Now? , The developer team has begun to formally prepare for this “AI Game Championship ”, The programmers and developers behind it are quite so “ The coach ” perhaps “ Parents of contestants ”, The final winner can get 1400 Swiss francs ( Renminbi conversion 9261 element ).
“ event ” It's not just iron fist , Developers say , The underlying mechanisms of these fighting games are similar , Just modify the combination skill 、 The difference attribute of games such as character blood bar value .
therefore , their DIAMBRA Arena Framework for all kinds of arcade video games to provide full compliance OpenAI Gym The standard Python API.
Like death or life 、 Street fighter and many other popular arcade games have been included :
GitHub link :
https://github.com/diambra/diambraArena
Video link :
https://www.youtube.com/watch?v=9HAKEjhIfJY
Reference link :
[1]https://www.reddit.com/r/reinforcementlearning/comments/sq1s3f/deep_reinforcement_learning_algorithm_completing/
[2]https://www.reddit.com/r/MachineLearning/comments/sqra1n/p_deep_reinforcement_learning_algorithm/
边栏推荐
- Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
- C # basic knowledge (1)
- Schematic diagram of crystal oscillator clock and PCB Design Guide
- SPI based on firmware library
- D23:multiple of 3 or 5 (multiple of 3 or 5, translation + solution)
- Selenium check box
- How can I get the Commission discount of stock trading account opening? Is it safe to open an account online
- Pyqt5 sensitive word detection tool production, operator's Gospel
- Gossip about redis source code 75
- [Happy Valentine's day] "I still like you very much, like sin ² a+cos ² A consistent "(white code in the attached table)
猜你喜欢
Scratch uses runner Py run or debug crawler
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
2022.02.13
2022.02.14
Hcip day 15 notes
I wrote a chat software with timeout connect function
QT creator source code learning note 05, how does the menu bar realize plug-in?
Current detection circuit - including op amp current scheme
Kubedl hostnetwork: accelerating the efficiency of distributed training communication
Alibaba cloud container service differentiation SLO hybrid technology practice
随机推荐
Fluent learning (4) listview
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
Generic tips
Les sociétés de valeurs mobilières dont la Commission d'ouverture d'un compte d'actions est la plus faible ont ce que tout le monde recommande.
Selenium library 4.5.0 keyword explanation (III)
Is the controller a single instance or multiple instances? How to ensure the safety of concurrency
Gossip about redis source code 83
QT creator source code learning note 05, how does the menu bar realize plug-in?
Fudan 961 review
Gossip about redis source code 78
Bufferpool caching mechanism for executing SQL in MySQL
NPM script
Day30-t540-2022-02-14-don't answer by yourself
[15th issue] Tencent PCG background development internship I, II and III (OC)
2022 chemical automation control instrument examination content and chemical automation control instrument simulation examination
Pytorch learning notes 5: model creation
Alibaba cloud container service differentiation SLO hybrid technology practice
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
How will the complete NFT platform work in 2022? How about its core functions and online time?
A preliminary study on the middleware of script Downloader