当前位置:网站首页>It is the most difficult to teach AI to play iron fist frame by frame. Now arcade game lovers have something
It is the most difficult to teach AI to play iron fist frame by frame. Now arcade game lovers have something
2022-07-03 23:43:00 【Zhiyuan community】
Bowen From the Aofei temple
qubits | official account QbitAI
current AI Have begun to learn to rub the move frame by frame to play the arcade ?
《 The king of Fighters 98》、《 Street Fighter 》、《 Death or life 》…… I've played with all my childhood memories , There are also claims to fight 5000 site To get started 3D Fighting games 《 An iron fist 》:

you 're right , It's the one who is very unfriendly to novices 《 An iron fist 》, Casually pull a character's rubbing table to feel the complexity :
( you 're right , Various framing decisions JF Technology is one of its characteristics )

△ An iron fist TT2 List of moves
but AI It happens that you can pass the customs before long after entering the pit The highest difficulty :

△ On the left AI
This kind of AI Behind it is a personal developer , He is also a hardcore arcade game enthusiast .
He trained novices “ Blacksmith ” stay Reddit Of “ Game recording ” There has been a near 500 The heat of the :

Reinforcement learning and training framework
the AI Behind the blacksmith , It's called DIAMBRA Arena Reinforcement learning interaction framework .
DIAMBRA Arena It provides multiple reinforcement learning research and experimental environments , Episodic reinforcement learning tasks , By discrete actions ( Such as joystick buttons ) And pixels and data in the screen ( Such as human blood bar ) form .
In this framework , Intelligent experience sends an action to the environment , The environment treats it , And correspondingly transform a starting state into a new state , Then return the observation and reward to the agent , With this interactive loop :

The code to implement the above loop is also very simple :
import diambraArena
# Mandatory settings
settings = {}
settings["gameId"] = "doapp" # Game selection
settings["romsPath"] = "/path/to/roms/" # Path to roms folder
env = diambraArena.make("TestEnv", settings)
observation = env.reset()
while True:
actions = env.action_space.sample()
observation, reward, done, info = env.step(actions)
if done:
observation = env.reset()
break
env.close()
This framework currently supports Linux、Windows、MacOS And other mainstream operating systems .
which AI Of “ Real battlefield ” For the early Tekken Tag Tournament, Of course , The complexity of rubbing moves is not inferior to that of the later new version ……

Developers chose fengjianren (Jin) Hejiguang (Yoshimitsu) Two representative roles are the main operation objects .
The inputs are : from RGB Convert to grayscale , And shrink to 128 x 128px Pixel value of the game screen 、 To the number of battles (Stage)、 Character blood bar 、 One side of the game interface .
The reward in training is a function based on health , If the opponent's HP is damaged, he will get a positive reward , The loss of health of the character controlled by your side will be negatively punished .
meanwhile ,AI The action rate of is also limited to the maximum rate 1/10, That is, every 6 Step send an action .
Because the framework uses a discrete action space , therefore , An agent can only choose one movement at a time during training 9 Up , Wait down ) Or attack ( impact , kick , Punch ).
therefore , Although a combo combination has stronger actual combat ability , But because of AI Cannot click two actions at the same time , In the real battle , It will appear. AI Use kicks frequently (Kick) And changing roles (swap) Two actions :

There are senior in the comment area PVP Fans say , Want to see this top AI Fierce scenes of players' mutual abuse , The developer himself agrees with this :
We are creating a platform , On this platform , Programmers will submit their well-trained AI And confront each other , And broadcast the game on our channel .

AI The tournament
Now? , The developer team has begun to formally prepare for this “AI Game Championship ”, The programmers and developers behind it are quite so “ The coach ” perhaps “ Parents of contestants ”, The final winner can get 1400 Swiss francs ( Renminbi conversion 9261 element ).

“ event ” It's not just iron fist , Developers say , The underlying mechanisms of these fighting games are similar , Just modify the combination skill 、 The difference attribute of games such as character blood bar value .
therefore , their DIAMBRA Arena Framework for all kinds of arcade video games to provide full compliance OpenAI Gym The standard Python API.
Like death or life 、 Street fighter and many other popular arcade games have been included :

GitHub link :
https://github.com/diambra/diambraArena
Video link :
https://www.youtube.com/watch?v=9HAKEjhIfJY
Reference link :
[1]https://www.reddit.com/r/reinforcementlearning/comments/sq1s3f/deep_reinforcement_learning_algorithm_completing/
[2]https://www.reddit.com/r/MachineLearning/comments/sqra1n/p_deep_reinforcement_learning_algorithm/
边栏推荐
- Minimum commission for stock account opening. Stock account opening is free. Is online account opening safe
- 2/14 (regular expression, sed streaming editor)
- Docking Alipay process [pay in person, QR code Payment]
- 股票开户佣金最低的券商有哪些大家推荐一下,手机上开户安全吗
- Actual combat | use composite material 3 in application
- Gossip about redis source code 77
- 在恒泰证券开户怎么样?安全吗?
- Ppt image processing
- Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
- 2022 t elevator repair registration examination and the latest analysis of T elevator repair
猜你喜欢

Sort merge sort
![[source code] VB6 chat robot](/img/89/46b67f627c8257eaddc70a247c9ba5.jpg)
[source code] VB6 chat robot

How to understand the gain bandwidth product operational amplifier gain

Interpretation of corolla sub low configuration, three cylinder power configuration, CVT fuel saving and smooth, safety configuration is in place

Idea integrates Microsoft TFs plug-in

Scratch uses runner Py run or debug crawler

MLX90614 driver, function introduction and PEC verification

Report on the construction and development mode and investment mode of sponge cities in China 2022-2028

How to prevent malicious crawling of information by one-to-one live broadcast source server

Alibaba cloud container service differentiation SLO hybrid technology practice
随机推荐
Recursive least square adjustment
Learning methods of zynq
Report on the construction and development mode and investment mode of sponge cities in China 2022-2028
Hcip 13th day notes
Scratch uses runner Py run or debug crawler
The interviewer's biggest lie to deceive you, bypassing three years of less struggle
D29:post Office (post office, translation)
QT creator source code learning note 05, how does the menu bar realize plug-in?
URLEncoder. Encode and urldecoder Decode processing URL
Investment demand and income forecast report of China's building ceramics industry, 2022-2028
What is the difference between NFT, SFT and dnft? How to build NFT platform applications?
Alibaba cloud container service differentiation SLO hybrid technology practice
Weekly leetcode - nc9/nc56/nc89/nc126/nc69/nc120
MLX90614 driver, function introduction and PEC verification
Schematic diagram of crystal oscillator clock and PCB Design Guide
Analysis on the scale of China's smart health industry and prediction report on the investment trend of the 14th five year plan 2022-2028 Edition
Hcip day 14 notes
D27:mode of sequence (maximum, translation)
Pytorch learning notes 5: model creation
Unity shader visualizer shader graph