当前位置:网站首页>It is the most difficult to teach AI to play iron fist frame by frame. Now arcade game lovers have something
It is the most difficult to teach AI to play iron fist frame by frame. Now arcade game lovers have something
2022-07-03 23:43:00 【Zhiyuan community】
Bowen From the Aofei temple
qubits | official account QbitAI
current AI Have begun to learn to rub the move frame by frame to play the arcade ?
《 The king of Fighters 98》、《 Street Fighter 》、《 Death or life 》…… I've played with all my childhood memories , There are also claims to fight 5000 site To get started 3D Fighting games 《 An iron fist 》:
you 're right , It's the one who is very unfriendly to novices 《 An iron fist 》, Casually pull a character's rubbing table to feel the complexity :
( you 're right , Various framing decisions JF Technology is one of its characteristics )
△ An iron fist TT2 List of moves
but AI It happens that you can pass the customs before long after entering the pit The highest difficulty :
△ On the left AI
This kind of AI Behind it is a personal developer , He is also a hardcore arcade game enthusiast .
He trained novices “ Blacksmith ” stay Reddit Of “ Game recording ” There has been a near 500 The heat of the :
Reinforcement learning and training framework
the AI Behind the blacksmith , It's called DIAMBRA Arena Reinforcement learning interaction framework .
DIAMBRA Arena It provides multiple reinforcement learning research and experimental environments , Episodic reinforcement learning tasks , By discrete actions ( Such as joystick buttons ) And pixels and data in the screen ( Such as human blood bar ) form .
In this framework , Intelligent experience sends an action to the environment , The environment treats it , And correspondingly transform a starting state into a new state , Then return the observation and reward to the agent , With this interactive loop :
The code to implement the above loop is also very simple :
import diambraArena
# Mandatory settings
settings = {}
settings["gameId"] = "doapp" # Game selection
settings["romsPath"] = "/path/to/roms/" # Path to roms folder
env = diambraArena.make("TestEnv", settings)
observation = env.reset()
while True:
actions = env.action_space.sample()
observation, reward, done, info = env.step(actions)
if done:
observation = env.reset()
break
env.close()
This framework currently supports Linux、Windows、MacOS And other mainstream operating systems .
which AI Of “ Real battlefield ” For the early Tekken Tag Tournament, Of course , The complexity of rubbing moves is not inferior to that of the later new version ……
Developers chose fengjianren (Jin) Hejiguang (Yoshimitsu) Two representative roles are the main operation objects .
The inputs are : from RGB Convert to grayscale , And shrink to 128 x 128px Pixel value of the game screen 、 To the number of battles (Stage)、 Character blood bar 、 One side of the game interface .
The reward in training is a function based on health , If the opponent's HP is damaged, he will get a positive reward , The loss of health of the character controlled by your side will be negatively punished .
meanwhile ,AI The action rate of is also limited to the maximum rate 1/10, That is, every 6 Step send an action .
Because the framework uses a discrete action space , therefore , An agent can only choose one movement at a time during training 9 Up , Wait down ) Or attack ( impact , kick , Punch ).
therefore , Although a combo combination has stronger actual combat ability , But because of AI Cannot click two actions at the same time , In the real battle , It will appear. AI Use kicks frequently (Kick) And changing roles (swap) Two actions :
There are senior in the comment area PVP Fans say , Want to see this top AI Fierce scenes of players' mutual abuse , The developer himself agrees with this :
We are creating a platform , On this platform , Programmers will submit their well-trained AI And confront each other , And broadcast the game on our channel .
AI The tournament
Now? , The developer team has begun to formally prepare for this “AI Game Championship ”, The programmers and developers behind it are quite so “ The coach ” perhaps “ Parents of contestants ”, The final winner can get 1400 Swiss francs ( Renminbi conversion 9261 element ).
“ event ” It's not just iron fist , Developers say , The underlying mechanisms of these fighting games are similar , Just modify the combination skill 、 The difference attribute of games such as character blood bar value .
therefore , their DIAMBRA Arena Framework for all kinds of arcade video games to provide full compliance OpenAI Gym The standard Python API.
Like death or life 、 Street fighter and many other popular arcade games have been included :
GitHub link :
https://github.com/diambra/diambraArena
Video link :
https://www.youtube.com/watch?v=9HAKEjhIfJY
Reference link :
[1]https://www.reddit.com/r/reinforcementlearning/comments/sq1s3f/deep_reinforcement_learning_algorithm_completing/
[2]https://www.reddit.com/r/MachineLearning/comments/sqra1n/p_deep_reinforcement_learning_algorithm/
边栏推荐
- Gossip about redis source code 78
- Smart fan system based on stm32f407
- [MySQL] classification of multi table queries
- SPI based on firmware library
- D26: the nearest number (translation + solution)
- D30:color tunnels (color tunnels, translation)
- [untitled]
- Idea set class header comments
- What are the securities companies with the lowest Commission for stock account opening? Would you recommend it? Is it safe to open an account on your mobile phone
- Gossip about redis source code 83
猜你喜欢
Idea integrates Microsoft TFs plug-in
2022 Guangdong Provincial Safety Officer a certificate third batch (main person in charge) simulated examination and Guangdong Provincial Safety Officer a certificate third batch (main person in charg
Sort merge sort
What is the Valentine's Day gift given by the operator to the product?
Hcip 13th day notes
Qtoolbutton available signal
Idea set class header comments
The difference between single power amplifier and dual power amplifier
How to quickly build high availability of service discovery
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
随机推荐
SPI based on firmware library
How to quickly build high availability of service discovery
Tencent interview: can you find the number of 1 in binary?
Pandaoxi's video
2022.02.13
Day30-t540-2022-02-14-don't answer by yourself
股票开户最低佣金炒股开户免费,网上开户安全吗
Alibaba cloud container service differentiation SLO hybrid technology practice
Weekly leetcode - nc9/nc56/nc89/nc126/nc69/nc120
The difference between single power amplifier and dual power amplifier
Ningde times and BYD have refuted rumors one after another. Why does someone always want to harm domestic brands?
Actual combat | use composite material 3 in application
Fudan 961 review
In VS_ In 2019, scanf and other functions are used to prompt the error of unsafe functions
Powerful blog summary
2/14 (regular expression, sed streaming editor)
[BSP video tutorial] stm32h7 video tutorial phase 5: MDK topic, system introduction to MDK debugging, AC5, AC6 compilers, RTE development environment and the role of various configuration items (2022-
FPGA tutorial and Allegro tutorial - link
2022 chemical automation control instrument examination content and chemical automation control instrument simulation examination
Op amp related - link