Why Honor of Kings is the Ideal Competition Arena for AI Reinforcement Learning
Developed by Tencent Games’ TiMi Studio Group, the mobile game has been among the most popular multiplayer online battle arena (MOBA) games since 2015.
While being one of the most popular mobile games of all time, what is lesser-known is that Honor of Kings is often used as an ideal testbed for AI research in the games industry.
The second AI Arena Multi-agent Reinforcement Learning Competition in China, which ended in April, saw vibrant groups of student developers who built reinforcement learning (RL)-based AI algorithms that can be used to play the HOK autonomously.
The winning team, composed of five students from Tsinghua University, said the theoretical capabilities of the RL model were not as easy as imagined when put into practice.
“In the beginning, we couldn’t even set up the game environment, let alone train the AI agent to play games,” team member Chen Huayu said, adding that his fellow team members were already keen players of HOK.
Their victory came after spending five months processing the source code and exploring the architecture to compete against 19 teams from top Chinese universities.
Rising to the Challenge with RL
Like human players, computer game agents are becoming more intelligent as they experience new behaviors and process the appropriate sequence of actions. The technology behind is the RL, a type of machine learning paradigm where developers reward behaviors they wish the AI to manifest and the program trains or learns itself by performing the necessary actions to achieve the desired behavior or outcome.
In the past, board games like chess and Go were testbeds of deep RL algorithms. One of the most famous examples is AlphaGo, a computer program developed by Google’s subsidiary DeepMind Technologies. In 2016, AlphaGo played Go against the legendary Lee Sedol, winner of 18 world titles, and won a 4-1 victory.
Fast-forward six years, the focus of game AI research has shifted from board games to more complex, non-perfect information games and strategic video games.
As a 5V5 multiplayer online battle game (MOBA) game, Honor of Kings is highly complex and challenging and requires extensive collaborations among players. This makes it the ideal environment for AI research and development.
During this period, the team from Tsinghua was languishing in the rankings, as others pulled ahead. Sometimes, there were only minor improvements after days of training, or worse, the model would suddenly collapse, and the team had to start again.
“It’s a grinding, slow and boring process that was frustrating at first,” says Chen, adding that then something clicked.
“We thought a lot about what points might lead to mistakes, and made adjustments little by little to the algorithm. Suddenly things got better.” Chen had designed AI agents to learn through countless reiterations. By encountering a scenario thousands of times, the agent was able to calculate the winning percentage of various options and finally choose the optimal solution.
Solving Real-life Problems
The more complex the environment, the smarter the AI can be trained. Will Yang, General Manager of Tencent AI Lab, explains that games provide an optimal R&D environment with clear goals and indicators that are easier to test and iterate.
“Whether it is through the provision of data by human testers or the data generated by AI’s own battles, the data is obtained directly in the virtual world, without the need for additional sensors and processors,” Yang said, noting that the paradigm can also be applied to solve real-life problems more efficiently.
Yang added that if AI can learn to perceive, analyze, understand, reason, make decisions and act in real time like people in complex games such as HOK, that hints at its greater potential in solving problems in a wide range of fields including robotics, agriculture, transport, and energy.
Tencent AI Arena’s combined strength has allowed the competition to become a platform that brings together the industry players, academia and research institutions.
The Winning Formula
Chen said that a clear division of labor, team members’ engineering capabilities, and fully automating the deployment of agents are the keys to a successful RL development in this competition.
“When something goes wrong with our agent, we know which part of the algorithm is at fault, and our team’s extensive engineering experience helps us find and solve problems faster.”
In the last two months of the competition, Chen’s team was able to fully automate the deployment of agents, even to the point of the AI being able to select the best-performing agent. This allowed them to be more efficient and conduct more experiments than other teams.
On top of that, each member was allocated with specific tasks. Chen was in charge of designing algorithms, while one of his teammates kept records and tracked the experiment. One student managed neural network models, and another was responsible for engineering, testing and optimization.
The Next-generation of AI Pioneers
Chen led a new team to participate in the 31st FISU World University Games’ Digital Intelligence Competition in March. AI Arena Multi-Agent Competition Track, organized by Tencent, is part of the tournament.
The competition is attractive for students and researchers for several reasons, says Chen.
“The use of advanced intelligent algorithms requires a lot of computing power, which is beyond what individual students and even many university laboratories can afford,” Chen said.
The large scale of HOK means that it would take years for an individual researcher to run tests that have been undertaken, and the costs can be prohibitive. By participating in the challenge, teams have access to powerful computing resources and cloud services provided by Tencent AI Arena.
For such reasons, the competition has become a driving force behind building a new ecosystem of industry, academia, and research institutions to collaborate with students from around the world, including Canada, the Netherlands, Australia, the United States, and China including Hong Kong. It’s not just an opportunity to compete against and connect with other leading universities around the world, but to build relationships with other student developers.
Jackie Huang, general manager of Honor of Kings at TiMi Studio Group, says that “we use HOK in the field of AI and e-sports to build a youthful and energetic student digital intelligence competition exchange platform.” With students from different countries and regions taking part, the competitions help connect talents globally and promote the development of AI research in the game industry.
“HOK hopes to promote the symbiotic development model where education, competition and scientific research become the three links of AI industry development,” says Huang.