Sounds of Success for Tencent Ethereal Audio Lab

2021.11.24

Simeon Shang is the General Manager of the Tencent Ethereal Audio Lab, which is part of the video conferencing app Tencent Meeting, or VooV Meeting for global users. Since joining Tencent in 2019, Simeon has been leading a team developing real-time audio technology for Tencent Meeting and working on various efforts to help people with hearing difficulties. Before joining Tencent, he spent nearly 20 years in the audiovisual technology field, including at Motorola and Dolby Laboratories.

In the latest edition of Tencent Perspectives, Simeon shares some of the latest progress he and his team have made and his recent life at Tencent.

1. Why did you decide to join Tencent?

It was not a difficult decision for me. After working in audiovisual research and development for nearly 20 years, I was eager to directly participate in product development and iteration, directly interact with users, and make a real impact on the world with my research. 

In the past, I was more immersed in scientific research in the laboratory. Now I can provide technical support to tens of millions or even hundreds of millions of users every day to make their work and life more convenient, which makes me excited and proud of my work.

2. What goals did you set for yourself and your team in the past two years? What was the most memorable moment over that period?

In the first year after joining the team, my goal was to improve the audio-related technical solutions of Tencent Media Lab, learn about Tencent’s corporate culture and businesses, and actively establish contact with other teams, so that the technology we have developed can play a greater role on the existing platforms.

Since 2020, we have begun to think about how to apply telecommunication and audio-related technologies to more scenarios. The entire Ethereal voice module was designed based on these ideas. Since the official release of Tencent Meeting empowered by Ethereal voice technology in late 2019, the number of users around the world has approached 200 million. When the pandemic broke out, we were able to meet our users’ needs for efficient and stable remote communication and collaborative office work.

What we have achieved at this stage has exceeded our expectations. I am very grateful to Tencent for giving me the freedom of allowing my team and me to decide the direction and approach of R&D work, and to devote ourselves to the technical research that is valuable to the company, the industry and the society, so as to not only realize the goal I set for my career, but also help people in need.

3. Ethereal AI audio technology is an important achievement of the Ethereal Lab. Can you briefly tell us how this technology was developed?

Around six years ago, with the popularization of 4G networks and the improvement of cloud computing, deep learning and other technologies, we foresaw that audiovisual technology would play a bigger role in the fields of virtual conferencing and collaborative office work. 

Tencent has accumulated rich experience and strong technical capabilities in the field of audiovisual telecommunication for nearly 20 years. Our audio technical support involves social communication software with a large user base, consumer products such as videos and games, and industrial solutions such as cloud computing and AI industry services. The great potential in multiple industries and huge user base of the technology undoubtedly serve as unique advantages for personnel engaged in R&D.

However, there are still some pain points that need to be solved urgently in video conferencing to allow users to hear much better and more clearly during the real-time virtual conference. Ethereal Lab has continued to reduce noise through the voice signal itself and to upgrade the circuits, acoustics and algorithms.

We have creatively solved many technical problems of real-time audio technology in the complex scenarios of conference rooms based on the technologies for the perception, collection and reconstruction of sound field. In addition, based on deep learning models, we have eliminated over 200 non-stationary noises in conference rooms, such as noises of air-conditioning, fan blade rotating, writing on phone screen, tapping on keyboard, putting down cups, and turning over pages. (Read how Simeon’s team helps make voices clearer)

4. Ethereal Lab has jointly developed the world's first hearing aid app that integrates hearing testing, assisted hearing and remote rehabilitation services, with the internationally renowned manufacturer MED-EL. How did the two parties come together?

With the success of Tencent Meeting and our other products, we have seen the outstanding performance of Ethereal technology and have begun to think about whether this technology could be applied to a wider range of fields. Tencent has been working closely with public welfare organizations for many years, so we turned our eyes to the potential needs of hearing aids. 

Although we have cutting-edge audio technology and successful experience in connecting people, we are not a medical equipment company. Last year, we cooperated with Nurotron, a leading cochlear implant manufacturer in China, and applied Ethereal AI technology to cochlear implants, which improved the speech clarity and intelligibility of cochlear implants by 40 percent.

We learned it was difficult for many hearing-impaired users to see a doctor or receive follow-up checks during the pandemic, so the team began to contemplate how to meet the needs of hearing-impaired users for rehabilitation and consultation without leaving home. At first, we developed a Weixin mini-program. After countless times of communication with users, manufacturers and medical experts, as well as analysis of the collected data, we gradually upgraded and optimized the functions of the mini-program, and made it into an App. In this process, our partners were impressed by our determination and capability, which laid the foundation for later cooperation.

In my opinion, as a head of a tech department, first of all, you must have confidence and even belief in your technology, then you should put yourself in users’ shoes to feel how they feel. Besides, you must have the determination and perseverance to facilitate the development of projects and actively seek support from external partners or internal resources of the company to achieve mutual benefits.

5. How will we be able to bring out the full potential of the social value of technology in the future?

Tencent has always adhered to the mission of Tech for Good, which is practiced in all aspects including business, technology and products. Taking our laboratory as an example, the company never gives us any definite short-term commercialization target, but encourages everyone to think about how our technology can benefit more people and exert greater social value. In such a relaxed and flexible environment, we can think about the real needs and application scenarios of vulnerable groups according to our understanding of technology and the existing technical reserves, so as to develop products serving different groups of people.

Secondly, Tencent has always advocated a win-win strategy to build the industry ecosystem together with our partners. For example, the company has set up the Tencent Technology for Good Plan to solve social pain points through technological innovation and coordination, together with professional volunteers, volunteer organizations and public welfare institutions. When we cooperated with MED-EL, the two parties hoped to realize our vision together in the first place, and neither party cared too much about their own interests.

I think audio technology has great potential in empowering a barrier-free society. Tencent announced an upgrade of its overarching strategy in April this year, and put Promoting Sustainable Innovations for Social Value at the core. We are actively discussing and exploring the social value of audio technology with the laboratories under the newly established Sustainable Social Value Organization. For example, we are studying how audio technology can help the elderly, so as to better meet the demands in scenarios such as nursing homes or elderly rehabilitation centers. Many elders suffer hearing loss, and their life quality can be greatly improved through technical means such as voice enhancement and noise reduction enhancement.

6. Can you tell me something about your day? What is the daily routine work and culture of Ethereal Lab?

Our laboratory advocates openness, innovation and transformation. Everyone on the team has their own main technical specialization and specific business scenarios to which they can contribute, and we encourage everyone to step out of their comfort zone, learn more about the outside world, and see what interesting developments in other industries can be integrated into our technology.

Only by maintaining the openness of our vision and thoughts can we collaborate with cross-industry partners and innovate better. For example, many users ask for visual, images, text functions, etc., which requires us to conduct multi-modal and cross-modal voice processing, to provide a better experience. 

I like listening to stories very much. I often listen to podcasts and my favorite one is called “People Fixing the World”. There are many problems in the world. Some people only see difficulties, while others think about how to solve them. I feel very encouraged and become more convinced that I can do a lot more to make the world a better place.