Victor Erukhimov, the founder of Avatar SDK.

Creating realistic avatars for social interaction in 3D media has been a longtime dream of many people. Avatars that you can recognize create emotional attachment, making interactions more efficient, fulfilling and fun.

However, this is not an easy problem to solve. There are technical challenges: How do you capture the person’s identity without scanning them in an expensive photogrammetry rig used by Hollywood special effects companies? A huge obstacle on this path is the perceptional issue: The more realistic faces are, the harder it is to make them likable.

Robotics professor Masahiro Mori named this phenomenon the “uncanny valley” in 1970: Models that look like human faces but are a bit off bring fear and disgust to many observers. This effect is amplified by motion: An immovable corpse causes less intense impressions than a moving zombie.

The uncanny valley became a very well-known term among video game developers. It is exploited to create scary scenes by using human-like NPCs that look and/or move weirdly. And, of course, it is mentioned every time 3D artists attempt to create a hyperrealistic, realistic or not-so-realistic human model. If you go to X.com and search for the “uncanny valley,” you will find countless examples of good-looking art or hyperrealistic videos that are labeled “uncanny valley.” For instance, here are the first few links for my X.com account: one, two, three.

What We Know About The Uncanny Valley

It remains unclear whether the uncanny valley effect is an inherited or acquired trait. Since we at Avatar SDK work on realistic 3D models of people, we did our best to find all research related to the uncanny valley, and we found a few surprising facts that I would like to share:

• Cartoonish avatars can fall into the uncanny valley, just like realistic ones. One of the reasons is a discrepancy between the level of detail in different parts of the model or between different layers, such as mesh and texture. If the mesh is sharp and the texture is blurred, many people looking at this model will have a feeling of uneasiness. See, for example, the paper by Zoll et al.

• Unrealistic human avatars that resemble a real person do not necessarily fall into the uncanny valley, as a case study by Henriette C. Van Vugt et al. shows.

• Interestingly, the uncanny valley effect is weaker if an avatar is “mirroring” the movements of the subject observing the avatar, as shown by research from Elena Kokkinara and Rachel McDonnell. A potential explanation here is that people feel better about something that they can control.

• The more realistic the model, the more realistic movements people expect from it. A study by Jeremy Bailenson et al. found that the disparity between the realistic appearance of agents in VR causes diminished levels of copresence, which is the level of perception of an agent as a social entity. A human agent is expected to behave realistically, a teddy bear less so and a blockhead—the least humanlike—only achieves high levels of copresence when it demonstrates “unrealistic, random head movements.”

• The uncanny valley effect depends on age, country and background of the subjects. For instance, it is noticeably weaker for gamers, as shown by Katja Zibrek et al. Possibly, those who spent a lot of time looking at all kinds of NPCs are harder to scare with imperfect 3D models of people.

What It Will Take To Jump Over The Uncanny Valley

The technology for creating 3D avatars of people is getting more accessible every few months. My company’s research, for instance, has found that it is already good enough to reconstruct a realistic 3D model of a person’s face from just one selfie, and it is good enough to be recognized by people who know this person in real life.

However, once the face is animated with technologies such as blend shapes, the spell is broken: It becomes immediately obvious that this is not the real person’s identity.

There are a few promising ways to overcome this. All of them analyze the input video stream and use neural networks to synthesize a realistic facial model: for instance, pixel codec avatars, as presented in research by Shugao Ma et al. This includes methods based on Gaussian models (see Tobias Kirschstein et al.) and on NERFs (for instance, Xiaowei Zhou et al.), but there are still obstacles related to putting these models into a 3D scene in a rendering engine and animating them realistically, including body and hair.

Other methods reconstruct traditional 3D models that are ready for standard rendering engines such as Unity and Unreal Engine. Meta is developing their pixel codec avatars. My company has also recently released Avatar SDK Leap, an offline tool that converts a video of a person speaking on an iPhone camera into a 3D animation that can be imported into Unity or Unreal Engine. We use neural networks to synthesize mesh and texture on each frame, which I believe results in an unprecedented level of realism. Currently, Leap is the only commercially available solution for neural-based animation, but other companies are following suit.

I believe that following this path will enable avatars on the other side of the uncanny valley, but there is still a long way to go before we achieve realistic people in a virtual reality world. We need to understand how to solve hair modeling, for example. Realistic body movement indistinguishable from real people is still an issue even for AAA studios.

However, given the rapid progress in deep learning along with a growing amount of data to train on, there is hope that we can solve these problems in the next few years. If this happens, realistic 3D avatars will soon be a technology that is readily available to be used by consumers. This will enable better AR/VR apps and remote collaboration experiences that transcend audio and video calls.

Although this tech is far from being on the rise, with Apple reportedly ending its smart glasses project and Meta shifting focus to AI, along with just about every other major company, having a portable computer screen on your face is still a dream of many.

We are yet to hear news that we can name “A New Hope,” but we will be waiting.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Share.

Leave A Reply

Exit mobile version