Play episode

#7: Stefano Corazza on creating, auto-rigging & animating 3D characters with Mixamo, avoiding the uncanny valley & facial motion capture with a webcam

May 29, 2014by kentbye

Having believable characters within virtual reality can be one of the most challenging but most important components to get right within a virtual reality experience. Stefano Corazza and Mixamo.com have been working on a set of tools to make it easier to create & animate 3D characters within your virtual worlds.

Stefano talks about some of Mixamo’s services like Fuse to create characters, their auto-rigging functionality, their Decimator to optimize the polygon count, and how they have over 10,000 different animations available in different packs. They also have developed Face Plus to be able to motion capture the facial expressions and emotions for a character, which is the technology that High Fidelity is using in order to translate what your doing with your face onto a VR avatar.

Finally, he talks about the principle of the uncanny valley, and how Mixamo avoids it — as well as the importance of the expression of emotions within VR.

Mixamo also recently updated their Fuse product to make it easier to create 3D characters by making it possible to upload your own content that can be customized with their tools.

They have a couple of pricing options from pay as you go to a $1499/year professional account where you get access to all of Mixamo’s services. Fuse is also available on Steam for $99.

Reddit discussion here.

Topics

0:00 – Intro
0:34 – Emotional importance of 3D characters – Face Plus facial animation, Fuze character creator
1:56 – Optimizing high polygon count characters with Decimator & transforming Kinect scans into rigged 3D characters
2:53 – 10,000 animations in pre-set packs and importing into game engine like Unity
3:33 – Face Plus motion capture of facial expressions, Unity plug-in & variety of applications of Face Plus
4:30 – Mapping emotions to a face via abstraction process
5:34 – How Mixamo avoids the Uncanny Valley via stylized characters
6:47 – Creating efficient game characters
7:11 – How people like Nonny de la Peña are using Mixamo
7:48 – High Fidelity & their use of Face Plus & using facial animation in game play
8:41 – Pricing structure for Mixamo
9:10 – Future of Virtual Reality & importance of emotions
10:02 – Representing yourself in virtual reality

Theme music: “Fatality” by Tigoolio

Rough Transcript

[00:00:05.432] Kent Bye: The Voices of VR Podcast.

[00:00:11.933] Stefano Corazza: My name is Stefano Corazza. I'm the founder and CEO of Mixamo. Mixamo relates to virtual reality because we help people create content. So 3D characters specifically, they are so important in the virtual reality experience. So they can create, they can rig, and they can animate characters on our online service and then bring them into any game engine and any virtual world where they want to experience VR.

[00:00:34.807] Kent Bye: I see. And so maybe talk about the emotional importance of avatars and what you do to kind of bring that emotional quality out.

[00:00:41.530] Stefano Corazza: Yeah, that's so important. Pretty much the majority of games is using avatar and characters because they are conveying so much emotion. And so we try to help creators of virtual reality experiences to create that compelling character and to convey as much emotion as possible. So we have worked really hard in the last couple of years to have a real-time facial animation solution. Most of the motion is carried through the face of the character, so now with just a standard webcam, you're able to animate your character in real time without any technical knowledge. So that's FacePlus, one of our products. And then another product that has a lot of use for VR is Fuse. It's a character creator. You can create your own avatar, you can customize it, but you can also import from the outside. So it's one of the first, if not the first, open character creators where you can integrate your own piece of clothing or your own body or even your own body scan. So we are working with a few partners like Body Labs where you can use a Kinect camera to create a scan of yourself and then you can import into Fuse and augment yourself at clothing, you know, customize. We have over 200 ways of customizing the body shape, the hairstyle and so on, the clothing and then bring yourself in the game.

[00:01:56.482] Kent Bye: I see, yeah, so I'm being at the Silicon Valley Virtual Reality Conference here. I just got a Kinect 3D scan of my body and he sent me this 10 megabyte file. Does that mean I could go and upload that to Fuse and then have it transform from a very high polygon character into a more efficient or maybe talk about like what is happening with the Fuse and what are you putting in, what are you getting out?

[00:02:18.882] Stefano Corazza: Yeah, so basically we are opening up the platform to the world and we are giving out our standard template character. So if your scan follows some basic rules in the UV mapping, whatever the topology is, doesn't matter, can be imported into Fuse. And there basically all the assets are game ready, so they have efficient Polygon count, you know normal maps and all this kind of stuff plus we also offering services for decimating that so if your scan is too high poly We have the decimator that can reduce that so you can bring into you know mobile devices for games and so on I See and so it's all fully rigged up and ready to go and maybe talk about like if you just throw that in the unity What do you need to do to actually animate it and make it work? Yeah, that's a very good question. So we have about 10,000 animations that you can choose from and some of them are actually already in packed. So we have like a shooter pack, we have like a locomotion pack, so you can select those pre-set groups of animations, they already come with the logic for Unity. So they already have the state machine and the blend tree, so it's literally like a drag and drop into Unity and you have your own character basically moving around with all the blends between, you know, the idle and the walk and the run, all automatically set up for you.

[00:03:33.107] Kent Bye: I see. And the face scan, is that something where you need just a normal camera and then different points in your face? Or how does that motion capture actually work?

[00:03:42.810] Stefano Corazza: So Faceplus only uses a standard webcam and is based on machine learning. So we have motion captured a lot of people in different facial expressions and we have learned from the data what it means if someone is happy or sad or angry and so on. And so we are recognizing these facial emotions on the video stream and then we are applying them in real time to any character. We also have created, since you mentioned Unity, we have created a plugin specifically for Unity. So people in Unity can either use it to create content or a runtime as well. Runtime is now in closed beta, but we have a few customers that are making games, TV, live shows, and even virtual worlds. Some of them are showing here. They're using FacePlus as a runtime component.

[00:04:29.937] Kent Bye: I see. And so with the face, there's a lot of different muscles moving at the same time, and emotion may be a combination of you know, dozens of different moves. And so, how do you map a face to those different emotions? Do you say, just show me happy, and then it sort of happens? Or maybe talk about that process a bit.

[00:04:47.796] Stefano Corazza: Yeah, so this is a very good question. The mapping between the human and the character is super important. And so in the past, people have tried to transfer the motion one by one. So let's say move up the corner of your mouth one inch and then let's try to do the same on the character. So this kind of stuff doesn't work unless your character is really very close to your facial shape. And so if I want to animate a fish or a car or a cartoon character, this stuff doesn't work. So we created a high-level abstraction to that. So as long as you can define what the shape of the happy character is for you, then we can map that to your happy character as a human. And then once the link is established, you can basically animate those characters in a meaningful way for the character.

[00:05:33.893] Kent Bye: I see. Can you describe what the principle of the Uncanny Valley is and how that kind of relates to the work that you're doing?

[00:05:40.638] Stefano Corazza: So the Uncanny Valley is that phenomenon where if you try to achieve photorealism but you don't actually get there fully, then the result is pretty terrible. So a human in a virtual reality environment is somehow photorealistic but not exactly. It will read like a sick human or a zombie or a corpse and it will be weird and it will be creepy. And so you have two options. Either you decide that you're not going to go for photorealism and stay on the cartoony side, like all Pixar movies do, for example, and you make your character stylized so it's clearly not a real human, right? Or you have to spend, you know, $100 million and make Avatar. where actually they get there, right? So, but you don't know that you got there until you did. So, in the meantime, you may have run out of money, right? So, the Uncanny Valley is a very deadly place to be. And so we try to be stylized and, you know, especially if you want to have a lot of characters in a game running on real runtime and maybe on a mobile device, then you need to be aware that photorealism is not an option. So you need to stylize your content in a way that looks cool.

[00:06:47.773] Kent Bye: I see, and so you're deliberately not crossing over the edge of the Uncanny Valley is what I'm hearing.

[00:06:51.836] Stefano Corazza: Yeah, we're not looking at subsurface scattering to render the skin perfectly and that kind of stuff at the moment. We are creating game characters that have to be able to run efficiently on any game engine and possibly any device. And so once you're aware of that, then you can have the art be super compelling, but without crossing that line. I see.

[00:07:14.149] Kent Bye: Nani de la Pena mentioned Mixamo as being a very crucial component of a lot of her work that she does in immersive journalism. I'm just curious if you're familiar with your work and how you've seen what you're offering and then how she's been able to use it.

[00:07:28.265] Stefano Corazza: I briefly actually touched base with her here at the conference. Every time we have users and customers coming up with stories like this, we are super excited because we have over 200,000 users at the moment, but we only probably know 1,000 of them. And so it's fascinating to discover all these different uses of Mixamo. We are very proud of that.

[00:07:50.024] Kent Bye: Are there any other projects that are out there that you say is, hey, they're a real example of what you can do with our product?

[00:07:57.027] Stefano Corazza: Yeah, so just to stay here at the conference, High Fidelity has been experimenting with our Phase Plus solution. It was very interesting to see a completely different challenge. And then talking about Phase Plus, we have seen some usage in game development where they try to integrate facial animation in the gameplay and tracking the blinking of your eyes to do some actions, so that was also very interesting. And we even have seen live shows in Japan doing real-time animation of manga characters using Faceplus. So it's one of probably of all the products that we have, the one that has been used in the broadest spectrum of applications. We're only learning, always learning from users and trying to make it better for all those applications that at the beginning we didn't envision.

[00:08:42.317] Kent Bye: I see. And so what is the licensing structure if someone wanted to use Faceplus in their project?

[00:08:47.859] Stefano Corazza: So we have a one flat subscription that is $1,500 a year and that gives you access to all services that Mixamo is providing. So Fuse to create characters, Faceplus official animation, 10,000 animations on our site, you can use the rigging service and the decimation service, plus premium support. So all this basically comes into a single yearly license package.

[00:09:11.542] Kent Bye: And so finally, what do you see as the ultimate potential of virtual reality?

[00:09:16.583] Stefano Corazza: So I think now we can have the toy to play with that we didn't have, you know, at least a cheap toy enough like the Oculus that we didn't have years ago. And then I think now is the time to think about UX and, you know, the user experience. At the end, emotions are the most important thing. So how can we create experiences that create those emotions that are meaningful for which we want to have VR. We don't want to take every action we do and convert it into a VR experience because like reading a Word document doesn't make sense in VR probably, but we can create new experiences of travel, of flying, of interaction between humans and basically go beyond what is possible and really use it for what it can deliver.

[00:10:00.323] Kent Bye: I see and it seems like there's a social component and and how do you see Mixamo kind of playing into that then?

[00:10:05.850] Stefano Corazza: We are basically trying to help people represent themselves in virtual reality. The first thing you do when you pop an oculus on you look at your your arms right and then if they're not tracking them you don't see them so there's some excitement in having yourself be represented in any form in a virtual world and that's why Second Life was so successful at the beginning because people had another identity and so we want to help people create those fantastic characters the way they wanted so they can be representing themselves in the virtual reality world the way they want.

[00:10:39.186] Kent Bye: Great, well thanks so much. Thank you.

Play episode

#7: Stefano Corazza on creating, auto-rigging & animating 3D characters with Mixamo, avoiding the uncanny valley & facial motion capture with a webcam

Rough Transcript

More from this show

#1389: Elemental Theory of Presence + Primer on Experiential Design& Immersive Storytelling

#1388: Ultimate Potential of VR: Promises & Perils Featured Session from SXSW 2023

#1387: Landscape of XR Ethics: A Retrospective Presentation by Kent Bye

Menu

Play episode

#7: Stefano Corazza on creating, auto-rigging & animating 3D characters with Mixamo, avoiding the uncanny valley & facial motion capture with a webcam

Share this

Rough Transcript

More from this show

#1389: Elemental Theory of Presence + Primer on Experiential Design& Immersive Storytelling

#1388: Ultimate Potential of VR: Promises & Perils Featured Session from SXSW 2023

#1387: Landscape of XR Ethics: A Retrospective Presentation by Kent Bye

Menu

Share this