Sylvio Drouin is the Executive Vice President of Unity Labs, which for the past year has been doing advanced research into VR authoring tools both for developers and consumers as well as graphic research. I caught up with Sylvio at Unity’s VR/AR Vision Summit where we talked about their some of their research projects including: VR authoring tools within VR, Project Carte Blanche to bring authoring tools to consumers, integrating motion capture and facial capture technology into Unity, and the future of smart assets that use AI and machine learning.
LISTEN TO THE VOICES OF VR PODCAST
A big theme that Unity Labs is working on is to make the assets smarter to eventually have more intent-based development that’s controlled with voice input. The smarter that the assets become, then the more streamlined and user-friendly the VR authoring tools can be. So the 3D assets are going to have more metadata integrated as well as eventually have more sophisticated integration of AI, deep learning and machine learning that enables intent-based content creation with a very simple and minimal UI.
Unity announced at the Vision Summit that they have 5 million registered developers, and they’re hoping to expand their content creation tools into the wider consumer market in 2017 with their Carte Blanche project. They’re planning on leveraging the Unity Asset store to allow an even larger demographic of VR users to create content without having to write any code.
Sylvio also talked about some of the new storytelling tools that they’re integrating that will allow people to create timeline sequences that are similar to film editing software. Enabling people to tell their stories with VR technology is something that has been motivating Sylvio for a long time, and so you can expect to see a lot more tools for capturing human performances in VR using webcams, Kinects, VR input devices, and other hardware input solutions yet to be announced.
It’s still an open question as to whether the metaverse will develop starting with a closed, walled garden with by apps or be more open and interconnected like the Internet. Sylvio’s suspicion is that it’ll likely eventually be an open and interconnected world that is more similar to the Internet than the fragmented game console market, but that either way Unity will have a key role to play. Based on the fact that around 90% of the consumer VR experiences that have been released so far use Unity, then they’re in a really great position to continue to expand their reach from the existing developers into the wider consumer market starting sometime in 2017.
Become a Patron! Support The Voices of VR Podcast Patreon
Theme music: “Fatality” by Tigoolio
[00:00:05.452] Kent Bye: The Voices of VR Podcast.
[00:00:11.977] Sylvio Drouin: My name is Sylvio Drouin. I'm a VP at Unity. I run the advanced research via Unity Labs. We started labs about a year ago. We have a branch in Paris and San Francisco. Currently, we work on VR authoring from a developer angle, from a consumer angle, and we also work on advanced graphic research to bring the graphic pipeline to the next level.
[00:00:35.017] Kent Bye: OK, great. So yeah, in terms of virtual reality, talk a bit about what is this innovation lab doing specifically with VR?
[00:00:42.957] Sylvio Drouin: As Dio mentioned in the keynote today and there's a fundamental, you know, we were looking initially because there's so many research projects that we could tackle and we have a certain amount of resource that we have access to, but we came up with a fundamental principle outline that gives us a framework to decide which project we're going to work on. And this principle is that every time that we make assets smarter and more beautiful, we peel one layer of complexity from the authoring pipeline. So ultimately, in 20 years from now, you'll be able to create very rich, complex VR and non-VR experiences simply by gesture and your voice. And at this point, the authoring UI, which is in Unity today quite complex, and you can see that you have to master hundreds of potential parameters and windows and components, At this stage, in 20 years from now, just with voice and intent, you'll be able to create content and the user interface will be practically inexistent. So that's one of the fundamental core principles that help us to define what project we're going to start working on today.
[00:01:49.368] Kent Bye: Yeah, and I think that when you look at that far out into the future, you know, we look at the mobile phone market, we have this application ecosystem with app stores. And then there's also the sort of wider, more open internet where there's web applications that can replicate what the apps can do. I see that at this moment, the technology for virtual reality really requires it to be an application to be able to hit the frame rates and be able to have the performance. But eventually, I could imagine a world that is less of an app ecosystem and more of an open web metaverse model. So, you know, when you look at this kind of like long-term strategy as to the future of the metaverse and the future of, you know, is virtual reality immersive media going to be yet another kind of console war where these fractions siloed content Or is it going to be more like the web that is cross-platform and have applications across all of them?
[00:02:42.875] Sylvio Drouin: Frankly, I would sincerely hope, and the actual metaverse is a discussion that is ongoing in labs, and we obviously would love to be at the forefront of providing the technology and the components for the next 15, 20 years old to build the next metaverse. It's something that we believe in, it's something that we are considering in terms of connecting different Unity experiences together. How do you transfer via an avatar system? How do you transfer from experience to experience? How do you provide a foundation to create economies that are transferable between worlds? I think that we want to first start connecting the different VR experiences and if we provide the tools to do this, someone will come up with one or multiple metaverse and one of them will be successful. But I totally believe that it should be an open world where people can come, create, play and learn. And that's one of the big issues that we're considering, is that The place where you come and create is the place where you come and play. So you actually create where you play. And this is abolishing the boundaries or removing the boundaries between having an editor where you create a published platform, you just do this in the same world. And that's the metaverse. And it comes back to the concept, again, of the smart connectable asset, where the smarter your assets are, the less UI you need to actually create an experience, and the closer you are to this metaverse concept.
[00:04:15.596] Kent Bye: Yeah, and it seems like in the short term that they announced today that Unity has somewhere around 5 million developers that are registered users on the Unity platform, and there's going to be tools that are going to be released sometime later in 2016 that you could start to create VR within VR, but yet you're also working on a project called Carte Blanche to be able to potentially take these tools and start to enable them for people who have these motion track controllers with the Oculus Touch and the Vive to be able to actually go in and as consumers of VR also start to participate in creating without having to know anything about programming. And so it seems like part of your strategy is also to start to expand your demographic and your market from that five million into even more and more. And so maybe you could talk about your strategy moving forward into starting to take these tool sets from Unity and make them available to people who don't have, say, programming backgrounds.
[00:05:15.092] Sylvio Drouin: But I think one thing that we saw with VR is that it attracts so many more content creators. It's not like the standard artist, like an artist that wants to do a museum exhibit, will not start working on the PlayStation 4. It's complicated, it requires knowledge, it's a lot of coding, it's not accessible. Apart from a few artists, they don't really release that on the App Store. But because of the fact that VR offers this multi-dimensional storytelling platform, artists from all over the world want to do it. So for us, it's a creation medium. It's a creation medium that enables so many more people to actually create content. So in this way, we said, all right, do we keep Unity, do we keep forcing our content creator to write code, or are we going to provide them with new tools to actually create those experiences? So that's why we're coming up on the short term, we're coming up with something called Director, which allows you to build which is a multi-layer timeline sequencer that will allow you to assemble sequence and create your own little Pixar movie. We are coming up with the carte blanche project which will allow you to take assets from the asset store, assemble them and create experiences as well or your own asset without ever writing a line of code. And then we can start thinking about the fact that, okay, we have 5 million registered developers, but why not having 45 million? How do we give access to content creation to many more people? And that's why all those projects exist right now in labs.
[00:06:50.341] Kent Bye: And if you were to kind of crystallize the intention and mission statement of Unity Labs, what would it be?
[00:06:57.872] Sylvio Drouin: I would say that it is, again, to make content creation more accessible to more people. That's one of the missions of the lab. And obviously, there's always this underlying mission of providing the tools and the technology to create the most beautiful content. How do you render the most beautiful frame? And that's always that ongoing effort that will consistently be there. But that's on the core technology side. On the authoring side, because we're a content authoring company, so on the content authoring side is really make this accessible to more and more and more and more people to tell their own story. That's why we say that unity could become the future of self-expression. That's how you would express yourself. If you imagine again, in many years from now, in the metaverse, will be the core platform where you will create content and interact with your friends via this content.
[00:07:53.958] Kent Bye: And what do you think are some of the biggest blockers for content creators, or some of the biggest open problems that you're trying to solve?
[00:08:01.247] Sylvio Drouin: In VR, there's going to be various stages to that. Right now, I think we're still at the level of understanding how people will spend more than 15 minutes in a VR headset without being sick. How do you engage them? It's like Palmer said today, I can sell a lot of devices, but if people are not engaged and don't come back, it doesn't make sense. So how do we provide them with first the technology and the tools to make sure that they're not sick? Second, how do you provide them with a storytelling language that is necessary to create a good VR experience? Because you cannot just take what we knew before and cram this into VR. It doesn't make sense and it does not work. So I think we're very much interested in those two aspects. And once we resolve those two aspects, then we'll start, we'll be able to open up a whole new world of possibilities on what are the tools that are going to be needed to go even further. But right now, it's those two fundamental problems that we need to solve.
[00:09:00.547] Kent Bye: Yeah, and it seems like, you know, when I asked Dio about being able to use the motion track controllers to be able to do some sort of low fidelity motion tracked controllers of, you know, being able to actually do a performance and then capture that, that it's something that sounds like you guys are working on. And so what can you tell me about in terms of like, it seems like adding human expression and human movements and being able to capture that is going to be able to give life to a lot of these characters where You'd be able to bring them in from the asset store, but to actually really bring them alive, I could imagine being able to use this VR technology to actually give a performance that is then translated within VR.
[00:09:37.892] Sylvio Drouin: Obviously, we are working on what we call like consumer motion capture, which is going to capture body and facial expression as well. This is an ongoing effort. That's not part of Lab, it's more like the main R&D team, but this is definitely going to make its way back. You know Lab, there is a flux between Labs and R&D in the company, like we exchange a lot of ideas and projects. This is a project from R&D that will probably make its way into the Carte Blanche project so that you can easily grab an asset from the asset store and use your own expression and movement to drive and animate the asset. If you start to do a consumer authoring tool, nobody is going to buy a $90,000 motion capture rig. You want to be able to do this from a camera, a Kinect or whatever device is going to come out soon.
[00:10:25.401] Kent Bye: And so when you talk about advanced rendering, you know, when I talk to Leila Ma, formerly of AMD now, and Saifa of VR. So she was working on like digital light fields and talking about this sort of chicken and egg problem where, you know, when you talk about having like a 16 GPU multiprocess architecture, it's something that she could build that, but she would need to have support from something like a game engine to be able to even drive these, you know, multiple GPU architectures. And so, When you think about digital light fields and the future of rendering, both in VR and AR, do you see that at some point we're going to come to a limit to what pixel-based rendering is going to be able to achieve? And do you see that eventually that the industry is going to be moving more towards this digital light field, virtual retinal display type of approach?
[00:11:12.174] Sylvio Drouin: I mean, I've seen some Lightfield solution. Now they're, you know, it's extremely expensive and complicated. Now, what I can tell you is that talking about Lila, I think Lila has a solution with Insightful VR. And I can tell you that without revealing too much, I can tell you that Insightful VR will certainly collaborate and partner with Unity to make this happen, to make that level of quality happen at a lowest possible cost. I cannot reveal more than that right now, but it's in the work, yeah. And I'm very curious, I've seen like, you know, I'm very curious on what chip set manufacturer are doing right now. I don't see any form of consumer grade light field solution in hardware anytime soon from what I know. So that's why we're looking at different solution. You could say that we're looking at like, you know, cloud-based solution to take advantage of the cloud power to create the quality that we need.
[00:12:10.586] Kent Bye: So do you imagine a time where you have cloud-based rendering that's piped over, say, fiber optic and going to mobile VR headsets that don't have super fast processing?
[00:12:22.359] Sylvio Drouin: Yeah, exactly. That's what I think.
[00:12:27.944] Kent Bye: Great. So for you, how did you get into virtual reality?
[00:12:32.197] Sylvio Drouin: My god, I was working in VR, I've always been at the intersection of computer graphic and AI and advanced authoring tools. I wrote a research paper 20 years ago about VR and I've always been fascinated by the subject and being at the forefront of authoring interface and graphical tool, For me, it was a natural, when I started Unity Labs last year, when I saw the state of the new wave of new generation of VR device, I immediately saw the actual potential in terms of storytelling. And I saw the potential into how this would open up a complete new world, or it would open up content creation to a completely new audience. And I wanted to be part of it. I reoriented my entire career is that I want, for many years now, I want to help people tell their own story via technology. So it was like a natural answer to that.
[00:13:31.745] Kent Bye: What do you think that virtual reality provides as a medium that is new and unique to telling stories?
[00:13:39.950] Sylvio Drouin: The way that I see it, and I think that you can ask me this, if we see each other every year, I'm going to give you a different answer every year, but my answer this year is that I believe that using 2D interface on a flat surface, with a flat monitor, with a mouse that's running on a 2D surface, to do 3D with this require immense translation effort. That's why we have extremely complex UI to do this translation. Now, if I let go of all this translation effort and I remove this effort and I don't need all those complex UI, that's where I can unleash my creativity as an artist, as a content creator. So I believe that because of this, because we just remove those barriers, the type of story that are going to be created are going to be crazy. It's just unbelievable what we can even predict now that's going to happen with VR. It's just that we've suddenly opened the door to content creator that were actually blocked by the same technological interface that we had created before. That's what I believe. That's why I believe it's a powerful medium.
[00:14:52.990] Kent Bye: Have you seen any specific examples of VR experiences that you feel like are really starting to figure out this new language of storytelling within VR?
[00:15:02.128] Sylvio Drouin: Yes, and it's Alex, and yes, it's an experience that I've seen that I don't remember the name, at Sundance, where you have to hold each other, you're two characters, and so you go there, you have to do it with another person. You both put the headset, and you have a sensor on your feet and on your arm, and you have to hold each other's hands, because you're going through this maze, and it's like cramped, and one of you needs to hold the torch, and if you let go of the hands of the other person, the other person sees nothing, and they're lost. And so you have to hold each other and this holding, this touch feeling, this amazing feeling that you had of being with someone else holding hands in this virtual world was the most powerful experience I've ever tried. And I think that The storytelling language for VR will evolve a lot as we evolve the sense as well, as we have the sense of touch, haptic, smell, anything you can imagine that we're all, you know, like all the devices manufacturers are working on in their own labs. The more sense we're going to bring to the table, the more complete the actual story will be.
[00:16:09.368] Kent Bye: Yeah, I was at Sundance as well, and I had a chance to see all 37 of the experiences there, and that was the real virtual reality, the Explorer. So you had OptiTag controllers in both your hands and feet.
[00:16:20.505] Sylvio Drouin: So you tried it? Yeah. Okay.
[00:16:22.391] Kent Bye: Did you like it? Yeah, so for me it was probably the most amazing experience because for me that was really the first time that really invoked the full virtual body ownership illusion where you have one-to-one tracking of all of your limbs, both your hands and your feet, and in a VR where you were just untethered so you could just freely walk around without feeling any constraints from the wires and you're able to Really kind of get lost into walking around these different environments that you're teleporting between and to me There's the moment where I was on this platform kind of floating through this Blade Runner type world where I was just looking at everything and and I You know, the graphics and the art wasn't necessarily the best that I've ever seen, but it just felt that... It was beautiful. Yeah, it was beautiful and it just allowed me to walk and feel fully immersed with another person that also felt like they were fully there. And I do agree that that level of invoking this virtual body ownership illusion gives you this sense of presence with mixed reality experiences that really goes beyond anything I've experienced.
[00:17:19.024] Sylvio Drouin: So you know what happened to me in that experience? It's very funny because there was a bug in the experience. And my head was here, and my body, so my head was like a feet left of my body. And my body was upside down, so I was walking on the ceiling. So they asked me, they said, oh, do you want to restart and fix it? I said, no, you know what? I'm going to experience it like that, with my head, feet. And it was the most surreal experience ever, because I could actually not walk. my wife had to actually really hold my hands and drag me, because I was walking on the ceiling, so I had no clue where I was going, but it was so surreal that I decided to go through it like that, just to experience it, you know? And it was awesome.
[00:18:01.224] Kent Bye: Yeah, it's possible that you didn't fully experience the virtual body ownership illusion because of that, because your head was detached and a foot left or so, yeah.
[00:18:09.866] Sylvio Drouin: But it's made with Unity, so I've kept contact with them, and they're gonna have me try it again.
[00:18:16.457] Kent Bye: For a lot of the, right now at this point, you throw a lot of things within an environment and in order to do anything with anything, it actually takes quite a lot of programming in the background to be able to make sure that it has sort of a realistic feel. Right now it feels like a huge gap to be able to kind of just even mimic natural interactions and reality within VR and that, I would imagine that there could be tools to do that. But also when you think about highly dynamic, interactive characters with artificial intelligence, you're starting to talk about a layer of AI that starts to get into, perhaps that transcends what Unity is going to be focusing on. But I could see a future where AI is going to be quite an integral part of...
[00:19:01.165] Sylvio Drouin: Yes, the future you will see. Right now, a lot of the character animation is driven by mocap and very expensive mocap. Then you need to clean up the data and you ended up having a series of 10 different custom or non-custom tools that you're using just to get the character to move right. And hopefully in the future that's something that we're looking at, not now but in the future. In fact we're looking at now because we are looking at cheap mocap systems that can generate very clean and accurate data. But we're also looking at how to use AI in the future to control those characters so that I don't have a team of 100 artists animating everything, you know. Again, it is exactly part of the principle that I outlined at the beginning of this interview is every time that you make your asset smarter, smarter means more intelligence in your asset. It means that the character will move by intent. If I say character smile and run, he'll move and he'll smile and run or she'll smile and run. That's what we say when we say make asset smart. So the smarter that you have your asset are, the less external tools that you need, you see. And that's where we're gonna head up. That's where we're gonna head up in terms of AI. That's where we're gonna head up in terms of not only simplifying our own authoring pipeline, but simplifying your entire authoring pipeline with all the tools that you're using now. Because right now, any form of interactive experience, it's kind of an artisan approach. Johan Andersen from Frostbite said it so well. He said when we do Need for Speed and when we do Star Wars Battlefront, there is so much sweat that goes into making an asset beautiful and behaving the right way. 200 people, 300 people, sometimes a thousand people work on making all those assets looking so right for 2, 3, 4 years. We want to go beyond this. We want to have enough intelligence and assets so that we can drastically reduce the actual effort that it takes to create a beautiful experience.
[00:21:11.667] Kent Bye: And when I hear you say that, it makes me think of Google and their whole repository of being able to capture information from thousands and millions of users. Is something like machine learning something that you've been looking into in terms of using machine learning and applying that to VR?
[00:21:28.259] Sylvio Drouin: We have not been doing this directly, but I'm in conversation with friends of mine who have did research projects at Stanford and Princeton, who have worked on leveraging Google data to do intent-based modeling. So Google has a bunch of 3D models that they've modeled a lot of content. Let's say that they take 10,000 models of chairs, modeled in 3D, and they tag specific attributes to each one of those chairs. So what you do is that with machine learning you analyze all of those chairs and you analyze all the attributes and then with voice you can say ok I want a chair of the 16th century, I want the chair to look more reddish, I want long legs on the chair and then you start talking. And the chair is going to morph in front of you. And that's all machine learning. That's all deep learning that have analyzed the 10,000 chairs, the 10,000 geometry, and the 100,000 of attributes that were assigned to those chairs, and then can come up with the right morph based on where you speak. So that's something that we see today in research projects. And being able to leverage a data set like Google has is the key to do something like this because machine learning is all about the data. No data, no insight.
[00:22:52.529] Kent Bye: Are there any dates that you can tell us when there are going to be some of these VR creation tools or other projects that you're working on?
[00:22:59.792] Sylvio Drouin: The VR scene editing for developers that you've seen today at the keynote is before the end of this year. And the Carte Blanche project, hopefully we want to have something in the end of consumer sometime in 2017, at least as a first test.
[00:23:14.479] Kent Bye: Great. And finally, what do you think is the ultimate potential of virtual reality and what it might be able to enable?
[00:23:22.361] Sylvio Drouin: It's a big question and it's a good question. If I had asked you 10 years, 15 years ago, what was the potential of the internet is to connect people. So I believe that VR is just an even deeper way to connect people. It goes beyond website and chat room and it connect people in a much deeper way. So which will create application that we even don't know it now. That's what I believe.
[00:23:49.880] Kent Bye: And how is Unity going to be a part of that?
[00:23:52.142] Sylvio Drouin: It's going to be the foundation of all of this. Why not? We are there now. So we're going to power the graphic, we're going to power the content creation tool, and we're going to power the components that are going to be used to build this new connectable world.
[00:24:06.718] Kent Bye: The only thing I would say why not would be sort of this whole debate I talked about like the closed app ecosystem versus the open metaverse model and which way things are going to, if it's going to be a hybrid or if it's going to be both, you know, like I could see it being an open world and a closed world. And I, I tend to want to have an open world, but right now the models, the business models seem to be leading towards a closed model.
[00:24:27.897] Sylvio Drouin: But I'm a big proponent of the open world, obviously. Now, who's going to own, is it Facebook who's going to own the metaverse? Is it Google? Is it something else? Is it going to be an open world? Is it going to be, you know, I would rather think that it's going to be like the internet at some point. Once we get all the technology together to do the actual metaverse, it will be open. There's no choice then for it to be open. You'll have obviously the Apple of this world that are going to try to control this and maybe there's going to be five different metaverse and you'll go into certain one that's going to be closed and another one that's going to be open. But if you look at the landscape and the way that the technology will actually evolve, it tends to be toward being open. It tends to be toward being like just the new internet. Great.
[00:25:18.357] Kent Bye: Well, thank you so much. Thank you. And thank you for listening. If you'd like to support the Voices of VR podcast, then please consider becoming a patron at patreon.com slash Voices of VR.