At Unity’s Unite keynote in November, Otoy’s Jules Urbach announced that their Octane Renderer was going to be built into Unity to bake light field scenes. But this is also setting up the potential for real-time ray tracing of light fields using application-specific integrated circuits from PowerVR, which Urbach says that with 120W could render out up to 6 billion rays per second. Combining this PowerVR ASIC with foveated rendering and Otoy’s Octane renderer built into Unity provides a technological roadmap for being able to produce a photorealistic quality that will be the equivalent of beaming the matrix into your eyes.
I had a chance to catch up with Jules at CES where we talked about the Unity Integration, the open standards work Otoy is doing, overcoming the Uncanny Valley, the future of the decentralized metaverse, and some of the deeper philosophical thoughts about the Metaverse that is the driving motivation behind Otoy’s work in being able to create a virtual reality visual fidelity that is indistinguishable from reality.
LISTEN TO THE VOICES OF VR PODCAST
Here’s Otoy’s Unity Integration Announcement
Here’s the opening title sequence from Westworld that uses Otoy’s Octane Renderer:
[00:00:05.452] Kent Bye: The Voices of VR Podcast. My name is Kent Bye, and welcome to the Voices of VR Podcast. So I first met Jules Urbach at SIGGRAPH 2016, and the thing that really impressed me about Jules is that he communicated with such a density of information that far surpassed anybody else that I had really talked to over the past two and a half years. And he's somebody that's really thinking deeply about the future of where virtual reality is all going in order to get to the point of actually creating a visual fidelity that is indistinguishable from reality. He's been working with Digital Lightfields and the Octane Renderer with Otoy, and he's also picked up a couple of Academy Awards for their work on Avatar and Benjamin Button. So when I went to CES this year, I had a chance to sit down with Jules to really do a deep dive into some of the deeper philosophical motivations that drive him and the work that he does at Otoi of what's it mean to be able to create a virtual reality that is visually indistinguishable from an actual reality. So we'll be covering all that and more on today's episode of the Voices of VR podcast. But first, a quick word from our sponsor. Today's episode is brought to you by the Silicon Valley Virtual Reality Conference and Expo. SVVR is the can't miss virtual reality event of the year. It brings together the full diversity of the virtual reality ecosystem. And I often tell people if they can only go to one VR conference, then be sure to make it SVVR. You'll just have a ton of networking opportunities and a huge expo floor that shows a wide range of all the different VR industries. SAVR 2017 is happening March 29th to 31st, so go to vrexpo.com to sign up today. So this interview with Jules happened at the Computer Electronics Show that was happening in Las Vegas from January 5th to 8th, 2017. So with that, let's go ahead and dive right in.
[00:02:06.182] Jules Urbach: So I'm Jules Urbach, I'm the CEO and co-founder of Otoi Inc. and Otoi is well known in the computer graphics industry and the real-time rendering space for our products including Octane Render which is an unbiased CGI path tracer. Most recently we've had a pretty exciting announcement regarding Octane which is that it will be bundled with Unity in 2017. So Unity is going to have all the cinematic tools that have traditionally been reserved for things like Pixar's RenderMan. And this will be baked into the Unity Editor, so you can actually bring in cinematic assets at the highest quality. Some of Octane's work is in the shows and the movies that you see today. The opening of Westworld, for example, rendered entirely in Octane for Cinema 4D. And we also do other things in the visual effects industry. Light Stage, for example, won an Academy Award for bringing in digital humans in films such as Benjamin Button and Avengers. And most recently, if you saw a certain Star Wars movie, there were two characters in there that Light Stage, we were very proud of having a role in that development. Now, the exciting part about VR and mixed reality and all of this roadmap that we have ahead of us regarding Octane and our work in the space is that there's a huge pent-up demand for getting absolute photorealism that is navigable, that is game-like, and it's something that has always been challenging even when you're doing games at 2K or 4K, but it's even more so when your constraints are doing this in a pair of glasses. and going to 120 hertz and beaming 4 megapixels in each eye in HDR. So the approach that O2i has taken to solving these problems has been manifold. And the first part has been really to solve the content creation pipeline in the video game world where we believe a lot of this kind of content needs to germinate and Unity and Unreal are really the two areas where we're supporting both of these engines to provide that. And adding Octane into those engines, and with Unity having that built in, means that the quality of simple things that you currently do in Unity and Unreal, such as baking lighting and translating cinematic assets into game engines is really all handled for you. But the most exciting development out of the other side of that is that we don't just have to render textures and lightmaps into a mesh, we can also, rather than rendering a skybox or a mesh, we can also generate a true hologram and have that be a proxy for a much more complicated model. And this is something that is familiar in some sense when you're baking lightmaps in Unity, but it's really taken to a whole other level when you can have things that are as complicated as a Star Wars movie or the opening of Westworld be translated into a format that is as easy to decode as video, but looks as real as the Matrix or the holodeck. And that always was my inspiration for working on both the visual effects pipeline and the real-time pipeline. And the touch points outside of those pieces in the content creation side has started to germinate really nicely in the last two years. It started with a really great and ongoing relationship with Oculus, with John Carmack in particular, who wanted to see the very first pieces of what we could do for jump-starting really high-quality content on the Gear VR. And those who've used a Gear VR might have noticed in Oculus 360 photos, there is a section called Render the Metaverse, and this was all generated with Octane, independently of what we currently are doing today, by using a format that Carmack had sort of specified, that's 18K, and it gives you this absolute perfect quality bubble of reality around you. And those stills have been on Oculus 360 photos, they've been used in backgrounds for other app players, and most recently, Samsung Internet has supported the format. But there's a lot more that is being added in the publishing model that we're now supporting. So, at the end of last year, we added the ability for Oculus Social, which is now sort of evolving into more of a social layer in Oculus' platform itself, to pull down more complex Renders out of octane and those can include things like animations movie theater environments that are absolutely peak fidelity But are relit by a shared surface and a lot of content partners really were moved by that and it was really exciting to see Disney and they brought in Marvel and ESPN and ABC we had Star Trek Warner Brothers and a lot of others Basically plan a flag and say our website Disney.com ESPN.com is going to be built as an Orbex environment and the portal the site is basically a shared screen and The content creation system for that started modestly enough with the oculus social layers where you're in an environment that is static but as we move towards sixth off and into next year with true mobile position tracking you're gonna have these environments really become room scale and And we want to add layers where those are the starting points for much more complex interactions. And our relationship with Unity is not just about supercharging the graphics for Unity content. It's also to come up with ways where Unity itself can be leveraged to publish components of what we see as an open metaverse. So inside of the current Oculus social layers that we've added, you can, for example, load up a screen in an environment and it can be an Unreal Engine game running on the cloud. And we've provided that system to Unreal developers for a while. But the tools that we're going to be adding to both engines and it's going to be something that's just again embedded in Unity will allow you to publish your Unity game to a link and it can be streamed or downloaded and do hybrid rendering at full native performance without having to publish an APK or an executable or one of the 29 different formats. Unity and Unreal both do this with the HTML5 backend. But the performance for WebVR and HTML5 is really, it's five generations behind and we don't want that. We need every single bit of performance that we can get. And to do that, we wanted to have no compromises. So we have tested and we've actually been showing at the show an experimental version of the Oculus Social Environments that can actually pull in an embedded Unity project and run that live. And it's awesome. You can think of it as an MP4 with a game in it. And having that work at native speed is really cool. And it's something that can apply to other engines. And it's a great component for allowing you to have three different Unity projects. And before we worry about how they physically behave with each other, blending them together in an environment means they reflect off each other, they're lit perfectly. And the ability to render cinematically, physically correct light fields in Unity and Unreal, and even in the cinematic pipeline, means that all of this data is kind of in there. I mean, our light field rendering approach was never just about navigating around an object and getting really good reflections. It was also about providing enough information in that very quick to decode data set to provide relighting. And that becomes also very important when you're talking about things like ODG or HoloLens, where the world around you is meant to provide a sense of realism, where the object being rendered is not just being lit by other synthetic things, it's being lit by the scene itself. And I think we've actually come a really long way to making that work well on the ODG glasses just by having the cameras provide us a sense of the incoming light. That's something that we've been able to really do well. And we were showing with Unity how we can take a light stage capture, which is something we previously leveraged for films, bring that into Unity and have a perfectly lit human head that can be lit by Unity objects or the elements in the real world. And the thing that's probably trickiest about real mixed reality versus just AR is opacity masking. And that's something that people have thought, well, you know, we can maybe do an LCD shutter or there's a lot of things to get done to make that happen. But what we discovered, almost accidentally, and it was a great discovery, and it's one we're going to leverage, the ODG glasses have such a high angular resolution, we sort of suggested they add two cameras, and they do now in these glasses that are high speed. We actually did a test at AWE with the prototype ODG glasses, where we blocked out the glasses, so they're not AR glasses, they're full VR, you don't see anything, but we turned on the camera pass-through. And because the lenses are such high resolution, you're essentially getting 6K for 90 degree field of view, 30% of the people that tried it had no idea that they were seeing the camera reprojecting the world. And because they didn't know that, they were surprised when we had a bright light source and it was all of a sudden extinguished. That's not something you can do normally in the HoloLens or with normal AR. And it's not a complex idea. I mean, in fact, Michael Abrash at Oculus Connect 3 was talking about how that's an approach that they want to take in their five-year roadmap, and it makes a lot of sense. But I'm actually very optimistic that it's going to be very possible to do that on devices this year, and ODG is going to be our real first test of that. And I feel that's incredibly exciting. There's a lot of great movement happening in the Android space around VR and AR, but ODG has solved so many of the hard problems around optics and getting a really high-quality, high-resolution experience. There's a threshold that's been crossed where you don't see a screen door effect, where you can actually fully replace anything that is screen-to-date. You don't have to go to an IMAX movie theater now to see exactly the same resolution you see in a digital 3D IMAX movie. You don't have to get an 8K display. You have that virtually projected for you. And you can start working on your desktop applications. You can really go take things to a whole other level as far as having mixed reality replace all these other layers. And that's just sort of exciting on a procedural level. But from the content and media and game level, obviously, the quality of those graphics is really good. So anyone that's listening to this, if they have a chance to check out ODG's glasses at other shows, they should see the content that we've been showing now for two years in VR, and on the Gear VR in particular, and sort of see the difference in how it looks in these glasses. It's not equivalent. I mean, there's a narrower field of view, and it's not quite as immersive. But really, the fact that these glasses are so lightweight, I mean, I personally worn them for hours. And the fact that you can bring in the real world, and they're totally self-contained, is really cool. And I think that that's going to work really well for people that are content creators that want to be able to live a little bit more in VR and not have to go in and out of it. And mixed reality, I think, if it's ever going to work, it does need to get to that magical sunglass form factor. And I had never tried the latest ODG glasses until last night, the R8s. They're four ounces, and they are so small that it really does feel like it's about the weight of a pair of glasses. And yet, it has more processing power than the Note 7 did. You know, it's basically the Qualcomm 835. It's incredible. I was running it for eight hours, didn't run out of batteries, it never got hot, and it was running high-end Unity content. Not just our stuff, but really games that could run in mixed reality. And to that end, being able to embed Unity games and cloud content and all these other pieces is really important for a platform like this. It doesn't necessarily rely on Google Play or the Oculus Home. So we want the discovery of things on the web and even through other layers, social media included, to provide links to these sort of content experiences that can be pulled in. And even on the platforms where there is a great deal of effort being put into the app store, like Oculus Home, the most popular app on the Gear VR is Samsung Internet. And Samsung Internet is actually one of the partners that is basically going to load in a link to an Orbex file and bootstrap the Orbex media player to run it. So they've already started to introduce that concept by having some of our earlier Orbex scenes being able to load from Samsung Internet and featuring that. But I think that all of us, and I think John Carmack, and a lot of the guys at Unity, and Tim Sweeney, I mean, all of us as an industry really want to see an open metaverse as at least an alternative to the App Store, just because it really will be hard to showcase and discover things without that. I mean, there are wonderful things that can come from an App Store, but there are also a lot of models that need to be web-like. I mean, the web allows you to embed a YouTube video and have that be mixed with other things in a page. We need the equivalent of that in a volumetric sense for mixed reality and especially for the kind that will actually be living in different layers and spaces. And I think that we're building the framework of those pieces now. And some of that work, I think, will lead to some really interesting open standards. We just, on Twitter, we announced that we were invited and we've joined the JPEG Working Group. So a lot of our work around compressing light fields and providing Orbex as a standard for how to render a scene was something that interested the JPEG Working Group. And they're working on JPEG Planet, which is awesome. It's a really amazing volumetric format, and they need technology like ours. And we're perfectly willing to open source the pieces that are in the data portion of our stack. I mean, Orbex is a great system for that. That also led to the MPEG Working Group also taking interest in that and asking us to come up with something similar for MPEG. and Orbex containers are open source. There's not much going on there. It's like a zip file, but it's designed to embed and clearly do so many other complex media types. You can embed an MP4, you can embed a Unity project, you can embed all sorts of crazy things in there, and all of these things are designed to be linked together like the document object model is. So, in about two weeks, I'm going to go to MPEG-117 and we're going to present essentially subsets of Orbex that can be used to create an MP5 container spec. And similar to how MP4 provides profiles, and baselines, and levels for decoding complexity and bandwidth, we can provide the same sort of thing so that you know that your Orbex scene, or if it's an MP5 scene, if it's just an open standard, has many different versions of an MP5 that can be pulled in based on the specifications of the hardware and the client. And that's going to become really important as we see, not necessarily fragmentation, because a lot of these devices are Android-based, but a lot of different profiles with compute, with bandwidth, So it's important to sort of get that going. And it's not our expertise, but with MPEG's help and the JPEG working groups help, I think there's a lot of value in that. And so we'll see how that goes as we get into this year.
[00:14:46.515] Kent Bye: Yeah, I had a chance to see some of the demos from Otoy using the Octane Renderer within the ODG glasses and there was a scene where actually I was in a street and I was seeing the trees and I had to ask whether or not this was video or whether or not this was rendered and it was so indistinguishable from reality that I think it's getting to that point. And I think that there's an interesting paradox with the uncanny valley, which is that whenever you get something that is photoreal, that's great for mixed reality if you're overlaying it because you want it to match your expectations for what reality is. But yet, when you're in a social situation with other people and you may not be able to match the level of fidelity of human beings with all the different facial expressions and having a photorealistic environment but yet something that we're not yet at the point of being able to replicate the full dynamic range of emotional expression and eye gaze and all those things. So there's a little bit of a fidelity mismatch there in terms of what we can do with the visual fidelity with Octane Renderer. But I'm just curious to hear your thoughts on the Uncanny Valley and moving forward and where you see this potential of what's it mean to have a virtual reality that's indistinguishable from reality.
[00:15:57.610] Jules Urbach: I think about this constantly and in fact it's really my personal inspiration for philosophically behind OTOI is I think there's a lot of value in understanding socially what it means for reality to be mimicked in the virtual world and in fact you have people as varied as Elon Musk question well you know if VR eventually gets to the point where it isn't indistinguishable from reality then we're probably living in a simulation I mean that that is one of these philosophical conundrums that comes from that but it's not that simple I mean we already live in a virtual world. We communicate with people now over almost like with ESP. You know, it's crazy how messaging and social media has really changed a lot of how we think about our social connections. And I think it's going to become a lot more sensory with virtual reality. And I guess from a technical perspective, I think we're basically there except for human faces. So the scene you saw was actually done by a super talented friend of mine named JJ Palomo, who's actually a great CG artist. He worked on a number of movies and I tapped him to do This demo for unity, and this is actually all the three of those things were released at the unite conference They're on oculus social that will be on links that you can download from samsung internet shortly and of course those links also work on the odg glasses and on the odg glasses the Resolution of that render is so high that it starts to look like you're really there And it's something that I feel even if you're telling that a movie and you want to show a movie Christopher Nolan loves film right it really doesn't want to show that there's any CG yet you're essentially going to an IMAX format to fill your field of vision and I think there's a lot of value in rendering even traditional films in CG to get you to the super resolution to get no film grain and to work in a form that is going to survive for decades or more to come because I think in the future you're going to want to see things have the equivalent of a light field kind of experience even if it's designed to be viewed in a frame or in a certain layer and I think the discipline of going there is really important because a lot of people look to us and said, well, can you scan in this forest or this scene? And sure you could, but you're going to end up with a messy point cloud. And what we want is we want a forest and a world that can look that good, that can be done by a team of three artists or two artists, and you have this experience, a sense of wonder as to whether or not this thing was real. And because every tree is synthetic, you can go anywhere. There is no limit to that. Even if the camera and the story is happening from one perspective, it's almost like a vector version of an image. I mean, you just have all the data that's there. And I think that means that it'll survive into whatever the visual medium evolves into, because there's no limit to how deep and crazy this can go as far as how these glasses work, or what you beam into the optic nerve, or what you want to represent in a future matrix or holodeck. So I think that that's one approach that we've solved in some ways with Octane, because Octane is just the laws of physics applied. There's no shadering or monkeying around. That's one of the reasons why it's such an easy tool to use. You know, you can basically just move the sun around and everything just works and looks great. You're limited only by the assets that you can largely buy from places like Evermotion and the creativity you have as a cinematographer. So that's why a lot of this stuff looks so good. But the one area that we've always sort of hit a wall with, and that's why, you know, Keloid doesn't have a lot of humans in it, is people. Now I think people from the neck down, easy, hair, easy, all of that is solved. But if you look at some of the stuff that was done in Rogue One, which we were involved in, or even other things that we've done our best to provide light stage data sets, the second that you try to dynamically alter what was captured, if we just do a straight up capture, we can bring that into the virtual world and it looks great. And I suggest that approach for close-ups or newscasts where you just want to have a perfect representation, holographically, of what a person's doing. But when you want to drive that dynamically, it's harder. And that's something that is not a soft problem. I think Rogue One showed incredible promise. It was 80% of the way there, but there are cases where it's just, you know, Tarkin is a CG character, I'll just say, because hopefully I'm not ruining this for anyone. But there are places where you can tell, you can see it. And it's hard to say why. It's just the uncanny valley is really tricky. And I think it's because of microexpressions. It's because you almost need a human to drive that, or you need some sort of deep learning algorithm to really make it feel like a human's driving it with a little bit of noise. And I can tell. I mean, my eyes and brain are attuned to it, and I think many people are as well. If the lips don't move, if there's a certain level of water or micro expressions not around the eyes, it gets strange. And that's hard when you're dealing with purely synthetic performances. So I think that when it comes to doing that whole cloth, it's tricky. And so replacing yourself with an avatar that's your face, that is not just a pure recording of you, is one of the last challenges and it will get solved. I mean, we're not that far from it, but it's not easy and it's not there yet. However, if you just want a recording of a person, and they're recorded as is, in motion or in still, we can do that. We showed that with Unity's demo in Unity, where we could do a straight-up capture of one of our employees in a really minimal version of the light stage, and that headshot can be brought in and added to a head, and it looks really good. And to that effect, you know, when you see Flash or Supergirl on TV, those heads are all light stage heads, and typically they're static, so it looks pretty good. You can hardly tell. But I do think that the expressions that the face can show are so hard to do that it's almost half the problem right there. Everything else I feel is in the rear view mirror at this point. And we'll see. You look at Facebook and how deep they are into the VR space and you think about the name of the company. Your face is who you are. It's your identity. So much of your brain is tuned to how faces look and interact and I think that's going to be the toughest enough to crack. But it will be done and it'll be pretty incredible when we feel our identities or we feel synthetic identities look completely real to us.
[00:21:07.097] Kent Bye: Yeah, I think one of the other things that you had mentioned there about having the Octane Renderer within Unity, which if we look at previous pipelines with the cinematic pipeline you have in RenderMan, you actually have to render out each scene and then you output it to a 2D format. But we're essentially talking about being able to somehow in real time have photorealistic rendering. Is that offloaded into the cloud somehow, or how is that actually being achieved?
[00:21:33.865] Jules Urbach: One of the things that, so when we ship with Unity, it's not going to use the cloud. The free version is all going to run on your computer. The only thing you're going to need is you're going to need a modern graphics card. I mean, modern like the last five years. And, you know, we currently work perfectly on NVIDIA cards. All NVIDIA cards work great. And we're working with AMD to get that same performance. Intel graphics don't work that great. It's just too slow. But we do have another option, which is you can take any GPU anywhere in your office and that could be used to power any Octane Unity app or editor anywhere else. So you can run it on your MacBook that way. And that's something we're introducing pretty much in tandem with the release of Unity. But it doesn't need the cloud. And in fact, it shouldn't need the cloud, because what we want is we want it so that in the Unity editor, all of Octane's there. And you basically, if you look at the demo that we did in Unite, and it's on our YouTube channel, you'll see that inside the Unity editor, you're seeing content quality that looks no different than what you see when you're seeing a Star Wars film or anything else or RenderMan. It's there. It's just grainy. It's like a Polaroid exposure. It just takes time for all the noise to resolve. And you can accelerate that by adding more GPUs. There's no hard limit to getting rid of the noise problem. It just requires more cost and compute power. And that's why the cloud does help, because if you want to get rid of that noise, it can take even though the scene looks almost done within seconds, it does take sometimes minutes or hours for the noise to be completely resolved. It's almost like you just let all the photons of light just resolve, and that's why Optane looks so good, because there's no tricks. It is running a physics simulation, but the magic of doing it on a graphics card is that we get the same quality as RenderMan, and in fact, we're working closely. Disney's an investor. We're adopting OpenShader, which is largely because RenderMan has adopted it as a standard, and it's the standard way of representing materials in the real world and it's used also in Arnold which is now owned by Autodesk and so we've basically brought that pipeline into Unity and we've given Unity in the free version the ability to see that in the editor. Now what we want to do in game mode is bake to light field so that if you want to have a version of what you saw and there's limited dynamic layers right if you just need to basically relight it that's something that even light maps on a mesh can do but we can now take that to a whole other level But there is another mode that we're working on, and it's something that if you look at our R&D work over the last five years, we've been showing a technology called Brigade. And Brigade and Octane actually share a huge amount of the same code, but they have different endpoints. They're both meant to be physically correct renders, but one is meant for the content creators, the other one is meant to provide no baking, no light fields or anything, just the equivalent of a game engine without being able to change and edit the materials. And so eventually we're going to merge Brigade and Octane into Octane 4, that'll be built into Unity and you'll be able to use those things. But even before that happens, we want to introduce as soon as we can, the ability for game mode inside of the Unity Editor to have a more compact and faster way of rendering with Octane. And a lot of our work has been around how to get that to happen, you know, with full dynamic capabilities in real time and get speed by not having to be able to change the material. So you can change anything else, but the materials stay fixed and we've seen we can get a lot more speed that way. So we're going to be testing that next year and then rolling that out into Octane 4. And to make that happen on consumer hardware in real time, we've actually been collaborating with a company called PowerVR. They are pretty well known for making imagination technologies as their parent company. Those chips have been in iPhones for years. And they provided us with an experimental chip that is a ray tracing hardware. It's almost like a video decoder, but for ray tracing. Is it like an ASIC? It's an ASIC. And they only have a few of them, and we got one. And I was told, and it was a rumor, that this thing can do 100 million rays a second in 2 watts, which is insane, because it takes hundreds of watts to do that on an NVIDIA card. But lo and behold, we did a test, and we got about 50 to 120 million rays on an old version of the chip at 2 watts. And we've just started a test, a 10-watt chip, and it scales perfectly. It goes all the way to 120 watts. We'll get, at that point, about 6 billion rays a second, which means that in your game mode, with foveated rendering, you're not going to see any noise. And that means the world for us, because even if you're rendering that in the cloud, it's going to be really cheap. And we already can assume that we might have some server resources there, but it does mean that 120 watts will get you the Matrix beamed into your eyes, and we want to bring that into Unity and also into Unreal. We're not going to be built into Unreal Engine, but we do have a plugin that we've been developing even prior to Unity, And it's going to have features like baking and publishing as well. And a lot of our existing work with Epic is actually focused more on streaming. Like inside of the Oculus Social layers, you can pull down a lot of Unreal Engine projects, not because the Gear VR can't render them, but because they're too big. You know, the kite demo is 25 gigs. So even Atom is a big demo. So we see a lot of the different transactional layers of whether you want to have something in the cloud or local, not just being about the compute capabilities, but about the data sets themselves and also You know, sure, you can have distributed metaverse, but you also are going to have servers and nodes in places and in the cloud that reference that. And I think there's a lot of great work being done in that. That's sort of out of our scope of things, but look at Improbable and the work they're doing. And it's awesome. And connecting that to us, to Unity, to Unreal and our work to kind of get at least a URL based system for the metaverse, like we have for the web, is going to really make a lot of this, you know, come together in the next, I think the next year.
[00:26:28.735] Kent Bye: Yeah, and covering the AI space, I've talked to different people that are looking at comparing the NVIDIA GPU versus specifically designed ASICs, and my sense is that the GPU is sort of like a general tool to be able to do parallel processing across lots of different graphics and shaders, but yet it sounds like there's going to be a move towards the designing of these types of ASICs, but often they're very specific use cases, so not like general use, and so Do you foresee this future of virtual reality moving to this kind of hybrid use case where you're using a little bit of a GPU along with maybe some custom design ASICs?
[00:27:04.611] Jules Urbach: Yes. Well, I mean, you know, look at it in most GPUs today, or look at even the Qualcomm chips. There's a CPU, in one chip there's a CPU, a GPU, and a DSP to handle specific things like video decoding and computer vision. And currently the GPU is really powerful. I mean, you could write a really powerful video decoder, and we've done that. That's how we are able to do a light-filled codec that runs fast without having to wait for people to build video decoding hardware. But ASICs, when you have a domain that is solvable, are really powerful. So in the case of ray tracing, you brought up a really important point, which is that when we render stuff in OptiCAN or even in real time as we move OptiCAN towards that goal, the ray tracing ASIC is only half of the work. So if we're to imagine a GPU that is equivalent to a 1080, maybe 250 watts, Half of that would be the ray tracing ASIC, which would get you the 6 billion rays. The other half would still do shading, pixel shaders, and run kernels, because there's a lot of general work that needs to happen. And you're never going to get rid of a CPU either. There's always going to be at least one or two CPU cores handling the scheduling and traversal of things. And similarly, things like video decoding, once the codec is well understood, then going to an FPGA, then to an ASIC really does make a lot of sense. And you see that with Bitcoin mining, right? Bitcoin started out as something you could do on a CPU, and then the GPU compute version came out, and all of a sudden that was the way to mine Bitcoins. And we knew this because we'd worked on writing a codec in AMD GPUs a long time ago. There are certain functions on AMD GPUs that are five times faster than NVIDIA, and so because of that, Bitcoin mining was one of those same leverage those instructions and a lot of people bought AMD cards to mine bitcoins. But eventually, you know, you had to move to FPGAs and then ASICs. And so, you know, you're not at the point where you have really optimized domain specific hardware that can mine and hash bitcoins. But everything that can be pushed into that known function can eventually be turned into something that's a really efficient ASIC. And ray tracing is definitely ripe for that. more so than I think people realize. I mean, until we tested PowerVR, we weren't sure of that either, but now we are, and I think that that's going to be bolted on to existing GPU solutions. NVIDIA and AMD and even Intel's solutions could use a PowerVR ray tracing ASIC, and I think that's going to happen, and we're going to try to help connect all those dots together.
[00:29:07.325] Kent Bye: And do you foresee that the technological stack required for both virtual reality and augmented reality are at some point going to converge and that we may just have single headsets but still maybe a mobile version, a desktop version, and then maybe...
[00:29:24.833] Jules Urbach: I think you're right. I mean, the whole reason we're doing this show at CES and backing, really, we're betting a lot on ODG. And I have to tell you, we've also bet a lot on the Gear VR. Like, we're all in with John Carmack on the Gear VR. It is our platform of choice for VR. But, you know, there's no doubt in anyone's mind, and I think Mike Labrash was saying that, VR and AR are not going to merge eventually. They're merging now because you already have so many interesting things that are happening in the beginning of this year where you have now little pucks that can do position tracking. You don't need to have that even attached or built into the device. You can just drop that onto an ODG pair of glasses and all of a sudden it can be tracked just as well as the Vive or the Rift. And you can put that on your hand and all of a sudden you have really great position tracking. And we built that into our Gear VR stack so you can tap one of those under the Gear VR and our app all the things we're doing in Oculus Solstice or Samsung Internet can use position tracking from a Lighthouse server. So that's one layer that doesn't require any sort of differentiation. And you're already seeing basically remoting. With the Vive, you now have a solution that is able to do really low latency wireless beaming. I guess it's wireless AD. I mean, it's essentially 6 gigabits per second is enough for you to get with the equivalent with the Vive and the CV1 are able to project. And similarly, the OTG glasses can do that as well. I mean, we've actually built even more efficient layers where you have a really powerful GPU on those OTG glasses, so let's do some reprojection locally. And I think that untethered VR has to happen. You're either going to get to the point where OTG glasses, when they're blocked out, have such good cameras that it just looks as good as what you'd expect a transparent layer to look like, and I think that's really important. In LG's case in particular, we do have the ability with the camera pass-through to really provide you both experiences and experiment with that. And I think the cameras are only going to get better and the screens are only going to get better to the point where even if there's low light, you can probably have a decent enough sampling of the environment with high-speed cameras to actually give you augmented vision. And I feel like that means that at the very least VR is full screen mode for mixed reality. And I feel that's important. I think you want to be able to go in and out of VR mode without having to take the headset off or that relying on weird wonky things that don't look real. And mixed reality also is powerful. A lot of the experiences we initially wanted to do only in VR, like the Batman animated series, now I want to make it like the Indian in the Cupboard or a toy chest thing where you can watch the episode almost like a ball in front of you and look at it as a hologram. It almost feels better that way. I mean, you could do both. It's like what Zachary was showing with the social stuff where you have a sphere and you can go in the sphere and you can also take it out. And that's where a lot of the Lightfield stuff gets really interesting because not everything has to be in a bubble around your head or lived in. It also can just be a shrunk down God mode version of these things and experiences do kind of become really interesting that way. So I think mixed reality is going to happen and it's going to happen through passthrough and eventually I think the pure AR plays are just going to go the way of Google Glass because it's really tough. to justify that because you're missing a lot of the VR experiences. And I think ODG has picked just the right compromise to do that and they could go further. I mean their R9s are 50 degrees, they could probably go wider if there's demand for that and people want it, they could do it. But you already have a better field of view than the HoloLens and a lot of our Gear VR content, while it's a little bit more, I'd say cramped, it looks really great. And it's something that I think the convergence is already there. And I think if you're not already starting on that, it's going to get hard to do that later. And I think for the standalone Santa Cruz device that Oculus showed, you have enough cameras on there to do exactly that. And I think Michael Abrams is correct in saying that the best way to do all these things is to really have really good camera capture and to fool the eye, just like VR is supposed to do, into thinking that the pass-through is just as good as looking through a pair of lenses or glasses.
[00:32:51.278] Kent Bye: So what do you want to do in virtual reality then?
[00:32:54.666] Jules Urbach: Well, I think I wouldn't mind living in VR. I mean, I used to live in France, and it was a beautiful countryside, but I also have to work, and I can't live in France. And so having places and environments that are your happy places actually be your office or your home is fantastic. I also want to work with people. My friends are spread all over the world, and we see how Facebook connects people together. I mean, I can keep in touch with hundreds of friends that I would never be able to do without some sort of communication. And being able to sort of do that in a more visceral, hands-on way is really great. And I think VR is going to be great for that. That's why mixed reality is important. Doing the equivalent of a Skype call or a messenger call with somebody, whether it's purely an avatar that is made up, or them, or some mixture of the two, is really valuable. I mean, I think being able to sort of emote and share things is really valuable. I think playing games is great, but I also want to experience things that I love. Like, I love comic books, and I love the Batman animated series, and I wanted to see that rendered holographically. I don't even know whether to try to call it VR or AR. But I love the idea of something that is beautifully rendered no matter what scale it's at, no matter where it's looked at from. And we've done that test with animated content with Batman. We're also doing similar things with more realistic content like you were seeing. And I want to provide those tools to everyone and see what comes of it. And I think the stories, the medium that you can create with that is going to be a lot like you know, until film was invented, people didn't understand the kind of emotional experiences you could have from seeing a film versus reading a book. And in some ways, you know, one doesn't replace the other. They both have their strengths and weaknesses. But there is something very different that's going to happen with just this spatial, volumetric and immersive and shareable medium as a tool for artists. I mean, we're going to see that in these wonderful ways, and I want to see that happen. And then I want to have people really rethink their lives and sort of the meaning of materialism, for example, is going to be completely different when forget 3D printing, just having an object, you know, maybe there's some haptics that can be represented and magically turned into anything like liquid metal, like the Terminator, right? That's going to change the value of collecting. It's going to change the idea of spaces and things. I mean, it's essentially going to turn into a lucid dream, and who knows where that will go, but I think it's going to be a really good and powerful tool. I mean, it can also be used for crazy things, but mostly it's going to allow people to really share and build and innovate much more quickly. I know that's going to happen, and I'm really looking forward to that happening in the next five years.
[00:35:03.372] Kent Bye: Yeah, one of the things that I think is one of the biggest challenges to the metaverse as we look at the lessons of the Internet and the World Wide Web is that we have this centralized system in a lot of ways and we have both with cloud servers or centralized DNS and I just imagine that people like Philip Rosedale working with some sort of distributed systems or to be able to perhaps distribute the load and not necessarily make it reliant upon an individual to be able to pay to keep everything hosted or to pay for all the servers for AWS. So I'm just curious if you've thought about that a little bit, the future of the web, if it's going to be more distributed and how some of these digital light fields and the computationally intense requirements that those have, if there's going to be a way to even possibly do that in a distributed form.
[00:41:15.865] Kent Bye: What are some of the biggest open questions that you think is driving Otoy forward?
[00:44:21.700] Kent Bye: Awesome. And finally, what do you think is kind of the ultimate potential of virtual reality and what am I be able to enable?
[00:44:29.292] Jules Urbach: Well, if I'm going to sort of go out on a limb, if you're going to put me to do that, I would say that, you know, Elon's point about, you know, maybe in, you know, a long time from now, we'll see if virtual reality and our reality can be completely seamlessly interchanged. I think that will happen. And I think it'll happen soon. And I think it's going to be a crazy, I don't know if there's ever been anything in our lives that is going to be that disruptive, not in a bad way. I mean, you know, people think, oh, if aliens encountered us, it would change our understanding of the universe. It wouldn't. I mean, we can understand what other cultures would be. What we don't understand, short of some sort of super AI, is what it would be like to essentially live in a completely non-physical existence and have that be normal. I mean, have that be what feels right. I mean, we don't have to go and hunt animals or grow crops, all of us, anymore. And I think that you're going to get to the point where the physical value of things just goes to zero. And I don't know how that'll happen, but it may very well be that at the end of that we realize that this entire thing is a simulation, maybe not on a computer. Maybe there's something more exotic out there. We've sort of been trick before where we thought we had it all solved with Newton and now everything makes sense. But no, you know, the theory of relativity shows us that time and space are like silly putty and we may find out that reality itself is that way. And there's an article in The Atlantic that came out last year that I highly recommend that was discussing how we as humans have evolved through evolution to not fully necessarily take in things as they really are. I mean, our brains are designed to understand time in a certain sense, But it may just be that the real reality, the true nature of existence is very different, and that we can't understand that because we're just limited by a biological brain, just like a jellyfish is limited to what its existence is underwater with the senses it has. And so that's something that I think we can start to sort of get at with virtualized and totally collective experiences in the virtual world. And then of course, there's AI, which I think is scary because you don't know what kind, I mean, you think about what that can mean in totally alien intelligence. But I also think that AI, if it's going to be of anything that matters to us, good or bad, it probably is going to have a human experience. And I think the thing that makes us most human is in our brains or how smart we are, what we compute. It's our experience as human beings with other humans. And if an AI has that experience in virtual reality or real reality, that is what would make it human. And that might make it good or bad. It's really interesting to think about it from that perspective.
[00:46:32.749] Kent Bye: Yeah, I think the biggest thing that's holding back a virtual reality that's going to be completely indistinguishable from reality is the haptics and being able to actually get that tactile feedback. So, I mean, I think that the visual and auditory can take you to a certain point, but it's still going to be limited in terms of how far it could go, at least in the short term. Maybe if you look at it in the long term, we'll eventually get there. But I think the haptics and even the smell and the taste, I think those other senses that may only account for anywhere from 7% to 10% of the level of immersion, you may be able to get to 90% just with the visuals and the sound. But to not actually feel and taste and smell things, I think it's not going to get to that point.
[00:47:13.188] Jules Urbach: I agree. And I think about that a lot myself. So you have most of your touch senses are in your fingertips. I mean, your fingertips are so sensitive that it's like if you draw a diagram of where all the nerves in your body are wired, a huge chunk of it are in your fingertips. And that is something where you really can sense if you had some sort of haptic glove that could just be applied on your fingertips, you could get a sense of touch on almost anything. And as far as things blocking you or whatever, I mean, you know, that's something where, you know, you have a rumble pack already for that. I mean, it's hard to say how good things would have to get where you have like sound waves that can block your movement. but I do think that the sense of touch is something that really needs to be applied to fingertips first, and that is something that a glove of some sort has always been on the roadmap for VR, right? I mean, that's something that people really do understand. Smell is something that I have seen done in the past through the equivalent of an inkjet printer, where you have base smells, and you have a digital library of what a banana smells like, or bacon, and it basically mixes them together, and it can be done in anything. At first, there were cartridges, and then they actually had it fully synthesized in a solid-state system, just like audio is. It was wild. So I think that sense is really something that could be done. You know, there are people working on fans and heat and things like that, but definitely smell isn't out of the question. And I guess maybe if you really are going to get to the point where you want haptics to be done like the way they are in the Holodeck, you know, then I was mentioning utility fob, which is basically just nano clay where you have tiny, tiny nano beads that can be put together magnetically to form anything to the point where when you touch them they feel real and there is something where you can imagine the software that can make that work like buckyballs but you know that is something that feels like it's a little bit further out but that is the ultimate matter synthesis system and whether it's real matter or looks like real matter it just has to feel like real matter because the visual parts are already there if we 3d printed something and you put on the VR glasses and touched it, and it looked like the way it was supposed to, you would be convincing. And that's kind of where I see the ultimate challenge of haptics as being, how do you solve the digital clay problem? Because that might be what people want. And maybe just very fast 3D printing is enough. But I think self-assembling digital clay utility fog, which has been theorized since the 70s, at nanoscales is doable, and we'll see. I mean, it's not my area of expertise, but that's how I'd imagine touch being done. And taste, a lot of taste is really driven by smell, so if you had swallowable utility fog that could be consumed, maybe you would have your piece of sushi digestible, but the calories wouldn't matter, or something like that might be how you would do it. The concept of virtual reality is interesting because you're really only looking at the surface properties of things. Whether or not the number of molecules in this thing that's supposed to feel like wood is really wood or not doesn't matter. You don't know even what's in your own body. As long as it feels like the surface of it, it's good enough. The holographic principle on every sensory level is kind of how I imagine the virtual world, if it feels like the real world, being way more efficient by orders of magnitude because of that, you know, efficiency in computation and matter and energy.
[00:49:52.288] Kent Bye: Awesome. Well, thank you so much, Jules.
[00:49:53.548] Jules Urbach: My pleasure. It's always a pleasure to do this with you.