#413: Rendering the Metaverse with Otoy’s Volumetric Lightfield Streaming

jules-urbachDigital lightfields are a cutting-edge technology that can render photorealistic VR scenes, and Otoy has been a pioneer of the rendering and compression techniques to deal with the massive amounts of data required to create them. Their OctaneRender is GPU-based, physically correct renderer that has been integrated into 24 special effects industry tools with support for Unity and Unreal Engine on the way. They’ve been pioneering cloud-based compression techniques that allows them to stream volumetric lightfield video to a mobile headset like the Gear VR, which they were demonstrating at SIGGRAPH 2016 for the first time.

Jules Urbach is the CEO and cofounder of OTOY, and I had a chance to sit down with him at SIGGRAPH in order to understand what new capabilities digital lightfield technologies present, some of the new emerging file formats, the future of volumetric lightfield capture mixed with photogrammetry techniques, capturing an 8D reflectance field, and his thoughts on swapping out realities once we’re able to realistically render out the metaverse.

LISTEN TO THE VOICES OF VR PODCAST

Otoy is building their technology stack on top of open standards so that they can convert lightfields with their Octane renderer into an interchange format like gLTF, which will be able to be used in all of the industry-standard graphics processing tools. They also hope to eventually be able to directly deliver their physically correct renders directly to the web using WebGL.

In the Khronos Group press release about gLTF momentum, Jules said, “OTOY believes glTF will become the industry standard for compact and efficient 3D mesh transmission, much as JPEG has been for images. To that end, glTF, in tandem with Open Shader Language, will become core components in the ORBX scene interchange format, and fully supported in over 24 content creation tools and game engines powered by OctaneRender.”

Jules told me that they’re working on OctaneRender support for Unity and Unreal Engine, and so users will be able to start integrating digital lightfields within interactive gaming environments soon. This means that you’ll be able to change the lighting conditions of whatever you shot once you get it into a game engine, which makes it unique from other volumetric capture approaches. The challenge is that there aren’t any commercially available lightfield cameras available yet, and Lytro’s Immerge lightfield camera is not going to be within the price range of the average consumer.

Last year, OTOY released a demonstration video of the first-ever light field capture for VR:

Jules says that this capture process takes about an hour, which means that it would be primarily for static scenes. But Jules says that their working on a much faster techniques. However, they’re not interested in becoming a hardware manufacturer, and are creating 8D reflectance field capture prototypes with the hope that others will create the hardware technology to be able to utilize their cloud-based OctaneRenderer pipeline.

Jules says that compressed video is not a viable solution for delivering the amount of pixel density that the next generation screens require, and that their cloud-based lightfield streaming can achieve up to 2000fps. Most 360 photos and videos are also limited to stereo cubemaps, that don’t really account for positional tracking. But lightfield capture cameras like Lytro do a volumetric capture that preserves the parallax and could create navigable room-scale experiences.

Jules expects that the future of volumetric video will be a combination of super high-quality, photogrammetry environment capture with a foveated-rendered lightfield video stream. He said that the third-place winner of the Render the Metaverse Contest competition used this type of photogrammetry blending. If Riccardo Minervino’s Fushimi Inari Forest scene were to be converted into a mesh, then it would be over a trillion triangles. He says that the OctaneRender output is much more efficient so that this “volumetric synthetic lightfield” can be rendered within a mobile VR headset.

Overall, Otoy has an impressive suite of digital lightfield technologies that are being integrated with nearly all of the industry-standard tools and with game engine integration on the way. Their holographic rendering yields the most photorealistic results that I’ve seen so far in VR, but the bottleneck to production of live action volumetric video is the lack of any commercially available lightfield capture technologies. But lightfields are able to solve a lot of the open problems with the lack of positional tracking in 360 video, and so will inevitably become a key component to future of storytelling in VR. And with the game engine integration of OctaneRender, then we’ll be able to move beyond passive narratives and have truly interactive storytelling experiences and the manifestation of the ultimate potential of a photorealistic metaverse that’s indistinguishable from reality.

Subscribe on iTunes

Donate to the Voices of VR Podcast Patreon

Music: Fatality & Summer Trip

Rough Transcript

[00:00:05.452] Kent Bye: The Voices of VR Podcast. My name is Kent Bye, and welcome to The Voices of VR Podcast. On today's episode, we're going to be taking a deep, deep dive into digital light fields, which I think is really going to be the foundational components of the future of VR in so many different ways. Digital light fields represent a technology that is able to replicate some photorealistic looking scene. The only issue is that these files are huge and it actually takes a lot of either processing power or really innovative algorithms to be able to render out these scenes. And so it's really hard to do in real time. So, we'll be talking to Jules Obrecht, the co-founder and CEO of OTOI, which has essentially cracked the code for how to process and deal with these digital light fields in order to stream them into mobile technologies. And they've really created a pipeline that's already fully integrated into all the different major tool sets of the existing special effects industry. And so in a lot of ways, they're building the foundation to be able to render the metaverse. So we'll be taking a deep dive into Digital Lightfields and Otoy and all the things that they're doing and trying to explore a lot of these new concepts as I myself am trying to figure out in the process of this interview. So that's what we'll be covering on today's episode of the Voices of VR podcast. But first, a quick word from our sponsors. Today's episode is brought to you by The Virtual Reality Company. VRC is creating a lot of premier storytelling experiences and exploring this cross-section between art, story, and interactivity. They were responsible for creating the Martian VR experience, which was really the hottest ticket at Sundance, and a really smart balance between narrative and interactive. So if you'd like to watch a premier VR experience, then check out thevrcompany.com. Today's episode is also brought to you by The VR Society, which is a new organization made up of major Hollywood studios. The intention is to do consumer research, content production seminars, as well as give away awards to VR professionals. They're going to be hosting a big conference in the fall in Los Angeles to share ideas, experiences, and challenges with other VR professionals. To get more information, check out thevrsociety.com. So this interview with Jules happened at the SIGGRAPH conference happening in Anaheim, California from July 24th to 28th. So with that, let's go ahead and dive right in.

[00:02:34.112] Jules Urbach: So I'm Jules Rebeck. I'm the CEO and co-founder of Otoy. Otoy is what we call a holographic rendering company. And we're very focused on physically correct rendering in games and films and now VR. A lot of our technology has already been used to great effect in VR with Oculus and specifically with the Gear VR. Last year we did a contest with Oculus that John Carmack and I were judges on, where we invited people to use our tools to create these beautiful high-resolution stereo cubemaps, and that's something that went really well. And so we're looking to sort of build on those pieces now with adding positional tracking, animation. pieces, and Otoe is actually, you know, it has one major product that's in users' hands already called Octane, and Octane is a plugin, a rendering plugin for 24 different tools ranging from, you know, After Effects to Maximize, Blender, Cinema 4D, and we're recently now working on integrating it in game engines like Unreal and Unity, but ultimately all of these tools have access to physically correct rendering, We have a cloud service that can generate not just Stereo Cubemaps, but also volumetric light fields that can now be streamed live over a connection. And we're adding that into the mobile VR devices once they get position tracking. So currently we're showing it at the show with the marker-based tracking, which is able to stream Stereo Cubemap videos, light field objects, and navigations. You can mix and match that with live game engine content with Unreal or Unity. as a layer and it's a really powerful mechanism of solving the issues of complexity and increasing resolution on wearable devices that are sure to continue in the next six to nine months as these devices come to market.

[00:04:04.604] Kent Bye: Great, so let's talk a little bit about digital light fields because as we see things we're having photons hit our eyeballs and so photons are like bouncing off of all these different objects in it. My impression is that in a light field rendered scene you may have different sampling points around the room so you can get different perspectives and that you're taking kind of like a light field capture, but then there's somehow you're able to kind of synthesize that into a whole scene and then represent it not through necessarily a mesh, but more of like vector photons flying through. I'm trying to figure out like metaphors and analogies for how to think about how this information is actually captured.

[00:04:43.210] Jules Urbach: Sure, so we're actually focused, I mean when we talk about capturing light filters in the real world, it actually still is about turning that into a CG asset, which is what we've been doing almost before we started on Octane. We've actually had a commercial service which is around today called Light Stage. It won an Academy Award in 2010 for doing realistic CG human capture. And ultimately that's the approach that we're taking, is we want to basically take real-world assets, specifically humans are the hardest thing to capture, and we turn that into what's called a reflectance field, which is like an 8D light field, and it includes not just the information about where every ray is coming from, but also the surface of the object that you're capturing, what happens when a ray of light hits it, where does it go. So that's basically more than just the ability to have a holographic image of something, it also means how does this thing get relit, And that's the thing that we're synthetically generating when we render what we call a holographic stream in Octane. It's an 8D light field, but there's also other pieces. It includes all the information about the surface, including the material properties, index of refraction, things like that, so that when we have a mixed reality stream, we can actually relight the stream with local lights that are captured by the camera. You can use that to composite. So it's really almost like having a really advanced physically correct renderer that trades off compute for data. And traditionally, people have actually done demos of light field renders where they just generate a lot of different views. And the thing is, if you're to do that with Octane, you can generate millions of cube maps, and you can generate a light field that can randomly access it. But that's a lot of work. So what we've done is we've come up with a way where you set a bounding volume, and the cloud service will run the ray tracer, but not in a naive way. It'll basically figure out where all the photons go, pack that up into a very small file, and that can then be decoded at 2,000 frames a second, and it's perfect. I mean, basically scene complexity, physically correct rendering, all of that is baked into that volume. And the latest thing is that even though we've been able to compress that to maybe 35 megabytes for a cubic meter, we want to be able to do videos, we want to do massive worlds. And so all of this led us to push that server side and then make sure that the minimum piece on the client was able to just essentially ray trace into that live stream. And we have that working now, and it works beautifully. So we've got a whole, system around this, and the data format for how we store those rays actually has changed about seven or eight times, so it's something more like YouTube, where, you know, you can press something, you take an ingested video, and we're here taking a rendered asset from Octane, and we're coming up with different ways of improving that. We may re-render that, but at the end of the day, you get a link, a URL that you can stream into a JavaScript client or the Gear VR with different entry points into different apps, and I think that's the way that we're gonna see this thing disseminated everywhere.

[00:07:11.602] Kent Bye: So is it translated into a 3D mesh in some way then or is it still kind of in your proprietary format?

[00:07:17.626] Jules Urbach: There are things like we have this light field of the forest scene where it's like it's a trillion triangles and we can ray trace that with Octane because Octane itself is a ray tracer, but it's noisy, right? So the delta that we have with these baked volumetric light field type objects is that the scene complexity is completely I mean, just like a photo, you don't care how complex the photo is, it's just the same number of pixels and the same density. And our stuff is similar. So you essentially can bake in things like volume, fire, and smoke, and clouds, and even all those things. And just a forest of trees, it's trillions of triangles if you were to turn it into a mesh. And the interesting thing is you don't really need a mesh. I mean, you actually can reconstruct. It's almost like a point cloud. You can also pull in depth and material information from the volume. So you could build that and turn that into a 3D mesh. And that's something we're thinking about. But ultimately, inside of a game engine, you can actually take this volume and mix it and composite it with deferred rendering. And it's perfect. It just looks beautiful.

[00:08:09.852] Kent Bye: Now, the Chronos group within the next couple weeks just announced a new format, the GLTF. And talking to different people here, it seems like right now there's a pretty set pipeline for how to do 360 video and output to an MP4 file. And there's a whole sort of tool set to do that. But when I ask people about, well, what about digital light fields? And the answer seems to be like, well, that's an entirely new workflow that we have to still kind of figure out.

[00:08:34.928] Jules Urbach: I'll also add that I don't think that 360 stereo video has been solved at all. I mean, you're talking about stereo cubemaps, like the type that we have on the Gear VR that we're showing at the show. Those are 18K on the Gear VR. They're 72K by 6K on the ODG glasses, and the ODG glasses are 120 frames a second. So if you're going to deliver that, even with all our compression, it's still insane. And I don't think that video that is sent out as an MP4 is going to really cut it. So what we've done is, solving all these problems at once, we can basically load those Stereocube videos or anything on the cloud, and then as your eyes are moving, we pull down whatever the eye can see, and that works really well. That's something that is pretty important for this kind of absolute high quality, peak quality, as Carmichael calls it. QMAP video format, and then I think for GLTF, which is something, so we were involved in that as well, I mean, I talked to Carmack about our results using it, we're building all of Octane around GLTF, we're using it as an interchange format, it's great, it's definitely more efficient than FBX, and you're right about light fields themselves not being a standard, but we almost think that you have to have a proxy system, you have to have something that defines a volume or a shape anyway, and GLTF's a great system for that, it's gonna be something that we support in all our tools, and that we think is going to be adopted along with things like maybe open shader language to define an interchange format for at least 3D scenes and how to render them. How it's rendered, how it's stored is almost a separate task and a separate process, but I think GLTF has a lot of potential and we're actually going to be doing our part to move that format forward and try to replace existing mesh formats with just GLTF. Like the LeanBic is an example of something that is very heavy and GLTF, once it adds morph targets, could replace that. So yeah, I mean we're excited by the progress that Chronos has made with that as well as things like SPVIR which sort of does the same thing for compute.

[00:10:11.913] Kent Bye: And so do you imagine that that would be kind of like just that we put a mp4 file out to see a 360 video we'll be able to create a glTF format that could be like a fully navigable room-scale experience with digital light fields that are volumetric.

[00:10:26.455] Jules Urbach: Yeah, so without even the Leica part, if you need, like for example on Steam you have these environments, I think Virtual Desktop can load in our cubemaps, it can also load in a mesh like an OBJ. And the thing is, there's a million different mesh formats, and basically glTF is the one format that I think can cover every feature we need, it's open and it's very efficient. So anytime you need a mesh or a 3D triangle mesh, glTF is going to support that. And it is, ultimately I see that as being, it's so small it can be transmitted, so if you need to have a polygonal object, you can do that. For other things like light fields and even volumetrics, like volumetrics we use OpenVDB, which is another open format that stores things like fire and smoke. GLTF doesn't do that. And then separately from that, light fields are a whole other... I mean, Stanford came up with ways of capturing light fields and doing this stuff, but it's more about like, you know, is there one format for stereo? No, you just pick two camera views and a light field itself is just a bunch of different views. But the way that we store and render them is almost a black box, and it should be that way. Ultimately, when we take scene data from GLTF and turn that into a Lightfield streamer object, it's really about the rendering process behind that. And I think that there's certainly some existing stuff in the MP4 spec that does multi-view, which you can kind of use that for Lightfield video. But this is an education that you need to have to be able to do really good volumetric streaming with all the pieces we want. It's a custom setup, but it's something we want to build on top of open standards. For example, everything we're doing for the Lightfield Viewer and Streamer runs on OpenGL ES 3.2 on the Gear VR, so there's nothing we need. I mean, we can almost send out a pixel shader, even potentially in WebGL, and decode everything on an open platform like the web. I mean, that's our goal, is that we get the playback and all these things to kind of sit on top of open formats. And if glTF can be consumed directly by like a web browser, which is where I think this is going, that's great. We still need mesh data to sort of be the shell or the skeleton of what we're doing and we're mixing that with other content. So there's lots of different pieces, but glTF is a huge step forward in my mind.

[00:12:14.823] Kent Bye: Yeah, and about a year ago, Otoy released a video showing a digital SLR camera on this sort of swinging rig in order to capture a digital light field of a static room. So it doesn't seem like there's a lot of movement that was happening. And so it's kind of like a still life capture of a static space. And then you're able to extrapolate from that digital SLR footage a digital light field. So maybe you could talk about that process and that specific use case for capturing a light field.

[00:12:42.254] Jules Urbach: That process showcases the simplicity of how light fields really work, which is that if you have one camera and you spin it around in a circle and basically change the axis, you're essentially going to get a light field from that. And it's 40 gigs of data that's evenly spaced. And, you know, the thing is, like, when I talk about light field rendering, the naive sense is, like, that's all you need. Like, we can take that data and that's your digital light field, but it's way too big. So what we've done is we basically it's a two-step encoder, we can take the data from the camera, encode it once to send up to the cloud, the cloud then processes it, basically through Octane, generates all the information that Octane, you know, recovers depth information, estimates normal, things like that, and then you basically render that the way that Octane normally does in our normal volumetric synthetic light fields, and then that's the process that we were doing. So, what we're kind of thinking about is that was an experiment that we did last year with Paul Debebeck, and who's now at Google working on, I think working on computational photography, But our piece is we want to come up with an even simpler and cheaper system that can do two things. One, super high quality environment capture in a second. Like that took an hour to capture that thing on the tripod, so we have that in one form factor. The other one is, and this is an extension of Light Stage. We built, and we showed this a couple of months ago, a very inexpensive Light Stage portable camera rig that can capture humans in motion talking. It can also capture people. So it's basically I don't think you want to capture light fields. You know, you could, I suppose, like Lightro is doing, where it's like, everything's in 360, but it makes so much more sense to get super high quality environment capture, and then do a foveated light field video stream, especially for faces and people, that includes all of the light stage technology, which means more than just the light field, it includes the material properties, or how does this face get relit. In the demo we showed, we were able to take that performance, put it in something like Unity, and you can move the Unity light, and the lighting changes on that person. So that's really an important property of real world capture. And I think that there's devices that we want to sort of build prototypes of and let others commercialize and build that hardware. And then there's also partnerships. I mean, like, we think that Lytro's stuff is great. Like, we want to maybe figure out a way we can augment that with all the tools that we do. So it's not a zero-sum game. There's a lot of advancements that can be made in there. But ultimately, anything that captures a lot of different viewpoints is something that we want to enable you to send those images to our cloud service, and we'll turn that into first an Octane scene, and then we can render that out into a very quick, streamable, volumetric video.

[00:14:51.882] Kent Bye: And so is Otoy working on your own sort of live capture video then?

[00:14:57.715] Jules Urbach: At GTC, and I'm actually redoing this presentation tomorrow at the NVIDIA talk, we showed a 8D reflectance field capture. So we were able to capture a woman's performance, one of our employees, from basically ear to ear, and we were able to take that very same asset, and it was essentially more than just a light field, it had the reflectance field. So you can change the lighting on it, you can composite her in any game engine or scene, and that's the kind of capture that we want to perfect. And it's commercialized already. We have a service where you can rent these pieces of equipment It was used to scan in Obama for the Smithsonian. It was the first portable light stage. And now we're getting into the point where it's going to be for video and inexpensive. And if you need to capture somebody's face or an arc, it's going to be good enough. And it'll be about $15,000 to $20,000, which is pretty cheap for something that does this part of the pipeline. And then the static light field capture system, we haven't unveiled that yet, but it'll be pretty amazing. And if people want to check out what the fidelity of that will be, they can set up an appointment at our booth, and we have a version on the Vive that shows a super high resolution light field capture and rendering that we're doing for static environments, and it looks amazing.

[00:15:56.807] Kent Bye: So if somebody wanted to go out and start capturing and working with light fields today, is there anything that's commercially available or out there that people could use to start to work with light fields within Octane?

[00:16:07.923] Jules Urbach: I think the simplest bet is there's actually a lot of really good photogrammetry tools, and what we want to do is, you can start with that. In fact, somebody who won the Render the Metaverse competition used photogrammetry to start on the art assets in Octane, and then they basically pulled that in, and that was the Japanese forest scene, which was, I think, one of the, it was the third place winner in the last month, so you can see that on the Gear VR. There's a lot of that that we want to enable, and so part of that is that was an off-the-shelf tool. What we wanted is we want to have you upload your images. It could be videos, it could be a bunch of different images you take, and we'll figure out how to build the mesh and the light field from that, and not have it be something that has to be on a tripod, spun around like what we did last year. So those are the kinds of things where we see experimentation being possible, but ultimately, there aren't really any good light-filled cameras that are out there. It's really just about taking multiple views. The closest thing that I think will solve that will be light-choice cameras, and those generally are real light-filled cameras in a sense that I don't think anybody else is actually doing that. commercially, and I think that's exciting, but those are also very expensive, so I feel like there's probably room for something that's smaller, and we're trying to work with partners that can enable a bunch of different camera lenses in there, but I would say that just taking a bunch of pictures and using existing photogrammetry tools is a good start, and we'll make that process even simpler by just taking those images and processing them on the cloud and turning them into a basic light field volume.

[00:17:21.075] Kent Bye: So I'm just curious to hear your thoughts of, you know, we have a whole visual effects industry here at SIGGRAPH. A lot of people have their existing pipelines and it seems to be, you know, 360 video is a very comfortable format for them. I'm just curious to hear your thoughts to see how you see the industry changing and how digital light fields are going to either have a kind of a slow or quick adoption growth.

[00:17:43.315] Jules Urbach: So I think if everybody could do light field rendering today the way that they can do a spherical panoramic video or cubemap video, we'd be done. But it's very hard. And light fields themselves, they sort of represent the idea of what we're trying to go after, which is a volumetric video stream. And the thing that stumps a lot of people, and this is why I think we've got a lot of momentum in what we're doing, is that a light field traditionally is just a lot of data to render, to capture, whatever. So it's like for us, we've come up with ways of making that problem a thousand times simpler computationally, And I think that our role in this industry is actually already pretty well spoken for. Octane has plugins in every app, like Nuke, After Effects. I mean, there's, in fact, Octane sells really well. We've seen our growth skyrocket, and I think VR is going to be where Octane really can help the visual effects industry just tap into everything. I mean, even rendering Stereo Cubemap videos in Octane is really easy. Like, we were just showing people at the show, like, we have a service that'll package everything up for the Gear VR. The next stage is that very same system, whether it's composited in Nuke or After Effects, is gonna generate a light field-like volume that you can then have the very simplest client basically stream through and we're going to connect that to different apps and different technologies so that it becomes like video. That's hard and there's a whole backend cloud service that we've been building around that. Octane is sort of the tool of choice for those pieces, but the industry at large is definitely looking at that. You're seeing people attempting to do these kinds of things through photogrammetry, like Reality's I.O. is pure photogrammetry. The Nozon guys are doing, I think, depth peeling or similar things to give you these Parallax effects, but I mean ultimately what we're trying to do is give you the absolute same quality that octane does when it's rendering live with noise But give you that in a cache format that is completely navigable and can still be composited with live dynamic content And that's hard and if somebody else does that well, then we don't need to but it's like there's such a void that Oh toys filling that we feel like that's Something that we can really help with and get a lot of people on on track

[00:19:27.354] Kent Bye: Awesome. And finally, what do you see as kind of the ultimate potential of virtual reality and what it might be able to enable?

[00:19:35.044] Jules Urbach: So I have a lot of thoughts on this and this is a whole thing for me, but I do think the metaverse as people describe it is where it's going and the option, the option not necessarily, you know, people being forced, but the option of basically swapping your real life for something where everything feels and exists in at least the same quality and realism is going to happen at some point. I mean, whether it's going to be in five years or 50, it's hard to say, but The second that does happen, and I think we're actually doing our part. I know that we can render the holodeck, we just need more computing power. The second we prove that and we have all five senses hooked into that, then it probably, as Elon Musk was saying, it probably means this reality is already a simulation. We don't know who created it necessarily, but it's the fulfillment of probably the meaning of existence that we get to the bottom of whether or not we can simulate a reality that looks exactly like this one and then start riffing on it and doing better implementations. Like, oh, I want to have the idea of the Christian paradise. Well, why not? It can be simulated. It's easier to simulate than reality, physical reality. So there are all these different layers. And I do think that from it, you can take all the trends that exist today, social and knowledge and experiences. And basically, when you have your eyeballs hooked into that, and at least even when it starts looking completely real, that's already at the point where a lot of people are just going to move into that virtual space and not have to go on trips and not have to go to schools where a lecture is no different in VR than it is in real life. is going to definitely happen in the next decade, for sure, if not sooner.

[00:20:53.231] Kent Bye: Awesome. Is there anything else that's left unsaid that you'd like to say?

[00:20:56.094] Jules Urbach: The space is growing really fast. We're excited. If people haven't seen the ODG glasses, which are the partner that we have in Mixed Reality, it's amazing. I mean, Tim Sweeney put them on. He said, this is the best glasses I've ever seen. It just came by our booth today. It's amazing. People should definitely check that out if they're at SIGGRAPH in the next couple of days.

[00:21:11.568] Kent Bye: Awesome. Well, thank you so much.

[00:21:12.902] Jules Urbach: My pleasure.

[00:21:13.442] Kent Bye: Thank you. So that was Jules Olbrecht. He's the co-founder and CEO of Otoy, which is working on a lot of really cutting edge light field rendering technologies. So I have so many takeaways from this interview and so many that I'm just going to try to give the highlights. This is an interview where there were so many new concepts that I found myself writing down a lot of new terms and just trying to figure out what he was actually trying to say. And so this is a very dense interview and I'd recommend you listening to it again and hopefully some of these things that I'm giving you here will give you a little bit more context. So the things that I'm taking away from this interview is that, first of all, light field technologies are super cutting edge. They're really the most photorealistic types of technology to be able to see different scenes that are rendered into VR. And I think part of the reason is that they're actually capturing a reflectance field, which means that they're not only just doing a video capture of something, but they're able to actually change the lighting within a scene. So you're able to do a volumetric capture and then change the lighting that looks realistic. So having that reflectance field actually gives you the capability to take something, put it into a game engine at some point. So he's working on integrations with both Unreal and Unity. and so you'll be able to do this octane rendering within these game engines and start to play around with the lighting and so it just gives you a lot more flexibility for creating these surrealistic and beyond normal capture because a lot of the other existing 360 video technologies are just capturing the world and then you have to do a lot of compositing out. And so they're coming up with a lot of different new file formats to be able to display these images. So one of the other things is that there's still a black box component for what Otoy is doing. However, they're wanting to really build the technology stack on top a lot of open standards, which I think is really interesting and key because they want to eventually be able to take a digital light field and render it in your web browser using WebGL technologies. the big part of what they're doing is the processing in order to create this data stream to be able to send it down. So it involves a lot of cloud computing, a lot of processing on the back end, and Otoy is not wanting to necessarily be bothered with creating hardware in order to actually capture some of these light fields, but they want to just focus on becoming a software company that is going to be able to ingest this huge amount of Data to then process it down to this very efficient streams that they're sending these light field videos down to you that can either be served to a mobile VR or Desktop VR or be rendered in real time at up to like 2,000 frames per second Which is pretty insane if you think about it So in a lot of ways, if you want to think about what OTOI is, they are like the closest thing to Hooli and HBO's Silicon Valley because they are coming up with the compression algorithms to be able to render out the metaverse. These light fields are so huge that it's not feasible to be able to actually process them in real time. In the process of going around at SIGGRAPH, there's a number of different types of technologies, like NVIDIA's IRAE and other approaches where people are trying to handle and deal with digital light fields. But the thing that Otoe has been able to do is that they've got these plugins for the Octane Renderer to be able to render these light fields into all the existing workflow and pipeline within the visual effects industry, which I think is a key difference from anything else that I've seen. I think that Otoe is years ahead of anybody else in this field. And at this point, there's not a lot of native light field capture technologies and cameras. And it sounds like Lightro is going to be this process of doing a camera to be able to capture a scene. But the thing that Jules is saying was like, well, I don't know, because that's a whole lot of additional data to be able to capture an entire scene. And so I think one of the things that Jules was saying is that what people are going to end up doing is kind of doing this mixed and compositing approach. So for example, you may do a photogrammetry capture of an environment that really isn't changing all that much. And then when you use something like the light show camera, you're probably going to be just focusing on the actors and the movement within a scene. So there's no real reason to have to use a light field camera to capture everything that you're going to be seeing. So I'm not quite sure exactly how the light show camera is going to be able to work, whether or not they have to have like multiple perspectives to be able to capture a full volumetric image. But one of the things that Jules was saying is that they've come up with their own kind of 8D light field capture that they're able to then send into Octane. So in other words, if you just think about, I have a camera pointing directly at my face, well, I'm not seeing my side, I'm not seeing behind me. And so I'm assuming that they've come up a way with eight different perspectives to shoot a live actor. And then there, taking all that data and then they're synthesizing it into one rendered file so that you get this volumetric 3D version of somebody acting. But it's really focused on that talent. It's not focused on capturing the entire scene. Again, they're going to be doing this compositing effect of looking at the environment, taking that, maybe it's using these photogrammetry techniques, converting it into a light field. And those light fields are not actually being rendered out. They're converting it into some sort of either point cloud or other type of data. So GLTF is a new open standard and I've done some other interviews and investigation into what this actually means. But essentially it's like a bundle of the 3D assets, which I think get kind of boiled down to a 3D mesh, which is what a lot of the existing like Maya and Cinema 4D as well as Blender are creating these 3D assets, which are 3D meshes, which are essentially like triangles that are combined into these volumetric shapes. Well, glTF is just a way of having an open standard to be able to have a highly efficient way of compressing all those 3D shapes into one file format. It sounds like there's not going to be necessarily a light field specifically that's within that glTF format But that somehow octane rendering is going to be able to export into a glTF So it's able to kind of create this mesh like entity from a lot of these digital light fields. So Still a little unclear into exactly how Otoy is going to be using glTF, but the big takeaway that it took is that they're using a series of different open standard technologies to be able to have a common interface that's non-proprietary, where they're doing this kind of black box processing in order to actually render out these. very complicated and data-intensive types of light fields, and then they're able to put it into a format that's going to be accessible from everything from mobile VR to websites on up into these very high-end machines that are done for desktop and room-scale VR. So there's a lot of other specific details around this technology that are completely new to me, honestly, and so I'd have to learn and investigate them more to be able to really speak about them with any more informed insight from what I took away from Jules, but The big takeaway is that Otoy is pretty awesome and they're on the cutting edge of a lot of these technologies and they're really laying the groundwork to be a pretty huge player when it comes to the future of these photorealistic renderings through digital light field technologies. And it's worth really listening to a lot of these concepts that he's talking about within this interview and to try to understand how to start to work with them because I think that if you really want to be on the cutting edge of rendering out scenes that are completely amazing looking then Otoe's certainly cracked the nut on that one. So I encourage you to check out some of the Render the Metaverse competition entries that John Carmack was co-judging with with Jules and they're really quite amazing and stunning. And I did get a chance to check out the ODG glasses and it was crystal clear. I mean it was probably one of the sharpest images that I've seen in any type of virtual or augmented reality headset and The field of view was a little small, so it was kind of like looking through a window into another reality. But in this interview and announcement from Augmented World Expo, Jules is saying that it feels like this is jumping two or three generations ahead of the competition, saying that there's about four times the pixel density of other wearable devices. And so if you get a chance to check out the ODG glasses, certainly check it out. It was really cool to see the light field scene rendered within that. So that's all that I have for today. I'm at Los Angeles today. I'm going to be going to the VRLA. And if you enjoy the podcast and would like to take a moment and show your appreciation, then one thing you can do is write me up a testimonial and send it to Kent at VoicesofVR.com and I'm in the process of just trying to gather up a lot of different testimonials to be able to present to other potential sponsors and also kind of building up a case for what I've been able to achieve with the Voices of VR so that as I launch the Voices of AI, I can kind of hit the ground running with sponsors and start to really investigate the world of artificial intelligence like I've been in virtual reality. And if you'd like to send some direct financial support my way, then you can become a donor at patreon.com slash Voices of VR.

More from this show