#1479: Agile Lens 2023: Selling High-End Real Estate with VR, Unreal’s Meta Humans. Pixel Streaming, & AI Experiments

I’m diving into my backlog to publish three unpublished interviews with Alex Coulombe, Creative Director of Agile Lens Immersive Design, with the second interview that happened at Tribeca Immersive 2023. See more context in the rough transcript below.

Here’s a talk that was given later in 2023 about the Four Seasons Lake Austin VR experience given at Unreal Fest 2023.

This is a listener-supported podcast through the Voices of VR Patreon.

Music: Fatality

Rough Transcript

[00:00:05.458] Kent Bye: The Voices of VR Podcast. Hello, my name is Kent Bye, and welcome to the Voices of VR Podcast. It's a podcast that looks at the future of spatial computing. You can support the podcast at patreon.com slash voicesofvr. So continuing my little backlog deep dive with previous unpublished interviews with Alex Coulombe, today's episode is a conversation that I had with Alex back at Tribeca Immersive 2023, where he was talking a lot about working with Unreal Engine and MetaHumans, but also all these different projects that he was working on with both the Christmas Carol, but also this real estate project that he was doing in Austin, Texas with MetaQuest Pros, with OptiTrack Tracking, and essentially like a very high-end photorealistic rendering of some of this real estate in Austin, Texas. I would later in that year, in December of 2023, actually had a chance to try out the demo, and I'll be digging into more of my reactions of that in my next episode of the conversation I had with him at Filmgate International in Miami, Florida. But this conversation with Alex happened on Tuesday, June 13th, 2023, at the Tribeca Immersive Festival in New York City, New York. So with that, let's go ahead and dive right in.

[00:01:17.136] Alex Coulombe: My name is Alex Coulombe. I run an XR creative studio here in New York City called Agile Lens Immersive Design. My background is in architecture and theater, which are both pretty spatial and over the years have led themselves very well to exploring these emerging technologies.

[00:01:31.883] Kent Bye: Great. Maybe you could give a bit more context as to your background and your journey into this space.

[00:01:35.925] Alex Coulombe: Sure. So I've always been interested in, let's call it real time technology. When I was studying architecture at Syracuse University, I was always looking for ways to better represent the experience of moving from one space to another, which really gets lost when you're showing someone a plan or a section or an elevation and starting to give someone a little bit more of a sense of being in a video game, for example, using, you know, Source Engine and things like that to go and understand the sense of compression and then release, you know, the feeling, for example, of coming off a train at Grand Central. and opening up into Grand Central Terminal is something that I was always trying to capture with certain things that were coming up in my architectural designs. And so I worked for a few different architecture firms. My first job here in New York City 13 years ago was for Raphael Vignoli Architects, who's just a block away. And they weren't particularly interested in how this technology could be used, and so I very quickly found myself gravitating towards any company that saw the potential of this for better communication of design intent. And so I landed at a company called Fisher Dax Associates Theater Planning and Design in 2012. And that was a company that focused entirely on designing theaters, which felt kind of perfect for me. My architectural thesis and every project I did in school I was always trying to turn it into a theater. I'd have a professor who'd say like, oh, we're going to do restaurants this semester. And I'd be like, how about a dinner theater? And they'd be like, no. But I had a good time anyway. And so at Fisher-Dax Associates 2012, we both know what happened in 2013. And so I had like a little bit of time starting to figure out how they were comfortable with letting me use things like Unity and different real-time technology. And then all of a sudden, the Oculus Rift DK1 Kickstarter came out, and I thought, well, this will be incredible. We'll just be able to put people inside the theaters we're designing. We can be like, this is what it is at row QC13. Here's five different design options for the balconies. Here's an orchestra on stage and a production of Henry V. And I just had this immediate vision of how that would be useful in our design process. And not only did that turn out to be true beyond my wildest expectations, After a couple years, my boss at Fisher Dax Associates decided to co-found Agile Lens with me as a way to better focus on the potential of what this technology could do, at first in the architecture field, but it grew from there.

[00:03:43.314] Kent Bye: Great. And don't you also have another sort of real-time streaming startup, or is that part of Agile Lens as well?

[00:03:49.358] Alex Coulombe: Yeah, at the moment, pretty much everything is under the banner of Agile Lens. A couple of years ago, we found that a lot of the experiences we were building for our clients were in Unreal Engine. And I have a very nice partnership with Unreal Engine going on a few years now. And we found that we always needed to spec pretty expensive hardware, both the headsets and the computers themselves, when we were building things for our clients. And this was whether they wanted a VR experience or a more traditional streaming experience. And so we realized that the potential of cloud computing and pixel streaming also known as remote rendering. Anyone who's seen Google Stadia or GeForce Now, for example, will know what I'm talking about. This seemed like it had tremendous potential for us to better represent the possibility of what our clients could see without them having to actually own the hardware themselves. So not only were we trying to make it so you could play these experiences in a browser, but then we pushed very hard on allowing you to access very high-end virtual reality experiences built in Unreal Engine from the cloud. And to do that, at first, we were using NVIDIA Cloud XR. which is a great application we've used it for a number of experiences for enterprise use cases as well as our live theater work but then recently i was poking all the right people at tensorworks who builds the pixel streaming technologies over at epic games and i know you saw back in march i got to announce that in unreal engine 5.2 there is now what's called webexr pixel streaming in unreal engine which allows you to bypass the friction that comes with the current method of using NVIDIA CloudXR, because you have to sideload that app into a headset. The meta app store doesn't allow anything on App Lab or in the official store that has any kind of cloud computing capabilities. So it was always a pain to walk someone through sideloading our app who had never done it before. But now with WebXR Pixel streaming, it's the same as using Mozilla Hubs, basically. You open up the browser link, on your Chromebook or in your Vive headset or your Meta headset. And then it's just going to prompt you basically like, do you want to enter VR? And it's incredibly promising. I do want to make sure I couch people's expectations. The latency right now is pretty untenable. You'd want to kind of hold your head still and not move around too much. but the potential now of just reducing the friction for very high-end experiences, especially when it comes to things like Apple Vision Pro and what may or may not be a very gated app store there. This will be a way to use WebXR as a portal to get all sorts of other experiences in there.

[00:06:04.949] Kent Bye: Yeah, before we dive into some of the other projects you've been working on, I did want to get a bit of your hot take on what's going to happen with Apple Vision Pro and Unreal, because we've had this lawsuit from Tim Sweeney and Epic Games against Apple. We have an exclusive partnership that is with Apple and Unity, and on top of that, we have sort of an antagonism with Apple with all the Chronos IPs. So very unlikely that they're going to adopt anything like OpenXR, which is a lot of how both Unity and Unreal Engine have been dealing with different immersive XR applications. And so I'm not sure what the exact pipeline is going to be for Unity, but they've basically found some way at least to get Unity projects onto the Apple Vision Pro. What's your early hot take as to whether or not you expect to see Unreal Engine on a platform like Apple Vision Pro, or if there's too many back-end legal disputes that you think it's a little bit too untenable with the existing architecture for Unreal Engine?

[00:06:57.674] Alex Coulombe: Right. And before I give my thoughts on this, I do want to make sure the whole public knows that even though Agile Lens and myself do have a partnership with Epic Games, I am not an employee of Epic Games. And so this is tricky because I gave the XR keynote at Unreal Fest in New Orleans last October. And because of that, there became kind of this assumption that I'm like the VR Unreal Engine guy. I know a fair amount about it, but anything I say is absolutely just the opinions of Alex Coulomb and Agile Enz, my employer, and not anything that Epic or Unreal Engine is saying officially. My thoughts are basically that Apple is going to continue to maintain this kind of standoffish position from Epic. From what I know about Epic and the philosophies that they have internally, I definitely foresee them doing whatever they can to create whatever necessary SDKs they need to in order to plug into that ecosystem. I don't think Apple can actively stop any of that happening. And we have giant companies like Disney that are using Unreal Engine all the time. And I think it only takes a couple of those companies to be like, hey, Apple, we really want to get our Vader immortal experience or whatever onto an Apple Vision Pro. So I think right now it was definitely beneficial for Apple to be like, hey, we have this great partnership with Unity. Look at how we're going to have this great path toward creating content together. But I don't see any reality where Unreal Engine isn't able to create first-party apps moving forward. I do want to mention also, in addition to the WebXR Pixel streaming direction, NVIDIA CloudXR actually allows you, starting with CloudXR 4.0, to make Unity apps, and it's kind of a funny other kind of backdoor because you could build a Unity app that then you get onto an Apple Vision Pro, which again, you could use to access any VR experience from a desktop computer.

[00:08:36.892] Kent Bye: Yeah, and while we're on the Unreal Engine topic, you've been doing some prototyping with a lot of the different, I guess, higher end features that are coming to Unreal Engine 5.x, whatever the future versions are going to be, were not available on VR initially. But you've been doing some tinkering and trying to create the right settings to be able to do this higher level fidelity and rendering and the different types of features to be able to have these really massive scenes within Unreal Engine. within a VR context. So I'd love to hear a little bit some of your early experimentations of that, and if it's on the pathway of having it more widely available for more people without so much tinkering, or if you expect that it's still kind of an experimental thing to be able to get it working within XR. So I'd love to hear about some of those different features and what you've been able to do with them so far.

[00:09:24.727] Alex Coulombe: Yeah, absolutely. So anyone who isn't too familiar with the Unreal Engine ecosystem, the two major innovations that came with Unreal Engine 5.0 were called Lumen and Nanite. And Lumen basically is this real-time global illumination system. You can have tons of light, very complex scenes, which is the Nanite portion of this. and it can run pretty well in real time. There was a demo called Valley of the Ancient, there was a tie-in with the newest Matrix movie, and there's a city sample you can download that's a big, robust, thriving city that you can even modify in Houdini, and you have traffic patterns and metahumans walking around, and it all runs pretty well real time on a 2D screen. That being said, none of those features have been tailored for VR right now. There are pathways toward optimizing the work in VR, but in many cases, it is still going to be far more efficient for developers to do baked lighting, forward shading rather than deferred rendering and the Lumen and Nanite system that you need to do with VR. That being said, we have been involved with some projects where it has worked very well. It just needs to be for the right project. If the ability for lights to turn on and off and move around is very important, if you have a very high polygon geometry that might be in the billions rather than the millions, Nanite is going to really save you there. at its base, VR is already very expensive. We know you're trying to render two, maybe three views if you've got a spectator view at ideally 90 frames a second, and that's a lot. So at the moment, I'd say maybe only 5% of the VR projects out there are really going to benefit from Lumen and Nanite, but I see that changing in the relatively near future. I also want to give a quick shout-out to a gentleman on Discord named Praydog, And he is developing this mod that I've been fortunate enough to do some alpha testing on. And this is a Unreal Engine to VR anything mod. You can take any experience built in Unreal Engine, you open up the executable, you launch his little injector, and it will make the experience work pretty darn well in VR, including ones that do have Lumen and Anite built in. So I know he wants to release that to the public in the coming months. Keep an eye out for that.

[00:11:19.642] Kent Bye: Okay, yeah, I know that there's a number of different folks in the gaming community that have been doing these different ports and injectors to be able to translate these 2D games into immersive VR experiences. So yeah, let's maybe dive into some of the different projects that you've been doing at Agile Lens. I know that with a background in architecture, I'm just curious if you've been designing any physical architecture or virtual architecture, or I know you've also been doing a lot of like immersive location-based experiences with more of a theatrical edge with integrations with Unreal Engine and whatnot. So got a lot of different types of projects. But I'd love to hear some of the different highlights of the types of stuff that you've been working on at Agile Lens.

[00:11:54.815] Alex Coulombe: Yeah, the Venn diagram we're always looking for is something that is challenging the use of emerging technology, something exciting with architecture, something exciting with theater. And when we can ever get all three of those things working together, I'll give an example. For example, a high-end hotel where we need a VR actor almost in the role of like a Ractor via Neil Stevenson's Diamond Age to perform the role of like a concierge. Something like that can be very exciting because then we get to push the limits of VR, do something with architecture, do something with theater. A recent project that's been keeping us very busy, which I'll try to couch how much I'm allowed to talk about it, but if this interview comes out in a little while, we might be okay. I got contacted a few months ago by this, I want to say like all sorts of different people who I knew from different industries. It was Vikas Reddy, it was Kim Bauman-Larsen, it was Jeff Model, it was Matthew Bannister. It was all these folks who were saying, hey, We're involved with this project, or we know about this project, that is trying to create essentially the most photorealistic VR experience you've ever seen. Alex, is there anything you can do to help with this? And so myself and Agile Lens came on board around December. And we did a whole month of R&D. We were looking at different versions of Unreal. We were looking at different applications. Do we actually want to use Omniverse? Can we do what we're trying to do inside Unity? Can we do everything we want using a Vario XR3? Or do we need to go to something more mobile? And so where we ended up was creating, and there's a ton of people involved with this, so it's Agile Lens, it's PureBlink who's handling a lot of the Unreal Engine work, it's Matthew Bannister and Dbox leading the whole creative design of this project, the client for this project is Jonathan Kuhn, it's called the Four Season Private Residences on Lake Austin, there's over a hundred consultants involved with this project, and Agile Lens' role in this has very much been to figure out the things that have never been done before with VR. A big challenge here was being able to have a large scale 80 foot by 60 foot 5000 square foot arena where you can actually have local multiplayer people all walking around together in what is meant to be a totally photorealistic representation of the Four Seasons Lake Austin private residences, which are not going to be built for five years. The prices of these residences go from $5 million to $50 million if you're getting the super penthouse. And a pretty big real estate challenge here to convince people to try to put down millions of dollars years before this is built. And so Jonathan's vision for this was very much that he wanted it to be as accurate as possible. I can say Agile Lens has been involved with architecture and real estate projects before where we've shown it to a client in VR and they say like, oh, you know, this space actually looks a little tiny. Can we just make it bigger? And it's like, OK, but that's not accurate. You're going to give people a false impression of this unbuilt real estate. So his goal for this was very much photorealism, lots of ray tracing, bouncing things off mirrors. We've been fortunate to have a lot of conversations with folks high up at Meta trying to make this work as well as possible. We are using MetaQuest Pros. We're using OptiTrack with custom active tags for all of the systems of moving around there. There's tours going on in this custom space in Lake Austin every day. It is something that requires like a large staff on hand to get everything to work correctly. And it's been a fascinating challenge in everything from the networking and getting Air Link to work well with five headsets all in the same space to making sure that we are striking the right balance of photorealism and performance. We're actually using reprojection for this, so we'll target a frame rate of 36 to get to 72 or 45 to get to 90. And a lot of these things I would have said at the beginning of the project, like, that's untenable, that's going to make people sick. We've had maybe hundreds at this point of people go through the experience. No one has gotten sick. I've been surprised by how well, and I want to give a lot of credit to Meta for this, their ability to use space warp and time warp to smooth out an otherwise maybe painful experience has actually been quite impressive, and this experience is far more photoreal than anything we've ever worked on before.

[00:15:39.186] Kent Bye: Yeah, I remember during South by Southwest you were leaking out some cryptic images of a MetaQuest Pro with these OptiTrack trackers on it, so I'm assuming that this was that project.

[00:15:49.414] Alex Coulombe: Yeah, leaking is a strong word. I think I was hinting at some of the exciting reasons why I couldn't be at South by Southwest as much as I wanted to be.

[00:15:57.438] Kent Bye: Yeah, so why go with the OptiTrack? Why not use just the normal tracking systems within MetaQuest Pro?

[00:16:03.184] Alex Coulombe: Oh yeah, well this could be a whole hour in itself. So we started with spatial anchors. Shared spatial anchors seemed like an exciting way to be able to get everyone in an Oculus headset to scan the same room, have the same data, and move around. That turned out to be not accurate enough. It also doesn't work very well over Air Link. It actually lags behind the correct position, and there's all sorts of little issues we tagged in their plugin, which six months later are now all listed as known issues in the actual plugin. We then went to a company called Anti-Latency, which was used by Felix and Paul for the Infinite in San Francisco. This uses IR markers on the ceiling to give kind of a big, giant fiducial sense of where everyone is and give you coordinate space. We tested this in an office space in Austin, and we're like, OK, great, it works. Then we move it to the tent, which is where the arena scale VR experience is. And because of the fact that it is a tent, the whole space kind of moves. And if you've got these ceiling tiles that kind of like move in waves and that's supposed to be your ground truth coordinate system, that doesn't work so well. So ultimately, what we're doing with OptiTrack I think is pretty clever. We are essentially only asking OptiTrack to calibrate all the users every so often. But because the space warp, time warp, black magic the meta has is so good, we're only, to use my client's analogy here, we're only letting OptiTrack put its hands on the steering wheel for like a second every time we're trying to figure out where everyone is. So as the host walks around guiding everyone through the experience, they actually see three headsets for each of the clients in the experience. They see where the headset is from the client's perspective, where the headset is going to go if they recalibrate them and use OptiTrack's perspective, and then they see a headset of what is going to look correct for them if they peek under the headset and everything looks aligned, which sounds confusing, but it's really helpful for the host to know whose headsets drifted, for example, because that is the big problem with just using the MetaQuest ecosystem, is the headset is going to drift. So that's OK if it's just a few inches, but if you're at the point where someone is about to bounce into someone else or hit a wall, that's when you want to do a light little recalibration. And something we've started to play with is actually automatically doing recalibrations using the eye tracking in the MetaQuest Pro. And we have found that people generally can't even tell if they've shifted six inches or even a foot if it happens while they're blinking. So we have all these little clever ways to keep the experience comfortable. and OptiTrack has been the most successful system for us so far, especially because we have it up on trusses and not in the ceiling.

[00:18:22.958] Kent Bye: Yeah, what I find really fascinating about the approach that AgileLens is taking here with the industry in general is that because you have this intersection of architecture and theater and cutting edge technology, you have, I often refer back to Simon Wardley, his model of this technology evolution where it goes through duct tape prototype idea, enterprise context, and then consumer applications, and then eventually you get into mass ubiquity. Because you're in this enterprise space with these huge budgets of these real estate projects, you're able to prototype the technology that could then get into more of a consumer location-based entertainment context, which is still, I guess, technically in this enterprise sphere, but it's moving from real estate into the entertainment sphere. So a lot of these underlying technologies, I can imagine a lot of use cases where theater folks would be really excited to get their hands on to this.

[00:19:08.826] Alex Coulombe: 100%. And every time we do client work, we're always trying to build up our own war chest of things we want to use for our own projects. Because Agile Lens, at any given time, is usually split about 50-50 between something we're doing for a client, someone says they need an XR SWAT team, and we swoop in and try to help them. But then we're trying to grab whatever we can and what we've learned from that to better inform our theater work or some crazy virtual architecture project, other things that we're doing really because that's what's making us passionate about the space.

[00:19:36.527] Kent Bye: Well, there's some different prototypes and demos that you're doing with pixel streaming and live performance, and also just, I think, during the pandemic, working with folks like Brandon Powers to figure out other motion capture techniques that were maybe a little bit more off the shelf, or trying to experiment with what is existing that may be on a consumer scale, but be able to get high quality level motion capture. So I'd love to hear a little bit of elaboration on some of these prototype experiments you've been doing with motion capture, live performance, and these different prototypes you've done with pixel streaming.

[00:20:06.204] Alex Coulombe: Sure. So as you say, at the start of the pandemic, there were a lot of interesting opportunities on the theater side. I've been speaking at theater conferences for years, TCG, USITT, Opera America, and we've always had these demos of like, hey, here's what's possible using things like Philip Rosedale's High Fidelity or VRChat or Rec Room or whatever. And we'd always get these kind of like, oh, that's really cool from theater companies. But it wasn't until the pandemic that some of them really wanted to explore this in a meaningful way. So Actors Theatre of Louisville was the first one who engaged us in a really meaningful way. We'd seen them at the TCG conference in Miami back in 2019. Brandon Powers was there doing some really cool stuff with Spatial and Magic Leap. We had this nice cornucopia of different demos for people to try. And they basically said, we want to do a production of Christmas Carol this December where we want to have these scrims put up and the idea would be that we could have metahumans, they'd heard about metahumans, they wanted to have all the characters from Christmas Carol, in particular the ghosts, projected into their space and maybe those ghosts could be live, maybe they'd be videos, and I started to make suggestions of like, well, what if actually your actor playing Scrooge was wearing like an Xsense or a Rococo suit under their costume and we could actually broadcast the whole experience live? And they toyed with that for a little bit and said, ah, no, you know, our actor Gregory Maupin, very talented, doesn't want to have to deal with the extra friction of technology. And I said, well, you know, there's this actor that we've been working with for a while. His name's Ari Tarr. He's amazing. He knows this technology inside and out. How do you feel about him being Scrooge in a separate VR production of the show? And they said, you know, go with God. Like, well, we'll be there for creative direction and we'll try to do as much as we can for it. But their focus, of course, was on the physical in-person experience. So what we ended up with in December of that year were two experiences which I was quite happy with because the in-person experience was really tailored to that medium, and the VR experience was really tailored to that medium. In-person, you had Gregory Maupin doing this incredible one-man performance of Christmas Carol, and we had what ultimately were these videos rendered by Rob Lester, who was working with us to get these beautiful moments that were key moments in Christmas Carol projected around the space. But then we had all that mocap data that we'd registered for all those characters, and so for the virtual reality production of Christmas Carol, we could do things like change the scale of them. We could make the Ghost of Christmas Future 100 feet tall. And in VR, you'd really feel that. And then we let Ari, in an Xsense suit and using the Live Link Face app tied to a Technoprops helmet, he could perform and see or have some sense of where all the audience members were because we basically turned his living room into a cave. And so he could kind of get a sense of looking around to see where everyone was. Now, over the years, we've continued to do this production of Christmas Carol. We essentially plan to do it every December forever and ever, and one thing that's been really fun to play with that for this year is Ari and other actors we've worked with, they're really eager to feel more engaged with the audience, and as great as something like Live Link Faces for capturing all the nuances of a facial performance, it doesn't give you a real sense of feeling connected to that audience and feeding off their energy. So this year we've been playing a lot with MetaQuest Pro and the Vive XR Elite and the attachment that came with the Vive Focus 3 to see how good a fidelity of facial tracking we can get while the performer is inside of VR. So at Next Stage over in LA last week, we were demoing some of our experiments with that and letting people try a wide range of emotions while staying in VR. And I think the results are very promising. Meta right now isn't supporting metahumans directly. And it's very confusing that metahumans and meta are totally separate adventures. But I've done a very janky thing to basically translate everything that the Oculus avatars get from expressions to get them to work in MetaHumans. And now we're just having a great time thinking about how many live performers are we going to have this December? Are we going to use all the different technology we've used in the past? Because we've done a YouTube live stream. We've done pixel streaming in a browser. We've done CloudXR. We're hoping we'll be able to do WebXR pixel streaming. But it's always just a really fun adventure figuring out what will work and what needs to wait another year.

[00:24:03.302] Kent Bye: Yeah, and I've also seen a lot of experimentations that you've been doing with Epic's metahumans. Maybe give a bit more context to the metahumans from Epic and where you see that might be going in terms of virtual beings or having these avatar representations that are a little bit more photorealistic or have, I guess, meta has their codec avatars. And this is probably a little bit like the equivalent of the codec avatar, but a little bit more stylized and optimized for a game engine. So I'd love to hear some thoughts on what's happening with Epic's metahumans.

[00:24:31.994] Alex Coulombe: Yeah, so I've been building courses for Epic for a couple of years now. That was another big life change during the pandemic was like, maybe I can teach things for Epic on the side. And they started to ask me to develop courses. One of those courses was for MetaHumans, specifically in the architecture space. And then that grew out to also doing it for the virtual production space and games and other things like that. And what a lot of people saw at GDC with MetaHuman Animator is incredibly exciting when we think about the possibilities of crossing the Uncanny Valley. I've always found in VR it is exponentially harder to get someone to feel good when you're looking at them. Something very stylized can work, but a lot of times there's just something uncomfortable where even if it was a flat representation of that person, maybe it's okay. But in VR, the eyes can feel very dead and the movement can feel very strange if not done right. So with MetaHuman Animator, this is some new technology Meta's developed internally that uses machine learning to do an incredible amount of processing over everything from like what your tongue is doing, all the little micro expressions your face does, and this other layer that happens on top, which is not live at the moment. I'm sure that they'll make it live at some point, but it does create this really exciting opportunity for one of our goals at Agile Lens, which is to give really talented creators and theater artists other revenue streams We're always looking at ways that you can have a live performance, but then also something on demand that those incredibly talented creatives could sell later. So the potential now of being able to have like, here's the best of what we can do live. And we're playing with technologies like Move AI and Capture E for seeing what kind of markerless mocap solutions are out there. But then the idea that we could have all that live data recorded and then start to process that and get something that feels even more realistic. The fact that I can stick my head into a metahuman's mouth right now and see everything the tongue is doing is incredible. And I think that's really going to revolutionize the lifelike connection you have inside of a VR experience. There's great companies we're also working with like Scatter and DepthKit and Soar who are doing wonderful things with volumetric streaming. But metahumans still represent a level of if not accuracy, if you're trying to represent a real person, a level of detail and nuance and subtlety that is really hard to capture in any other form right now.

[00:26:38.058] Kent Bye: And so what's the input for metahumans? I know they probably have a lot of dials, but can you, say, feed it audio and have it just translate the audio into a performance? Or can you record someone's face from a camera and then get that translated into a performance? And I'm sure there's depth sensor cameras as well. So what are all the different inputs you can put into a metahuman in order to, on the other side, get a virtual human performance?

[00:26:59.363] Alex Coulombe: Yeah, anything and everything. And I'm glad you're mentioning that because one of the other things we're exploring right now is how we start to play with the notion of liveness and recorded versus live versus something that has a little bit more of an AI or NPC quality. So we have an experience that's running right now in partnership with Infinite Reality in London, a tech week over there. And this is an experience that has a metahuman that's been trained on almost like a chatbot kind of system. And in a very quick, fairly natural way, you can talk to this metahuman and it's like a little car showroom with Vodafone and you can say like, hey, you know, beautiful car, what can you tell me about it? And they'll respond and their mouth will move and it feels pretty good. And then you can say things like, can I see the car in blue? And the car will actually change to blue because you can hook up all these input events. So the idea that you could start to train characters that actually feel pretty good and you could start to get closer to something like maybe like a Sleep No More, where rather than a virtual reality theater experience that's only happening in one spot, you could start to fill up this world. And at any given moment, maybe there's five live actors and they're distributed or they're all in like a central stage. But then there can still be these other characters in various states of being prerecorded or live AI that are also sprinkled throughout I think just starts to go a long way toward telling a more full and complete story that feels like a real built-out world.

[00:28:15.751] Kent Bye: So it sounds like it's like bi-directional natural language processing where you have the customer speak that gets from speech to text and then into the AI large language model or whatever system they have to be able to either drive an action like change the color of the car or give back information, which then you would have to do text to speech and then do that speech synthesis. So you're taking the speech synthesis input and then driving it into Epic's virtual human to be able to give a live performance is what it sounds like.

[00:28:40.856] Alex Coulombe: Yeah, and whatever's happening, the main plug-in at the core of this is something called in-world AI. There's very natural body movement. It just feels like you and I talking right now, where it just seems like, yeah, it passes the smell test for like, that's the way a human would probably behave if they were telling me this information.

[00:28:56.724] Kent Bye: Yeah, I just saw a demo from Niantic called Meet Wall and talked to Keiichi Matsuda that was using nworld.ai and actually had a chance to talk to them at Augmented World Expo because, yeah, it seems to have a way of constraining knowledge that you have this real natural language input, but it doesn't go off the rails like a lot of large language models. So I don't know if that's the same system that you're using there.

[00:29:17.102] Alex Coulombe: Yeah, I think so. And just shout out to Keiichi. I actually reached out to him way back in 2009 when I was doing my architectural thesis because of his early work with hyperreality and augmented reality. And we've stayed in touch ever since. Really excited about everything they're doing at Liquid City.

[00:29:31.584] Kent Bye: Well, so you alluded to things like virtual production, and I know that you've also been involved in, say, the real time conference that has a lot of folks from the Hollywood industry, but also both film production, video production, but also immersive productions and probably even like theatrical productions. Maybe you could talk about the real time conference and your involvement with that and this tie in to all the stuff that you're doing and how that ties into things like virtual production.

[00:29:54.975] Alex Coulombe: Sure. So yeah, the Real-Time Economic Summit was a recent event hosted here in New York City at the Museum of the Moving Image. It was hosted by Jean-Michel Bottier, who I've known ever since Games for Change a few years ago when he said he wanted to assemble this conference that would bring all these different industries together, that would be looking at how real-time technology could really influence the different industries together, because I think what he identified correctly was that whether you're working in automotive or architecture or entertainment, there are a lot of similar problems we're all facing. And if we can help each other, it's very much a rising tide raises all boat philosophy. So at the Museum of the Moving Image, in addition to your standard short little talks from Nvidia and Epic Games, among others, there were also these really quite wonderful working sessions where we would sit down in small groups, like 20 to 30 people, from anywhere to three hours to four hours, And we'd have some slides. So I led a working session with Chris Nichols from Chaos and Samantha Anderson from Epic Games. And we talked about addressing the luxury market. And there were other ones on open standards and interoperability and things about where real-time technology would be going over the next five years. But it was all rooted in this very practical, like, these are tools we're trying to use to solve real-world problems. Where are some examples and case studies of where things have been solved that we can all learn from? And what can we do now to push all that forward? So I know he plans on doing that conference again next May. The real time conference is another like easier to access virtual event that I know will continue to happen. So I do recommend anyone who's interested in virtual reality, any kind of real time technology, definitely check those out.

[00:31:27.318] Kent Bye: Yeah, and I'd love to hear a little bit of a trip report to the Noah Nelson and Catherine Hughes. They have the Everything Immersive as well as No Presidium. They had a whole next stage conference that was happening sort of right at the end of AWE and right before the Apple Vision Pro announcement that happened a week ago. So, yeah, I wanted to be there, of course, but there's so much travel I was doing that I wasn't able to fit it in, especially coming here to Tribeca now. But I'd love to hear some of your experiences and what you were talking about and what were some of the hot topics or talks that were really striking to you.

[00:31:58.041] Alex Coulombe: Yeah, this was an experience that I heard someone describe, I think very accurately, as both intoxicating and sobering, because it's something that a lot of us have wanted for a number of years. Noah and Catherine and a lot of the No Procenium team started to plan this way back in 2019, and it was supposed to happen in March 2020. Kevin Leibson on the Agilent's team and I were going to be there in 2020 to talk about our magic leap AR ghost dating experience called Ghosted. It was like a totally different time. Things were so innocent then. And then it got canceled because of the pandemic. Noah tried to make it happen again, I think last year, got canceled again. And now it finally happened. And most of the people there here in 2023 were the same original people who wanted to be there in 2020. And honestly, what I found so refreshing about it was that it's a lot of these issues that I've tried to push to the forefront of people's minds at conferences like TCG and USITT and Opera America, where it's like, we have all these tools. We have all this incredible technology. How can we help this not only help the theater industry survive, but thrive? And there's a very big focus on immersive at Next Stage and just a great gathering of creatives of all types, anyone from a playwright to a costume designer to people making escape rooms. Meow Wolf had an incredible presence there, kind of gave a performance on the stage with their speaker freezing and everything and needing to be revived. And it was just a bunch of theater kids having a great time talking about what's going well, where the challenges are, how we can help each other. There were what were called salons, which were a lot like the working sessions at the Real Time Economic Summit, where we'd gather in a group and we'd go through some different prompts that I think Noah had designed. And we just, again, tried to work out what's going well, what isn't, and what can we do to help. And I really do feel like a lot of us came away from that conference feeling a sense of hope. and a sense of invigoration and that we're not alone and that we do want to help each other. And so I certainly tried to put myself forward as like anyone who has some really exciting ideas that maybe aren't going to be as practical to do in the real immersive world. Like I've got a bunch of calls lined up over the next couple of weeks to discuss how some people there could start to take their ideas and use the tools that we're developing at Agile Lens and bring that vision to life in other ways. I'm also just very proud of Noah and everything he's put together at Noprasenium. And I just found out right before coming here that he is going to get to see Galactic Star Cruiser, which I know he's wanted to see for years and it's closing in a few months. So I'm really happy for him on that front.

[00:34:19.665] Kent Bye: I am really happy for him, too. Yeah, I know that's a really expensive hotel that was started by Disney. And yeah, just the price point was way out of the range for most people who would want to go there. But yeah, it just ended up being really out of the price range for a lot of folks. So glad to hear that. Yeah, so it sounds like there's a lot of these intersections from these different disciplines that Agile Islands is right at the heart of it. And I think another hot topic that has come up again and again for the last couple of months, I've probably done at least a couple of dozen interviews about artificial intelligence, machine learning, generative AI, and you have natural language processing of the virtual beings we alluded to a little bit with nworld.ai. But I'd love to hear if there's been any other type of explorations that you've been doing with the potentials of generative AI in the context of architecture, but also theater with the overlap. And so yeah, I'd love to hear if you've been tinkering around with all the different exciting AI tools that are out there these days.

[00:35:11.505] Alex Coulombe: Yeah, probably nothing that will be totally unfamiliar to your listeners. Like, we've played with mid-journey in ChatGPT. We've tried to get different prompts that way. We've tried connecting ChatGPT to a metahuman. Right now, of course, the delay it takes to actually get a response is a little bit untenable. But we really are interested in thinking about how we can build out worlds and stories and something that maybe feels much more open than it actually is. I think we quite like the idea of having narrative spines that will take you through a story. but then just something that feels very believable as you navigate around that spine, and certainly a lot of these tools coming out are very exciting on that front. On the architecture side, that's also been quite exciting, and I've had many conversations, as you have, with Andrei Yonko-Jokaru. We spoke at a panel on Roosevelt Island not too long ago about some of the work that's going on on that front, and some of the tools that are coming out in Unity, in Unreal, in other places, some of the Web3 platforms as well, that start to allow us to think about treating worlds either as something totally separate and their own universe or something that is a little bit more like Somnium Space, for example, where it is contextual and the real estate there is going to be based on like, oh, there's another building that's 60 feet in front of me and it's completely blocking my view to the virtual water. So there's always been these fun, exciting conversations going on with Jessica Outlaw as well about how do you start to take architectural psychology and the rules and suggestions and guidelines that work in the real world and translate that into a meaningful way in the virtual world without being totally skeuomorphic because there are so many opportunities to do things in the virtual world that would never work in the real world. But you need to have a bit of a conceptual ladder there, easing people into the possibilities of what can be done. So certainly when we do our theater productions, that's always a great opportunity to play with, you know, what's possible and what isn't. We did a production here in New York City called The Orchard Off-Broadway last year starring Mikhail Baryshnikov and Jessica Hecht. It was a hybrid production that took place both at the Baryshnikov Arts Center and virtually. And unlike Christmas Carol, this was a one-to-one experience where the virtual and the physical audience were having the same experience. But we had a digital twin there of the Baryshnikov Arts Center. And as part of the virtual experience, we started from this notion of like, here's this nice, brutalist piece of architecture. But then within the virtual version of the experience, we got to do all sorts of things that you would never do in the physical space, like making the walls all move around or having a kaleidoscope effect or allowing you to lay Baryshnikov on the table and play a game of operation on his chest. So, you know, in many ways these start to feel a little bit like mini games of what's possible with space and interaction within these separate worlds. But another really exciting challenge about this as well has been how do you start to bridge that gap between traditional theater audiences who expect something that's more passive and let's call it the Fortnite generation who wants things that can be much more interactive and are going to let them run around all the time. My kids have done a lot of virtual reality theater, and it is totally their expectation that anytime they go to an immersive theater show now, they are going to be able to click a thousand things and run around at 100 miles an hour and check in with Deirdre Lyons or whoever, maybe for five seconds every five minutes to be like, oh, wait, what's actually happening? So the expectations, I think, especially for the younger generation of what digital entertainment looks like, even in an art form like theater, is constantly evolving now.

[00:38:20.205] Kent Bye: Yeah, one quick philosophical note and inspiration in conversations I've had with Andrea Cotico is this, you know, fighting against the bifurcation of the virtual and the real and really focusing on the virtual and the physical. Because as David Chalmers has argued in Reality Plus, that virtual experiences are genuine experiences. So anyway, I just want to like throw that out. So I tend to prefer using physical rather than real. But aside from that, you mentioned that there's these generative tools that are coming into both Unreal Engine and Unity, and I'm sure all these other ways that people are integrating into through the various APIs and whatnot. But do you imagine that some of the stuff that you're looking at would either be from an authoring perspective of the stuff that you're creating, or from the users being able to be within the context and being able to manifest different 3D objects, from bringing the generative tools and to the actual users are being able to use those tools to be able to generate different scenes or modulations of these experiences. So yeah, I'd love to hear some of the different nuances of some of the generative AI tools within Unity and Unreal Engine, since it sounds like that agile lens is on the frontiers of all these tools. So yeah, I'd love to hear how you expect to see some of these different possibilities with generative AI start to get integrated into these tool sets like Unreal Engine or Unity.

[00:39:30.154] Alex Coulombe: Yeah, and it really does all go back to the way that I remember initially positioning virtual reality, for example, to our clients in architecture 10 years ago, where you want to tread a fine line between the agency you'll give someone else, and I mean this both in the sense of clients who are commissioning architecture, and then also a theater audience. So user-generated content starts to become so easy to create with generative AI tools, but a lot of it, at least the foreseeable future, is going to be pretty bad. And so how do you make sure that what is being generated, and again, whether it's from an audience member or someone who's crafting a piece of architecture or crafting a theater experience, how do you make sure it's meaningful and it's adding something to the experience and it's not just a gimmick? So starting to set up certain parameters there is kind of interesting. Before the AI craze started to happen, some of our pixel streaming experiments definitely involved trying to figure out how to make it as easy as possible for someone to generate their own theater experience. Like, let's start with creating a world. Here's some templates. Here's some things you can do to it with very light AI in there. and then like here's some characters you can start to bring in and now you can start to set up cues for set changes and light changes and there's still a framework there that I think would actually go quite well with some of these AI tools and allowing especially someone who is very creative but not necessarily a quote-unquote artist to start to do some exciting things and I've talked to a lot of people lately about the idea of creation versus curation And someone who has very good taste might be able to use AI tools in a very meaningful way to craft very good things because they can look at the four images that Midjourney comes up with and they have a sense of which of those is the best direction to go in a way that someone else might not. And I think there's some interesting semantic discussion about like, is that person an artist if they have very good taste but they can't actually generate that art themself? But we're still in early days at all this and we're just having a lot of fun playing with it all and testing it and seeing where it leads us.

[00:41:18.246] Kent Bye: It reminds me of the clip with Anderson Cooper and Rick Rubin, who's talking about how his job as a producer is to have a good, refined sense of taste and know what he likes and have strong convictions and that artists appreciate his opinions on these things. So, yeah, it's possible to be a producer or curator, as you're saying. And yeah. I don't know. I think these semantic debates, we'll see how they continue to evolve as to prompt engineering and to what degree is that an artistic expression. So, yeah, I think it's probably likely that it is. But, yeah, we'll leave that to the larger public to settle out these nuances. I don't think there's anything that any one individual can decide. Yeah, I'm curious. There's so much that's going on, and we are able to cover quite a lot of all the stuff that you've been working on. I'm sure there's a lot of projects that you are not at liberty to talk about, but I'm wondering if you could maybe generally talk about what are some of the technological directions that you get you really excited into where the future is going and that are going to enable the type of experiences that you want to have.

[00:42:13.314] Alex Coulombe: Yeah, I constantly find myself thinking about a book called The Invention of Morale which the short version being there's a guy stranded on an island and he gets excited because he realizes he's not alone and sees there's a bunch of people there and then over time he realizes that none of those people are alive and it's basically like a volumetric playback machine of like a week in these people's lives as playing back over and over again and something about the connection you can feel with something pre-recorded combined with the power of seeing something live. I often talk about my time studying abroad in London in 2008 and the fact that I had a professor who did get us into the first or second row of so many incredible shows across the West End of the National Theatre was mind-blowing for me and I'm always looking at ways to democratize that. I think that everything is actually moving in a great direction toward allowing people to have these very rarefied, cathartic, theatrical experiences now. So looking at volumetric capture, the quality that that's increasing at, the fact that it is becoming more alive. Amazing if you want to be able to get someone to actually look like what they actually look like and capture all the movement of their clothes. That's amazing. The direction that metahumans are going in terms of how lifelike you can make something and have these levels of detail where you can see all these subtle little micro expressions on their face. Motion capture is coming in an incredible direction. I'm so excited to pretty soon, I think, not have to ask actors to put on a motion capture suit anymore and be like, we're going to set up five iPhones and get everything we need from that. The direction that AI is going, the direction that some of the tools are going for being able to very smoothly transition during a show between something prerecorded and something live is quite nice. We've already actually done that for Ari in Christmas Carol. where some of the more emotionally taxing scenes that he doesn't want to do over and over, we can actually just have him introduce the scene, and then very smoothly slide into the pre-recording of him doing that, and then go back into being live. And we've had interesting debates about, is that unfair to the audience that they think that it's live, but it isn't? But there's still a sense of stagecraft and showmanship to the whole thing. So I'm really excited about how all that is going. and just starting to combine these tools in different ways. One experience we're working on that I can talk about is right after South by Southwest went to Tennessee to capture this production that the Gateway Chamber Orchestra did called La Pasión Según San Marco, which is kind of the Passion of the Christ story told from St. Mark's perspective. And this was one of the most incredible live chamber orchestra pieces I've ever seen in my life. And maybe 200 people saw it. you know, middle of Tennessee. We were able to capture the show in 360, stereo 180, volumetrically using DepthKit. We captured move AI moments and we have motion capture data. We now have two terabytes of what I would call the essence of this incredible production that now through the magic of these technologies and VR and doing this in pixel streaming and on a regular computer, we can start to evangelize this work and spread it through education, students seeing this and learning about some of the Latin roots of the music and the percussion instruments and how the piece evolved. Goliak who composed the piece is amazing and he's on board with making all this work. And I am just so excited by the number of tools that are available to creators now and the ways they can more and more start to influence each other and work with each other, all in the spirit of making something that is going to touch people's hearts and move them. Because this is a 90 minute show, for example, that we are going to be boiling down into like a 10 to 15 minute festival piece. And the fact that we not only have to do that, but have to decide which mediums we'll bring it into is a really exciting challenge. And it feels like there isn't a great template for that yet. So to be figuring all that out is wonderful.

[00:45:45.392] Kent Bye: Yeah, that sounds really quite amazing. Definitely looking forward to seeing that. And in terms of the audio aspect, did you capture a lot of mono channels of the performance or any ambisonics? I'd love to hear what you were doing in terms of the audio capture for that performance.

[00:45:59.713] Alex Coulombe: Yeah, so this is one of those interesting debates where, for the most part, when we work with people who come from a traditional theater background, they make the transition into virtual reality very well. When we were doing Christmas Carol, for example, it was wonderful having our director, Robert Barry Fleming, sit down at his home in Tennessee, put on a VR headset and watch Ari Tarr over in Oregon doing a rehearsal. And for him, it felt a lot like being in a black box during the early days of rehearsal. One thing that we've had a lot of debates about is audio. So when it comes to audio, especially for something like a chamber orchestra piece, they want the audio to be as crisp and clean and mixed as perfectly as possible. And I understand that, especially speaking to a conductor as talented as Gregory, who ran this piece. The tricky thing is helping them understand that it's not like you're going to have people sitting down and just watching the piece in VR. In that case, fine. A perfect audio mix is great. We want people to be able to walk around the stage And so the short answer to your question is there are audio tracks of all the mics that were around the stage. We have not had success yet in convincing anyone to give us all those audio tracks because there is a concern that someone might be standing at a place on the stage where they're going to hear where a flute player flubbed a note or something like that. And to us, that all is part of the sense of liveness. And spatial audio, as we both know, is so powerful. But at the moment, we just have a perfectly crisp, clean mix down of the entire show. And at the moment, we'd probably just make it a 3D audio source that lives in the middle of the stage. So there's a little bit of panning. But to do that in a proper, like, binaural audio way with spatial audio would definitely be the dream.

[00:47:33.393] Kent Bye: Well, if they end up listening to this, I'm just going to advocate as an XR journalist that I would really like to see a spatialized version of this with audio. I think it's really important to give that extra dimension of spatialized audio to not have just the visuals. The visuals are great, but I think the real power of that will be to actually kind of move around and experience it as if you were there and be able to do things you wouldn't be able to do if you were there, like walk around the stage. And so you get a whole unique type of audio experience that would be impossible in physical reality. So yeah, if you're listening, Pass over the audio.

[00:48:04.571] Alex Coulombe: We call that the Kent by bump.

[00:48:06.612] Kent Bye: All right. Well, I'd love to hear from you what you think the ultimate potential of XR, special computing, and the future of immersive storytelling might be and what it might be able to enable.

[00:48:18.496] Alex Coulombe: For me, it's the power to change someone's value system. I don't want to change minds, and I don't want to play into the slightly tired notion of VR as empathy machine. But I just think back to, in particular, live, spatial, in-person experiences that completely rocked my world, and being able to bring that kind of experience to people from around the world. And as the prices of these headsets and these devices start to go down, as more people become familiar with them, I think there's just tremendous opportunity for giving people experiences that couldn't happen any other way. And, you know, it's kind of about making people feel things that they never thought they'd feel and bringing them experiences that they otherwise would never be able to access. That for me is still the driving core of everything I do.

[00:49:03.824] Kent Bye: And is there anything else that's left unsaid that you'd like to say to the broader immersive community?

[00:49:08.450] Alex Coulombe: I just want to give a quick shout out to something that happened a couple weeks ago, which to me was really indicative of the kind of community we have in the world of VR. And I'll try to give a short version of the story. And I'm not really the person to tell it, but I just want it to be out there as much as possible. Someone had posted on Twitter that there was a four-year-old girl who was dying of cancer in the UK. And she was about to have her birthday. And she really wanted to become a mermaid. And she just wanted to have a VR experience that could do that. And this was something that a lot of us found out about on a Saturday. Her birthday party was going to be on a Monday. This was run by a company called VR Therapies, which is a wonderful, wonderful company. And so there were folks. my previous colleagues at Onboard XR, for example, who put together this beautiful mermaid experience using Mozilla Hubs. A bunch of folks from Agile Lens, they came together over the weekend, and we built this Unreal Engine experience. And Kevin Leibson got a bunch of his actor friends to record all this dialogue of wishing Zainab a happy birthday and making her just feel as immersed as possible in this experience of being a mermaid. We set up games. We set up all these different interactions. We made it work standalone on Quest as well as in desktop. Once everyone had built these experiences, I think there were five or six of them in all, there were things swimming with dolphins and just all sorts of things toward embodiment and feeling like a mermaid that were delivered basically in under 48 hours for that birthday party on Monday to see what people created, no one being paid for anything, totally out of the goodness of their hearts to make this beautiful little girl become a mermaid for a day and to leave behind these experiences that her brother and her family could continue to have That, for me, is kind of the height of what this industry can do, and I loved every moment of that, and I'm so proud of everyone who was involved with it.

[00:50:47.829] Kent Bye: Yeah, that's a really beautiful story, and yeah, thanks for sharing it, and thanks for helping Catalyze, and I know there's a number of different tweets and retweets that are going out, and there's a whole dispatch that got written up as well, kind of giving a report of how that went, so yeah, really quite beautiful. Well, Alex, thanks again for taking the time. You've got your fingers in a lot of different parts of the XR industry and really tying a lot of things together in this true interdisciplinary way. And so just really appreciate you taking the time to help give a little bit of your insights of all the different things that are happening, both at a low technical level and at the highest level of what this is all going to be and where it's all going. So, yeah, thanks again for taking the time to help unpack it all.

[00:51:28.479] Alex Coulombe: Thank you, Kent, for continuing to be a beacon for all of us and inspiring me, in particular, to try twice now to have my own podcast. I did it with Xrdad a few years ago. And the one that is ongoing right now is called The Unofficial Unreal Engine Podcast with Jacob Feldman from CoreWeave. And we have a lot of fun basically trying to emulate you and have good discussions about the ultimate potential of Unreal Engine. Awesome.

[00:51:52.059] Kent Bye: Thanks again for listening to the Voices of VR podcast, and I would like to invite you to join me on my Patreon. I've been doing the Voices of VR for over 10 years, and it's always been a little bit more of like a weird art project. I think of myself as like a knowledge artist, so I'm much more of an artist than a business person. But at the end of the day, I need to make this more of a sustainable venture. Just $5 or $10 a month would make a really big difference. I'm trying to reach $2,000 a month or $3,000 a month right now. I'm at $1,000 a month, which means that's my primary income. And I just need to get it to a sustainable level just to even continue this oral history art project that I've been doing for the last decade. And if you find value in it, then please do consider joining me on the Patreon at patreon.com slash voices of VR. Thanks for listening.

More from this show