#972: Augmenting Virtual Reality with Aardvark

Aardvark is an AR platform within the context of virtual reality applications. Here’s how Aardvark is described on it’s Steam page:

Aardvark is a new kind of web browser that allows users to bring multiple interactive, 3D “gadgets” into any SteamVR app. It extends the open platform of the web into VR and lets anyone build a gadget and share it with the community. Aardvark gadgets are inherently multi-user so it is easy to collaborate at the Aardvark layer with the people you are in a VR experience with.

Aardvark was originally announced by Joe Ludwig on March 19, 2020 and had it’s first, early access release on Steam on December 19, 2020. I did a Pluto VR demo in December that integrated their telepresence app with Aardvark AR gadgets and Metachromium WebXR overlays, and I got a taste of how multiple applications will start to interact with each other within a spatial computing environment.

Aardvark and Metachromium are both overlaying objects and layers on top of virtual environments, but they are taking different approaches. Metachromium uses WebXR depth maps to composite the pixels on top of the existing virtual environments. Aardvark is tracking your head and hand poses, and attaching declarative web app objects to these reference points or the room center.

Ludwig says that Aardvark is his white paper for why he thinks his approach could be easier to scale in the long run. Metachromium runs WebXR at the framerate, which has a lot more overhead. While it’s only the Aardvark app that’s running at framerate while the rest of the gadget is a declarative web application approach using the React framework that only runs JavaScript code when the user takes actions. Ludwig is skeptical that JavaScript will be able to run within the context of a 90 to 120 Hz render loop on top of pushing more an more pixels to displays in VR apps that are already pushing the GPUs and CPUs to their limits, and Aarvark gadgets reflect this design philosophy.

I had a chance to catch up with Aardvark creator Joe Ludwig on January 12, 2021 to get some more context on Aardvark, how it started, where it’s at now, and where it’s going in the future. Ludwig is still in the early phases of getting all of the component parts in place in order to bootstrap this new platform.


It’s still early days in fleshing out the flywheel of this communications medium feedback loop, but the potential is pretty significant. Ludwig says that Aardvark has the potential to start to prototype the user interface design and functionality of augmented reality applications within the context of a virtual reality app.

There’s still a lot of missing information to fully manifest this vision, especially in not having any equivalent of a virtual positioning system to get the X, Y, & Z coordinates of the virtual work instance and specific map and conditional states. Ludwig expects that his may eventually be provided through an OpenXR extension, but for now these AR gadgets will need to exist relative to the head or hand poses or localized to the center of your play space.

When Aardvark was first started, Ludwig conceived of it as an overlay layer. And so it’s been surprising to him to discover that there’s been a lot of of work in trying to get these spatialized gadgets to be able to communicate with other gadgets, especially within a multiplayer context. The early experiments show the power and potential of a multiple-application AR ecosystem, but there isn’t a single killer app or utility that’s tightly focused on a specific use case or context. This leaves a lot of room for exploration and discovery starting with a backlog of ideas, but without a lot of clear direction as to what will be compelling or build momentum within a specific community.


This is a listener-supported podcast through the Voices of VR Patreon.

Music: Fatality

Rough Transcript

[00:00:05.452] Kent Bye: The Voices of VR Podcast. Hello, my name is Kent Bye, and welcome to the Voices of VR Podcast. So back in December, I had a chance to do a demo from PlutoVR, which is a telepresence social app that is running in this multi-app ecosystem where you're able to do it as an overlay onto other applications. And in that demo, they pulled in different applications from Aardvark as well as applications from Metachromium. And so Metachromium and Aardvark are two ways to bring in different layers of augmentation into virtual environments. And so starting to think about how to do rapid prototyping of augmented reality and user interface design and different functions and features that might be useful within the context of a virtual environment, you can start to use something like either Metachromium or Aardvark. Today I'm going to be doing a deep dive into Aardvark with the creator Joe Legwig, who happens to be working at Valve, although this is not officially connected to Valve in any way whatsoever. But I do think it's important because Joe has been working on things like SteamVR and very familiar with OpenXR. his familiarity with all these different systems gave him the idea to be able to actually create this additional layer of using web technologies like React to be able to create these declarative augmented reality gadgets and widgets that are either attached to limbs of your body or into the room that have these augmentation layers within virtual reality. And We talk about the evolution of that, where he's at and where he's going. It's still the very beginning of this. So it's like a new distribution platform that he's starting to get bootstrapped. And so if you are intrigued by this concept, then this is a great time to start to jump in and start to experiment and see what's even possible. So that's what we're coming on today's episode of the Voices of VR podcast. So this interview with Joe happened on Tuesday, January 12th, 2021. So with that, let's go ahead and dive right in.

[00:01:54.672] Joe Ludwig: My name's Joe Ludwig. I've been working on VR for. some years now, I guess I just passed eight years. And I've worked on a bunch of stuff, SteamVR, OpenXR, various other things. And what I've been working on recently in my spare time is something called Aardvark, which is a platform that allows overlay app developers to build apps that cooperate with each other a little better than the raw SteamVR overlays can, and allows them to mix together their visuals as well as their input in a way that is much more similar to a web browser than to a traditional downloaded app.

[00:02:31.545] Kent Bye: Yeah, maybe you could go back to the moment when you had the idea or the insight to be able to create this, like what was the spark for this hard work idea?

[00:02:40.709] Joe Ludwig: It came over in multiple stages. So the first part was, we needed to do some UI and the UI needed to be spatial to some degree. And we've been doing some work with React. And so it made sense to take the just it came to me middle of the night one night and I woke up and it wouldn't let me go back to sleep. And it occurred to me that the scene graph that we needed to generate to do anything spatial with UI was somewhat similar to the component graph that would be generated in a React app because I'd been doing some React development. So the first notion was to try to mix those two basically. And other things that happened that were kind of along that line with React 360 is in a similar vein. But at that time, I don't know if React 3.6 is still around, but at that time, it was not really available for development. So I started working on that. And sometime after that, I started trying to bring up an implementation of that that I could share with everybody, an open source version. And that's what turned into Aardvark. Initially, I was thinking primarily about the scene graph and the visuals. It turned out that the interactivity is at least as large a component of Aardvark as the visuals are. And the other thing that sort of came in later was the notion that every gadget is a tab, so that they're isolated from each other, and they don't get in each other's way, and they don't have quite as much access to the system as an installed app would. So a little bit of sandboxing and protection of the user by isolating the apps in that way, as well as the other things you get from being a web browser, which is automatic updates of apps, synchronization between the apps and the servers that they come from, that sort of thing.

[00:04:17.227] Kent Bye: Yeah, I had a chance to do a demo with Pluto last December and publish an interview with them. And they were using a combination of metachromium, hard work and their own overlays. I know that there's OpenXR and then there's all these proposed standards for an overlay layer. So as an example, when you're in SteamVR and you push the menu button, then you have the menu that comes up. I imagine that there's a similar mechanism for having things show up in this overlay layer, but yet it's not always just an overlay because you said it's not layers, it's actually more like gadgets. So I'm trying to get a sense of the metaphors of how to make sense of this, because I understand a lot of things from, say, the 2D realm with the DOM and putting stuff on top of objects already, but in the scene graph, it's 3D. And then it's not necessarily like a plane. It's more of like, you're able to put these objects within a scene graph of its own, or like you're stacking them on top of them. I'm just wondering how you start to like make sense of the metaphors for what we know from the 2d and how that gets translated over into the 3d.

[00:05:23.359] Joe Ludwig: So there were a lot, there was a lot in there. One piece of that that I'd like to come back to now is not the right time to dig into it. is the difference between the approach that Metachromium and WebXR takes versus the approach that RVERT takes. But I think the answer to your question really is we don't know yet. We've had decades of WIMP interfaces, windows, icons, menus, pointers, which is the standard desktop metaphor. We've had 12, 13 years of mobile interfaces that are very much WIMP derived. Like they still have menus, they pop up, say you tap things with your finger instead of clicking on them, but they're definitely along the same lines. They're a little bit more tactile in that you slide things around with your finger instead of clicking and they're less abstract in that way. Spatial UIs are something different and we haven't spent nearly as much time with them. So I don't know that we really know what the answer is to exactly what the metaphors are. I have tried to make Art of Arc as flexible as possible. So in Art of Arc, you can grab things and move them around. But within a gadget, you can also just attach things to the user's left hand that they can't move around or that they move around in some custom way. And those things are not hard-coded. So I suspect that as gadgets are developed, those metaphors will change. One thing that I've tried to do is, I guess a few things I've tried to do, I've tried to get gadgets to be able to do consistent and easy things easily. So grab a thing, move it around, put it at a particular location in space, attached to your hand. Those things are all supported with essentially one line of code. If you want to do something a little more complex than that, I'd like those things to be possible, but because I don't know what they are, I can't really make them easy. So sometimes as you get further from that gadget approach, it does get a little bit more complex to actually implement them. But all those things should be possible because of the sort of general create a scene graph, create an interaction graph approach that OnGuard takes. So I suspect that some of these metaphors we do know, like you grab a thing that acts as a tool, you move it around and you use it as a tool. I suspect that that would translate more from the physical world than it does from the desktop interfaces. Because you don't really pick something up with your mouse and use it as a tool on a desktop interface. Other things I think will come over from the desktop interface, like buttons. In Art of Arc, as it stands at the moment, you put a button with your finger, it does a thing. What that thing does is up to the gadget. but it's essentially the same as a button on the screen would be. And there are other metaphors that I think are more unexplored, like gestures, for instance. One of the things that came out of the hackathon that we had back in September was this notion that if you and I do the equivalent of a high five in a shared VR space, then we have enough information to synchronize the coordinate system of your space and my space, and then we can share a spatial environment together. Basically we shared an inside of Aardvark gadget layer where you can see my gadgets, I can see your gadgets, we can interact with them together. And the gesture to get into that in the simple room gadget that exists today in Aardvark is we put our hands together in space and we pull the triggers. So that's a gesture, right? Gestures aren't really a thing in desktop metaphor. They're kind of a thing in mobile in that there are some things that are like swipe from the right side of the screen, swipe from the bottom of the screen. In some cases, two-finger, three-finger swipes do different things, or pinch zoom in out. That's a gesture, clearly. I suspect that gestures will be much more common in spatial interfaces, but I don't think we know what that basic language of gestures is yet. That's a common theme, at least in how I think about these things. When OpenXR was starting up, there were several established SDKs that were in place that were allowing people to build VR applications. And there were a smaller number, but still even a couple of those SDKs in place, APIs in place that allow people to build AR applications. There are essentially none that are established in that way for this kind of spatial overlay cooperative system. And so we're exploring a lot of new ground here. And I don't think we know what the best practices are yet. It's all very experimental.

[00:09:36.073] Kent Bye: Well, I know with Pluto VR's interface, it uses something that's very similar to like the Steam VR menu interface, where you bring up a menu and it overlays. A lot of the primary interface kind of lives in that overlay, like a 2D plane that uses a lot of that overlay.

[00:09:51.919] Joe Ludwig: So that overlay is literally a SteamVR dashboard overlay. So SteamVR provides this mechanism where in the SteamVR system UI, you can plug a 2D quad, then you can interact with the laser mouse. And that's what Pluto is doing. What our work is attempting to do is basically make it so that that kind of cooperation between dashboard overlays and the dashboard itself can be extended to a lot more interactions and a lot more objects and gadgets running at the same time.

[00:10:20.559] Kent Bye: And there's also a 2D plane versus like that interface within SteamVR seems to be like constrained to like a 2D interface versus what you are saying with Aardvark is more of like a spatial. It's like a whole scene graph. If you want to have different objects that are in there, it does not limit it to a 2D plane.

[00:10:35.808] Joe Ludwig: Yeah. Multiple models that can be animated. They can be attached to different hands. They can move around as they need to. And there can also be 2D planes because there's a lot of richness in generating a 2D plane from webpage. So taking advantage of that I think is pretty valuable. It's not clear how much of the interactions that we do is going to end up being in 2D planes and how much it will be in something that's more 3D and more sort of spatial native, but both I think are parts of where we're going to be headed.

[00:11:04.943] Kent Bye: Yeah, one of the things that really blew my mind when I was talking to Pluto VR was having the demo where there's a pen and I'm drawing, but yet they're pulling in a car and the car is actually occluding the pen that I'm seeing. I can't see it because it's actually interfacing and it was like... Oh, well, I never thought about how spatial apps would interface in the same scene graph and how you could have a light from one app and that light is illuminating other things in other apps. Because usually when I'm working on a computer, these apps are within their own context. They're in a 2D frame and they don't really interface with each other in the same type of way as they do in a spatial environment. And so like an example that Jared and Forrest were telling me was that imagine like Photoshop where you have the color picker, but yet the color picker is its own self-contained app that just happens to interface with the other features of your Photoshop. And so you could start to really break down these apps in a way that it's much more modular where you're pulling in component parts. If you don't like the color picker and the default, maybe you pull in your own custom color picker and you have them in there. this concept of spatial computing being these modular gadgets that they're all cooperating with each other in this ecosystem context that starts to break down the normal boundaries that we have in the context of the window. In spatial computing, they're all kind of like interacting with each other.

[00:12:29.041] Joe Ludwig: They potentially could end up interacting with each other. I think we would be in a better place if we end up in a world where they end up interacting with each other. A lot of these things have been experimented with on the desktop, like Olay components. I mean, I don't think that's a correct word, but there's, Microsoft's had a technology for embedding components into other apps in the late nineties that ended up not really going anywhere. Like they did a bunch for their own apps. You can embed an Excel spreadsheet in a Word doc, but that didn't work as well cross render. And one of the reasons for that is just UX complexity. So it's possible that UX complexity will drive this out of spatial also, but I hope it doesn't because I think it's a rich vein to mine to allow people to make their own experience better in small ways here and there. And the color picker is an example of that. The whiteboard example in Aardvark has a color picker, if you want to call it that, built into it. That color picker is for fixed colors in little cylinders. And if you touch the pen to that color, the pen's color changes. there is nothing in those color pickers that you couldn't put it in another gadget. You could write a gadget that is like an artist's palette that has an array of colors on it and you touch the pen to the right spot on that array of colors and you get an appropriate color from whatever it was you touched and would work without any changes in the whiteboard markers in the whiteboard example. So I think that there are a lot of examples of one tool acting on another gadget that would be very compelling in that sense. And color picker is one, keyboard is another, flashlight is potentially interesting. This would break underlying games a lot, but if you had the ability to say, I need stuff to be brighter, so I'm going to put a headlamp on. And maybe it didn't work in the underlying app because you want the headcrabs in the dark to be scary. And so you don't allow that. But maybe it does work on the gadgets that are overlaid on the app. And I don't think we know yet what that full set of tools is. And one of the things I'm hoping people will start to explore is building more of those tools that interact with not just the gadget that they've shipped with, but all the other gadgets too.

[00:14:44.099] Kent Bye: Yeah, because you're starting to get into the realm of like mods, where if you have like a flashlight on your head, like you said, and you're like illuminating a scene that should be dark, you're kind of like modding or hacking the existing behaviors. But this starts to get into the realm where I conceptually think of what Aardvark is, is that you're able to put an augmented reality layer on top of an existing like virtual reality game. So you have a virtual space that has a spatial context, but you're able to potentially use OpenXR, WebXR conceits to be able to potentially create these widgets and all these other things that could interface with that scene graph and potentially do like an AR layer on top of that. I don't know if that's how you also think about this. It's like this augmented reality layer on top of virtual environments.

[00:15:31.136] Joe Ludwig: So the reason Aardvark is called Aardvark is, well, for one, because it's a fun word to type and a fun word to say. But in the middle of Aardvark is the letters V-A-R, and Aardvark really is A-R in VR. That's literally the way that I was thinking about it at the very beginning. And the reason for that is that AR I think is very exciting. My model for where things are going to go long-term is that everything that you do where you're sitting in front of a computer and it has 100% of your focus, whether that's a laptop or a desktop or even a tablet, that is going to be a VR thing. If you want 100% focus on a thing, then you might as well black out the world. You might as well have infinite screen space. You might as well have a more richness in expressing the third dimension and motion, things like that you get from VR. Everything that you do on your phone, where you're spending some attention on it, but not all your attention on it, which I guess could also be some tablet. I mean, sometimes it could be your laptop if you're in a meeting or whatever. Those will all be AR things. And the monitors will eventually just vanish entirely. Keyboards will probably stick around because they're really good at entering text, but monitors and other display technologies will all be subsumed into AR and VR. But AR is not here yet. VR is here. VR works really well. And all of the metaphors that we need to develop to make AR work long-term are things that can be developed in VR today. And if they're developed over synthetic environments like a game or an empty void or an application that exists primarily to be the background of you doing work with your AR environment, with your AR tools, then we can develop that today. There's absolutely nothing in the way. And it can be high res, it can be fast frame rates, it can be comfortable headsets, good sound, all that stuff is just working today. So while we're waiting for AR displays to catch up to where they need to be for people to actually use them all day, we can do a lot of exploration of exactly what AR needs to do in VR. And at the same time, we'll make VR much more capable because we'll have the ability to listen to our favorite streaming service while we're playing, I was going to say Beat Saber, but some game that isn't Beat Saber, you know, a game where augmenting it with additional sound is not quite as jarring as playing music over Beat Saber. or any of the other things that we might want to do with the real world, with our watches, with our phones, that we can now do in VR because we can augment all the VR apps at the same time and bring these tools with us from app to app. So we're at this point with AR displays where there are some out there. They're basically all in pilot projects. You know, they sell some number of units every year. I'm not sure I've found a pilot project that turned into a production project where there's some company where there are a thousand people working there with a AR display on all day long, but they are sort of proving out some of the compelling use cases. And eventually those displays will shrink to the point where they can actually be worn all day. And they'll be high enough res that really you can, with a straight face, say that if you black them out, they become VR displays. And while we're running up to that eventual day, we might as well start to learn the lessons we need to know about interaction and usability and just spatial computing in general with AR that runs in VR.

[00:18:54.693] Kent Bye: Yeah, maybe we could swing back to the differences between what Metachromium is doing with their kind of skinning a Chromium web browser and using WebXR versus the approach that you're taking with Aardvark. And if you're using WebXR, OpenXR, and like what some of the differences between those two are, because it was a little bit confusing as to me as to what was what when I was working with the demo with Pluto VR, but maybe you could describe what some of those differentiations are between those two.

[00:19:20.653] Joe Ludwig: So WebXR is essentially the same as any of the desktop PC VR SDKs, just ported to the web. So in WebXR, you use WebGL, you generate pixels that represent your eye buffers. You need to do that at a very fast frame rate, 90 frames a second, 120 frames a second, whatever it is. And every 11.11 milliseconds, you send those eye buffers into the WebXR API and they get passed down to the runtime that comes with the hardware and they come out the headset. They are In essence, by default, each WebXR application is expecting to be that base level of reality. It's not a layer on top of Beat Saber, it's Beat Saber. And because WebXR is built around this, what you get out of WebXR is iBuffers with pixels and potentially a depth map. What Metachromium attempts to do, as I understand it, is it takes multiple of these apps at the same time, uses their depth maps, and tries to blend them together as best it can. There are fundamental limitations to doing that kind of depth compositing, especially around transparent surfaces or translucent surfaces. Just the information is not there to actually do those things effectively. So imagine you have two layers coming through from WebXR, and you have a layer that is both in front of and behind a tool that you're holding in your hand, there's only one pixel for that layer that is in front of and behind, let's say, your hammer. And so the information is not there to do the thing that is both in front and behind. And I think that's one challenge with the WebXR approach. The other challenge I think is that VR is very demanding in terms of CPU and GPU performance. We as an industry are able to take whatever GPU you have and push it to the limit without too much trouble, mainly because of the frame rate, but also because the fill rates are getting pretty high. You know that modern HMDs are pushing that 2K by 2K panel size. And you combine those two things and you just have to push a ton of pixels. And as the scenes get more complex, as you get more objects in the scene, the lighting gets more complex. You also have to do a ton of work on the CPU. The pixel pushing is not really that different between WebXR and native PC VRM. The CPU work is dramatically different. There's a lot of overhead in JavaScript that is not there in native code. And I'm not convinced that that is going to go away. Webasm helps a bit. Webasm is still 10 to 30% overhead, just from the fact that it's Webasm running exactly the same algorithms. and that varies a bit, you know, depending on how much memory access WebAssembly does, it has better or worse performance. So I'm unconvinced that it makes sense to try to get JavaScript to run in 11.11 milliseconds, every 11.11 milliseconds, the entire time you're running whatever the thing is. And Aardvark in some ways is my white paper about that. It's like my statement that I don't think that's the right approach. What I think is the right approach is that JavaScript works very well in a declarative environment already. When you open a web page, what you're looking at is some HTML and some CSS and some images that were generated by and manipulated by JavaScript. And that JavaScript runs not every time you need to generate a pixel because your monitor's refresh rate is 60 hertz, you run some more JavaScript to generate each pixel. That's just not how it works. What the JavaScript does is it either declares in the first place or manipulates the declared HTML elements. And then those run through a layout engine that is written in C++ that chews on them, does it very quickly, you know, figures out how big all the boxes are, figures out how big the fonts are, renders all that stuff, does a bunch of it on the CPU, does a bunch of it on the GPU, does automatic compositing of video and other sources that all feed into that rectangle that is on your monitor. And the JavaScript only runs when you click a thing or when you drag a thing, when you mouse over a thing, And so the JavaScript runs at the events that happen at a human time scale or an interaction time scale, where they're a few times a second instead of 90 times a second or 144 times a second. And the native code, the C++ code that does the smooth animation of the video or the smooth animation of the control sliding in over the course of several frames when you mouse over a thing, that's all in C++. You express your intent through these declarative approaches of HTML and CSS, and then the native code, the system of the web browser actually does the work to render that to the user smoothly. So, Aardvark does a similar thing. In Aardvark, at no point do you take what would be the WebXR approach, which is ask the system where the hand is, load a model, draw the model out where the hand is. You don't do that. What you do is you say, draw this model at the hand. And you hand that down to Aardvark, and Aardvark says, oh, I'm drawing this model relative to the hand. And it's not necessarily at the hand. It could be a relative offset. It could be scaled up. It could be rotated. But the expression that you're making, the statement that you're making, is draw it on the hand. What that means is that 11.11 milliseconds later, when your hand moves a few millimeters to the left, Aardvark knows it's on the hand. It draws it on the hand. It uses the new hand position. So Aardvark needs to run at frame rate. None of the gadgets need to run at frame rate. And if you have a gadget that is slow, it doesn't matter because it doesn't have to run at frame rate in order for things to perform and still look right. So between the performance implications of doing things in a declarative way and the visual fidelity implications of using scene graph to composite instead of using these depth buffers and pixel maps to composite, I think that Aardvark is taking an approach that is more scalable in a lot of ways, and we'll end up with higher quality and higher fidelity results in a lot of ways. But part of the reason that I'm building and working on it is to prove out that thesis. I don't think it's settled yet. Metachromium has very few users. Aardvark has very few users. Eventually, we'll find out what the answer is. Maybe it will be some combination of them. Maybe it will be one or the other. Maybe it will be neither of them. but they're both attempts to make a statement that this is how things should work and then prove it with code, like functioning code people can actually use.

[00:25:50.530] Kent Bye: Yeah. And so you mentioned WebGL being able to draw things out and within a lot of things within WebXR, it uses libraries like Three.js or Babylon.js to be able to be declarative interface, to be able to not get so low level as to actually be programming with WebGL directly, but to use these libraries to be able to interface with those ways of creating what's essentially like drawing a 2D graph and then painting pixels onto that with WebGL, my understanding at least of most of what that has been. So are you interfacing and drawing stuff through WebGL or using one of these things like 3GS or Babylon.js to actually be drawing things within Aardvark VR or Aardvark XR?

[00:26:31.397] Joe Ludwig: I'm not currently. I usually just call it Aardvark I ended up with the artwork XR domain that I haven't put anything at yet, but the XR is just differentiated. If I ever have a trademark, it'll probably have XR in it, but I just go with artwork. I'm not currently using WebXR at all, not currently using WebGL at all. it's possible that will change and that there will be one tightly optimized pile of JavaScript or maybe Webasm that will function as the Aardvark renderer. At the moment, that renderer is a little bit in JavaScript and then the more performance critical pieces are all in C++. So it's basically built into the browser in the same way that the layout engine for CSS and HTML is built into the browser.

[00:27:15.951] Kent Bye: So as people are making Aardvark gadgets, then are they essentially writing these React web apps then?

[00:27:22.057] Joe Ludwig: Yeah, exactly. They're writing React web apps that may or may not ever have a rectangle of pixels visible. So that the whiteboard, for instance, doesn't use, in Aardvark parlance, it's called a panel. It's a 2D rectangle that is the web page output of the Aardvark gadget. And the whiteboard doesn't use one. It uses a bunch of models. And so you pick up the models, move the models around, draw a stroke on the whiteboard. It makes a new model, but it does actually use a little bit of 3JS code to generate the glTF model that it creates. And then it hands that glTF model over to Rdrark and says, draw this model here. So there's no WebXR in there at all. There is a little bit of 3JS. And over time, I expect that library base to develop a little more. But it's a custom render, if you can call it that. It's very straightforward and simple right now, because I've been focusing on the framework side of it. So the rendering is just glTF models with physically-based rendering. And it's pretty straightforward at the moment.

[00:28:24.305] Kent Bye: Well, if somebody wanted to say, use some of the creative coding APIs of 3GS and be able to go into some VR experience, but then start to do spatial visualizations using some of the more spatially oriented 3GS code, can you write a hard work app that is able to then essentially write and create all sorts of math art using 3GS within the scene graph? Or is it not possible to do that because you're not necessarily using WebGL in the same way?

[00:28:55.236] Joe Ludwig: I'm definitely not using WebGL in the same way. I don't know that I know enough about Three.js to really know the answer to that question. And I also don't know enough about A-Frame. But as I understand, A-Frame is a little more declarative in the way that you specify what it is you want to draw. It's possible that translations from these systems to something that's more declarative, like Aardvark is, would be feasible. I don't know enough about them to answer that right now, unfortunately. But what I can say, though, is that where WebGL deals with textures, and shaders, and constant buffers, and transforms, and things like that, the abstraction is higher level in our case. The abstraction is glTF model, panel, transform, origin, which is I want it to be on my hand, or I want it to be on my head, or I want it to be on the room. I want that to be the parent. And also these interactive elements, like interface entities, which are quite a bit higher level than WebGL is.

[00:29:52.430] Kent Bye: You just said there that you can choose the origin point as to whether it's relative to your hand, your feet, your head, or the room itself. Are you able to get, say, coordinates of whatever locally a VR app is running? Let's say if it's a level in Half-Life Alyx and it has its own coordinate system, or what if it's a Unity app that may have a different coordinate system per level? But being able to address things to a specific location, to be able to anchor objects to that, like if you wanted to create a whole augmented reality experience that would be like a guided tour, you would kind of need to have some sort of the equivalent of a virtual GPS system to be able to, based upon where you were at, put things there, but you would need to have some way to be able to query that and get access to whatever that coordinate system was if it's not immediately available. Because at this point, we don't need it really for anything. But if there's some way to have an interface between whatever that canonical addressing would be for any given experience and to be able to have Ardvark be able to attach to that.

[00:30:55.546] Joe Ludwig: That is a system that a lot of people want to build for a lot of different reasons. And Ardvark would be an immediate user of it if it existed. Because Ardvark, I guess, as of a few weeks ago, Ardvark gadgets can now figure out, they can find out what app they're running. but they don't really know what level they're in. They don't know what their transform is relative to that level. So if you're in VRChat, and if we're both in a room in VRChat, your art work instance and mine would both know that we're in VRChat, but it wouldn't know where we are in the room. cracking those apps open to the point where we can get that information out and find out what map I'm in, what instance of that map, because it might be that, you know, you and I are both playing Skyrim and we're in the same dungeon, but we're each in our own world. We're not actually in a shared environment. And what the transform is relative to that environment, you know, the transform from my room center to that environment. Those are important things to know. And I expect that those will be features that as overlays become more common, that runtimes will start to provide and that apps will start to provide to runtimes. In a lot of ways, it's a rich presence question. So if you run a modern PC gaming platforms, modern consoles will allow applications to tell their social layer, the user is in this dungeon or the user is doing this thing. And then they reflect that out to that user's friends. So the user's friends can say, chat with them and say, Oh yeah, I did that part last night. It was great. And what we're talking about here is really just an extension of that, but it's not quite available yet. It's something that I'm hoping will become available. Maybe it'll be an open XR extension. Maybe runtimes will start to support it, but it's not quite there yet.

[00:32:33.080] Kent Bye: Yeah, I feel like to really start to unlock the AR potential where you could like start to put art and attach it to an existing game. And it'd be another way of modding it without having to have any sort of official interface with the app, which I think is sort of the beauty of AR already is that you really don't need permission always to be able to augment something that's already there in physical reality. And this is a way to potentially have a quick and easy way to be able to do various different annotations on top of a space. So yeah.

[00:33:06.056] Joe Ludwig: Absolutely. Like you need to know where you are in Skyrim, let's say. in order to put a quest guide there, in order to put a little tombstone for where you died last time, in order to put some other augmentation, to put a name over your friend's head. You need to know that your friend is there. You need to know where you are relative to the map. You need to know where your friend is relative to the map. And so having that information available would unlock a lot of applications that are not quite supported yet. We have an approximate version of that in the real world in the form of GPS, which is relatively low resolution, probably not enough to put a name over somebody's head, certainly enough to know what restaurant you're in. So we could start by trying to get what restaurant I'm in, what street am I on kind of level for VR, or we could just go all the way to, because it's a synthetic environment, it knows where you are. It just has to tell somebody, and once it tells somebody, then somebody can share it with everybody else, and we'll know where you are, which would unlock all these applications. So we'll see how that comes to pass, but it's an exciting notion. We just have to figure out how to get that information out of the app, which currently is the only thing that knows it. The runtime, like Oculus Runtime, SteamVR Runtime, they don't know it, so they can't share it with anyone yet. There's been talk in various circles about, well, we could do some of the same things we do in the real world. we can run some computer vision algorithms over the virtual environment and try to figure out where it is. And that might work. Certainly you could get some information about that. It seems like with relatively little overhead, the game could just tell somebody where the user is, and you'd have a much more accurate picture of that. But it may be the game that's interested in that. So we'll see how it plays out. But that's not really what I'm excited about. And maybe something OpenXR can help with. You know, if there were a standard open XR extension for doing that kind of sharing that kind of information, then it would be straightforward for the runtime to then share it with overlaps.

[00:35:02.352] Kent Bye: Yeah. You said earlier that you could put objects relative to the world. What's the coordinate system that it's using in the world if it's not associated to your body?

[00:35:11.236] Joe Ludwig: The four origins that are supported at the moment are left-hand, right-hand head and stage. So it's the room center, center of your chaperone, your play space.

[00:35:19.857] Kent Bye: Oh, okay. So as you're walking around, then that would just be the center of your room. Yes. Okay.

[00:35:25.620] Joe Ludwig: Yes. And for multi-user artwork, because gadgets can be, this is a departure from the way web browsers typically work. But if you and I are in a Pluto conversation together and we're both running artwork and I pull out a whiteboard, the whiteboard shows up for you and we can both interact with it. That's a capability afforded by a room gadget that the Pluto folks have built. So when we're in a Pluto call together, your artwork instance and my artwork instance recognize that through this room gadget and they build a shared coordinate system. There's not currently a mechanism where you can just say my origin is that coordinate system, but it kind of does it indirectly through things being relative to your hands or things being relative to the stage. So those coordinates all kind of flow through those rooms. And once they do that, then we do have that shared coordinate system and we can interact with the same gadget at the same time. Like I can pick up a pen, draw something, let go of the pen and you pick it up and draw something on the same whiteboard.

[00:36:25.908] Kent Bye: That's interesting. When I think about communications mediums and how they propagate, I sort of broke it down into like four different stages where there's a new technology that unlocks new affordances that weren't possible before. And then you have the artists and the creators who are pushing the limits of that technology by making stuff. And then you have the distribution phase where it has to some way get in the hands of people so they can actually try it. And then after they try it, then they're able to have that closed loop of giving feedback back into both the technology and the creators. And you sort of have this evolution and development of a communications medium because you're trying to express something or facilitate some sort of reaction. And I'm wondering if you've thought about now that this platform that you've been working on is made available and it's like creating all these new affordances, what you think the key thing is to be able to help take it to the next stage of getting more people involved in creating apps and then sharing those apps with each other and that creative spirit of that, creating something and having it be received that then drives the development of that in the future.

[00:37:30.295] Joe Ludwig: I think that that's a reasonable framework. I mean, it's the typical flywheel that you see with any new technology, that you make a version of the thing, you get it in front of people, get feedback from the people and use it to make the thing better. And that gets it in front of more people because it's more compelling. So that's essentially the same loop, just with slightly different labels. And I think that it's relatively easy to install Aardvark over the past few weeks, month, couple months. A lot of effort has gone into trying to make it so that it's just something you can run all the time. So you don't really have to worry about running it. It was a little bit in your face and in your way a couple months ago, but that's gotten a lot better. So if you're running it all the time, and if you can run a gadget just by downloading it, like the web browser handles this, right? You just put in a URL. Now our work knows about the gadget and it just runs on your machine, just like a webpage would. Distribution is relatively straightforward in the literal sense. The thing that gets a little harder is how do you get a specific gadget? Like if you have a gadget that you like and I want to try out that gadget, how do you send it to me? And you can email me a URL. We can go to a shared space, like we can jump on Pluto Call, for instance. And in that latter case, I can pull out my gadget scanner, I can point at your gadget, I can favorite your gadget, and then I can make it on my own. And I'm hoping that spreading essentially word of mouth like that will allow gadgets to spread. And then I intend in the short run to be very liberal with access to that global gadget list that everyone has when they bring up the list of gadgets. much like when the web was brand new, there were websites that had literally every website listed on them. That's what Yahoo started as, it was a directory of all the websites. And there were what's new, what's cool sorts of pages in Mozilla and Netscape that were literally every new webpage that day. The web quickly scaled to the point where that didn't work anymore, but Aardvark has not gone that far yet. So while Aardvark is still small, I think that encouraging people to check out cool new stuff is the most important thing. And the way to get that to happen is for there to be cool new stuff to check out. So my current focus is on generating tutorials to make gadget development, to demystify it a bit. There are a lot of developers out there who don't know anything about React. There are a lot of developers who don't. necessarily understand the difference between this declarative approach and the render loop based approach that someone with WebXR would use. So getting people comfortable with those differences is my focus right now. We have one video that went up over the weekend. We'll have more coming up over the next few weeks that explain all the basics of how to get a gadget up and running. It's not hard, but it is a little weird. So getting people used to the weirdness and getting them over the hump that they can make a thing that's interesting and useful to them, then we can help them share that thing with other users, which will drive more people to use Aardvark, which will drive more people to make stuff that uses Aardvark, which will drive more people to use Aardvark, etc. So I'm hoping that that will come online in the next few weeks. We'll start to see more and more gadgets. The gadgets that are out there right now are mostly the examples or the ones that came out of the hackathon in September. Perhaps this spring we'll have another hackathon that will, now that our work has moved forward quite a bit in the last four months, hopefully we'll see another hackathon and we'll generate a bunch of new gadgets or take some of the old ones and add some more features to them, et cetera. But bootstrapping that flywheel is a challenge. So the first trick is to get some content out there, and I'm working on tutorials, documentation, and that sort of thing to help make that gadget development process as straightforward as possible.

[00:41:12.110] Kent Bye: Yeah. When I, when I think about the VR streamers who attach things to their hands, I see a lot of people who are Twitch streamers who want to see the Twitch chat. And there's already a lot of other apps that are out there that are able to kind of overlay windows that are out there. But what do you imagine people will want to make that they could only make would say an Aardvark app that wouldn't be available on other things?

[00:41:34.233] Joe Ludwig: I mean, Aardvark uses entirely open APIs, so there's nothing that you could build in Aardvark that you couldn't build directly on the CDR. The goal is to make it so that it's easier to build things in Aardvark, and so that you get some advantages from the Aardvark way of doing things, just because it makes your life easier. It's easier to deploy because you've just uploaded it to your web server. It's easier to make something interactive. It's easier to do the user interface because there are these consistent metaphors and the language of how to interact with things and it's understood for all of our gadgets. Twitch is an interesting example because there were half a dozen suggested gadgets that people could work on as the hackathon was getting ready to start that were all Twitch related. And some of them were audience gadgets where an audience member would watch a stream and would interact with the streamer. Some of them were streamer gadgets where the streamer would see the chat or they'd see the other things coming in from the audience. And I think it would be interesting to get some of those developed, like to get them actually to be out there and see if streamers will use them and see how useful they are. One thing that I think is interesting is that these rooms where we share gadget state with each other, they don't have to be symmetrical. So if the streamer is essentially the host of a room, they're sort of the person on the stage and you could have smaller groups or even individual visibility into those rooms. So if you and I are watching some streamer play some game and we happen to be in the same room, we can see each other's stuff. We can see the streamer's stuff. The streamer can't see our stuff because they'd be overwhelmed, obviously. but there may be limited ways in which we can interact with their stuff by applauding them or sending a chat message to them or making suggestions or voting for this thing or that thing in the same way that people do those things in text chat channels right now with Twitch and other streaming platforms. So I think there's a lot of potential there for VR streaming, but also other kinds of media that can be a little bit more interactive. There are a lot of things that are floating around like this now. YouTube has a thing called instant premieres, where you watch the video with the followers who were watching it when it first premieres, and you get a chat channel. Amazon has things where you can start a party, where you start a watch party, where you watch a movie or a TV show and you're in a shared chat channel with people. Big screen is basically all about that. They do a lot of interesting things with shared media experiences. Like Fortnite's done that recently with some of their concerts. I think a lot of those interactions could be even richer than they are. And Aardvark or a system like Aardvark could enable that by allowing the things that are interactive to be spatial. as opposed to be just chat, just text, or just pictures on webpage.

[00:44:20.957] Kent Bye: Yeah. I think interacting with the audience and other Twitch things, I think is probably one of the areas that I would imagine that be a pretty high demand, especially if people are veiled within VR and they want to be interacting with people who are watching them. And yeah, all this stuff that they could afford. I think the other thing that I think of at least is like, you're trying to meet up with someone in a VR chat world, and it's a really big world, then knowing where they are relative to that world, if they've consented to sharing their location. But then we get back to the GPS coordinate problem that we talked about earlier, where I'd imagine in the future, if there's enough information to be made available, you could start to rapidly prototype and iterate on features that could be eventually built into the system themselves because they're useful enough, but it's a way to kind of like have the audience be able to augment or develop their own features. Of an experience that isn't quite developed yet within the application. And that could be a way of feeding those ideas back into the actual core. So everybody's using it, but yeah, I don't know if you've had anything specifically for yourself that you want to see or want to make.

[00:45:24.469] Joe Ludwig: The biggest thing I want to make is an artwork debugger. you know, the ability to see what gadgets are running and see what their scene graphs look like and how they're connected to each other like that. Visualizing that right now, I have a shoddy 2D webpage that kind of does it, but it's fundamentally spatial information. So seeing it spatially would be very valuable. The audience for that, it's pretty small right now. So I don't know if that's the next thing to work on, but I think that as you said, distribution is I think the next challenge. So making content, distributing content, those are the two big things right now. And part of distributing could mean that one of the next things to work on in Aardvark is figuring out how to make it automatic in VRChat, for instance. I don't know much about the VRChat extensibility and the way VRChat environments can be marked up, but if that could be automatic, if a person running Aardvark joins a room in VRChat and they automatically get the shared Aardvark environment, that would be great. That would unlock a lot of potential there. And it's easy for anybody who's not running Aardvark to run Aardvark. And once they are, they just jump into VRChat and away they go. And as you say, it could enable them to prototype things that aren't in VRChat yet. Or it could even, you know, it might be that this sort of approach where things are much more composite than monolithic is actually the way forward. Like it may be that VRChat itself would be better off if some of the things that are currently built into VRChat were really overlay things, like the avatars, for instance. You know, if your avatar and VRChat were a thing that came from an overlay, the VRChat itself might want to draw them. But if the selection of those avatars comes through some sort of open standard for avatar formats, and you can select that avatar in an overlay and use that avatar anywhere where you have an avatar, that's pretty powerful. So the leaking might happen in both directions basically, like the blending of overlay apps into VR, into base level realities, and the blending of what's currently in base level realities up into the overlays would be pretty useful too.

[00:47:28.872] Kent Bye: Yeah, a quick note about VRChat is that they do have the ability to go onto VRChat website and get an invitation link that has like a URL string that has like instance identifiers and whatnot. And so there is ways that they have built in to be able to identify what instances that you're going to. But I don't know how much of that information is generally just available as you go in there.

[00:47:49.842] Joe Ludwig: The question would be, how do you get, every time you teleport or every time you locomote inside of the room, how do you sync up that user's position with the room origin? and have that on an ongoing basis. Knowing what instance they're in, what map they're in, that's a good start. Knowing what the transform is relative to that room is the next step.

[00:48:07.709] Kent Bye: Yeah. Yeah, well, to me, I think it's very exciting, but it's still early days where I think, you know, having more of those loops of getting momentum, people actually building stuff and finding out what those at least killer apps would be for people to say, Hey, this is going to be worth me. Downloading to go check it out and actually making stuff. Cause I love the idea. And I think as stuff goes forward and you're able to get more nuanced information, cause it is things just are attached to your body at this point. But yeah, I'm excited about it because it feels like it's a way to potentially start to rapidly prototype augmented reality type of experiences within VR and to be able to use the affordances of having all that precise data of that world. That's not fully unlocked yet, but once it is, then we're able to really take it to the next level in terms of creating a lot of these apps and whatnot. Yeah. Yeah. So I guess for you, what, what are some of the, uh, the next big steps or things that you want to try to move to next?

[00:49:06.913] Joe Ludwig: I want to get the documentation up to snuff. I want to try to get that flywheel going, getting a gadget that you put together, getting a little more integration with base level worlds, maybe, and maybe VRChat, maybe hubs. Hubs, I think might, I don't know what the state of the hub scene is at the moment. So hubs might not be the best place to spend time, but VRChat is actually doing quite well. records doing quite well. Maybe some of these would be good to be able to know where you are, what your space is. So I work on that problem and just, you know, keep turning the crank, make it better.

[00:49:36.601] Kent Bye: Great. And, uh, and finally, what do you think the ultimate potential of virtual or augmented reality might be and what it might be able to enable?

[00:49:47.285] Joe Ludwig: So I think I've given you the same answer every time you asked me that question, but I should go back and check. But I think long-term what it does is basically take people who are at arbitrary distances and bring them together. And not necessarily for just for socializing, but to do things together. So we talked about you and I being in Skyrim together. Skyrim happens to be a single player game, but if it were a multiplayer game and we could cooperate, beating out monsters or whatever in Skyrim, we could do it in a way where it feels like we're really there. And so it could enable us to have these shared experiences that might be experiences we could have in the real world, might be experiences that are completely fantastical and not really viable in the real world. And I think that's very powerful. It allows people to connect in a way that more traditional pancake games do not and connect in a way that they can't when they're remote from each other.

[00:50:40.188] Kent Bye: Hmm. Great. Is there a, is there anything else that's left and said that you'd like to say to the immersive community?

[00:50:45.892] Joe Ludwig: I guess go to steam, download aardvark, play around with it, try to make a gadget and jump on the Slack and let me know what you think.

[00:50:52.765] Kent Bye: Awesome. Well, Joe, I'm excited to see where this all goes. It feels like it's very early days. When I saw some of the demos from Pluto VR, it started to really open up my mind just in terms of what's it mean to have like an ecological approach when it comes to computing and how these little gadgets are interacting with each other. We already have APIs on websites that are talking behind the scenes about different stuff, but to actually see it within the spatial context and these different gadgets interacting with each other in different ways, it sort of opened up my mind to how everything is usually within this closed walled garden, single app running at the same time. I think this starts to open up new possibilities with multi-app spatial computing in a ways that we can start to see where the seeds of that are going to go. And there was a lot of new insights that I had just by testing out early. And yeah, I look forward to see where this all goes because it's sort of like a whole new area and There's going to be things in hindsight that I saw through this. It's like, oh, in hindsight, it makes sense, but things that I didn't see and that I was surprised about. And I think that's, what's so exciting is that it's the opportunity for people to really discover things that in hindsight will be obvious, but we don't know yet. And we've yet to discover.

[00:51:57.860] Joe Ludwig: I think there will be a lot of things that are obvious in hindsight that we don't know yet.

[00:52:04.165] Kent Bye: Awesome. Well, thanks so much.

[00:52:05.786] Joe Ludwig: All right. Thank you.

[00:52:07.586] Kent Bye: So that was Joe Ludwig. He is the creator of Aardvark. So I have a number of different takeaways about this interview is that first of all, well, I actually learned a lot about what Aardvark is and especially contrasting it to something like Metachromium, which is using like a Chrome browser and a lot of the affordances of WebXR and the WebGL and being able to basically overwrite all of this stuff on top of an existing VR application, rather than using the depth maps that are coming from WebXR to be able to try to mash those together. For him, he's just trying to know where your hands are in your head and be able to attach objects there that it doesn't force JavaScript to run at frame rates to be able to update at the same 90 hertz or 120 hertz that you would expect a native PCR VR application to be running. And so what he's able to do is do a little bit more of a declarative approach, where just like the web browser has a HTML renderer, where it takes all the information from HTML, CSS, and goes through C++ code to be able to actually render things at a high frame where there's a higher level abstraction with the JavaScript that is either modifying or triggering different events happening within this layout system. And so it's a declarative layer, meaning that it doesn't have to necessarily happen at 90 hertz or 120 hertz. he's able to do this layout engine where Aardvark is running at framework and you're able to attach these Aardvark objects into your hands. So as I listen to that, I really see that this is a little bit of like abstracting out what your body pose, your head poses, to be able to run that at framework, to attach different objects onto your body within the context of a virtual environment. Because right now there is no way to get like a virtual GPS from any of these virtual environments. you have the center of your room which is relative to your local play space but it's not something that is relative to the virtual worlds that you're in. I think to really get to this level of augmenting virtual reality we're going to need to have a lot more information and context from the world itself and to be able to query where your location is, where these other objects are. And so Joe is saying maybe that'll eventually be an OpenXR extension. And so things like Unity and Unreal Engine, I think, are in the process of implementing things like OpenXR. And so if there's an extension to OpenXR, then that could be something that could be declared to whatever systems may want to have that type of information. wherever your body, which is essentially like a camera within these virtual spaces, wherever that camera is, is going to give you a GPS coordinates where that is. And maybe from there, you're able to do some orientations for what the origin point is and do some translations and be able to put things relative to your body into this world. But right now, in the absence of having that GPS available, maybe there'll be some computer vision algorithms, but that's a lot of additional overhead as you're already constrained. I can't imagine that it's going to be feasible in the longterm to be able to run on top of all that, all these computer vision algorithms to be able to do augmented reality. But I think if you get the pure information of the virtual environment, you're able to essentially do some prototyping for what type of things might be useful. As I look through the list of the different gadgets and widgets, a thing that struck me was that a lot of these things are going to eventually be potentially built into some of these applications. So if you're thinking about VRChat, a lot of the things that you really want that you don't have already could potentially be just features that are built into VRChat, and it'd be hard to top that just because they have all that. information that's right there. I think the value is to look at something like some of the features like VRChat, like say maybe the social graph that you have, but maybe you want to go into a different application that doesn't have that social graph that's already built up. And maybe you want to jump into a Mozilla Hubs application, but you want to meet up with your friends. Well, because Mozilla Hubs is potentially just spread out all over the internet, then something like the Aardvark panel could be a way to create like these social graphs across these different experiences so that if you want to meet up with your friends, then you're able to pop in directly to some of these different virtual environments. That's one use case that I can think of is that generally it's like things that are centralized within the application. They're going to be maybe just built into that application. You have the potential to be able to start to augment some of these existing applications with features that they're not willing to implement. Or maybe you have an idea that is a little bit more experimental that not enough people would merit developing that within the application itself. So you're able to sort of do these mods into these experiences, but something like the decentralized web and WebXR and these different applications, you know, there are these different layers that would be useful, like a social layer and a social graph. And that maybe you don't want to like recreate that with every new context. And so maybe there's value of decentralizing that in some way and putting it into this type of Rvark interface. So that's one future trajectory as you move forward is trying to see what the different affordances are within some of these different experiences and then see if you take the decentralized approach, okay, now what do you have to do to really architect that and use these different open standards and start to implement similar type of user interface. I think it's also going to be really interesting just to prototype the different user interaction designs, because as Joe said, is that we already have the WIMP interface. That's the windows, icons, menus, and pointers. And when you start to break out of that, then you start to have other things like gesture interfaces, maybe even like voice activation could be a thing at some point, but just thinking about how to start to put these little gadgets on your body and see what kind of things they can do within the world. I think of different experiences like Star Wars from the Galaxy's Edge. You have like this little gadget on your watch and you start to interact with it. There's things like a map that will help point to different things that you need to go find within that world. I think eventually we're going to start to have that, but again, we need to have a little bit more of that information from the world, like the GPS and things like that. But there could be other ways in which that generally useful different types of applications across different contexts. And I think that's the challenge is like what those things are that you'd want to have and what are worth really building and really experimenting with. Because sometimes it may be just easier to build a steam overlay application to be able to sell it because, you know, right now there's ways to distribute these apps, but there's no way to monetize them in any way. And so if you're going to be building some of this stuff, it's going to be like an open source version. So if people are. willing to contribute back to the commons to be able to create these different types of applications then I imagine taking a look at applications like the OVR drop or OVR toolkit. Are there things within that that start to get replicated? And I think the other thing is to look at like these Twitch streamers and see what type of things that they're doing because they're already at the bleeding edge of finding ways of creating these layers of augmentation within their virtual reality because they want to be present to the virtual world, but also be connected to their audience in different ways. And so I like what Joe was saying is that there could be new ways of having the audience be able to participate with someone in a spatial environment where it's not just a 2D text chat, but you're able to actually put in these objects into the world. And so I think one thing that helped me think about what this Aardvark is is that it's a little bit different than WebGL. WebGL is a lot lower level when it comes to graphics of, say, like shaders and textures and constant buffers and transforms. And the abstraction within Aardvark is at the layer of a glTF model, you have panels, you have transforms, and you have different origin points of either being attached to your head, your left or right hand, or the stage, which is essentially the world, like the center of your room. And you can have different parent aspects onto that as well. So you think about it as these watches or these things that are attached to your body. So rather than in, say, a WebXR approach where it's basically just painting pixels into the eye buffers and having these depth maps. Well, because you don't necessarily know where your body is, then you have to kind of detect the body and put things onto your body. But when you want to actually attach things directly to your body, it seems like Aardvark seems to be a good approach because it's actually using that as the origin point of your body and, you know, eventually probably your feet as well, but starting with your head and your two hands or into the center point of the room. And so you can think of it as like these different gadgets and these watches that you're adding onto your body that is implementing different features that are not necessarily implemented within the world yet. Joe said he's not necessarily using WebXR or WebGL much at all. And that's more these React applications that have these panes that are attached to your body. So if you are familiar with writing React web applications, then you can start to think about all the different things that would be interesting to start to pull from the open web to be able to start to have General situation awareness for whatever is happening within your virtual world I see that augmented reality is like this Ability to pull in additional layers of context into your world and as Joe said it's sort of like the things that you'd be looking at your phone for rather than if you want to have your complete attention, then you would might as well use something like virtual reality, but This is for ambient information that's able to add additional context. So you can think about like GPS and maps and being able to find people and identify things. Again, there's some things where, you know, some of this could be better to just like sync up on a discord chat and be able to have a real time conversation. There's also Pluto VR, which is starting to have this social telecommunications layer on top of all these. And so it's. ways that you could start to essentially bring in the functionality of like discord, but you're able to have specialized avatars. And so Pluto VR is starting to use both Aardvark and metachromium in different ways. And so just being able to communicate with people that does quite a lot within its own right. So then the question is, okay, what are the other types of information that would be useful to be able to pull into these virtual environments? And I think really focusing on a specific use case and finding whatever that is that you absolutely want to be able to have that feature. Imagine it's going to be something around Twitch streamers that would be able to do different applications that maybe enable and unlock things that allow a level of interactivity with their audience because you do feel somewhat cut off from your audience when you're unveiled within VR while you're streaming. Other than that, there's other things like VRChat and be able to look at the existing social applications and see, okay, would it be useful to say, okay, where are all my friends that are in instances that are publicly available and I'm able to join them? Because there are a lot of times when a lot of your friends are in these private instances and you don't quite know if they're able to be joined or not. So if there's a list of different friends that are open joins and they're in rooms that are publicly available and you're free to join them. That would be handy as an additional layer to be able to filter down. So again, these starts to get into these different aspects where some of this information could be available from the VR chat API that you're able to potentially query that type of information. And eventually these different types of things could be built into these applications themselves. And so there's a risk here of building these different types of applications to augment your existing applications, but they prove to be so useful that they just get built into the application themselves. So really thinking about this type of application into things that are outside of the open web, and then starting to think about these different tiers of functionality and how they could be potentially start to be broken up. The challenge is something like Pluto VR is that they are starting to pull in these multi app context environments, but you actually have to like download all those things and bring them all together. When open source software, there's ways to do packaging to be able to actually automatically pull in all those different things. And maybe there'll be some sort of similar package to be able to automatically pull in all these different component parts. But as we start to build out the more decentralized version of a lot of these things, then the distribution and the packaging is going to be an issue. And just to be able to make sure that you have everything, you don't have to like download 20 different things just to be able to run one single application. So I think there's going to be a value to that. It's just a matter of the user interface. The challenges that Joe said is that that was attempted to be able to break up these different things and mash them together. But the complexity that was introduced when you would start to say embed an Excel spreadsheet within a word doc, it became so complicated that it just made sense just to keep those things separated. So maybe. Spatial computing will find that that's the same thing. It'll be just easier to bundle everything together and have everything that you absolutely need to do a specific task. But with the multi-app type of mindset, there's going to be a lot of ways in which these different component parts interact with each other. And just like on your mobile phone, you can swap out your keyboard. If you don't like your keyboard, you can use a different keyboard. And so there could be a similar way in which that there's modular components where if you want to really do a very specific task, maybe you want to have some shortcuts and some gestures that are performing different things. then you're able to maybe implement that on the Aardvark and the ways in which the Aardvark, which is essentially a React web application, is communicating with your native VR application that you're running in that moment. Maybe there's going to be OpenXR extensions that are going to be able to communicate with each other. But thinking about how these apps are able to communicate with other apps, as well as the baseline VR app that you're running, to be able to pull information. That's part of the reason why Joe wants to eventually write this debugger is to be able to start to visualize how these different entities are talking to each other. When Joe's first started this, he thought that it was just going to be like these layers that you're overlaying on top of each other. But some of the most interesting aspects of these things has been how these things actually talk to each other, because that level of interaction ends up being something that is part of the compelling and interesting thing that was surprising to him. And so really thinking about these individual apps and gadgets and how they're talking to each other within the context of a spatial computing environment, I think that is some of the work that he's going to be experimenting with and tinkering with. And right now, when we think about the future of spatial computing, you know, there's a lot of experimentation that can happen right now. Joe said that anything that you could potentially want to do with an AR, you can already start to prototype within virtual reality. So if you're interested in some of those human computer interaction questions, then you could create these different toys. They're not essential things that you need. They're just things that are kind of fun to play around with. And maybe you'll stumble upon something that is a crucial utility that is going to be solving a very specific need that nothing else does. I don't know what that is, and I don't know if it's been discovered yet, but you can at least start to play around with different implementations, trying to implement something with an or be able to implement it in Metachromium and see what the differences are using the WebXR and the WebGL approach versus Joe's approach, which he said is kind of like his white paper to say that he thinks that's the right approach. So I think it's going to take people experimenting and building things in both applications. So yeah, it's a bit of an open question. Uh, he said that there's not a lot of users of metachromium, not a lot of users of art work. It's still very early days for both of these different approaches. And so there could be different ways of implementing both of them. And just to see which one starts to take off, which one starts to have momentum. That's like the question right now in terms of. It's still very early days and it's lots of opportunity to both experiment, but also potentially discover some fundamental affordances that are completely obvious in hindsight, but that nobody's really like thought of and considered at this point. So. Yeah, it's exciting to see where this goes. Uh, he's trying to bootstrap this whole loop between being able to unlock new affordances, which I think this actually does a lot of interesting new things that are different to be able to create these layers of augmentation within a virtual environment, to be able to produce gadgets, to be able to fully explore and implement all the different potentials of that, to send a URL to someone and be able to distribute that. And so he's got ways of centralizing that and potentially even. ways to be able to scan what gadgets that you're using and to be able to clone them and to be able to download them yourselves and to be able to have people actually using them and to provide feedback both to the gadget creators and be able to request back to Joe different features that you need to be able to actually do things that you want to do within this layer of augmentation. So, that's all that I have for today, and I just wanted to thank you for listening to the Voices of VR podcast, and if you enjoy the podcast, then please do spread the word, tell your friends, and consider becoming a member of the Patreon. This is a listener-supported podcast, and so I do rely upon donations from people like yourself in order to continue to bring you this coverage. So you can become a member and donate today at patreon.com slash voicesofvr. Thanks for listening.

More from this show