#537: The Future of VR for Google is on the Open Web with WebVR & WebAR

brandon-jones-2017Google’s mission is to organize the world’s information, and so they’ve been long-time advocates for the open web. At Google I/O last week, they announced that they’ll soon be shipping Google Chrome for Android with WebVR, and that they’re going to start experimental builds for WebAR. During the WebVR talk at I/O Google showed how to write a progressive web application with three.js that could be viewed on a desktop computer, mobile phone or tablet, or Google Cardboard or Daydream virtual reality headset. Google is pushing the hardest for platform-agnostic WebVR applications on the web as mental presence is their core strength.

I had a chance to catch up with primary WebVR spec author Brandon Jones at Google I/O to talk about why they’re holding off on shipping WebVR 1.1 and waiting until the latest WebVR “2.0” version is ready. It’ll ship as WebVR 1.0, but there has been such major refactoring to account for augmented reality that internally it’s referred to as the 2.0 spec. Mozilla will be shipping the 1.1 WebVR spec in their browser in August, but Jones says that the Chrome team doesn’t want to have to maintain and support the 1.1 version, which is sure to quickly be deprecated.

Jones and I talk about the differences between WebVR & WebAR, and the long process of developing the WebVR API over the last three years, VR’s relation to other exponential technologies, and the philosophy of being a immersive technology platform developer for billions of devices.

LISTEN TO THE VOICES OF VR PODCAST

Check out my previous interviews with Brandon in 2016 and in 2015.

Here’s the Building Virtual Reality on the Web with WebVR from Google I/O ’17

The Google Expeditions team is also using A-Frame to rapidly prototype Google Expeditions experiences

Subscribe on iTunes

Donate to the Voices of VR Podcast Patreon

Music: Fatality & Summer Trip

Rough Transcript

[00:00:05.412] Kent Bye: The Voices of VR Podcast. My name is Kent Bye, and welcome to the Voices of VR podcast. So Google I.O. was last week, and they had all sorts of developers from across the world coming to hear the latest news that was announced at Google I.O. They announced a new headset for a standalone VR headset, as well as showing the latest demos for the Project Tango technology, which is like their augmented reality phone-based AR, which was super impressive. So I have some other interviews that I'll be diving into, some of the other news from Google I.O. But for me, I think the most important and significant announcement that was made is that there's going to be a version of Chrome that's going to be coming out later this year. It's going to have implemented within it WebVR. So, on today's episode, I feature Brandon Jones, who has been working on WebVR for the past three years, and this is actually a story that I've been tracking since the very beginning of the Voices of VR podcast, when I did an interview with Vlad Fasovich, as well as Tony Parisi, back in May of 2014. So Brandon started WebVR as a 20% project and over the years it's grown to the point of getting a lot more momentum. It's got basically every major VR headset manufacturer involved in the process of WebVR. And there's just been some huge support. The only problem is that there's no WebVR browsers that have been launched yet. Nothing that's official. So if you want to see any WebVR content, you have to download special browsers, you have to get special tokens in order to display WebVR content. So I talked to Brandon about the state of WebVR, where it's at, and when we can start to expect some of the browsers to ship, as well as the process that's involved in order to push out this new web technology. So I talked to Brandon about WebVR, WebAR, the differences between the two of them, and just the power of the open web as applied to these new immersive computing platforms. So that's what we'll be covering on today's episode of the Voices of VR podcast. But first, a quick word from our sponsor. Today's episode is brought to you by the Voices of VR Patreon campaign. The Voices of VR is a gift to you and the rest of the VR community. It's part of my superpower to go to all of these different events, to have all the different experiences and talk to all the different people, to capture the latest and greatest innovations that's happening in the VR community and to share it with you so that you can be inspired to build the future that we all want to have with these new immersive technologies. So you can support me on this journey of capturing and sharing all this knowledge by providing your own gift. You can donate today at patreon.com slash Voices of VR. So this interview with Brandon happened at the Google I-O conference on Thursday, May 18th, 2017 in Mountain View, California. So with that, let's go ahead and dive right in.

[00:02:57.178] Brandon Jones: I'm Brandon Jones, a software engineer at Google, and I am the primary spec editor on the WebVR spec. It's been a personal project of mine and a few other individuals for about the last three years, and just had some big announcements about it on the stage at Google I-O.

[00:03:13.341] Kent Bye: Yeah, we just came from the keynote on the second day of Google I-O, and some big announcements about both WebVR and WebAR coming to both Chrome, which is available for Chromium right now, but it sounds like in the mainstream for Android later this year. Is that right?

[00:03:30.752] Brandon Jones: Right, so there was a couple of different announcements around the web and Chrome and whatnot. Probably one of the biggest is that we do have a version of Chrome that will work in VR, for Daydream VR, and that'll be coming out later this year on Android. Android is the first step anyways. Then we also had some announcements around just generally re-emphasizing our support for WebVR as one of the core tenants of the web. We want to be a part of that platform, one of the bits of the foundation of the web. And then also had some announcements around a relatively new tech called WebAR. And the announcement on stage was that we are putting out a experimental version of Chrome that incorporates some of these AR capabilities. And that's actually following the model that helped us get WebVR off the ground. When I started WebVR with Vladimir Vekichevich from Mozilla a few years ago, we both created what we called experimental builds of our respective browsers. that we released separate from the normal browser track and just put them up on Google Drive or just someplace to make them accessible and allow people to download those, try them out and give us feedback about what was working or what didn't. You can actually still get some of those builds today. We've continued to keep them updated, a little bit less so now that we're trying to actually get them in the stable tracks of the browser. But that turned out to be a very successful way for us to really get a lot of feedback, to iterate quickly. The APIs look significantly different than they did when we first started because of the feedback that we got from the community. And so we're trying that again with WebAR, where we want to get things into developers' hands as soon as possible, even though they're not going to be part of the normal browser releases, but then use that feedback to iterate on what eventually will make it into the stable browsers.

[00:05:23.495] Kent Bye: Yeah, so there's a number of questions that I have about this. One is that, for right now, it seems like there is this kind of whitelist that you have in the background, like the origin trials, so that there's websites that are out there where you have these WebVR Chrome experiments. But yet, if anybody just goes out and starts to put out a WebVR experience, and they try to see it in a daydream, they may find that it actually doesn't show up, or it's not So maybe you could talk a bit about this process of the whitelisting. You have these websites that are experiments where you can start to look at stuff. But at what point is that going to maybe start to be lifted so that anybody who puts out a WebVR could have somebody look at it in either a Daydream or a Google Cardboard headset?

[00:06:05.377] Brandon Jones: Right, so describing the origin trial process as a whitelist is maybe not completely accurate, but there are some reasonable parallels. The biggest difference is that it's not Google going out and deciding who's allowed to have the API access and who's not. Developers can actually sign up for access to an origin trial token. And we give them out pretty much indiscriminately, like we're not doing tests about what the website content is or anything like that. You request one and you'll almost certainly get it. And then by embedding that in the header of your site, you're able to activate this WebVR content. And that just means that the API shows up by default when people visit your origin. Now, this is not something that's unique to WebVR. It's actually a process that Chrome has been using for several other APIs, such as WebUSB, WebBluetooth, and a couple of others that escape me right now. But the idea is when we have an API that we don't feel is completely finalized, but we want to get it into a larger number of developers hands. And then very specifically, we want to allow those developers to get as much user feedback as possible. Obviously, if you go out to your users and you say, you can use this feature as long as you download the special build and then go into about flags and then flip this checkbox and then restart the browser, users don't do this unless they're like super dedicated. So Origin Trials allows developers to opt into an API that they know is unstable, provide some content to users and then give us feedback about what is working out for their users and what isn't. So it's a system that several features have gone through. WebVR is just the latest in that. And we would like to get out of that origin trial mode as fast as possible, but we're not going to do that. until we know that the spec is final. Now, the WebVR spec has been undergoing some pretty significant changes behind the scenes. We've got Microsoft, Mozilla, Oculus, Samsung, and Google all working on it right now. And through the process of these origin trials and several others, we've identified just areas of the spec that could do a lot better. And so we started reworking that, and it kind of cascaded. It got to be a bit bigger refactor than we were initially anticipating, but we all feel that the spec is going to be much better as a result of it. It's going to be much more future-proof, and it's going to provide a much better foundation for the immersive web going forward. So I don't have an exact timeline on when that's going to wrap up. For my sanity, I hope it's relatively soon. But once we have what we feel is more or less the final version of that spec, and we've been able to do one or two more rounds of this origin trial with that version of the spec against developers with users, that's when we'll be able to comfortably lift the origin trial and say, you know, spec is final, we're comfortable turning this on for everybody by default. So it's not quite as close as I would personally like, but it's definitely within sight, and we will get there as fast as we possibly can.

[00:09:12.934] Kent Bye: Yeah, because I remember, I might have been at GDC last year, 2016, where we chatted, and I think at that point is when Microsoft had come in at the 11th hour and said, hey, we want to get on this WebVR as well. And I think that likely sort of opened up a lot of the use cases for augmented reality at that point. And so it seemed at that point, the 1.0 or 1.1 version of WebVR was ready to maybe start to be pushed out. But yet, with all these changes and refactoring, was that part of it? Is that looking at how are you going to accommodate augmented reality? And I'm just curious, as we're moving forward, if we're still on that 1.1 track, or if this refactor is going to be what you're going to be pushing out and trying to launch with is this completely 2.0 version.

[00:09:56.752] Brandon Jones: So we do refer to it internally as 2.0, but that's a bit of a misnomer because when we go to launch, we're not going to call it a 2.0 project. The public will never see a difference between WebVR 1 and WebVR 2 or anything like that. It's just going to be WebVR. Because once we get to a point where that's stabilized, it's just the basic feature set and we'll start building extensions off of that. We refer to it as 2.0 internal to the group just so that we know what we're talking about and can disambiguate between the different variants of the API that we Now, your recollection about Microsoft coming in kind of at the 11th hour for what was then the 1.0 API is correct, but it probably does a disservice to Microsoft to describe it that way. They've actually been a great partner to work with, and one of their primary concerns has been making sure that the API is not necessarily catering to experiences like HoloLens or other AR-based experiences. but is forward-looking enough that those can be included gracefully in the future. So when we're talking about WebAR up on stage here, a large part of what we're doing is making sure that the WebVR spec is set up in such a way that WebAR can elegantly sit alongside it and they can interoperate really well and complement each other rather than having to be these kind of separate pillars of the web. And of course, you know, it's not just Microsoft's hardware that this impacts. It's any sort of Tango devices or anything like that that use inside-out tracking. The standalone devices that we've been talking about here at I.O. would also benefit from some of the changes that we've been making. You know, they're really good changes for just making sure that we're forward-looking and that this version of the API is not going to be obsolete in the next three generations of hardware. At least as far as we can tell. Maybe somebody's going to come up with some completely new concept for how this all works and everything that we've done will be for naught. But as far as we can tell, this is going to be fairly future-proof.

[00:12:00.040] Kent Bye: Well, just from doing software development practices, usually a whole number update implies that there's breaking of APIs. And I could understand from your perspective that you wouldn't want to launch with a 1.1 version of WebVR and then soon after that completely break all the APIs. So that's one argument. The other argument is to go ahead and push it out. to have 1.1 out there if it's relatively stable, and then have everybody kind of recognize, hey, we'll just have to update everything. Because I think part of the thing that's holding back WebVR is that there's no widely accessible browsers that it's very easy to kind of jump in, aside from maybe the Samsung or other. VR PC based but in terms of the mobile browser especially with daydream and Android devices where people are able to kind of jump in and quickly slot something into their Their mobile headset without it kind of kicking into the operating system of that, you know for the gear VR For example, you just want to be able to go to a website jump in So I guess that's the question I have is you know Why not push out a 1.1 and why wait for the 2.0?

[00:13:05.241] Brandon Jones: It is an excellent question and I think that if we were talking about any other environment than the web, then the path that you're talking about where we push out of 1.1 and then push out of 2.0 a little while later and just accept that the community is going to have to update to work with that would probably be the correct route to go. This is what we see with, Oculus did this with their dev kits and Valve did and And Daydream has to a certain degree, like the early Daydream SDKs and APIs were different than what we have now, and they were breaking changes. You just had to update your stuff. And that can work in the native world. But we're talking about the web. And the web is fairly unique in that it's been a very consistent platform almost since its inception. There are bits of the web that are still available today that came from the very first iterations of the browser. In some cases, they simply don't make sense anymore in the modern web, but they have to stick around because the volume of web pages that we have out there is so massive that taking out even the oldest, crustiest, most deprecated APIs will break tens or hundreds of thousands or millions of pages. And as a platform, we simply can't afford that. So with newer APIs like WebVR, we're very cognizant of adding to the technical debt of the web. and if we launch something with a 1.0 or 1.1 API that we know from the get-go we intend to replace, what we've done is we've added a new chunk of technical debt to the web that is very, very difficult to then extract and remove. It's not impossible, we've done it before, but it is always painful when we have to remove something from the web that has been available by default up to that point. no matter what the time period. That's why we're being a little overly cautious. Believe me, it would be much better for my heart if we could just release something and get on with it. Because I totally sympathize with the developers that just want to use it, and they want it to be stable, and they want to move along. There's a fair number of them out there that will happily say, yeah, let me do this with 1.1, and then when you come out with 2.0, we'll update and move along. But there's also a lot of content on the web that people put out there and post up to GitHub and say, hey, look, this is cool, and then never, ever touch again. I'm as guilty about this as anybody. And we just want to discourage having especially as part of like kind of the initial push. We want to discourage having this long tail of content that doesn't work and it's completely opaque to the user as to why. They go to one site that says, hey, I support WebVR, but it just blows apart on them because they're using a 2.0 compatible browser and this is 1.1 content. And so that's what the Origin Trial System is meant to encourage, by having developers opt into it, and then giving them a strict timeline after which the token will expire. We're telling them in no uncertain terms, you can use this, you can build content with it, you can give us feedback on it, but it will unequivocally stop working, and it will stop working at this date. And that gives us a control valve to say, if we have any of this deprecated content that gets left behind, we've really done everything in our power to discourage that. Because once we get to the point where it is done, it's final, it's a core part of the web, we want to be able to go out to users and in no uncertain terms say, it works, it just works. And you don't have to worry about all these weird technical terms that don't make any sense to the average user anyway.

[00:17:02.970] Kent Bye: Yeah, as I am a journalist, I feel like I'm often on the front lines of some of the semantic battles within the VR community as I go to these different companies and hear the different proprietary words they start to use. Just as an example, Microsoft calling everything mixed reality and then mixed reality headsets when they really mean VR headsets and then they have a mixed reality controller that's VR only which you know you would assume that would be for HoloLens as well it likely will eventually but then I sort of get these messages from their PR people saying, oh, no, no, these mixed reality controllers are not for HoloLens and not for AR. And so you have this mixed reality spectrum, which comes from academia. On the different ends, you have augmented reality, which is adding virtual worlds into the real world. And then you have virtual reality, which is completely virtual. The reason why I'm bringing all this up is because it seems like in this keynote, I very much appreciate that Google is sticking with the commonly used terms of virtual reality and augmented reality. what everybody understands both from academia as well as the wider community. Instead of calling it the mixed reality spectrum, Clay said, okay, let's call it the immersive computing spectrum. Okay, that's fine. But the question here that I'm getting at is, you have WebVR, now you have WebAR. Are those two different APIs or is that the same API and is it actually kind of WebXR, which I think some people are using the X in order to imply that it's the full spectrum.

[00:18:28.394] Brandon Jones: Right, so let me start out by saying I have a good working relationship with individuals at Microsoft. I would like to continue having a good working relationship with them. So I will not be offering comments today on their naming scheme. Nor can I particularly comment on Google's choice of language to describe these because I'm not in our marketing department. I don't decide on these things. But getting back to the WebAR versus WebVR, or WebXR as a more umbrella term, It is an interesting discussion. Now, we haven't completely clarified where the line between these APIs is, or if there is a line. At the moment, we're treating them as separate but highly related things. I have a gut feeling that that's where we'll continue to go. So they are two different APIs then? At the moment, they're two different APIs. Now, that's a little misleading because, at last I checked anyways, I've had a lot of non-work related stuff going on recently. And so I'm a little out of date with the absolute latest and greatest from the rest of the team. But last time I checked, WebAR was an API that built on top of WebVR. It's still using some of the same mechanisms for, say, positional tracking and whatnot, but then layering on additional functionality for things like room tracking and depth map and camera access and stuff like that. I can point you at the right individuals to speak to if you'd like more clarification on that, because I'm probably not the right one. But we will continue to evaluate, and it may make sense in the future to say, well, you know, this is just a minor extension to WebVR, so let's just make it extra functions on the WebVR spec, or it might make sense to continue branding it as something different. And it's not completely clear right now. What is clear to me is that we've had several pushes for people to say, we shouldn't be calling this WebVR, we should be calling it WebXR or WebMR or something like that. And certainly there's an argument for that. I mean, we have OpenXR that's in development right now, Khronos, and they don't want to limit it to just VR. But my opinion has been from the beginning, we're definitely developing for VR devices right now. We're not exposing functionality for, you know, see-through devices or whatnot, although we're not preventing those from being used in the future. And if we get 10 years down the road and developers are cursing my name because web VR is this completely inaccurate, you know, it wasn't forward-looking enough because now, you know, everything's spectrum and whatnot. I won't care because it means people are still using the API. If the worst thing you can say about it is that that V just doesn't make sense anymore, I will be a happy man.

[00:21:22.504] Kent Bye: Yeah, from my perspective, I feel kind of torn by the XR term, just because X could mean that it's just sort of an abstracted way of saying it's a full spectrum from virtual to augmented reality. But anyway, there's semantic battles that I think that when it comes down to it, it comes down to having a common language across the entire industry. And when people say a word, you don't necessarily know what it meant. Just like WebVR versus WebAR, or if it was WebXR, But with WebXR, I would kind of assume that all the APIs are built in. You could do anything from the full spectrum from augmented reality. But saying that it's split kind of implies to me this confusion as to, OK, well, if I do WebVR, am I going to be able to, once these headsets go another five years and everybody's kind of having this blended reality, then am I going to be able to use that same API to be able to do scanning of a room and bring objects into this virtual experience?

[00:22:18.262] Brandon Jones: Yeah, there's good points and it's very difficult to tell the future, right? But I think that as developers, we tend to be used to using different libraries and APIs to build on top of one another. I'm trying to think of what a good analog would be for it. In computing, we have file IO APIs, and then we have higher-level networking APIs that oftentimes are built on top of those file IO APIs, and then we have browsers that are built on top of the networking APIs, but we're not terribly concerned about making sure that everybody understands that a browser is a file IO interaction system. I think that that will be something that we see reflected again in virtual reality hardware. Yeah, there's going to be different modalities, there's going to be a lot of different things that this headset does and this one doesn't. But from the developer's point of view, the layering of, well, I have this set of APIs that provide the basic virtual reality core, and then I have this other set of APIs that allow me to layer on AR capabilities to that. And then I have this other set of APIs that allow me to layer on, I don't know, eye tracking or, or, you know, full body input or neural interfacing or, you know, wherever we end up in the future. to have that be a stack of related APIs rather than concerning ourselves right now today when most of that is theoretical about saying, well, let's choose a term that encompasses all of it to avoid confusion 10 years down the line. I don't really see the value in because I think once we get 10 years down the line, the idea of how these things all interrelate is going to be so baked into the developer consciousness that it's just really not going to matter too much. And it certainly is valuable today because the devices that we have today are clearly virtual reality, or clearly augmented reality, or clearly virtual reality with a pass-through camera, however you decide to call these things, or tablet AR or whatever. Like there's such a clear division between what the different device capabilities are. that it's probably more confusing than anything else to try and just wave our hand and say they're all the same. There's commonality between all of them certainly. We want to address all those commonalities with WebVR as much as possible. But to try and pretend that we're building something right now that will handle them all without any modification between them is disingenuous. We want to handle the commonalities between all of them. And VR is kind of the term that we've settled on to represent that core. Because if you can handle the VR feature set, then everything else is kind of building on top of that.

[00:25:13.450] Kent Bye: And I think that makes perfect sense. And I'm wondering if you could comment on the last three years journey that you've had from starting on, going back to this being a 20% project that you're working on one day a week to start to integrate some of these VR APIs into the browser. And now, over the last three years, and now all of a sudden it's in the keynote here at Google I-O. And it's sort of got enough momentum that it looks like it's going to happen. And it's going to start to actually drive a lot of these immersive experiences for the overall strategy for Google.

[00:25:45.518] Brandon Jones: So my first thought on that is that as developers, it's often very easy to mistake the first proof of concept of a product as the endpoint. I certainly was in that mindset a little bit when I was first putting WebVR together. I was just going, oh, well, I can work these APIs in and shove it into the browser. And yeah, we'll have to update it a little bit as the native APIs change. But that should be most of the work right there. And this is complete folly on my part, because the distance between that very first it works to it just works is massive. It's something that as developers we are horrible at predicting. We can predict the it works pretty well. We can look at it and say, I can get something that is putting stuff on screen two weeks. Two weeks tends to be the magical developer number, right? And then we kind of think, yeah, then we'll have to clean it up and add some comments and maybe some documentation and then we're done. But no, like getting from that point to the end point where it can be released to a billion browsers and not break everything and appropriately fall to the background when those VR capabilities aren't there, but surface itself when they are and invite people into that. that realm and stay stable and provide a good frame rate and all of the rest of the stuff that goes into it. And to have testing behind it and everything, it is a massive undertaking. And to say that I had no idea what I was getting into three years ago is definitely an understatement. But I'm very glad that I've been along for the journey. And it is frustrating to watch the years tick by and realize that we still haven't released and just be so anxious to allow people to actually use this in the real world. But it is also amazing to see the teams that are behind this. It's definitely not just me. The full teams that are behind this have multiple browsers, all of which care about making this an excellent experience. Nobody wants to put out something that people look at on the web and go, okay, the web just can't do this. We want to make sure that this is part of the fundamentals of VR moving forward. We're talking about a new compute platform here. The web kind of stumbled when it came to mobile. Like it took the web a little while to catch up to the reality that mobile was eating the desktop world. We don't want to make the same mistakes with VR. We want to be there kind of from the get-go as much as possible. We want to be part of the foundation of what VR is. And we want to be there at the point where VR has its iPhone moment and really takes off, which we're still not there yet. Like we have amazing technology that's out. And we have even more amazing technology that's coming. But I still don't think we've had that moment where the general public looks at a product and goes, I need that. When we get to that moment, the web will be there.

[00:29:00.873] Kent Bye: Well, I'm here at Google I-O. I had a chance to try out maybe five or six different experiments. One that I really absolutely loved was Dance Tonight. I think that's going to be one of the things that really actually drives a lot of engagement and interaction with people, where you're able to actually kind of record multiple versions of yourself. And it's like a music video. And it's just fun and an amazing sort of collaborative project. And then there's other experiences that I had where I just kind of had going to the WebVR experiments portal and sort of jumping in and out. And I just wanted to be able to go to a place and just have all the experiences without having to kind of jump in and out and switch these contexts. And I know that there's various security issues of being completely immersed and kind of going from website to website. But is this part of the WebVR 2.0 refactor is figuring out how to do portals and links between different websites?

[00:29:52.426] Brandon Jones: Yes and no. It's certainly something that we're considering. It may not be something that's part of that core spec, and it's a little difficult to describe because what you're really talking about at that point is more semantic information. What does this sphere represent in my scene? What does that portal represent? On the web, we generally have a very semantic breakdown of the document that you're looking at. We have very specific ideas of what a link is, what an image is, what a scrollable region is and whatnot. Because VR is primarily 3D content, and because all of the major 3D APIs are imperative rather than declarative, OpenGL, Vulkan, all these things, you're issuing commands that are fairly opaque and then the GPU is just throwing pixels on the screen in relation to that. There's not natural ways to associate declarative metadata with those elements. It's an open question as to whether or not we should try enforcing some of that as part of the initial API, partially because we just don't know what that should look like. I mean, I'm still not totally clear what a link in VR looks like. And I don't know of anybody else who is either. There's been some really interesting experimentation around that. You know, things that look like portals, things like doorways that you actually step through, those kind of things. And they're really cool. But, you know, there's this question as to whether or not that's something that we want to push to propagate through the entire web. And maybe different sites have a different concept of what a link is, and maybe something's more appropriate in the context of your experience than it is in mine. And so out of an abundance of caution, it may be something that we hold back on for a little bit. in favor of saying we're going to let the community play with this and see where the de facto standards arise and see what works and what doesn't because it's very difficult for us. I like to pretend that I am sometimes, but I'm definitely not a content creator. I'm a platform guy. I'm defining the core APIs. I'm not out there working in Unity or Unreal or 3JS even most of the time. And so it would be disingenuous of me to step up and say, I know how this should work in the context of all of your applications. It's much better for us to provide as much of the fundamentals that we know everybody will need, and that we know work generally the same across all platforms, and then let the community experiment with how the content needs to interact with each other. Now, we are talking about different ways to ensure like, You can call a function and say, OK, I want you to link over to this page now, and we'll have some browser-assisted transition between those two and whatnot. But how that's represented visually and how you interact with it is going to be a little bit more up to the content creators initially. is a strong argument after we've primed the pump with a lot of community content for looking at it and saying, oh, we're seeing these patterns emerge. This is a really good interaction model. And so we want to canonize it and bake it into the platform. But we have to see what those interaction models are first.

[00:33:14.815] Kent Bye: Yeah, it's almost like in a website you have a CSS style sheet that allows you to change the look and feel of it. And it's almost like you need an equivalent style for how does the portal look in this world so that it doesn't break presence.

[00:33:27.923] Brandon Jones: Well, and it goes beyond just the visual styling. Like, right now we are in a very unfortunate spot where across the board, both native and web VR applications, don't really have too much in the way of accessibility. You have some things that can lead to good accessibility, such as positional audio. That can be a huge boon for people with visual impairments, but then they're probably not making much use of the headset optics themselves and whatnot. But yeah, if you're in this purely imperative world where you're just issuing opaque commands and pixels are showing up on the screen, and the only thing that's really interpreting them is the human brain, you lose out on the ability to provide a lot of the accessibility that the web actually has baked into it. We definitely want to get to a point where we can expose more of that accessibility functionality to the VR realm. It's important. If we intend for this to be a platform that serves billions rather than millions, we need good accessibility controls. And I know I'm not the only one thinking about this. I know that there's people in all of the platforms that are talking about it, both at the web and the native level. But I have not personally gotten a clear idea as to how that should be exposed and how that should work. But it's an important thing. We need to go down that route. And that kind of goes hand in hand with things like, well, how are links styled? And how do these common interactions work? you know, what does a default movie player look like and all these different things. They're all tied up in one another and it's hard to get there without taking this first step of at least exposing this core API to start allowing people to experiment.

[00:35:14.290] Kent Bye: Do you think that over the long run, it's going to move away from that imperative opaqueness and have something that's a little bit more declarative? I can't help but think of the early days of the web with Tony Parisi and Mark Pesci, who created the VRML, which was trying to be that declarative model. But there's A-Frame and other things like that that seem to be moving more towards a declarative model of putting together a scene that maybe can be somewhat equivalent to the DOM that we have in the web, have something that could be searched and have some sort of structures so that you could add those levels of accessibility.

[00:35:48.307] Brandon Jones: Right. A-Frame is actually a great example of the type of thing that we're hoping to encourage near term, which is a model that takes all of this imperative base level API and adds a declarative semantic level on top of it. And that's not something that we have to completely define and bake into the browser core. And, you know, A-Frame's moving 100 miles per hour. They're constantly adding new features. They're able to iterate really quickly. And they could not do that as effectively as they do today if they had to iterate on each one of those features as part of a core browser API. So, by giving them just enough to build on top of and then go off and define their own declarative semantic concept for how VR should be exposed, we're allowing them to do much more of that experimentation that I was talking about previously, where they get to iterate with the community and say, How should this work? How should we be describing these things? What should a link look like? You know, all of that stuff. And then the browsers can crib from that afterwards and say, okay, A-Frame and ReactVR and all these different systems, they're doing really well. What has succeeded in that model? What hasn't? And what parts make sense to bring back into the core of the web? You see that a little bit right now in web components. where, you know, for the longest time, the best that you could do was, you know, Angular or stuff like that. And then we decided, no, no, no, we really need a better component model for the web. And so we have ES6 modules and web components and all these different things that got pulled in. And they are not full-scale replacements for something like Angular or React or Ember or whatever framework you want to use. But they are fundamental building blocks that those libraries can then build on to be more efficient, more expressive, whatnot. And that's very much generally the direction that I think VR capabilities should go, certainly at least in the near term, over the next few years.

[00:38:01.043] Kent Bye: I also have a question about progressive streaming of content. I have a number of different VR experiences that I'll use as a way for people to have a metaphoric understanding. When I'm in VRChat and I go through a portal, I have this little loading screen such that if I've never been to that world before, then I have to wait until all the assets come down. But yeah, when I go into something like Google Earth VR, I'll get like a kind of a low-level shape of the architecture of a space. And then as I look around and wait, then there's more and more defined geometry that comes in. And so you have this streaming of the assets that are coming down. And I know with GLTF, that's kind of like the emerging standard for bringing in 3D content. And I'm just curious if that's a dimension that's going to happen at the GLTF level of having these 3D assets being able to be progressively streamed, such that if you go to a website now, you get this high-level architecture of it, and then the images and all the text kind of pop in. Are we going to be able to have that similar type of user experience within WebVR? Streaming is hard.

[00:39:07.550] Brandon Jones: Especially if you're talking about something that has a little bit more unpredictable structure. So if you're talking about something like audio or video, these can all be streamed in relatively the same way. And of course, there's slightly different codecs that all have a slightly different opinion about how this should work. Some are more efficient for some architectures than others and whatnot. But generally, you can say, okay, well, you know, we've got this audio stream and this is how we can buffer that up and stream it and look ahead and whatnot. And you can do a pretty good job. When you start to talk about something that's more complex and unstructured, like a full 3D world, The ideas as to what the appropriate model is for streaming become a lot less clear. Now you do have more basic ideas as to how this should work at an engine level. Unity has, you know, kind of built-in methods for how it streams in geometry. Unreal Engine has built-in ideas for how it does that. They each have different artifacts and different benefits and costs and whatnot. And so yeah, you know, some of this can be deferred to the A-frame or the GLTF level or whatnot and saying, if you're using these tools, then this is like the de facto way that streaming will happen. But it's very difficult to really define that at a platform level, certainly as early on as this, because if I have, say, a racing game, I want to do, you know, driving, I'm going to have a very, very different streaming model than if I have a more open-world environment because racing has the benefit of saying, well, I generally know that I'm going to be going forward down this set of track. It starts to look a little bit more like a movie actually because you're just following the track and you can stream things in because you know where the user is going to be 30 seconds from now and a minute from now and so on and so forth. But if I have an open world, like if I was trying to build the grounds of Google I.O. and stream those in, I have to handle it in a very different way because I could wander anywhere I want and I can't necessarily say, oh, I know that the user is going to the amphitheater next because I could choose to turn around and go to the concession stand instead. Because there's so much variability across the different experiences, it's difficult to come up with system-level primitives that are appropriate for something like streaming. But once again, there's an opportunity to say, if we're, you know, we can look at all the tools that people are building, we can look at the environments that they're using, and say, hey, this has emerged as a commonality between a whole bunch of them. And it would be beneficial for these reasons if we bake that part of it down into the platform. And maybe that doesn't look like, you know, well, point me at a URL and I'm going to stream in your environment. Maybe that looks a little bit more like, you know, we have texture loading mechanisms that can broadly use the same streaming mechanism across all sorts of environments. And that gets baked into the platform because there's good technical reasons for doing so. But once again, I feel like this is becoming a theme. It's really going to be best for us to watch and see where the community goes and what kind of content people are trying to build and where the pain points are. I should emphasize that really. We don't want to try and optimize the web too early. We don't want to solve a problem that nobody will ever actually have. There have been some instances of that on the web before and they just kind of don't turn out well or we have to change things up significantly to get to a point where it's solving an actual problem that people really have. So it's best a lot of times to see the type of content that people are creating, talk to developers and say, what are the pain points? What is preventing you from delivering the experience that you want? And that's where you get the really good valuable data as to if we had this primitive on the platform, we could do so much better.

[00:43:06.158] Kent Bye: So because you're a platform guy, you're really thinking about architecting this future of WebVR and WebAR at a very low level. And I know that on the web, there's a lot of other open standards that are going to be used in conjunction. And so I guess the higher level question is if there's other exponential technologies that are out there, whether it's like the blockchain and cryptocurrencies and adding that to identity to be able to like this concept of self-sovereign identity, Or the distributed web, so that we may have file systems that are not centralized, but maybe are kind of more of a peer-to-peer sharing. Are those kind of completely just separated from what you're looking at from the WebVR? Or are there other exponential technologies that are starting to touch into the API level of WebVR and WebAR?

[00:43:54.350] Brandon Jones: Right. So probably the best question to ask there from a platform level is, is it necessary for these two APIs to have knowledge of one another? So you're talking about different identity APIs, which I can't comment too much on directly. That's not an area that I do too much research. People say blockchain and my eyes start to roll back in my head. I will fully admit that's an area of tech that I don't get. And it's probably mostly a matter of more reading on my part. But The relevant question is, is there something valuable that we can deliver to the community and to app developers by creating a close linkage of WebVR and blockchain or identity or whatnot? In most cases, the answer is no. Maybe there's some good natural interaction points, but for the most part, identity has tons of really good uses that have nothing to do with VR. And VR has lots of really good uses that don't need to have any concept of identity. And even when you're creating an experience that uses the two of them in tandem, you can probably validate the person's identity and handle that separately from your handling of the VR scene in general. And there's just some application decisions to say, how do I pull this information over? How do I want to present it to the user or whatnot? A really good example here is the Web Audio API. Audio has very clear, very interesting relationships to VR. And the Web Audio API specifically has APIs in it or functionality within it to do HRTF and localized audio and whatnot that existed before WebVR ever came around. So when WebVR came around, some of the people working on audio, Hong Chen at Google, who I've worked with a little bit on this, looked at it and said, hey, here's a great new application for our technology. So they went and took WebVR and plugged it into one of their experiences, took the the head position and poses that was coming out of there and fed it into the existing systems for web audio and got head localized audio that tracked your headset's movement. This didn't require any changes on the audio API side because it had the mechanism in there. It said, okay, give me this information to tell me where your head is and I'll localize the sound for you. WebVR didn't have to know anything about the Web Audio API, because all it does is surface head poses and handle some of the rendering side. But because the appropriate endpoints existed at both ends, not specialized for either case, we were able to wire up the information between them and create something that was new and exciting. And that is, if you can achieve that model throughout your platform over and over and over again, that's a great place to be because you're no longer saying, well, it would be a tragedy if Web Audio created an HRTF model that explicitly took in a WebVR headset object and that was the only way that it worked. because then it wouldn't be useful for anything else. We don't want to do that. As much as we can possibly avoid it, we want to try and get away from there. And the only times that you really want to have that sort of explicit linking between the two APIs is one, if it's technically infeasible to do it any other way. Or two, if there's security concerns involved or maybe performance concerns or something like that. Otherwise, it's best to have these individual API pillars that can talk between each other using a common language but aren't necessarily strongly dependent on one another.

[00:47:50.259] Kent Bye: So what do you want to experience in either WebVR or WebAR?

[00:47:54.421] Brandon Jones: Oh my goodness. I don't know. I want to experience the unbridled creativity of the content creators out there. And that's probably the number one reason why I find this really interesting, why I'm personally pursuing it, is because if I go to an app store for VR, or really anything, but especially for VR, I have a pretty good idea of what I'm going to find there. There's going to be some games and that's great. I love gaming. I want to play a lot of VR games. I have fun with that. There's going to be some really interesting use cases like Google Earth or some stuff like that. There's going to be maybe some productivity things. We're low on that right now, but I think we'll get there in the future. But what I definitely know I'm not going to see, is the really weird one-off like Isaac Cohen type experiments and like the things where I'm gonna stick my head in for 20 seconds and go, well that was weird, and then walk away from and never think about again. All the little things where I would look at it in any other environment and say, is this worth installing an app on my phone or my desktop system for? No, because I know I'm going to spend maybe a minute there and then walk away and never touch it again. But there is so much interesting stuff that you could do with that. I mean, think about the weird stuff that ends up on YouTube that's just fascinating. And then think about if you had to install a player application for every single channel on YouTube in order to consume that content, you would never see most of it. You'd pick like two or three channels that you care about a lot and you'd have a very predictable, safe stream of content coming your way. But you wouldn't have that opportunity for somebody to send you a link and go, oh my gosh, you have got to see this hippo. This thing that the hippo does is just wow. And I want to see the VR hippos. I want to see how creative people can be when they're not limited by saying, what's monetizable? What's installable? What can I convince users to go through that process for? I think it's going to be weird, probably a little scary. I am definitely going to rue the day when the 4chans and the reddits of the world discover WebVR. I know that this is coming. I'm not looking forward to it. But there is so much awesome creativity on the web. by people who don't really have better ways to get their content out to people, by people who, for whatever reason, don't really have access to those store models or whatnot. And I want to enable that. I want to see what they come up with. And I have no idea what it's going to be. And that's exciting.

[00:50:58.289] Kent Bye: Do you have any favorite WebVR experiences that are out there right now?

[00:51:02.500] Brandon Jones: So my favorites at the moment, simply because they're such interesting, like they exemplify what WebVR should be. I really love Matterport and it's kind of a boring, maybe corporate, quote-unquote, use of it. But the idea that, you know, this is something that provides real value to people. I'm going real estate shopping, I'm looking through these houses, I can view, you know, the photos as the baseline. I can load up the Matterport 3D model. And then I can say, you know, I'm really interested in this house. I happen to have a headset. Hey, look, I can click a button and then actually do a virtual reality walkthrough. This has real value, but it's also something that we're going to be talking about this a little bit later today at our session on stage six. It's also something that they have a very hard time convincing people that that's worth installing an app for. because I'm not going to look at most houses more than once. Hopefully, I'm going to go through a whole bunch of houses, find the one that I like, and then ideally, I'm just going to live there. So I don't need the VR walkthrough of it anymore. So it's something that exemplifies why WebVR works and what it's good and valuable for beyond the App Store model. I also love Sketchfab because once again, they've got so much content. They've got millions of models. And a lot of them are the type of stuff that would never make sense to like put in a different environment. They've got weird stuff in there. They've got random, you know, drone scans. They've got crappy, you know, My First Maya model stuff. They've got some adult content on there that would, you know, raise the eyebrows of, you know, your average app store. And yet on the web, if you make all of this accessible, I might not spend more than 30 seconds looking at any individual model, but they still have value. So for me to be able to pop between the different models and look at them in this new environment, really lean in and appreciate the detail that the artist put into it, and then walk away from it with no residue left on my device, that's really valuable. And once again, it exemplifies what web VR should be good at. So those are kind of my favorite experiences to date. I've also really enjoyed a lot of the artistic stuff that I've seen. The LCD sound system. I love that experience. The whole concept behind it where you can participate in so many different ways. Whether it's just watching from afar, or actually being in the experience, or actually creating part of the experience. It's really cool, so I love seeing stuff like that. But those are usually small enough individual experiences that it's really hard to just call out and go, oh yeah, I spent 30 seconds in this thing and it was amazing, but I don't remember the URL anymore.

[00:53:52.137] Kent Bye: Yeah, that LCD sound system, Dance Tonight Experience, which premiered just last night, I was just absolutely blown away. It was so fun to be able to go in there and hear a song, and you're basically recording yourself up to 10 times. And then, you know, you kind of are metaphorically creating a measure of a song. And so you're going through, like, each one of these, like, four beats, you're seeing an entire new scene that someone's created. And just the variety of, I'm just like, oh my God, I don't even, know the limits or possibilities of the different types of scenes you could have with just such a simple, small primitives of hands and head. And from that, just to see what you can do.

[00:54:30.215] Brandon Jones: It's amazing. I continue to be amazed at how human something so simple as like just maybe three spheres, like that's all it really takes. And just to have that noise, that natural human jitteriness and non-linearity and just Like, it's so surprising to me how much of that comes through, even the simplest of geometry. In the LCD sound system thing, it's like, what, two cylinders in a cone, I think? And yet, it's so human. And it's so spectacular. I love it.

[00:55:10.622] Kent Bye: Great. And finally, what do you think is kind of the ultimate potential of virtual and augmented reality, and what it might be able to enable?

[00:55:18.969] Brandon Jones: Right. I always like to jump forward to the world where we have, say, AR. Everybody has this concept of the Ray-Bans that are just your normal pair of sunglasses, but they're overlaying information onto the world around you. And I think that's kind of the optimal endpoint that a lot of people are working towards. Like everybody who's working in VR today, they're not working towards the big clunky boxes that you put on your head. That's the intermediate step. And it's a great step to take, but everybody wants to get to that lightweight AR headset because that's like the world-changing thing, right? So, in that environment where you have your pair of Ray-Bans that are, you know, able to broadcast information out onto the world around you, A great example is I'm walking along the street, and I see movie posters along a movie theater. I walk up to the one for the next Star Wars movie or something like that, and my glasses say, oh, hey, there's content here. Now, I don't want the world to just be blasting content out at me without my permission. That's terrifying. That's very dystopian. But I do want to be able to say, oh, hey, I'm interested in this movie, Star Wars 23. I'm going to punch some little permission on there. And now it can surface showtimes to me right there on the movie poster. And I can maybe have some Fandango integration that lets me tap on a movie time and buy a ticket right there. And in the meantime, I've got a trailer playing on the poster itself. Only I can see, of course, because it's going through my glasses. And then it gives me some opt-in and say, hey, do you want to have the Star Wars experience follow you around today? And I can click on that. And now I can look up in the sky and there's Star Destroyers and TIE Fighters and X-Wings flying around. And I've got Stormtroopers running through the street and everything. And then I can click a button and it all goes away because now I'm going into my business meeting and I don't want Kylo Ren standing over me the whole time. This is such a cool experience, like you can totally envision that this is where we're gonna be 10-15 years down the line. And the critical part of it all is, it feels very much like the web. I came up to something very bespoke. It advertised itself to me. I discovered it in a very natural way. I opted into it really quick. I was able to leave it behind. Nothing asked me to install. I'm just pulling up information. I mean, this kind of sounds an awful lot like how we use the web today, just broadcast out onto the world around us. And I don't see an environment where all of that is native and gives us the same kind of experience, and we feel comfortable about it. Like certainly, I can see an environment where you walk up to the movie poster and it says, hey, click here to install the Star Wars app on your headset, and I'm just going to keep on walking. But if I can interact with it like I interact with the web today, then it becomes so much more compelling and I can see that becoming a big part of my day-to-day interaction with the world around me. Because the web cares about your security, the web cares about permission, the web cares about your identity, the web cares about ephemerality, the web cares about being able to erase your history and being able to save your passwords and all of these different things. We need an environment like the web that gives us that safe space to experience the rest of the world in a VR scenario. So that's where I think it's all going.

[00:59:04.830] Kent Bye: Awesome. Anything else left unsaid that you'd like to say?

[00:59:08.051] Brandon Jones: That everybody should be listening to the Voices of VR podcast by Kemp Bai, because it's one of the most awesome insights into the VR industry that you can possibly get. You've been doing this for years now, and you're just getting better at it. I love the content that you produce, and I want you to keep doing it for as long as you possibly can. Awesome.

[00:59:29.119] Kent Bye: Well, thank you so much. Thank you. So that was Brandon Jones. He's one of the specification authors of WebVR at Google, and he's been working on it for the past three years. So I have a number of different takeaways about this interview is that, first of all, I'm just super excited to see where WebVR is going to go. And I think that once it launches with these initial browsers and they kind of lock down the API, I think it's just going to take off. It's already really taking off with the A-Frame community. So A-Frame was mentioned in the podcast, and I have an interview with the authors of A-Frame that will be coming up here in the next episode. But A-Frame is essentially like this declarative language that sits on top of the WebVR specifications. And so the Mozilla folks that are creating A-Frame want to get some browsers out there in the wild so that we can start using WebVR. In fact, they've already pushed it into their main line, which should be shipping in Firefox in August. And so in August, they're going to be shipping with the 1.1 WebVR specification, but with A-Frame there's going to be a JavaScript file that's basically handling the nuances of interfacing with the WebVR spec. So whenever the 2.0 launches and the Google Chrome for Android launches that has support for WebVR, then everybody can kind of update to the 2.0 spec and we can have this kind of interim period where the 1.1 version has a little bit of technical debt that is not going to necessarily be supported in the long term. Now the Mozilla folks are all for just pushing forward and dealing with the consequences of maintaining that technical debt because they want to get this out there and people using it. So I can definitely appreciate the caution for pushing it out too quickly and having to maintain some level of technical debt forever. But at the same time, it's really super annoying to try to see a number of WebVR experiences that are out there right now. I spent the afternoon yesterday trying to see a number of different WebVR experiments and you have things like you go to the WebVR experiments webpage, you bring it up on Daydream, you try to launch it and then you put it in the phone and then it launches into the Daydream home. There might be some flags or different builds or using the beta browsers. I couldn't get it to work, although they were showing some WebVR experiences at Google I.O., but I think they may have been pushing out special builds in order to have these origin trials or applications in order to actually run on the Daydream. Most of the apps right now I have to see on the Google Cardboard. So for anybody that's doing development within WebVR, I think probably the best thing is to use some of the PC-based HMDs, either the HTC Vive or Oculus Rift. Going to webvr.info, you'll see all the different browsers that are compatible right now so that you can download some of the experimental browsers and then from there start to hop into the VR experiences. Part of the reason why I haven't explored a lot of web VR is because each time you go in and out of an experience, you have to go out of VR, back into VR, and so there's no real good way at this point yet, that I've found at least, that you can go into VR and kind of browse around all the different experiences. Once I get that ironed out, then it's going to be a lot easier to kind of navigate around. And for anybody that's interested in using WebVR, there's a number of different frameworks that are out there, including A-Frame, Vizio, as well as React.js, which is coming from Facebook, which is not as full-featured, I think, as A-Frame. A-Frame has been around for the longest. It's probably the most full-featured and the biggest community that's out there right now. So we'll be diving in that tomorrow a little bit more. Just to say a few more words about other takeaways from this podcast is that it was just really interesting to hear the differences between WebVR and WebAR. It sounds like that WebVR is going to be providing this foundation, a lot of the HMD type of immersive capabilities of the headsets. But with AR, there's going to be all sorts of things like scanning the room and putting objects and detecting the depth and figuring out the occlusions. All the different things that you really need in order to have an AR application are going to be building on top of the WebVR technology. And I think overall, Google as a company, they're really trying to organize all the world's information. And so Daydream as a platform, I think for me personally, it's not the best for gaming. And so there's other platforms that I think are going to be a better experience for virtual reality gaming. I think in the long run, Google is focusing on mental presence, education, learning, being able to overlay information. So I think that's going to, in the long run, be the strength of Google is that they're going to have these augmented reality applications where you're overlaying information about the world they've gotten from scanning the internet and collecting all the information and just aggregating in a way where you'll be able to take a picture with Google Lens and figure out what this flower type is. Or the Google Expeditions, which is one of the educational initiatives that has been going on for two years now. So they've been rapidly iterating, trying to figure out the best practices for immersive education within I have an interview with Jennifer Holland. We'll be diving into that a lot more. But right now, internally, the way that Google is creating some of these Google Expedition experiences is with A-Frame and WebVR. So internally, Google is starting to rapidly prototype some of these experiences with these different WebVR frameworks and then packaging it up and delivering it into these different applications. So I expect to see this continue. So as YouTube starts to go into volumetric video, what's it mean to be able to start to put volumetric WebVR experiences within a volumetric video that will eventually be in YouTube? YouTube hasn't announced anything like that yet, but I know that in the long trajectory of VR, we're moving towards volumetric video. There's going to be new interactive platforms. And these WebVR and WebAR technologies I think are going to be a key part of how we're going to start to overlay information into these different experiences. And in fact, I think WebAR actually has so many more compelling use cases than WebVR at the beginning. Just because we're going to start to have a lot of these Tango enabled phones. And so there's just so much more information that's out there on the open web that you could pull in into reality. And there's just going to be, I think, a lot more use cases for being able to, for example, a lot of these. early applications of being able to put furniture within your room. And so you're pulling out and being able to swap in and out different volumetric based pieces of furniture into your home. You could see it in context. So to me, I think that the WebVR is super exciting and I think it's going to take off and accelerate a lot faster than a lot of people recognize. But I think that the Daydream 3 DoF controller makes a great volumetric pointer once you have the ability to be able to have the Chrome browser within Android for you to be able to navigate these different experiences. I think that the Daydream as well as with the 3DOF controller is going to be one of the best platforms to be able to explore the open web. And I'm super excited to see where this all goes. So that's all that I have for today. I just wanted to thank you for listening to the Voices of VR podcast. And if you enjoy the podcast, then please do spread the word, tell your friends, and consider becoming a donor. Just a few dollars a month makes a huge difference. So donate today at patreon.com slash Voices of VR. Thanks for listening.

More from this show