Josh Carpenter is a researcher and interaction and user interaction designer for figuring out how to use virtual reality on the web with WebVR. Mozilla has been increasing the size of the team working on WebVR because they’re betting that immersive experiences like virtual and augmented reality will be a huge part of the future of computing. Mozilla wants to enable web developers with the tools and infrastructure so they can help build out the metaverse of interconnected virtual worlds.
Some of the lessons learned for user experience design in VR is that they found that designing UI elements onto a curved surfaces works really well, and the text size has to be fairly large and so that reduces the amount of information density that’s available to work with. They also found that lean in to zoom with DK2s positional tracking is analogous to pinch to zoom on a multi-touch device in that it’s effective but yet also somewhat annoying, and so they try to avoid relying upon that too much.
Leap Motion integration into WebVR and virtual reality and warns about designing skeuomorphic interactions with your hands, but thinking about extending your reach and optimizing the interaction design for what works the best within VR. Josh also talks about the concept of progressive enhancement and how that applies to designing VR experiences that work in a range from mobile to sitting at your desktop with positional tracking to all the way to room-scale tracking with two-hand interactions. For the web, an experience should work at the most basic input device, and then progressively enhance the experience if more of those input devices are detected to be available.
Josh talks about the range of WebVR demos that were being shown at GDC ranging from games created in Flash, a 360-degree video, as well as Epic Games’ Couch Knights which was exported from Unreal Engine 4 using a WebGL export plugin.
The WebVR Web API specification is being designed so that you can experience the web through any of the virtual reality HMDs, and they’re also figuring out the user interaction paradigms that will allow people to be able to go to any destination in the world wide web-enabled Metaverse.
He talks about how Unity 5 now supports One Click WebGL export. Unity exports WebGL 1, and WebGL 2 is on the horizon with even more next-generation graphics capabilities. For example, Mozilla was showing off the following demo at GDC for the level of graphics fidelity that’ll be possible with WebGL 2
Josh also talks about what they’re doing in order to increase performance for real-time, immersive 3D experiences. There are a lot of optimizations that can be made to the browser if it’s known that the output will be a virtual reality display. It will take more development resources, and Mozilla has recentely committed to growing the WebVR to enable more VR on the web but also to help create the Metaverse.
He also talks about the power of the web for ephemeral experiences, and some of the collaborative learning VR experiences that he really wants to have within VR powered by a browser. He also talks about how WebVR is pushing innovation on the audio front, and he cites A Way to Go as an experience that pushes the audio API to it’s performance limits.
Finally, Josh talks about the future plans for getting WebVR easier to use for developers, making sure that it’s possible to have mobile VR experiences, and then creating responsive web design standards so that websites can flag that they should be experienced as fully immersive VR experiences. He also sees that it’s a safe bet to be investing in virtual reality because immersive experiences are a key part in driving innovations of the future of computing.
Theme music: “Fatality” by Tigoolio
[00:00:05.452] Kent Bye: The Voices of VR Podcast.
[00:00:12.133] Josh Carpenter: My name is Josh Carpenter. I'm a researcher with Mozilla. And we're working to bring the web to virtual reality. And my role on the team is as interaction designer or user experience designer, trying to figure out what are the browsing conventions in virtual reality? How do we use it? What's it good for? And why would you want to combine the web with virtual reality? What's special about that combination?
[00:00:31.000] Kent Bye: Great. And so why don't you tell me a little bit about what you're showing here at GDC? Sure.
[00:00:35.202] Josh Carpenter: So at GDC this year, what we have is we've reached out to the community, and we've collected some of the coolest web VR demos out there, kind of the early examples of what you can do when you combine the web and WebGL specifically with virtual reality. So we've bundled them together into a series of demos. And what we've done is actually wrapped them within kind of a prototype browsing interface. So as users actually experience these demos, you're not escaping into Windows and double-clicking on an app icon. You're just calling up a browser, heads up display, and you're clicking on a link. And that link takes you to a website, and that website is a world. So we're trying to think about, can the web and can browsing be the metaverse that we've been promised for so long? And how will that actually work? How will you type in a URL when you can't see a keyboard?
[00:01:16.786] Kent Bye: And so talk a bit about that user interaction, some of the prototypes, experiments, and then lessons learned by designing for virtual reality.
[00:01:24.824] Josh Carpenter: Sure, so one thing that's really interesting is that we've spent a lot of time trying to design our interfaces onto curved surfaces. So think about a curved surface that wraps around you as though you're on the inside of a cylinder, as at each point it's equidistant. And it just turns out that's a lot better for legibility, it's a lot better for reach distance if you're using something like the Leap motion control systems. It just works a lot better. So that was one thing. Another thing is text size. It's tremendously difficult to render text at small sizes on these early VR devices and have it be legible. So what that forces you to do is to make the text larger. When you make the text larger, information density drops pretty fast. So all of a sudden, you have to try and find ways to cram a lot of complexity into a very limited amount of real estate. Because part of the eye that can actually read text is very, very small. It's like 10% in the middle of your field of vision. And you don't want people to have to move their head too aggressively to actually read a bunch of text. So we're playing around with things like what they can do is actually lean in, so we can actually have small text and we can know that thanks to like let's say the DK2's positional tracking camera, they can actually lean in to read that text. As an interaction designer, that's quite fascinating. That's roughly analogous perhaps to the pinch and zoom of mobile computing, where pinch and zoom is handy but it's a little bit awkward. So you know, leaning in to read text is handy but it's also a little bit awkward, so you don't want to make the user do it too often. These are the sort of things that we're kind of playing around with right now. But as an interaction designer, like being able to design an interface that takes up real estate, like real world real estate, that people can lean in and look behind, like having to think about what happens when people look behind their bookmarks is pretty wild. It's by far the most fun thing I've ever worked on.
[00:02:54.308] Kent Bye: What's some of the limits of how many different items you can select for in a single cylindrical view around yourself? I mean, you have to hit some limit at some point where it just becomes too many. So what were some of the limits that you found?
[00:03:07.303] Josh Carpenter: Yeah, I think there's lots of tricks you can do. So for example, if you've got something like the equivalent of a menuing system, where you've got multiple levels of menus that are kind of accessible through some sort of hierarchy, maybe you can cram more information in that way. If you do things like have the targets be fairly small, but when you look at them, they increase in size, you can actually cram more information in that way. There's a lot of techniques we can play with. Although, right now, we're not too worried about that particular problem because, frankly, there's not that many web VR sites out there. So trying to cram a lot of information into a tight space, it tends to be more of a problem when you're talking about, let's say, a keyboard, or let's say an on-screen prompt, and less about trying to display large volumes of, let's say, links and content. But over time, and you can imagine something like a Yahoo of WebVR, where someone's actually gone out there, curated a ton of different websites, put them in different categories, and actually displays them using some really clever interaction system, some system that actually reveals more information gradually and easily. Because frankly, typing in URLs would be pretty difficult.
[00:04:03.277] Kent Bye: And so talk a bit about the Leap Motion integration and what having your hands within VR in the context of a web browsing, WebGL, and virtual reality experience is like.
[00:04:12.584] Josh Carpenter: Sure. What's so neat about the Leap is that virtual reality obviously creates this impression that you're actually in this world. And so when you look down and your body is not there, it breaks that illusion. It can be quite jarring, I think, as most people have probably found. And so Leap's quite interesting in that it actually enables us to actually bring a representation of your body into the world with you. And obviously, our hands are primary modes of interaction with the world around us. We have a lot of dexterity with them. So with the Leap, you can imagine future interactions and future systems of interaction where we actually use our hands directly to touch the world around us. I think some of the early challenges for designers are maybe not over-rotating the skeuomorphism of that. Just because I can see my hands in the virtual world, maybe it's not good for me to actually turn virtual pages that might be far too skeuomorphic. I might want to do something that's more analogous like painting with light. I may want to do things with my hands that affect the world at a distance of like a thousand meters. Like maybe I'm not limited to just what I can actually physically reach within like I have a one, two meter span, but I actually want to project changes into the world. So I think what we're trying to play around with is taking the Leap Motion control systems and systems like it and figuring out I guess what makes sense from an interaction standpoint. One of the demos we have is called Rainbow Membrane. It's going to be on our site mozvr.com. Done by a local artist by the name of Kibibo in association with Leap Motion. And with it, you put your hands in front of you, and you actually warp the fabric of reality. So you're not trying to do micro interactions. You're actually just trying to do very broad gestures. And the return on your time doing that is just fun and enjoyment in the world around you. That feels like a pretty good fit for me for early motion control. And the only thing I'll say on that is, ultimately, the web is all about progressive enhancement, which is to say, everything that we design has to work from a common baseline. Like, let's say, a very simple model of looking and then maybe a single input, clicking. And then as we add things like motion control, as we add things like, let's say, positional tracking, these have to be optionals. The system should work without those niceties. So we kind of look at something like the Gear VR, for example, and we consider that a pretty good baseline for interaction. Everything should work with that. And then we add on features, and others can add on features in a progressive way, but it should always work with a pretty simple input system.
[00:06:18.904] Josh Carpenter: Exactly right. That's exactly right.
[00:06:20.944] Kent Bye: And maybe you could tell me a little bit more about some of the demos that you have here that you're running.
[00:08:24.675] Kent Bye: Oh, wow. And what about 360 degree video? Is there any examples of that?
[00:08:28.298] Josh Carpenter: Yeah. So we've been working with a team, obviously with the LVR team. We've got this amazing open source web VR video player. And we're using their video player to take people into the Arctic with Polar Sea. It's a documentary that's being shot in the Arctic right now by a team called Deep Ink in association with a French company by the name Arte. So essentially, it's a mono video equirectangular MP4 file. And we're just using a standard video tag in HTML, which is hardware accelerated. We're then taking that texture and projecting it onto a WebGL sphere around the user. That's a lot of stuff that average developers don't have to worry about. They can just use the LVR player, put it into a WebVR scene, and they're good to go. And then some other examples we have are a little custom, extremely basic video player that enables people to actually, if they don't have a headset, they can load up in the browser. It works in every browser. And they can use their mouse to drag and look around. And then if they have a headset, they can actually put in the headset and actually look around with it. So that backwards compatibility story is pretty huge for the web. So if you're a video content producer, you can actually distribute through the web and know that your adjustable market is in the billions. It's not just in the hundreds of thousands.
[00:09:26.307] Kent Bye: So that's pretty huge for them. And yeah, how does something like some of these other mobile-related HMDs that are coming out, how do you see the WebVR playing in with those?
[00:09:38.073] Josh Carpenter: We want WebVR to work with everything, so developers don't have to worry about what kind of HMD the user has. It's just, you make a really great content, the browser takes care of worrying about what kind of HMD it is, and they just give you the tracking information, essentially. So the Web API that my colleague Vlad, in association with Brandon Jones over at Google, have designed, has a ton of flexibility built in. So as new devices, new input devices, and new HMDs come out, we don't have to modify the VR Web API. It's going to work with everything. So right now we've got DK2 compatibility, but down the line we're going to work with the manufacturer of these new headsets to actually ensure that their SDKs give the web API what the web API needs, and everything works together really seamlessly. And then with regards to Glass specifically and AR, right now I think most of us working in VR are kind of building the foundational skillsets they're going to take us into. Obviously, AR is lagging by a couple of years. Not lagging, that sounds very impatient for cutting-edge technology. But right now, we've got these amazing VR headsets, and we're all learning how to do really cool spatial design. We're all learning how to do 3D experiences. That's going to lead really seamlessly into AR down the line. I think everyone in VR is really excited about AR. And really, mixed reality, I think, is what we're talking about in a big way.
[00:10:46.081] Kent Bye: And so maybe you could talk a bit about the implications of Unity 5 and having a WebGL export now.
[00:12:31.224] Kent Bye: And in terms of latency, which is very important for virtual reality, Oculus has come out and saying their target was around 20 milliseconds. And I'm curious to see how that's improved in Mozilla's browsing experience and what can be done to continue to improve that.
[00:12:45.555] Josh Carpenter: Yeah, I think what we're going to do over the next year is tune performance. In June, we released these APIs that actually enabled all these things, enabled WebVR. And from this point forward, we're working on the ergonomics for developers. It's just easy to get started. We're working on getting the technology into our main Firefox releases. So it's in Nightly right now, which is an alpha channel. We want it to be in the desktop channel and in the primary release channel. So hundreds of millions of people have Firefox with WebVR enabled. And then we're working on performance. So latency is part of that. So right now, we haven't done measurements. We figure we're on 60 milliseconds. We want to get down to 20. There's a whole bunch of things in the browser architecture that we know we can do to optimize if we know the display context is virtual reality. So really, it's just engineering time and resources. But Mozilla recently committed to increasing the size of the web VR team. So we're actually pretty bullish on this. We really feel like it's something that's very exciting. The results have been really awesome. And frankly, the response in the community has been huge. We've had so many people coming up to us saying, like, this is really exciting. I want to work on this. I'm an HTML5 game developer. Like, holy crap, I can make web VR games now. We're expanding the size of the team. We're going to be taking this on really aggressively in the next year.
[00:13:50.475] Kent Bye: What was it that convinced the higher-ups in the decision-making structure of Mozilla to commit more resources to WebVR?
[00:13:57.811] Josh Carpenter: I think it's hard to look at virtual reality right now and not be struck by the momentum. I mean, the number of heads that's been released by very large organizations. And then if you've tried these heads, if you've tried these demos, it's impossible to not be struck by the impression that you're looking at something that's genuinely new. This isn't just, you know, 3D TVs. This is something genuinely new and genuinely amazing. This is a medium that is radically unlike any medium that we've had before in a very special way. So I think that the web needs to be where computing is. The web is where computing is. And if virtual reality and mixed and augmented reality are the future of computing, or even a portion of the future of computing, it's really important that the web is there, and the web is there early. What we don't want is a repeat of mobile, where on mobile, the web is actually really dominant. It's in every application we use for the most part, but it's not really used to design the front-end experiences. We don't want that. We want the web to be an awesome kick-ass option for designing the entirety of your virtual reality experience and actually being a really awesome channel of distribution for your virtual reality experience. So, you know, industry momentum, belief that this is the future of computing, we're part of it, and a belief in the importance of the web being everywhere are really what ultimately make us bullish on web VR. And I guess also, we want to build a metaverse. Like, and, you know, if you look at the web and you squint your eyes, you see 90% of really what the metaverse would need to be. And so we feel like if we, you know, unleash all these amazing developers out there and these massive companies like Wikipedia and Reddit, these huge entities with all this content, if we unleash them on virtual reality, that's a pretty big game changer.
[00:15:34.916] Kent Bye: Yeah, and the other thing that I guess the web is unique in terms of consuming information is that it's much more ephemeral in the sense that you may watch a video on YouTube, but you don't necessarily have to download it on your computer and have it there forever. And so it seems like the web VR would be really great for ephemeral experiences that people may want to have, but not necessarily want to own forever on their computer. And I'm just curious if you had a sense of what type of ephemeral experiences like that that people may be wanting to experience with WebVR.
[00:16:05.315] Josh Carpenter: Yeah, one of my... Yes, definitely. One of the things I want to see is, I jokingly call them 4D GIFs. So if you go to GIF Sound, which I believe is a subreddit, people will take looping GIFs and they'll mash them up with audio from YouTube videos. to either comedic effect, but sometimes really quite stunning and beautiful effect. They'll take a gif of a time-lapse of a storm swirling, they'll find the loop point in that gif, so you have a seamlessly looping gif that endlessly curls a wave or swirls a storm cloud, and then match it with some haunting Philip Glass melody. And you've got something that if you fullscreen it is really stunningly beautiful, but it's fuzzy and it's a GIF and it takes forever to load. In web VR, it might be pretty interesting to have seamlessly looping stereoscopic video loops that may be only 10 seconds long, but they loop seamlessly to create this kind of endless effect. Put that together with positional audio and wrap it into something like maybe some new file format, like a 4D GIF kind of format. That gets really interesting. And if that's like a three megabyte file download, you can just be hitting next, next, next, and cycling through these. Or setting these as your home screen, overlaid with your Twitter data and all the day's news. There's so much stuff we can do. The thing that causes me the most anxiety in life is the whiteboards full of things that we really want to build, and then running around trying to convince students and developers and companies that, like, listen, someone's going to build this. You should build this first. You're awesome. Go build this. It's going to be amazing. So I think we're about to, in the next year, see things really be unleashed in a big, big, big way.
[00:17:30.113] Kent Bye: And what are some of those things that you really want to see in VR then?
[00:17:32.977] Josh Carpenter: For me, I'm really interested to see, I think collaborative learning is pretty cool. So I mentioned Polar Sea, so I'm in the Arctic flying on a drone, flying on a helicopter over glaciers that are sadly crumbling into the sea. But the way Polar Sea has been filmed, I'm there alone. It would be pretty interesting if I was there, if I loaded up that website at maybe like 10 a.m. every Monday of the week, the director joins me. And using a technology called WebRTC, which is an open real-time communication protocol for voice, video, and data, it might be pretty interesting if I can actually be talking with my classmates or with the Explorer or with an astronaut in this virtual reality environment, all powered to the browser. with backwards compatibility. So if one of the students is home sick that day, and they're just on their desktop computer, that's fine. They can just drag their mouse to look around the scene. Or if they're on their mobile phone, they can also participate. So collaborative learning across every device, I think, is going to be huge. Just these collaborative experiences powered by the web in virtual reality is something I'm extremely, extremely excited for. I think there's a lot of low-hanging fruit there, for example.
[00:18:33.038] Kent Bye: And what's the state of positional audio and experiencing that in the browser? Are there open source plug-ins, or is that something that has to come through either Unreal Engine or Unity? And does that positional audio still translate to a WebGL experience?
[00:18:46.768] Josh Carpenter: Yeah, so we have the Web Audio API, which is a really robust API for doing audio. I'm not an audio guy, but one of the things that we're committed to doing at Mozilla is having best-in-class web audio performance, and one of the drivers of that has been WebVR. There's an amazing experience out there called Way2Go, and it's a 360 video experience that plays in HMDs or just in a 2D browser. where you're like a stick figure running through a video forest. So it's a mashup of hand-drawn animation with a black and white forest environment. It's stunningly beautiful, but it pushes the Web Audio API to the absolute limits. And it really revealed quickly where we can make improvements in performance. So we think, in the next sprint, let's just get Web Audio API performance to be kick-ass and to do all the things well that it is capable of doing. And then I think we need to evaluate where are the limits of that and figure out what do we need to actually make more immersive audio, knowing how important audio is to VR.
[00:19:37.450] Kent Bye: And finally, maybe you could talk about some of the next steps in terms of what is currently in the Firefox Nightly and where you see this going in the future with some of these prototype user interfaces that you're showing here at GDC.
[00:19:49.788] Josh Carpenter: Sure, yeah. So what we're going to do, we've got it in Nightly. You require an add-on, and you also require disabling a new technology called E10S, which breaks WebVR support. Those are a lot of steps for users to have to negotiate to get WebVR working. A lot of steps for developers to have to negotiate to get WebVR working. So we want to just make it really easy to get started. Just reduce the friction for developers and for users to get WebVR going in their browser. That's a pretty low-hanging fruit for us. Beyond that, we want to make sure that it also works in mobile. So we have Firefox for Android. It's a great mobile browser. We want to make sure WebVR works there. really, really well. And from there, that's our gateway into the world of mobile virtual reality. So we're going to be working really hard on that as well. And then beyond that, in terms of the interaction design side of things, right now we're kind of in an internal prototyping stage. So if you go to mozvr.com, you're going to find all these amazing demos, but we've kind of taken out that pseudo-browser interface, and we're working internally on taking that to the next level. We really want to let people go anywhere, from anywhere, in virtual reality, on the web, without taking their headset off. That is like the simple one-line goal. But it turns out there's a lot of complexity involved in that. We need new standards. So a web developer can say, hey, I'm a modern virtual reality website, so put me in full screen automatically. Don't make a user click a button to go into full screen virtual reality. And we need to be able to distinguish what a classic website looks like. So if you go to a classic website, we know it's a classic website, and we display it in some backwards compatibility fashion, perhaps mapped onto a curved plane that wraps around the user. So we're going to be working on these new platform pieces, all building up towards new prototypes of kind of an all-in-one browsing experience. And the modes of distribution kind of to be determined. But we'll get it out there. Being a Mozilla, being a not-for-profit, we're all about building neat stuff, and then just release the code as quick as we can, giving it away, and then hoping that people take and do their own cool stuff with it. Because there's all these web developers, and they're rabid, and they just want to build cool stuff. So again, if we unleash that into the world of VR, we think it's going to be, again, to repeat myself from earlier, a game changer.
[00:21:36.835] Kent Bye: Awesome. And is there anything else that's left unsaid that you'd like to say?
[00:21:40.150] Josh Carpenter: I'm just that it's going to be a crazy year for all of us. I think anyone working in VR right now I think feels like, yeah, I think we made the right bet on how to devote our time. It's impossible that 3D and immersive is not a key part, if not the key part, of the future of computing. So regardless of how virtual reality plays out in the next year, even the next two years, we're all investing in an area of incredible importance, perhaps the most important area we could possibly be investing in. So it's deliriously, ridiculously, fantastically exciting. But I really cannot wait to see what comes out of this year. Awesome. Well, thank you so much. Thank you very much.