#760: Web-Based 3D with Microsoft’s Babylon.js JavaScript Framework & glTF

babylonjs-devsBack at Microsoft Build 2018, I had a chance to talk to two of the lead programmers of Babylon.js, which is a JavaScript framework for building 3D immersive environments with HTML5 and WebGL. It’s a framework that’s similar to three.js or React 360, but uses Typescript and has more of a focus on enterprise applications. It’s developed and used internally at Microsoft, and there was a new release of Babylon.js 4.0 just that came out just ahead of Microsoft Build 2019. I conducted this interview on May 8, 2018 when Babylon 3.3 was the latest release, but the project leads David Rousset and David Catuhe provide a lot of context for the history and primary use cases for Babylon.js.

saurabh-bhatiaIn this conversation I also talk with Saurabh Bhatia who is a program manager on glTF and Babylon.js, and we talk about all of the various ways that Microsoft is using glTF in all of their various products. There’s a pretty significant paradigm shift happening with spatial computing that can be seen in how many of Microsoft’s main products are starting to integrate support for 3D object file open standards like glTF.

I’ll be at Microsoft Build 2019 on this Monday and Tuesday covering all of the latest mixed reality news for the Voices of VR podcast.

LISTEN TO THIS EPISODE OF THE VOICES OF VR PODCAST

This is a listener-supported podcast through the Voices of VR Patreon.

Music: Fatality

Rough Transcript

[00:00:05.452] Kent Bye: The Voices of VR Podcast. Hello, my name is Kent Bye, and welcome to the Voices of VR Podcast. So, there's a number of different web frameworks that are out there in order to do things on the web that gets compiled down into WebGL and the Immersive Web. 3JS is a very popular library and 3JS is used by something like A-Frame, which is like a declarative language that allows you to write this kind of markup and then it gets compiled down to WebGL. Another one of the frameworks that's out there is React 360, which Facebook is really putting out there. And another framework that is produced by Microsoft is actually called BabylonJS. So on May 5th, 2019, Babylon JS 4.0 was released. And last year at Microsoft Build in 2018, I had a chance to get a little bit more of the backstory and history of Babylon JS to be able to talk to the two co-founders of Babylon JS, as well as to hear the different integrations with glTF and how Microsoft in general is supporting a lot of glTF within their company and all the different applications. That's essentially like an open standard format for 3D objects. But also, you know, that the fact that they're supporting something like Babylon.js to be a little bit more of a enterprise grade framework to be able to have different version controls and releases and non-breaking changes and uses TypeScript to be able to write in this kind of higher order language that then gets compiled down into JavaScript. So it's kind of a different approach, and it's a framework that's maintained by Microsoft and used internally. And on the heels of Microsoft Build 2019, there's a brand new release of Babylon JS 4.0. Again, this interview was done a year ago, back in 2018, just to get a little bit more context. And I'm sure that when I go to Microsoft Build, hopefully I'll be able to catch up with them again to get a little bit more update as to what's the latest and greatest for Babylon JS. So that's what we're covering on today's episode of the Voices of VR podcast. So this interview with Saurabh, David, and David happened on Tuesday, May 8th, 2018 at the Microsoft Build Conference in Seattle, Washington. So with that, let's go ahead and dive right in.

[00:02:17.580] Saurabh Bhatia: Cool, hi, I'm Saurabh Bhatia, I'm a program manager. I work on GLTF and Babylon.js. So I work with the community out there to kind of promote the GLTF spec, help do implementations in Babylon.js, and then help make sure Microsoft's adopting it across all of our products.

[00:02:35.230] David Rousset: Hello, my name is David. I'm working at Microsoft as a Program Manager too, and on web technology most of the time, and VR and Unity, but I've been co-creating Babylon.js with the other David and the other French guy.

[00:02:49.153] David Catuhe: And I am the other David and the other French guy. I am driving the Babylon.js project here at Microsoft. We created it with David like in 2013, and we are proud developer today that can work on the pet project we had in the past.

[00:03:05.047] Kent Bye: Great. Great. So glTF is like sort of the data format for 3D modeling and sort of the open spec there. And in terms of the developing for the web, a lot of people who have been doing web VR have been using something like 3JS, but maybe a framework on top of 3JS, something like maybe A-Frame that is able to have a declarative language that then gets handed off to 3JS, which then is eventually rendering down to WebGL. So imagine that the Babylon JS is kind of like the equivalent of that layer to be able to do what 3GS is doing, to be able to take the JavaScript, translate that into a spatial 3D object. So maybe you could talk a bit about why you started it, and maybe some differentiating factors for how is it different than something else that's out there, like 3GS. ROMAIN GUY OK.

[00:03:48.536] David Catuhe: So we started it mostly because we wanted to have a framework which was, I would say, enterprise-ready, meaning no breaking changes. So people can move to the next version without being afraid of any change that could be done on their own side. Second, we started it in JavaScript, but we wanted to use TypeScript. And thanks to TypeScript today, we can have all the wonder of ES6 if we want. It's just a flag to change. We can compile it to various targets easily. And to be fair with you, because we are in the open source world, when there is a PR, it's far easier to run a check thanks to TypeScript, because there is all these strong typing systems. And you were right. Everything goes through Babylon.js, but there is no framework on top of it for WebVR. We also wanted to have BabylonJS as a one-stop shop. So you want to do 3D on the web, then we have physics, you have collisions, you have WebVR, you have tomorrow WebXR, all available inside the same framework. So GLTF is also part of it. You want to load GLTF, sure, but you don't have to go to a satellite report to get it. We maintain everything. So we keep the coherency of everything. So on every single new release or version, we test everything across the board and we don't have segmentation. So that's the reason why we did it.

[00:05:05.282] Kent Bye: So my understanding of TypeScript is that it's a little bit of a declarative language that you're able to do something that's kind of equivalent to a frame where you're able to sort of define things. Is that the idea or maybe like what is TypeScript?

[00:05:16.394] David Rousset: TypeScript is really a language on top of JavaScript, adding types. It's really useful for people coming from other spaces like C Sharp, for instance, that were used to the paradigm of JavaScript, and also it's going to add productivity and quality on top of JavaScript, so a lot of people tend to enjoy that in the enterprise, and we enjoy that as Babylon.js developers, to be honest. We've been able to find bugs thanks to TypeScript that were complex to find. compared to a frame like the declarative one. Today, we don't have a complete approach on that. We have a viewer that can use a declarative approach like HTML-like element, if you like to, that's going to load the viewer and set some parameters to let you configure this viewer. But most of the stuff will be done in code itself using TypeScript or JavaScript based on what the developer would like to do.

[00:06:02.989] David Catuhe: It's clearly a superset of JavaScript. You can see it as JavaScript plus types. And less error-prone, easier to maintain, easier to deal with community because of all the types. So people can clearly be more confident on the code they are shipping as well.

[00:06:18.940] Kent Bye: Are these types able to be bundled? Is that the idea, is that you create a type and it's sort of a combination of different things? Or what is it able to do with the type?

[00:06:26.573] David Catuhe: They are only available at compilation time. So at the very end, it's JavaScript. So TypeScript compiles everything into JavaScript. So at the very end, if you take Babylon.js, the file itself that is the artifact that we are generating, it's purely JavaScript. So there is no more types and stuff like that. And you can use it like any other JavaScript engine. And then it includes all the call to WebGL, all the calls to WebVR, purely in JavaScript. But the way we build it, the way we compile it, is through TypeScript.

[00:06:57.040] Kent Bye: ROB DODSON Yeah, and maybe you could tie in how glTF is fitting into this Babylon.js ecosystem. You mentioned that you're kind of working on both, and how do you see that they're related?

[00:07:07.100] Saurabh Bhatia: Yeah, so I work, I guess, with the GLTF loader that's now part of Babylon.js. And I've been working with the GLTF spec for a while now. I think the key factor for us was, as we were writing down the spec, we had to, like, prove it out. Does this work or not? Right? So we've been working with the community, including the 3.js guys, to, like, Whenever we propose something, OK, let's do morph targets, how is it going to work across different engines? And then showing it out in the open source world, showing it in Babylon.js, showing it in 3.js, and making sure it's working across all those engines has been a key component to proving out the glTF spec. So you can't really do the spec without having an engine that goes along with it that shows you how to utilize the spec. And that's how I see it fit in together.

[00:07:53.044] Kent Bye: Yeah, and I guess you had mentioned you don't have to do anything for the WebVR. For A-Frame, when talking to Diego Marcos, one of the things he said was there's WebVR 1.1, but at some point it's going to go to 2.0. Don't worry about it. We're going to take care of all that. At the framework level, you just have to update your A-Frame to be the latest. And then on the back end, they'll take care of all the nuances of the WebVR spec. So I'm just curious what people have to do in order to have WebVR integration, and if it's a similar thing where, as long as they just use Babylon.js, that going from 1.1 to 2.0 is going to be pretty seamless.

[00:08:27.781] David Catuhe: Exactly. Nothing to add. It's exactly what we will provide. So far, we have a one-liner in our code. Literally, it's scene.createDefaultVRExperience. That will take care of everything for you, including initializing WebVR 1.1, making sure we find a controller for you. Everything is done. under the hood by BabylonJS. When WebXR, so the WebVR 1.2 that you mentioned, will be available, it's going to be entirely transparent. The people will just have to switch to the latest version of BabylonJS, and then they will automatically, transparently, without any effort, be able to use the new version of it.

[00:09:04.237] David Rousset: Maybe I can add something compared to F-Frame. F-Frame is really great, and we know a lot of people really enjoyed it, and we were inspired by the simplicity of F-Frame, really what we try to do with the VR helper on our side. But F-Frame is really targeting VR-only experience. On our side, we really want to have a kind of smooth transition from 2D, even if it's really 2D screen, like touch, or mouse, or even pencil, up to VR controller, using the same control and the same code. So you can build your scene in 2D, test it, with UI elements, going to work on mouse, and we've been doing a VR session with David, showing a kind of Fruit Ninja game working on the mouse, which is not really fun, but it works, and then switching out in one or two line of codes to VR, because it's going to be transparent for us. Once it's working in 2D with mouse or touch, it will work in VR. So this is our philosophy, because we know that WebGL is available across all platforms, and WebVR is something more that can get provided to your user when they have a compatible device.

[00:10:04.413] Kent Bye: Yeah, and it seems like that the enterprise market may have specific considerations. I mean, there's so many different versions of Unity that if you want to create a project and you put it out, then you're going to have to maintain it and keep it up to date. But I imagine that if you were able to put something out with the open specifications, then that's just going to have a little bit easier for being able to have archivability of projects. I think one of the issues, if you create a project within Unity, you may have to keep it updated. And I think the idea with the open web is that there's better web standards so that you put it out there once. And that for some enterprise markets, they want to be able to have the ability to be able to see some of these models they're creating or some of these experiences down the road without having to worry about keeping things updated. I mean, you may have to update the Babylon.js. But overall, the idea of the open web standards is that you have this permanence that is a little bit lacking when it comes to the lack of backwards compatibility when it comes to something like Unity. It can be on the bleeding edge, but the trade-off is that it moves slower, but it's just more stable. So I'm just curious to hear how that actually plays out in Battle on JS.

[00:11:06.519] David Rousset: I would say it's partly true, because specifications also are moving sometimes on the web, and sometimes are breaking also stuff. So this is where frameworks are really interesting. So whether you're using A-Frame, 3.js, or Babylon.js, so they're going to handle that for this complexity. You're right, it's moving slowly, so we have less problem like, for instance, Unity sometimes won't like to move faster and break something important. And you're right, I know people that have to modify their code from one Unity version to another, but they will have better features in exchange. On our side, this is really our motto compared to maybe competitors. People tend to come to us because we have less breaking changes compared to the other, but this is not only because of the W3C specification. I think it's also, like David was saying, Enterprise-grade. We are working with a lot of big team within Microsoft, so using like Remix 3D, PowerPoint, things like that. So we cannot afford ourselves to break stuff because we will be in trouble inside Microsoft, but also with external customers like Adobe or other people we are working with. They don't want us to break everything. So I think it's for both reasons. It's frameworks-wise and also specification are, as you said, this is the beauty of the web. Normally, if I'm creating a page today, it should still run in 20 years. But this is not always completely true.

[00:12:25.299] Kent Bye: And recently, the WebVR kind of got rebranded to WebXR. So I think that there's going to be the desire to be able to maybe add some extensions to be able to do both VR as a baseline, but also adding different things to do AR. And when it comes to the ecosystem of Microsoft, there is the Microsoft Edge web browser that you can see in a Windows Mixed Reality, or what I would call a virtual reality headset. And then there's the HoloLens. But I don't know if there's a web browser for HoloLens, like how this sort of vision of the augmented reality for the web is going to be playing into something like these different devices that are out there, including the HoloLens.

[00:13:01.579] David Catuhe: So the good news is that since the latest version of Windows, you can use Edge with the Windows Mixed Reality headset like you mentioned, including the HoloLens. It's not AR, it's only VR, so you can't use the AR capabilities of the HoloLens yet, but that's why WebXR is here. The goal is to be able to have VR plus AR simultaneously under the same specification. So soon we will be able to at least ship something with WebXR. For VR, that's tough, stay with me. But in the future, that's why WebXR is here. I hope we will be able to also support AR.

[00:13:38.035] David Rousset: It has been designed that way. WebXR will first try to be on parity with WebVR, which sounds weird, but it's because we need to have some change in the design of WebXR to be able to enable AR capability in the future. So the first version we have in the company, most people will think that, okay, it's like WebVR, but it will be ready to move slowly and to switch to the new version using AR devices. That's why, again, sorry to insist on that, but I think people should use frameworks, not just Babylon, but frameworks in general manner, because we will have some breaking chances between WebVR and WebXR, and as framework makers, we will take care of that for you.

[00:14:16.140] Kent Bye: Yeah, and just being here on the expo floor here at Microsoft Build, there's a number of different displays about the 3D ecosystem, GLTF, and really trying to show how much Microsoft and these different products are integrating the spatial computing, whether it's the GLTF models into Microsoft Word or PowerPoint. Maybe you could give us a survey for everything that is happening when it comes to this open standard of GLTF and how it's being built into all these products and what that kind of means for the future of immersive computing.

[00:14:46.055] Saurabh Bhatia: Sure, so yeah, we have a lot of products that have come online that have now added the ability to display 3D objects. You mentioned a few, I think Office, like PowerPoint, Word, Outlook, Excel, they have support for inserting 3D objects in there and having some interesting new experiences that are now possible because of that. I'm afraid I'm going to miss out a few, there's quite a lot. There's Paint 3D, Remix 3D. I think the Uber point we probably want to make is we want to make sure the content that is flowing within our ecosystem or even in and out of our ecosystem is compatible. It's open and interoperable. So that's really the goal and GLTF was kind of the perfect vehicle for us to have the same asset. flow across different applications, different devices, and even different experiences. So if you had checked out some of the demos there, it's like you've got 3D on a 2D surface. You have it working in a web browser, whether it's through Babylon.js or whatever WebGL engine you want. And then you also have it in mixed reality, wearing a headset and kind of looking at it in a spatial format. So GLTF is the glue across all of that. I think the other part of it is we are not just using it in our app ecosystem. We are also contributing back to the open source community through various tools. It's all done under the Khronos Group repo, the GLTF Khronos Group repo. So we want other developers to also start building apps like these and start unlocking the creativity with mixed reality.

[00:16:09.492] Kent Bye: And I'm curious to hear about the progressive web applications for Babylon.js, because I know that being web developers, they may want to deploy this as a mobile application, or for people to see it on their desktop, or if they happen to have the mixed reality, or virtual reality, augmented reality headsets, that they can hop in into any of those different contexts. So I'm just curious if Babylon.js has any progressive web applications to be able to compile things down to a native application.

[00:16:36.642] David Rousset: So it's web development, so we've got native support for web development and we had a couple of time ago is a support for offline version using IndexedDB. So for instance to store assets could be useful or when you are offline you don't have access to a server so you can store like detail about your user if you're building a line of business application inside the DB. So we've got support for that out of the box inside Babylon.js compared to our competitor for instance. Except that, any kind of engine would be PWA ready as soon as the browser is, meaning that you can, in the manifest and the service worker, specify that you want to download the various files, the assets, like could be glTF files, locally, when you won't have access anymore to the network that will be ready to be handled. For instance, during the session we've been doing yesterday, we were using PWA to render the game completely offline because I was downloading the controller models, the assets of the games, so it was ready to be rendered completely offline. PWA also enables you to remove the Chrome of the browser, to really have a native-like experience. What we add on top of that at Microsoft is a support to push that in the Microsoft Store. You can even call WinRT APIs if you like to, using what we'll name progressive enhancements or feature detection. So we are starting to do that. Some people are interested in building PWA because you can even put that in the Mixed Reality Portal, adding the 3D icons, and really, be like providing the same kind of experience as a native app in a way. But for us, native JavaScript is native on Windows, meaning that we don't consider that as a different target. If you are C++, C Sharp, or JavaScript developer, you should have the same kind of features and experience where you are on a Windows platform.

[00:18:23.433] Kent Bye: And I'm curious to hear your perspective on the state of the browsers, because I know that there's been a sense of people waiting for Chrome to ship their version of that, and that they're the biggest browser. And to some extent, if you're going to create one of these experiences without the biggest browser, it's sort of held things back to some extent. Also, Apple and Safari, I don't know to what extent that they've been engaging with these different discussions or announcing any plans to integrate any of this for people who are on iOS and Safari. But we do have the Samsung internet. We have Oculus browser. We have Microsoft Edge, as well as Firefox.

[00:18:59.050] David Rousset: Android also, Chrome on Android got support for it. So you're right, Chrome on desktop is still not there. They have it, but behind a flag, so it's almost there. So we do hope it will be there soon. And I know that Google is really also saying that VR is important. They've got the Daydreams. They've got various initiatives, like really cool experiment on HTC Vive on the Steam VR store. So I really think it's going to come soon. So in the meantime, you've got support in Firefox, Edge, and a lot of different Android devices. You can still address a lot of devices today, even more tomorrow with Chrome support of WebVR. And in the meantime, we've got a full back support using device orientation. So you can have a cardboard-like experience to start with. So you can already push your code and put that in production. And the day that the user will switch to a more recent version of Chrome or will buy VR devices, it will be immediately compatible.

[00:19:52.818] Kent Bye: Why is developing these technologies for the web important? Which one? Well, just Babylon.js, like all this stuff on the web. I know there's most people that are developing applications for AR or VR end up using something like Unity, especially for the Windows ecosystem and HoloLens. Unreal Engine doesn't quite have the support there yet, but it will probably come as soon as the ecosystem grows larger. But most people doing VR development these days are doing native apps, and that there's a certain amount of performance hit. that isn't quite there yet. So you kind of lose things, but you gain things as well. So what are the trade-offs that you see, like why people should be paying attention to this?

[00:20:31.091] David Catuhe: It's interesting to use WebVR for a really good reason, I think, is that it's going to work everywhere with one single initial code. The code that you're going to write is going to be the same everywhere. WebVR is truly the only one that can clearly give you access to all browsers supporting WebVR, plus PWA, so you can even reach to native stores. with one single code initially. So I guess it's important for web developers and also because if you're a web developer, you are not really motivated to learn C Sharp or Unity. So for an entire group of developer, which is a big one like the web developer group of developer, it's important for them to be able to reach VR. So I guess my answer is, with one kind of code, you can reach everything out there. And if you're a web developer, you don't have to learn something new because it's already here.

[00:21:20.180] David Rousset: I'm a true believer in web VR, so maybe I'm biased, but I think you can reach a lot of people using web technologies. And you have to think also we are not targeting the same kind of experience. We're not going to compete like with Unreal and building triple A games in the browser. This is not the intent, you know. but there are some people that want just to show 3D models like on e-commerce, and they want just to see the same kind of, I don't know, I want to buy a beautiful bag, I don't want to install a Unity app on my phone to be able to see the same, you know, thing I'm currently viewing in my web page. If just one button could put that in my asset, have a quick view on that, and I will never go on the same site after that, it's really useful to have web technology for that, so it's really... touching people and also integrating where web technology are because sometimes you are building like, I don't know, a document library and you don't want to create another app just for the VR experience. If you already have WebGL support and WebVR, you can mix that in a natural way I would say if your application is based on web technologies.

[00:22:22.648] David Catuhe: And by the way, if you think about the cost and how expensive it is, if you are using frameworks, that's not really expensive because you are one line away from being able to support VR when you are already doing 3D. So you mentioned experience like being in a store, stuff like that, or a virtual store. If you are already doing 3D to represent the object that you want to sell, you are literally one line away from being able to do VR. It's a no-brainer for me. Yes, let's add VR on it. If people are not supporting VR, fine. And if they have a VR headset, cool, that's a really good advantage to use.

[00:22:58.553] Kent Bye: Well, I guess for GLTF, it's like this new open standard. In some ways, all of the different pipelines, there's existing pipelines, and it's just a matter of adding support for GLTF. But for some developers, the 3D pipeline is a completely new pipeline. And I know that there's things like Google Blocks, there's things like Sketchfab, and there's things that people can be in VR creating some of these 3D assets. But I'm just curious to hear how you see this pipeline workflow and what kind of tools that Microsoft has done to maybe sort of lower that friction for people getting into immersive computing.

[00:23:29.607] Saurabh Bhatia: So I think, yeah, we've been spending a lot of time trying to solve that, not just Microsoft, kind of the rest of the community. I think we have to do this together. So I think this is still a question on GitHub, and a lot of people have a lot of opinions. If you are an artist and you're creating assets, like a brand new asset, are you directly doing an export from Max and Maya to GLTF or are you going through something like Collada and that's your asset exchange format and then you transcode it over to GLTF? I think it's going to end up being a combination of both. So there's an open source effort for doing collateral to GLTF that kind of the Kronos group has been behind and kind of improving that support for it. Definitely if you have existing assets, then that's kind of the pipeline forward. On our end, through actually the Babylon.js exporters, we've got two exporters for 3ds Max and Maya, where if you're creating new assets, you can actually hook up the material models in Max and Maya that match the GLTF material models and directly export out to GLTF. So I think it'll end up being a combination of both and we are working with the community to kind of develop the right tools and fund the right tools so that everyone can get on to the GLTF pipeline.

[00:24:37.937] Kent Bye: What are some of the most exciting applications that people have been using Babylon.js that you've seen deployed out there so far?

[00:24:43.799] David Catuhe: It's a good question. It's interesting because we are very various kind of customers. We worked with a hospital in Canada to display brain scans. We worked with game company like Ubisoft. We were working with hobbyists and students that want to create game for Game Jam. We were working with National Geographic to do a multimedia experiences for their TV shows. It's tough for me to pick one specifically, it depends on what you want to do with 3D. And that's also the beauty of it, I guess, because 3D is everywhere. That's weird to say that today, but really 3D is everywhere. We used to work with G-Powers for their own factory when they handle parts. So it's clearly something internal, intranet, really business-focused and centric. And with a small guy in the middle of nowhere that was doing an incredible game where you are eggs that has to shoot them. It's a fun game. People are using it in so various ways that I don't know how to pick that one specifically.

[00:25:47.428] David Rousset: Difficult question. I think as a Microsoft team, I've been working for Microsoft for years. I can feel I'm very proud to see my little pet project, like you said, as being used by Office, for instance. It's like, you know, at the beginning of Microsoft, there was Windows and Office. And at the end, your little engine ended up being used by Office. I think I was pretty proud of that, if you had all those kind of experience. And as a framework maker, I think what's really cool is when people are building cool stuff on top of what you've been doing that you wouldn't have imagined, like all the egg game is a good example, but also Stygian building stuff for the Hellscare. I was pretty proud to see that because at the beginning we were having fun building games, to be honest, and it turns out that it's being used for very serious stuff and important stuff, so I'm pretty proud of that.

[00:26:34.225] Kent Bye: And there's another dimension of the web that it can interface with other APIs and communicate in a certain level that it's a little bit easier using just the language that people are familiar with to be able to call out all these different services, whether it's Microsoft Cognitive Services or these different artificial intelligent object detection, natural language processing. I'm just curious what type of things that you see this fusion together of this open web of bringing different parts of what makes the web a unique medium differentiated from something that might be like in a Unity application.

[00:27:04.952] David Catuhe: Interestingly, JavaScript and Node.js did a lot around this world because yes, you have the front-end with JavaScript and the same code, literally the same code can also be used back-end and you mentioned AI, we can think about back-end server that can do high-quality computation and stuff like that. So the web is unique in this way that it's simultaneously the front-end and the back-end and I think that nowhere else we can find this kind of experiences and that's why we added a a specific option in Babylon.js, so Babylon.js can also be run on the backend, so when there is no WebGL, because people can use that for testing, for physics or collision simulations, so the same code can run backend and frontend, and isn't that cool? I mean, it's really huge.

[00:27:51.718] David Rousset: And I really like your question because I'm a true believer again about what you said. Because I've seen some experiments using in VR, for instance, where we were in the museum, where you could use cognitive services to ask questions about what you were seeing around you, using computer vision to, for instance, detect a specific painting, having a boat experience, but being in VR, so in a very natural way. You could have been building this using native technologies, but you can mix all the different connections you have on the web inside the web page, which is natural, and build VR on top of that or not. It's really what I like about web technology is that you can mix a lot of different services since the beginning, to be honest. And now with all the true powerful stuff we've got, I'm really waiting for people to use our little baby to create what you just created. But they need to be aware that it's possible. Sometimes they just forget about it.

[00:28:43.770] Saurabh Bhatia: I have to think of a cool example. I was actually going to say, we are at the beginning. We are kind of the ground floor of building this out. It's kind of 1.0, 1.1 with WebVR. And once you actually have a stable ground floor, there's going to be some amazing experiences that get built up on top of it. So I think what we are seeing right now is just some initial experiences that are coming out. And I hope this continues to grow. And there's going to be some mind-blowing experience that we haven't seen yet. And I'm waiting for that. Yeah.

[00:29:14.900] Kent Bye: One technical bit about WebGL, because my understanding of WebGL is it kind of gets compiled down to what's essentially like a black box. You are painting on a 2D frame, like the 3D rendering of the pixels. And it seems like it'd be a lot better if you would actually have a 3D DOM that you can actually place objects in and be able to have a little bit more dynamic interaction. But that, to some extent, has a whole reimagining and a re-architecture of the web stack that's actually built with spatial computing in mind. I think when WebGL came out, it Certainly, it wasn't in mind that this would eventually become like a spatial computing platform. So what do you see as the roadmap here moving forward? Are we going to have like a 3D DOM?

[00:29:53.606] David Catuhe: We already have one in Babylon.js, not at the driver level. So obviously, it still has to go through WebGL. But for 3.2 and 3.1, we shipped the viewer. And the viewer is literally exactly what you mentioned. You can, in a descriptive way, specify, okay, I want a sphere here, I want a box there, I want a camera there, a light here, and boom. We do that in, I would say, an A-frame way, so it's inside the HTML, you have tags. And this tag can describe the scene that you want. They can also load GLTF models. they can describe the behavior that you want, this camera should be full screen, this one should not, I want shadows there, so it's purely descriptive way of what you want, but still it has to be compiled, generate shaders and go through the WebGL stack, so it's not exactly what you may have dreamt of, but still it's on the way to be that.

[00:30:45.387] Kent Bye: So you can kind of have the equivalent of the CSS, or the Cascading Style Sheets, where you can take an ID and start to change the visual look just by it. This one, yeah, absolutely correct. Yep. That's all done in the JavaScript layer?

[00:30:57.415] David Catuhe: Yes. It's declared in the HTML page, but obviously then there is a JavaScript processing all of that to generate the WebGL objects.

[00:31:07.902] Kent Bye: Great. And so what's next for both Babylon.js and glTF?

[00:31:12.940] David Catuhe: So what's next? We have a rich roadmap already on the GitHub repo. If you want to go there, we are at the very beginning of 3.3. We have lots of stuff. I would say for your audience, the biggest is clearly the support of WebXR. making sure that we can support it in a transparent way. We are also doing a lot of thinking about should we have an editor, an intermediate page where from, you mentioned Maya or Autodesk, you can do some tweaking, changing some properties and then saving it again in glTF. We don't know, that's a discussion we have with the community so far, and we have tons of advanced features we want to support, like Cascaded Shadows, more support for even better handling, so I highly encourage people to just go to BabylonJS.com, go to the GitHub page and you can see the never-ending list of features we want to implement in 3.3.

[00:32:05.420] David Rousset: And yes, on our side, we started to think about also integrating nice control in the 3D space to enable interactivity, like you were mentioning having a DOM in 3D. This is something similar like having cool control rendered in 3D, inspired by what we're doing in the Mixed Reality Toolkit, for instance. I don't know if you know it. but trying to put that in WebGL, and then it will be available in WebVR. Again, trying to follow our philosophy, starting in 2D, let's say you want to have a cool 3D experience using the mouse, but then switching to VR, you will have the kind of fluent, maybe, integration. We're thinking about that for the upcoming weeks.

[00:32:44.119] Kent Bye: Yeah, I just heard about the Mixed Reality Toolkit, the MRTK. There's an equivalent VRTK, which is different for Unity, but I understand that this is presumably in the Windows stack. What language is MRTK, and how do you translate that into what you're doing with Babylon.js?

[00:32:58.787] David Rousset: So, yes, MRTK is for Unity, so it's in C-sharp and they've got specific shaders for Unity. And what we're going to do, we have really good guys, like this guy on my left, David Cattu, is really good in shaders and everything, I'm not very worried about the technical part. But what I'm really interested in in MRTK is the design language. Because we are developers, so we're really good at building technical stuff, but building UX is not part of our DNA. So I'm really interested in reading what they've been doing, like the effects, like the behavior of the control, and do that back in JavaScript and WebGL, and this is what we like to work on. So we cannot port the control like that, but we can port the behavior, which is, I think, the most important part.

[00:33:43.202] Kent Bye: And what's next for GLTF here at Microsoft?

[00:33:44.763] Saurabh Bhatia: So there's a couple of extensions that have been ratified recently that are very important. There's a Draco compression for geometry compression from Google that's going to reduce the size of the geometry, especially as it's being transferred over the wire. So Babylon has support for it in Babylon 3.2, but that's something that others can start adopting in the GLTF ecosystem now. We've also done, I believe it's called Materials Unlit, which is kind of, if you don't want to do PBR, then an option is to just say, I've captured all the lighting information there, just display the texture and don't do any additional calculations. So for like low-powered devices, that's a quick and easy way to just render things. There's a few more that I know works. I think at SIGGRAPH, you're probably going to have a bigger kind of a roadmap meeting where we get into more details on what's next. And while we are kind of doing these extensions, the other effort that's really going on is making sure that everyone is supporting it correctly. So there's the glTF validator effort. So if you're like writing out glTF files, exporting it out, whether it's like Max, Maya, or even like Paint 3D, we want to make sure it's valid. And you use the glTF validator tool to ensure that. And on the same time, if you're writing an engine that's loading it up, like Babylon.js or 3.js, then we've got a GLTF asset generator project, which generates all these unit test models, which goes and checks out all the nooks and crannies of the spec and makes sure that you're correctly supporting loading up the spec. So as we add new capabilities, we also want to make sure that all the existing capabilities in the 2.0 spec are supported correctly by everyone.

[00:35:14.383] Kent Bye: So great. And finally, what do you each think is the ultimate potential of virtual and augmented reality, and what it might be able to enable?

[00:35:24.786] David Catuhe: Did you see Ready Player One? Yeah. But that in positive. No, I think that the world will evolve with more collaboration, with more space where you can collaborate in 3D. I am French. I am living here in the United States. And I barely see my family. So that's one of the space where I clearly see the VR space as a ultimate tool to meet with my family without having to spend ages in a plane, like feeling the emotional discussion and communication thanks to VR. Yeah, that's going to be something with the improvement of rendering and the improvement of network and the improvement of our headset, going to be something I would love too. We also see something in the healthcare. Clearly, doctors cannot be everywhere, but thanks to that, it's a virtual presence again. I am a little bit unclear on what could be AR in the future. I saw some, I would say, inspirational videos where you are in a supermarket and you are looking for eggs and you don't know where they are and then suddenly on the ground there is an arrow guiding you and stuff like that to help people have a better life. So that's gonna be something cool. So I would say everything cool, but not the bad part of the VR and AR.

[00:36:35.648] David Rousset: Yes, on my side, I really hope to have a better productivity, like you said, in a general manner, to have productivity, because I'm a hardcore gamer, I was a hardcore gamer, maybe I'm too old, but I really enjoy playing games, but I know there is a tremendous potential for something else, like teleportation, but the true teleportation, like you said, for your family, and I'm living in France, working with a team in the US, it's not that simple. not to be able to interact with people in a more natural way. I would really enjoy to have like a virtual presence of myself. And so is the healthcare, because my little child is suffering from troubles. I would love to help people who are disconnected also from the real world. They could be reconnected thanks to the virtual world. I think it would be awesome. So I do hope it will happen someday. And for augmented reality, I still don't know. HoloLens is an amazing device, but it's really B2B oriented. So I would say as a final consumer, I'm waiting for products that can reach me and like you say, I'm not sure I need help in the supermarket, but I would love to have augmented reality experiences that could be useful for my day-to-day life.

[00:37:44.495] Saurabh Bhatia: I'm gonna have to start off with like just being connected with your family in new ways I think that's really the thing I guess yeah all of us have kind of you guys came from France I was brought up in India and now I'm living here so most of my family's back there so I think that relationship of like being able to connect back in a way truly meaningful way I think that resonates really well I kind of feel like I take a broader view of like when we think about like augmented or mixed reality it won't be just one device it could still be our smartphones it could still be X, Y, Z, something else, something new. I don't know what it is, but there's all these crazy technologies that are being developed, awesome machine learning and computer vision algorithms that we are going to come up with some interesting new way of using it that we haven't yet thought of. So that's my true believer self. It's like, yeah, we haven't yet thought of the awesome application yet. There's some nice things that we are seeing value in right now. And we want to definitely push towards those. But as we go towards those, we'll uncover new things and unblock new tech that will open up a whole new set of things that were virtually impossible before. So yeah. Beautiful.

[00:38:53.040] Kent Bye: Awesome. Is there anything else that's left unsaid that you'd like to say?

[00:38:56.301] David Catuhe: Just try BabylonJS, BabylonJS.com.

[00:38:59.288] David Rousset: Thank you for coming. I do hope that people will be inspired by what they could do in web in general manner, whatever the frameworks. Sometimes people just forget about what you could do in WebGL because they think that WebGL is slow or JavaScript. But go to check what you can do with WebGL and WebVR and you could be surprised.

[00:39:17.833] Saurabh Bhatia: Yes, I'll second that. Go check out Babylon.js. Or even GLTF.

[00:39:23.798] Kent Bye: Yep. Awesome. Well, thank you so much for joining me today on the podcast.

[00:39:27.721] David Rousset: Thank you very much. Goodbye. Thank you so much. Thank you.

[00:39:30.803] Kent Bye: So that's Saurabh Bhatia. He's a program manager at Microsoft working on GLTF and Babylon.js, as well as David Roset and David Katia, who are the lead programmers of Babylon.js. So I have a number of different takeaways about this interview is that, first of all, Well, I'm excited to go to Microsoft Build just to see what the latest and greatest frameworks and features are for Babylon JS 4.0, which was just released on May 5th, 2019. Again, this was an interview that was done a year ago, and so I think you get a little bit of a sense of how this is done by an enterprise, it's supported by an enterprise. It's internally used within Microsoft on a number of different projects, and they're adding a lot more visualizations and inspectors, and I'm really excited to see where this goes. I'm personally excited to see if there's going to be some sort of content management framework that is going to be integrated into one of these different, either A-Frame or Babylon.js, you know, for people who want to have a little bit more of a WordPress type of experience in order to publish content into the immersive web. And that's not something that we've seen just yet. I mean, there's actually some applications of people that are doing it on their own, but just to see a little bit more of an open source content management framework to be able to start to manage a lot of these different things. So I'm excited to see where this goes in terms of Babylon JS and what type of applications it's used for. It sounds like they have a lot of really strong enterprise applications. It's written in TypeScript, and I don't know if it's going to be easy enough for individual projects or if it's going to be a little bit more like really suited for people who have an entire, like huge team with lots of different people working on these different projects. Sometimes when you get into these frameworks, it could get so complicated and convoluted that you really need like an entire team to be able to really deploy out some of these different projects. And so I don't tend to hear a lot about Babylon.js as much in the WebVR community, but it'll be interesting to see how this compares to other things like Three.js and A-Frame. And I expect to see that because it has this continued development and so much focus from Microsoft, then It might be a little bit of a dark horse when it comes to these different frameworks that people look to when starting to develop for online on the web. The other big framework, of course, is the React 360 by Facebook. If people are already developing on React, then whatever offerings that Facebook has there may be very well suited for serving those different use cases. So I'm excited to go to Microsoft Build and see what other is being announced. This is a little bit of just like a historical marker and to see like whatever the latest and greatest is. As I talk to other people within the WebVR community, Babylon.js comes up again and again. And I just wanted to get this out there ahead of Microsoft Build 2019 and hopefully get a little bit more updated information as I go and check out all the latest and greatest news at Microsoft Build, which is happening on Monday, Tuesday and Wednesday of this week. And I'll be there for the first two days covering that for the Voices of VR podcast. So, that's all that I have for today, and I just wanted to thank you for listening to the Voices of VR podcast, and if you enjoyed the podcast, then please do spread the word, tell your friends, and consider becoming a member of the Patreon. This is a listener-supported podcast, and so I do rely upon your donations in order to continue to bring you this coverage. So, you can become a member and donate today at patreon.com slash voicesofvr. Thanks for listening.

More from this show