#1306: Kickoff of Meta Connect Series with CNET’s Scott Stein Analyzing Meta’s XR & AI Announcements

I interviewed CNET writer Scott Stein at Meta Connect about his first impressions of the Quest 3, Rayban Meta Smartglasses, AI, and Meta’s strategy as they move into Mixed Reality. See more context in the rough transcript below.

Here are links to all all 12 interviews from my series covering Meta Connect 2023 in looking at first impressions of Quest 3 and Rayban Meta Smartglasses, mixed reality trends, AI, Meta Horizon World Builders, WebXR and alternative production pipelines like React Native, Apple Vision Pro buzz, VR filmmaking, unpacking the changes in Unity’s fees, and digging into Qualcomm’s new chips with the XR2 Gen 2 and AR1 Gen 1.

This is a listener-supported podcast through the Voices of VR Patreon.

Music: Fatality

Rough Transcript

[00:00:05.452] Kent Bye: The Voices of VR Podcast. Hello, my name is Kent Bye, and welcome to the Voices of VR Podcast. It's a podcast that looks at the future of spatial computing. You can support the podcast at patreon.com slash Voices of VR. So this is the first episode of a 12-episode series that's going to be unpacking different conversations and announcements and insights from MetaConnect. So this is the 10th Connect, and I've had a chance to attend all 10 of them. Going back to like the early days of MetaConnect one, which is invite only. So this is an invite only event. So not quite like the last one from Oculus Connect six back in 2019, where it was just a broader XR industry, anybody could come. So it was a little bit more of an invite list, but it's always an amazing time to have an opportunity to connect with a broader community of XR developers. So I spent a lot of the first day actually just talking and connecting to a lot of the different developers that were there and trying to get a sense of like, what's the buzz? What's the state of XR? What are some of the hot topics that are coming up? And so throughout the course of this series, I'll be unpacking a number of the key themes that were coming up. And so this first conversation is going to be with Scott Stein, who's a writer at CNET. And so he had a chance to have an early look at Quest 3 and also had done a number of different interviews with folks like Qualcomm and a couple of folks from Meta. And yeah, just an opportunity to reflect on the deeper trends in the announcements that we all had just heard on the first day of Connect on the 27th. So this is one of the interviews that I did on the very first day and a good one to kick off the entire series because we're just kind of thinking deeper around all the different announcements that were made around both the Quest 3 as well as artificial intelligence and AI and how Meta is going to be integrating AI across their family of different products, as well as how smart glasses are going to start to play into this other branch of, you know, more from the form factor of glasses and adding in more technology into that. Different things like computer vision and virtual assistants and some really sophisticated, specialized audio with like five microphones that are able to capture like a sound field and then play it back in a stereo mix of just really amazing binaural audio that feels very specialized. So we're going to be starting off with this conversation that I had with Scott Stein, uh, kind of our first impressions and just reflecting on the broader implications of all the different announcements that we had heard over the course of the day and kind of reflecting on where things were going in the future. So that's what we're coming on today's episode of the Voices of VR podcast. So this interview with Scott happened on Wednesday, September 27th, 2023 at MetaConnect at Meta's headquarters in Menlo Park, California. So with that, let's go ahead and dive right in.

[00:02:46.094] Scott Stein: Hi, I'm Scott Stein. I'm a writer at CNET. And I've been covering the VR space and AR space for years. You know, I've been at CNET for about 14 years. And I moved from phones, wearables, into looking at all this stuff since the first iterations of Oculus.

[00:03:02.448] Kent Bye: Great. Maybe if you could give a bit more context as to your background and your journey into covering this space.

[00:03:06.790] Scott Stein: Oh, sure. So yeah, my background before that was theater. So way, way back, it was MFA in theater, playwriting. It was creative stuff, thinking about early emergent technology, and then entered into tech kind of through, you know, working in magazines and just finding that my interest in tech in general was kind of the through line. The background was a little bit in psychology, too. I'm interested in the ways that sociologically people adopt new tech. So I've always kind of gravitated to thinking about new platforms, new ideas. And for me, that's more interesting than which product you pick versus what is a new platform like VR, mixed reality, what's it going to do for people, how it's going to transform how people interact.

[00:03:50.393] Kent Bye: So we're here at MetaConnect, and the keynote was today. And the big three topics were mixed reality, artificial intelligence, and smart glasses. And so I'd love to hear some of your first initial take. I understand you had a chance to go and have an early look at some of the technology. And yeah, what's some of your first impressions of what we were able to learn about today?

[00:04:08.502] Scott Stein: Yeah, it's interesting because, like, as they were saying, it's like Meta and Oculus have been doing this for a while, so they've had such a long foot in the door here, yet it also feels like they're trying to get purchase, like a foothold, in the new landscape to come. So, in a sense, I look at what they're doing now as getting an early foot planted in a new mixed reality realm, where it looks like a lot of players are coming in, and evolving some of those thoughts at a more affordable price. But I also feel like they're a bit caught with their smart glass vision. I look at that as you have this simultaneous vision. Everyone's trying to get to that point. Nobody's there, but you see this lateral where everybody's looking at exploring AR through VR because, for whatever reasons, the glasses have a lot of obstacles on a lot of fronts. And so you can just see that presented with Metas just running in parallel. And in the middle, they have AI. AI is like the glue or the peanut butter in the sandwich. They threw out a lot of experimental stuff, which first seems wild and absurd, but also seems interesting. I mean, they had things like Tom Brady and Snoop Dogg voicing imaginary AI chatbots and then talking about it. You're like, what is going on here? So I think they look like they're throwing ideas out there, too. But I look at it as a product in transition.

[00:05:31.500] Kent Bye: Yeah, it's interesting what Meta's strategy has been. One metaphor I like to use is that they've kind of skipped over the enterprise, and they try to go direct to consumer. But to do that, they've had to invest literally billions of dollars to do that. And so they're very used to being a consumer-facing company with all the products they've had. But with emerging technologies, we've just now, at MetaConnect, learned that within the next month or so, they're finally going to bring back their Quest for Business arm. Whereas the Apple Vision Pro is like $3,500, that's like priced more for enterprises. And what you have with something that's more consumer-based, you know, I lean a lot upon like Simon Wortley, who he has this whole... evolutionary model where you look at first there's an academic idea, then there's a custom bespoke enterprise application that takes a lot of money to build something that's very handcrafted, and that's the enterprise space. And then you have the consumer market where it's ready for primetime and ready to be dispersed out. And so by Meta trying to rush to almost subsidize these mixed reality headsets without having the broader ecosystem or use cases or why are we doing this type of thing, it's kind of like we're rushing into this, trying to sell this vision for why mixed reality is so compelling, whereas having the medium itself be more slowly baked within the context of the enterprise where there's companies that are paying for it, it feels like their strategy is throw things out and to say, hey, we're going to make this available for consumers. We have the most affordable mixed reality headset, but yet, I'm still left with the question, but why? What's the thing that's gonna make me wanna really get excited about this?

[00:07:06.801] Scott Stein: That's what I got the feeling of, too, and they're kinda caught in the middle in that, you know, last year with Quest Pro, they were looking at the business side of things. All the demos for that were looking, what I saw in the first demos at Reality Labs Research were practical, business-y meets productivity things. This time, all the demos were games. And you can tell that they've got a gaming console on the market. And kids use it. And my son uses it, who's a teenager. And they want to know, is this cool enough to get? There's a real mainstream part. At the same time, is it advancing things? Not everyone's going to upgrade to it. But Apple, on the other end, looks like they're starting from the beginning. building out this HoloLens-y, Magic Leap-y sort of business type of conceptual headset where they're going to eventually boil it down into something else down the road. Meta already has a working model for VR. But then you have to figure out mixed reality should transform it. And right now it's kind of being just added into the mix. So it doesn't feel integral yet because the OS, the whole landscape is not built for mixed reality. It's an optional thing.

[00:08:18.441] Kent Bye: Yeah, there's been a number of different pieces on the film festival circuit that are starting to look at mixed reality. There was Eggscape, which picked up a third prize at Venice Immersive 2022. It's more of a tabletop gaming where you're using the controllers to guide this cute little character. It's very similar in ways of like Lucky's Tale. And then you have, like, Monsterama, which was at Tribeca in 2023, which is going into your living room, and it's transformed into a monster museum, and you can, like, fight with werewolves. And I think similarly with the Stranger Things demo that they're showing, you're in this spatial context of, let's say, your living room, and all of a sudden you have these portals with these hand-tracking interactions where you're able to open and close portals, but then grab things and squeeze them and use this psychokinesis type of mechanic. And so, I feel like it's on the cusp of like, OK, this is starting to get a little bit interesting. But yet, at the same time, the hand tracking and the limitations of technology and a fragile nature of some of the demos. And it felt like it's not fully baked. Maybe it's going to get there. And this is just like the hand tracking that they've been pushing out lots of different updates. So when I take a look at what's happening with these technology platforms, it feels like just like we've had the Quest 2 on the market for like three years, they've been incrementally improving it. And it's one of those things where you can buy the piece of hardware and the hardware is mostly staying the same, but there's so much that can still be unlocked with the software layer. And it kind of feels like we're in a summer moment where They have the new foundational technologies from the Qualcomm XR2 Gen 2 that's going to be in there. But on top of that, with all these new sensors and everything with mixed reality, there's a lot of potential. But yet, what the actual use case is and how the software feels like it may take some time to actually be fully baked for stuff that's really going to be a compelling use case and a killer app that's going to cause people to buy it.

[00:10:01.193] Scott Stein: Yeah, fragile nature of the demos is a good way of putting it. I think that's something that struck me when I got to look at it for my hands-on. It was exactly that. Things felt a little rough. And I mean that respectfully. It's that, exactly, not a lot of hand tracking implemented. Some of the demos were OK. And I felt like the possibilities for this, the processor and everything else, there's a lot of powerful, seems like there's a lot of powerful potential. But it looks like they're waiting for people to figure out what to do with it more. On an OS level, it's more of like an update to a phone, or an update to an iPad, or update to a PC. On a game console thing, you have these real next generational leaps, seven years go by. You don't have that here. So they want a consistency to what Quest already is. But that almost holds it back. It makes me wonder who's going to advance that. Are they waiting for the app developers to do it? That's going to be hard. I expect more change on an interface level. But if you're still using the controllers, not fully leaning on hand tracking, it feels like it should change your mindset, but it's not fully changing it yet.

[00:11:09.337] Kent Bye: Yeah, I love the way that you put that the AI is the glue that's tying all these different things together, because that was a big theme that was in today's keynote that I was particularly interested in seeing how, with all these different innovations that's been happening in AI, and certainly, that has been at the forefront of pushing forth a lot of the foundational technologies. They're not always necessarily given credit for some of the consumer-facing stuff, but they have been doing a lot of the research. And I think this is the first time that I've seen them make this pivot towards really trying to think about, how to productize some of these different technologies, large language models, generative AI, and that also there's been a bit of a disconnect between what Meta's next-generation spatial computing efforts with VR, AR, and mixed reality versus their traditional family of applications, Instagram, WhatsApp, Facebook. and that there's AI that helps bridge that gap a little bit more because they already have users and the types of applications of these chatbots and these AI personas that are loosely tied to these different celebrities, but with a pseudonym in some ways to give a little bit more separation between who the identities of these people actually are, which I would certainly want that as well if I don't necessarily want to have an AI chatbot of a large language model speaking on the behalf of who I am. But anyway, I'd love to hear some of your thoughts making sense of what's happening with AI in the context of meta and both their traditional family of products, but also the future where they may be taking things in the context of XR.

[00:12:36.972] Scott Stein: Absolutely. I'll give you sort of a preview from two interviews that I did that I've written about yet. But talking to Angela Fan and talking to Andrew Bosworth here, it's interesting because from what it sounds like on the personality AI side, it sounds like they're really going to be more just to kind of chat with. And I'm like, when can you play a game with them? Or could you do more? And it may not be that yet, but it's the beginning of thinking about these tools in your toolkit. But then on the glasses side, it sounds like they're trying to build that assistant level thing, which they talked about for years, like that there would be something that in AR would watch everything you're doing and give guidance and Project Aria. Like, I feel like they're almost starting to begin to potentially use Ray Bans as a test bed for that, from what it sounds like they're doing with the AR1 gem one. But then I guess the other thing you're asking about, like, where they seem to be going with AI, I was a little let down that we didn't hear more about some of the perceptual AI advances. You know, it's like hand tracking is the big one. I kept thinking, like, well, hand tracking has got to be getting better on Quest 3 with the AI possibilities on the chip, neuro-processing in the cameras and depth sensing. Bosworth said, I mean, he and Hugo Swarp definitely acknowledged that the capability is there for that. I guess, like, the Quest, I remember, like, the Quest didn't release hand tracking till, like, midway through. Maybe they're going to finesse that as the products in the wild. But I think that's so important. That's almost more important, because I think, like, how it understands the environment, where it interacts with your hands, what does that mean? for a Quest user? How is that fundamentally changing? Right now, they're just being like, oh, we're going to mesh a room, put things in it. But I think it should be an AI layer that begins to recognize objects in the room and have a more intimate relationship with that and what's going on with everything you're sensing. I don't know. It sounds very vague the way I'm describing it. But that's the kind of dream of AI and AR. I don't know if it's so much about the chat part of it. It sounds like they had a happy accident to some degree of generative AI that this whole movement happened. And then they can integrate that like a lot of companies, which is very exciting. But perceptual AI seems to me to be the most fascinating territory, unless I'm maybe I'm jumping ahead of myself too much.

[00:14:53.775] Kent Bye: Well, yeah, I mean, I think the other component of some of these technologies that, you know, there's a number of different times they put up a slide that said, build responsibly. But I feel like a lot of their responsible innovation teams that were separate teams have been dispersed and put out into other local teams. I feel like having a checks and balances of some of these ethical implications of the technology, it can be difficult. I did some interviews with Applin and Flick talking about this paper they did, they're talking about the best practices of responsible innovation. And they were saying that generally, it's good to have these responsible innovation teams have the capability to push a big red button to say, stop. We should not do this. And when the teams are kind of integrated into the engineering teams, there isn't that kind of dialectical process that allows them to actually question whether or not this is a good thing to be pushing forward or some of the different harms that could be coming from some of these things. I guess I understand the limitations of, some of the large language models and generative AI technologies. And what I wonder is, to what degree is these gaps and what kind of harm could be done when they just start to push these things out that may be starting to, for them in their own QA, they've maybe not been able to really understand the full implications of the types of harms that could happen. And when they start to deploy it, then they start to see the gaps of what it can't do and how it can start to bring harm in different ways. That's the types of things, and I think that they had a slide that said that, but just knowing that they let go of a lot of their policy folks and they've sort of disbanded the organizational structure of their responsible innovation team. So I guess I have a little bit more questions as to how that's going to move forward and if it's just going to move fast and break things back to that kind of ethos.

[00:16:30.975] Scott Stein: It totally feels like that. There's a very fast and loose feeling going on here. Is it that generative AI for sure? And then, yeah, Facebook's history with this. That's a big question. Right now, it's very unclear how these personalities, these AI personalities they're doing, really work. I get a sense of it in terms of these, like, They have different permissions or sets, but it's still pretty vague to me. And then, how is that going to work for next year when they want to have people create their own and build that for themselves? How will those be responsibly done by people? It does feel experimental. And then also, there's the question down the road of where these things go with AI. Baz was talking about the goal for them for AI is a lot different on the glasses. He was bringing up that it's got to become more personal, like intimately personal with everything you do. And that becomes a whole level of trust and privacy. How do they make that happen? If it knows everything you're doing, which seems to be their goal, that you're going to have a deeper personal AI at some point that they're shooting for. Different from the meta AI, but it's sort of the assistant AI You have to open it up to all these permissions. And then that goes back to, what are you sharing with Facebook? Things that people had turned off before about connected apps and other things. And now you, does it know everywhere you're browsing? Does it know this? These things are going to, to be a good assistant, it seems like there's going to be a desire to have it want to know those things in order to help you better. Are we ready for that? And from who? And how does that all work? That's all down the road. That's a whole other level. Right now, they're looking more at these more outward-facing generative AI things, which again, some of it they were saying hooks in a bang. Some of it, how are their data sets working with this? Not entirely clear with that.

[00:18:19.697] Kent Bye: Yeah, during the pandemic, there was a big kick that Meta had gotten onto around this idea of contextually aware AI, where the AI is able to understand all these different contextual dimensions of everything that's happened. And they even had examples of like, where did I put this memento for my grandmother? And it knows where it's located within your house because it saw you put it there. And because of that, then it's going to tell you where you put it. So just this idea that it could have this store of memory. And I know that they had like these egocentric data capture, this ego 4D AI, like all these like benchmark tests for these different types of memory and other aspects that when you read it, it's just like reads to me at least like this dystopic future of AI that's out of control and has no bounds. And I just like to see with some of these large language models that have had had this layer of alignment where you have reinforcement learning from human feedback, where you have this tuning that happens, but yet that's a layer that increases the safety, but yet there's been this whole movement of people finding out ways of how to jailbreak and get down to the root of this information that's there. And so if you have these really personal AIs that are having all this information, how do you ensure that people can't jailbreak that information and get access to it? So it feels like we're kind of sleepwalking into this future that is, I don't know, for me it's a little bit scary, but I don't know if you asked about contextual AI or the privacy implications of it all.

[00:19:40.596] Scott Stein: Yeah, well, I mean, he kept emphasizing, you know, trusting it and that's an intimate level of stuff. And how do you do that responsibly? And those are big unknowns. What surprised me is the way the Ray-Ban glasses, as they were announcing, are really going to have an AI element next year that, in a sense, sounds like a variation on what they were thinking about Project ARIA, which is that test bed for some of the contextual awareness things. Maybe it's, you know, how did they turn on next year? How does it turn on awareness of what you're looking at? Is it by cue? I would imagine, power-wise, it's not going to be looking at your stuff all the time, but maybe you trigger that. It sounds like they're beginning to play with exploring inputs and having the AI understand that. That's the beginning of where their aspirations go to AR as a personal assistant, which is a very different path from VR as they've done it before. And that's a whole leap that is fundamentally feels pretty different in terms of what you're trusting it for, what you're using it for.

[00:20:46.253] Kent Bye: So you had a chance to talk to someone from Meta. What are some of the takeaways around what's happening with artificial intelligence in the interview that you were able to do?

[00:20:54.218] Scott Stein: I think the takeaway was that there's kind of a, I don't want to say spontaneous, but there's a little bit of like, oh, we're getting to do this right now. And I think everyone's kind of leaping into the moment. Some of the work, it sounds like, that they were doing on some of this stuff with the personalities had been AI research they were doing before, some of it with translation and some other stuff. It's almost like putting some of these experimental research ideas into a living kind of ongoing process experiment. It felt the most fast and loose of all the things. And the ability to suddenly have these like 26 personalities that they then, it's like an NPC in a game. They seem to be assigning visuals from celebrities to kind of add spice to it. But it's not the same as they're not informing the AI. But next year, my question is, are you training your own AI to be you? Or is there future work in AI assistance? Does it mean that you are creating a data set that can then work without you? Is that your doppelganger? How do those intertwine? Are you enabling some of this data set down the road to be your outward-facing other self? Is that the goal? Or are they not even ready to think about that yet? Because right now they say, oh, that's your personal AI. The dream down the road is more of your personal AI data set. But they're dealing more with the assistant search part. How do those intertwine? Do those intertwine? What does that even look like? Those are my kind of weird thoughts about that. What are they trying to prove by showing Tom Brady, Snoop Dogg, Mr. Beast? I think that they're creating test models to say, this could be how you do it next year for something that maybe is not you, but could represent you in certain instances. How does that happen?

[00:22:48.280] Kent Bye: Yeah, they announced an AI studio, which they didn't give a lot of details on, but it sounded like it was creating these personas. There was a company that was at Augmented World Expo called nworld.ai that Meet Wall was a piece by Niantic and Keiichi Matsuda's Liquid City, and Neil Stevenson's been working with them a little bit, creating these characters. And you have like a bounded set of knowledge where those characters are operating with what they know and don't know, but then they're able to then use the large language models as more of a conversational interface rather than tap into the vastness of all the knowledge of the internet and try to come up with a statistical average, but to be a little bit more domain-specific in some ways. Yeah, I don't know what the technology is behind it, but with these personas, it feels like it's maybe more contained or bounded to these large language models that have a little bit more character and personality.

[00:23:36.157] Scott Stein: Yeah. It all feels like parts of a future product that are, like, yet to materialize. It feels like now we're, like, working with existing products, seeing, like, glimpses of bits and pieces. Like, you have the mixed reality test bed is on the Quest 3. The AI part is kind of everywhere. And the perceptual social comfort is in the glasses. And, like, are these going to triangulate into some new thing? I still feel... that the fundamental thought process on this should be, like, transformationally different. And that's what Meta says, too. But you don't see it fully taking shape in that form yet. Like, they're still taking existing product paths and going, hey, these are glasses that can look nice and take photos and listen to music. And this is a thing you can play games on. So they haven't taken the full leap themselves. And Apple doesn't seem to have taken the leap. Like, they're doing sort of a, Mixed reality floating screens for work, but is there a fundamental transformative idea of what even the nature of, are we going to leap beyond apps? Is it going to be more than just browsing screens?

[00:24:46.914] Kent Bye: Yeah, the process of demonstrating a lot of these AI apps, what it felt like in the keynote, like these walls of text of these chatbot interactions, which was kind of funny to see just, you know, the different varieties of different ways that they're interacting with them. But I also found it quite interesting to see how they're taking, like, things directly from mid-journey, like the slash imagine to create a generative AI prompt and be able to have that in the context of a chat or to create a dynamic sticker from generative AI and to be able to send that to your friend and a little bit less of the avatar stuff in this keynote and more of things like integrating style transfer generative AI techniques into Instagram so that we're going to be perhaps flooded with all these AI generated images on these social platforms like Instagram and just really leaning into that this may be a thing that drives engagement and novelty in people. That was what I was really struck with is just how all these innovations of AI are starting to blend into what is really these mass consumer scale products hitting the mainstream.

[00:25:44.792] Scott Stein: Yeah, it seems like exactly like a lot of that stuff, which will probably be the most widely used, frankly, are these things that you can use without a headset or any of their products. And that's stuff people are going to start playing with right away and go, oh, I can create stickers fast. And that's like, it's easy to get traction there because it's probably free. And in a sense, that's like the most here and now. I guess the part of me that leaps ahead that I go, it reminds me of the thing I also think was missing today. was generative AI as it applies to a little more of the metaverse-y space. You know, Horizon Worlds was barely mentioned, I think, in the presentation, but looking at what Roblox has been doing lately, which is, they're interesting. They seem to be really fronting some generative AI with the idea of it being assistant for creation for, I think, down the road, to more spontaneously create stuff in your world that you wouldn't want to spend time learning how to do. I would think that that's what Meta should be doing for the Metaverse. But how does that manifest? Can you start creating things and doing things that you don't even know necessarily what you're doing, but you can help discover it for yourself? Not a chat, but a kind of a world creation, world awareness type of a thing. It's a big question mark. Because right now, the chats do seem very contained. Though I think it's nice to think of them as an assistive mode in VR. But I'm not really sure, where are the deeper hooks in it? Like, is it just something you're going to chat with? Or is it something that's going to really help you jump to a settings feature or help you do something? It sounds like maybe not so much at first. But on the glasses, the AI assistant might be a little more of an OS layer down the road, like something that will be. helping you be aware of what you're looking at or bringing up things.

[00:27:40.055] Kent Bye: And so as you are digesting and trying to take some of these different insights that you had from talking to Meta folks, what are some of the takeaways that you're trying to suss out in terms of what we should also be thinking about?

[00:27:50.939] Scott Stein: I think it's interesting to think, like, who's ready to do what? Like, I was talking to one Ramdan developer today. He was talking about working on other platforms and this. And how much is Meta leaning on developers to figure it out? And the relationship between that and App Lab is interesting. App Lab is pretty hidden, as we were talking about. And I almost think that they need to, in a Steam type of way, make a lot of these experiments much more visible, so you can trust people to discover great games and ideas. I think it's going to take a lot for people to make the leap to creatively build in mixed reality. And I'm not sure how much of these will be viable products. But to let that experiment live for people so they can try it and play with it, I think about like the arts groups, the Sundance and Tribeca, it's kind of a dream to say like, I can imagine a lot of these groups in a year or two going, wow, in a cheap headset, you could make, like you said, you can make some mixed reality home experiences. Whereas like how many people are going to have a Vision Pro, but a fair amount of people may have a Quest 3. But I think that's all experimental stuff. And I don't know how many people in the moment looking to make a profitable game are going to look at a subset of the Quest 3. And I don't know if that's going to require everyone to be on board, like I keep thinking about next year. between Vision Pro, whatever Samsung's cooking up, and this, will that create a commonality to say there's a reason? Could you commonly make mixed reality apps? I guess that's my question. That's kind of my biggest overarching question is how fast that mixed reality landscape can emerge.

[00:29:32.301] Kent Bye: Yeah, and you just recently went to the Apple event where they had a watch where you have, like, pinching gestures that you can start to use, kind of like a click, but be detected with the watch, and also, like, spatial camera aspects. But, yeah, as we move forward, we have Apple that's coming into the mix. You mentioned the Samsung and working in collaboration with Google and Qualcomm, and so there's this other tier. And we also just had an opportunity to get a little bit of a briefing from Qualcomm, like last week, where they were talking about all the new features of the XR2 Gen 2 and announced that they have over 80 XR headsets. They're using the XR2 Gen 1. So you have this whole entire ecosystem that has been developing with XR generally. And now that we have all these new AI capabilities that are built into the chip, I think with both the XR2 Gen 2 from Qualcomm as well as the AR1 Gen 1 that is going to be in these glasses that are going to be here at the Ray-Ban Meta smart glasses. So these new chip capabilities that are going to start to expand what's possible with these platforms. I feel like it's like the next iteration of the platforms that we're going to be seeing for the next two or three years, and that it's, for me at least, I get excited to see what kind of capabilities that are going to be slowly developed. But right now, it's still so early days that it's hard to know what exactly that looks like.

[00:30:51.594] Scott Stein: Yeah, ahead of Kinect, I talked to Hugo Seward at Qualcomm, who's the head of XR, because I was curious exactly, like, what do these chips mean? I mean, they are the player, as opposed to, you know, except for Apple making their chip, Qualcomm's chips are in all the products. And they debut on the quest, but you're going to see them proliferate everywhere else. And that's exactly the interesting question. There's a lot of possibilities in AI and mixed reality. But for the glasses part, it does seem like the question of glasses is being a bit like spread across things. Like I was asking about, I mean, last year they had AR2 Gen 1, but they had this other more advanced AR chip platform designed for AR glasses that would work with phones. AR1 Gen 1 sounds like a step back, but it's meant to be AI functions, but without the full AR. He made a comment, though, about how it sounds like there's been a bit of a holdup in the sense that some of the bottleneck might be displays and optics. Or, I mean, more optical technology, not to put words in. You can read the article. But, you know, he said that's a bit of a...that we've still haven't seen any glasses with AR2 Gen 1. And they should be forthcoming, but it's interesting because that was last year. So, it sounds like, in a way, Not a bit of a pullback, but they're refocusing maybe on... smart glasses that do some of the main functions before the optics, and then everybody's now pushing for the mixed reality on the VR. It's like everyone's working out those thoughts, Apple too. All the dreams of AR glasses got pushed into mixed reality pass-through in VR. So I think that's what's interesting is that you're going to see a generation of more products figuring that out. We thought we were going to see that last gen with the products, but I feel like we're really going to see it now. maybe towards a generation ahead of that where it really gets solved.

[00:32:49.212] Kent Bye: Yeah, I think that reminds me of what Hugo said during that briefing where he was basically like, you know, Qualcomm can make these chips like the AR2 Gen1, but if no one makes anything with them, then there's only so much that Qualcomm can do to kind of force the market. You have to have the OEMs that are actually building something that's compelling in order for that ecosystem to grow. And so by stepping back with the AR1 Gen1 and have all these other AI features, but also I was really struck by the spatial audio that's in the Ray-Ban Meta smart glasses. To hear this, you know, haircut types of demos that people have had with the kind of binaural audio that goes back and forth and giving you this sense of spatialization. And so it's not doing anything that's like super fancy in terms of ambisonic sound fields or it's not doing any like head tracking or detecting where you're looking, but it's just recording what you were doing in the moment and then playing it back in the video. But that playback has this deeply immersive quality. And so I feel like if we go back to the Bose AR frames as a platform, which had a lot of great potential, but I think it was maybe a little bit ahead of its time in terms of what people knew how to create these different spatial audio experiences. But for me, I think the spatial audio component of this is going to be a real killer app that, you know, you can put on these glasses and have a conversation and be able to hear things and other people that are around you not be able to have it. So the ways that it's able to direct the sound directly into ear feels like a kind of a magical type of technological innovation. But I feel like it's also one of those things that actually has a use case of people talking on the phone or listening to music or wanting to have these on, at least from an audio perspective, have that type of auditory experience.

[00:34:24.623] Scott Stein: Yeah, it's a really cool idea, and it sounded really fun doing the demo of it, that I was like, oh wow, I can really hear these things, like ASMR type of a thing. And it feels like they're figuring out some of those thoughts that, you know, like, Apple's kind of leaping into this, which we haven't even tried, the iPhone doing spatial audio for the Vision Pro. Bosworth is even saying that, like, that's been on Meta's mind with the glasses, but they're waiting for it to figure it out, to kind of find the use, find the purpose. But it looks like, This is like them acknowledging that they're gonna figure that out, not on these glasses, but like for the video, spatial video, but at some point I would imagine they will. Yeah, that's super interesting. And also, to Meta's credit, I've seen so many people wearing these glasses already all throughout the event. It's really hard to tell they're wearing them. I find that these have become really normal looking. That seems like so much of their mission, which is an interesting mission, to say, can you create glasses that just look normal? And I think they have pretty much achieved that for that look. That's also the step to just be like, I'm buying glasses, and they just happen to be smart glasses. Can they get over that hump? Because that's a big hump. Otherwise, you're getting to this point of like, am I always carrying something else around that's my thing? Or have I really bought into that my next pair of glasses are going to be, you know, prescription Meta Ray-Ban smart glasses?

[00:35:54.617] Kent Bye: I heard some people talk about like the increase of resolution of a 12 megapixel camera is like at the level where people can use the smart glasses to take a photo rather than taking out their phone and kind of hands-free type of thing. But also like this Instagram live streaming type of thing where you can hold a phone and shoot yourself but then also switch into the first-person perspective and go have another adventure. So to be able to give an introductory context but then move into this other mode of live streaming kind of reminds me of going back to the days of justin.tv or just the ways that people are kind of broadcasting their lives. And I don't know where that's going to go, but yeah, I feel like that's another area that at least some people that are content creators are excited about.

[00:36:36.308] Scott Stein: Yeah, I mean, it's the sort of stuff that makes me think of Snap and all the stuff they did. But Snap felt, by design, to be kind of like early influencer, very over-the-top glasses design. . . pretty across the board, you know. To me, that just seems like something that has a different meaning now. Like, yeah, I feel like it feels well-timed for the moment. It seems like a lot of people, when I casually talked about this, and I said, oh, these can also instantly go to Instagram, people were like, oh, wow, like that's, People understood that in a way that was more intuitive than VR. And so I think that was a smart play. And it's also like you mentioned, audio has been around in glasses for a while, and then Amazon just had its new Echo Frames. And that's kind of becoming a little more, especially AirPods world and earbuds world, people are more accepting of that. And it looks like META is taking that and easing the camera part into it. So in a sense, to kind of almost make the camera part not so intimidating. I think I also made these arguments today that people ask about privacy in the camera. And frankly, there are those issues. I said that when I reviewed Ray-Ban Stories two years ago. But we're frankly in an era where you could put any camera on you. You're using your phone cameras. I don't really understand how it's any fundamentally different other than, yes, you could be surreptitiously filming, but the light would go on. But I feel like maybe we're ready for that now. I don't know. Maybe I'm leaping ahead, but I don't think it's that wild an idea to have cameras on your glasses.

[00:38:22.513] Kent Bye: Well, to push back on that a little bit is that there's a whole aspect of bystander privacy, which is like a bystander, have you consented to then have your location be live streamed in that moment? You know, I think the issues around from a policy perspective, from bystander privacy, Meta kind of kicked the can down the road, saying, oh, you're responsible as a user to make sure that you have consent from everybody that you're putting on the camera. But how often does that functionally happen? So now you have this situation where, just by the way the product's designed, you now all of a sudden are having all these different aspects of bystander privacy violations, potentially. Maybe in specific contexts are people's homes or other stuff that's a little bit more sensitive. Well, maybe when people are out in public, then it's a little bit less of an issue. It's something that they didn't really figure out a good solution for, but yet we're moving forward with it. So I wouldn't say that it's not there anymore. It's just they ignored it, and now they've progressed forward in that no one's complained loudly enough. And now there's this idea in technology policy circles called the technology pacing gap, which is that technology advances so quickly that it's so quick that the policymakers are kind of like put in this position where they have to do all this sense making around like, what are the implications? What are the harms? And there's this dilemma where either you regulate too early and you stifle innovation, or you wait too long and then the genie's out of the bottle. So there's kind of like this very small window under which that you can actually implement viable technology policy issues. And so this feels like an issue where with these different changes, then are there going to be some fundamental shifts in terms of our concepts of bystander privacy?

[00:39:59.640] Scott Stein: Yeah, and also the shifting ground of what the product is with updates. I think this is something that Meta does a lot with their products, sometimes in a positive way, where you go, oh, the Quest 2 has had so many software updates, and they open so many features that are experimental that the nature of the product is transforming from under you. And it sounds like next year, with some of these, AI camera tools, the nature of these glasses is going to shift. So my question, too, is like, when is it capturing info? Right now, we're talking about just photos and captures. But is it going to persistently be capturing information? It's the same question of the testbed of ARIA glasses. In the future, these things might be always scanning. And then What then? Are you covering up the LED? What does it know about your environment to get an AI awareness of whatever it's doing with the camera for intelligent vision analysis? That's a big extra unknown. That's not just like capture, it's like what data is being taken in of your environment, things around you, text, documents, whatever it is. It sounds like, I mean, starting next year, they're going to start having this thing do some stuff that I don't know about yet, you know, and I think those, talk about moving fast, you know, that's, it's not even the product that we think about right now. It could very well be in a couple of years that the Ray-Ban glasses are really kind of a different product.

[00:41:28.088] Kent Bye: Yeah, when I did a Ray-Ban Meta Smart Glass demo today, they gave me an opportunity to have a conversation with Meta AI. So then I was like, OK, what do I ask? And I was like, well, where is the nearest Whole Foods to where I'm at right now? wasn't feeding in the location at that point. And then I was like, it wasn't able to do anything with computer vision at that point. So it wasn't able to look at what I was looking at. So these are things that they're going to be adding over time. I think about these different aspects of computer vision and things that have a little bit more complicated privacy implications or whatnot, but that they're maybe taking a little bit safer approach. But as they expand it out, it's going to have these additional capabilities that are built into these assistants. So yeah.

[00:42:08.875] Scott Stein: Yeah, and we're here, like, it brings the point, we're at a developer conference. And so, exactly, ideally you'd hope that all of these AI tools and their permissions and what they can do, this is the real open doorway for where Meta's going. But that hasn't been fully explained or rolled out here. So, maybe mixed reality is a tool. But how do people, developers, how does anyone understand what Meta is really doing with AI on all fronts on these devices and their platforms? We're definitely in a state where we don't know. And even the developers don't know. And so that's exactly like it's still in a state of uncertainty. It would be nice at a point like this to get a clearer rundown on that, to get a better understanding of what is meta AI, or what they're calling it, I believe, meta AI or the AI assistant. What are the terms of that going to be for the glasses? Where can you access the AI on Quest 3? Where are the hooks for that? Right? It's a big unknown. And when I think about all the cameras on the Quest 3, all the possibilities, they're massive. It seems like, you know, whatever you have on the glasses, you're going to have all of those possibilities and more on Quest 3, although you're not wearing it around all the time. But you definitely have a whole bunch more opportunities in environmental understanding and perceptual stuff that It's like a dot, dot, dot. We just don't know.

[00:43:37.584] Kent Bye: This whole egocentric data capture, this ego 4D that they talk about, where that's the research to be able to do this first person perspective capture of all this data and what kind of innovations can happen when it comes to algorithms for artificial intelligence and machine learning to be able to digest and process and make sense of the different contexts and information. So it's certainly a research project that we're kind of going into. So yeah, I definitely see this sibling relationship between the XR and AIs and move forward. But I'd love to hear if you have any other comments on the actual MetaQuest 3 and some of your other thoughts on that as a technology platform that's going to be launching here soon.

[00:44:14.222] Scott Stein: Yeah, so, that's why I'm at Quest 3. I want to kind of know, I'm really curious to know what its, like, limits are, you know? I don't know if we can really know that at the moment. But it seems, conceptually, to be pretty exciting as a platform leap. And when talking to Qualcomm about the kind of bandwidth for the AI, similar to the glasses, because they're promising a similar, like, leap in AI or neural compute on those. It was suggested that it's like, again, it could be AI, it could be simultaneous processing, it could be taking some bandwidth of mixed reality plus your hand tracking to like get a deeper multiple understanding of things. Or it could be like fitness sensors, it could be like extra sensors, body... I mean, Hugo Soir was talking about like, you know, external trackers like we have on like Sony Mocope sensors or Vive tracker. Will we get into smartwatches? I mean, let's pause a little bit about this. You know, he said he doesn't want to overload the system, kind of an answer. But I think that's a big opportunity for Meta with fitness, that they kind of corner the market for now, because Apple seems to be really not addressing it for now on Vision Pro. And I don't know if they're going to address it. I don't even know if they're going to address it in this generation of Vision Pro, because it seems very stationary. And it seems very about exploring, not moving around. And that could change. But those are the questions I have about Quest 3 are like, I'm really excited about seeing how hand tracking evolves. Is this the generation where Meta's promised that maybe hand tracking could be the main form of interaction like Vision Pro? Can they make it good enough? Or how many iterations can they get to for that? Because I feel like I get the VR gaming part of it. That's like makes sense. You have so many things like PlayStation VR 2 that just have this hardwired connection and foveated rendering. that this can't do, but it's all the other questions of like the interactive environmental understanding. that I think could make the Quest 3 basically AR glasses before the AR glasses arrive. And that seems really exciting, maybe even more exciting in some ways than Vision Pro in the sense that Vision Pro has all the power, but Quest 3 has the affordability and the mobility. Somewhere between the two, I feel like the ideas will be solved, you know?

[00:46:46.272] Kent Bye: Yeah, there's a little dig that Boz had during the keynote where he said that this MetaQuest 3 is going to be the most affordable spatial computing device. And he very specifically was using the language of Apple in that moment to talk about the affordability. So certainly it's a completely different model that Meta's trying to... in some ways democratize these technologies and get it into as many hands as possible. And I think that certainly with what's happened with the from the DK1, DK2, CV on into the first consumer launches of the Rift, but now into the Quest has driven so much innovation in the field. So I feel like this is a like you said, this device that could be the best mixed reality AR device that we have out there because it is so affordable, but also it's got all these new technological features that are going to be able to help drive and push forward what's possible as this mashing together of all these things. And maybe as we're here with all these developers, there's someone that's here that's going to have that idea that Beat Saber didn't come out until May of 2018, and so first had the dev kits back in 2013, so that's a good five years of people tinkering around with the medium of VR before we had the Beat Saber that became this killer app, so maybe there needs to be another, like, five years of the mixed reality device before we have, like, the killer app of mixed reality that is going to drive people to say, ah, yeah, I definitely need to get that device for that use case.

[00:48:04.431] Scott Stein: Yeah. Apple's finessing the technological achievement territory. But everyone needs the dreamers and the weird ideas and the potent, where are those going to bubble up from? It's interesting to me, at $500, however they're making it work, they always try to make their stuff very affordable however they can. But that's a vast difference in what the Vision Pro is going to be. Like, there are inevitably not going to be that many people that are using Vision Pro. It's going to be like the Holland's magically, you know, world of like, hard to get your hands on tech. A lot of people, comparatively, are going to have Quest 3, and like, people could noodle around and play with some ideas. Like you said, like, think of some things that, and you can also sense, like, it makes you realize that, like, the dreamers in the space are what drives it. And big tech doesn't have all the answers. And you can tell that big tech's a little bit like, you tell us what the answers are. And I kind of feel like that's happening across the board. It feels like for meta, there's a little bit of that, like, we think this idea is cool. We can't wait to see what the developers do with it. You know, Apple's like, we think this is the future. Can't wait to see what the developers do. Well, it's like, who's going to be that person? Who's going to, you know? I think the ideas start with the art and the great ideas. And then the tech companies get inspired by this. So I think we're at a new phase where, like, the whole idea needs to be kind of rethought up and rebooted, mixed around. It's my favorite, like, there are great AR and VR creators out there already that are You know, they're indie, and they're dreaming, and they're tinkering. I think this is a more tinkery headset than the Vision Pro by far, just because it's affordable. So I think, like, some people are gonna pick it up and just start going, I got an idea, I wanna get it on its feet. I wanna prototype something, and it needs to be kinda messy now. I don't think it's anywhere near being, it's not finished. You know, Apple's presenting a very finished, seemingly finished model, but I don't think we're ready for it to be finalized like that.

[00:50:12.005] Kent Bye: And finally, what do you think the ultimate potential of virtual reality, augmented reality, mixed reality, and AI might be, and what it might be able to enable?

[00:50:20.753] Scott Stein: Oh, man, that's a massive question. I mean, I keep thinking about AR in a much more diffuse, perceptual, sensory way. Like, I keep dreaming about reading Ed Yong's An Immense World earlier this year, thinking about, like, animal perception. I'm like, what does it mean to have perception? Are we going to change the nature? Are we either enhancing our perception, or are we going to remodel the meaning of perception? This sounds very, like, very trippy, but I think AR Some of the fundamental things about AR seems to think about it, you know, they always talk about having superpowers or sensory superpowers. And it's not so much to me about, like, navigating an OS. It's like a persistent, wearable, ambient thing. And so if that becomes the future of AR, then VR and mixed reality need to meet it halfway. And then I think it's appointment-based, and some of it's ambient, and, like, There's a whole, not only does the whole world have to get mapped for it, but I think there's that, you know, like Niantic talks about reality channels and, you know, talks about, you know, the tuning into the different levels of the world, and I think there's that. But I think it may even be deeper and weirder than that. To me, it's no more weird than what social media turned into that we didn't anticipate. Mass-scale stuff should be fundamentally weird. But I think of it as perceptual. I guess that's the future to me, is we're very app-based now. And it sounds like the future of this stuff is going to become, as they point out to AI, is a layer. And are we going to transcend apps? rethink that. I don't know how you get to there. But it sounds like everything in meta is like AR world. It sounds like an AR layer with AI. And I don't know where the apps come in. Maybe it's augments, as they talk about, these little bits and pieces. So I think the future world of this sounds much more like a mix, like a weird, hot mix of stuff versus, oh, I'm opening my app, which I think has become the past 15 years.

[00:52:40.813] Kent Bye: Nice. And is there anything else that's left unsaid that you'd like to say to the broader immersive community?

[00:52:45.841] Scott Stein: No. I mean, think outside the box. I'm trying to read, and I'm open for book recommendations, or ping me on Twitter, or X, or threads, or wherever I may be. But I'm interested. in thinking where I'm not thinking. So I think, like, say the community, like, it seems like a lot of the stuff is trying to map out what's here, but I think it's also what's not here, and I think it's like taking models that you may already have, art groups like Nail Wolf or whoever it might be, or people working in AI art. Yeah, it's definitely a think outside the box moment. So I'm excited about that because I don't know that the models need to be what they currently are. It doesn't need to be locked in like this. So, yeah, excited to see what happens.

[00:53:26.747] Kent Bye: Yeah, well, it feels like we're on the threshold of a whole new paradigm of reality as we move forward with all these technology platforms and devices and intermixing of AI as the glue to it all. And yeah, I really appreciate all your different reflections on this, since you've been an OG of OGs looking at this stuff for a long time and checking out a lot of these different demos and trying to get a sense of where things might be going. So yeah, I very much appreciate this opportunity to help break down both what's happening here at MetaConnect, but also where things may be going in the future. So thank you.

[00:53:54.583] Scott Stein: Yeah, I'm glad to talk. And I often feel as confused as anyone else now. I think it's an interesting territory. It becomes less clear by the moment in a good way. Hopefully.

[00:54:07.006] Kent Bye: Awesome. So thanks again for tuning into one of my dozen episodes about MetaConnect. There's lots that I've been unpacking throughout the course of the series, and I'm going to invite folks over to patreon.com to be able to join in to support my work that I've been doing here as an independent journalist trying to sustain this work. Realistically, I need to be at around $4,000 a month to be at a level of financial stability. I'm at around 30% of that goal. So I'd love for folks to be able to join in, and I'm hoping to expand out different offerings and events over the next year, starting with more unpacking of my coverage from Venice Immersive, where I've just posted 34 different interviews from over 30 hours of coverage. And I've already given a talk this week unpacking a little bit more my ideas about experiential design and immersive storytelling. And yeah, I feel like there's a need for independent journalism and independent research and just the type of coverage that I'm able to do. And if you're able to join in on the Patreon, $5 a month is a great level to be able to help support and sustain it. But if you can afford more than $10, $20, $50, or even $100 a month are all great levels as well. And will help me to continue to bring not only you this coverage but also the broader XR industry. I now have transcripts on all the different interviews on the podcast on Voices of VR and in the process of adding categories as well into 1,317 interviews now that have been published after this series has concluded. So yeah, join me over on Patreon and we can start to explore the many different potentialities of virtual and augmented and mixed reality at patreon.com slash Voices of VR. Thanks for listening.

More from this show