#985: Facebook HCI Research on AR Neural Inputs, Haptics, Contextually-Aware-AI, & Intelligent Clicks

I participated in a Facebook press event on Tuesday, March 16th that featuring some Facebook Human-Computer Interaction Research on AR Neural Inputs, Haptics, Contextually-Aware-AI, & Intelligent Clicks. It was an on-the-record event for print quotes, however I was not given permission to use any direct audio quotes and so I try to paraphrase, summarize, and analyze the announcements through a lens of XR technology, ethics, and privacy.

I’m generally a big fan of these types of neural inputs, because as CTRL-labs neuroscientist Dan Wetmore told me in 2019, these EMG sensors are able to target individual motor neurons that can be used to control virtual embodiment. They even showed videos of people training themselves being able to control individual neurons without actually moving anything in their body. There’s a lot of really exciting neural input and haptic innovations on the horizon that will be laying down the foundation for a pretty significant human-computer interaction paradigm shift from 2D to 3D.

The biggest thing that gives me pause is these neural inputs are currently being paired with Facebook’s vision of “contextually-aware AI” that is presumably an always-on, AI assistant who is constantly capturing & modeling your current context. This is so their “Intelligent Click” process can extrapolate your intentions through inferences and aims to give you the right interface, within the right context, at the right time.

I don’t think Facebook hasn’t really thought through how to opt-in or opt-out of specific contexts or how third-party bystanders who revoke their consent and opt-out or if there’s even any opt-in process. When I asked about how Facebook plans to to handle consent for bystanders to either opt-in or opt-out, then they pointed me to an external RFP to get feedback from the outside community for how to handle this. I hear a lot of rhetoric from Facebook about how the fact they are in charge of the platform is allowing them to “bake in privacy, security, and safety” from the beginning, which sort of implies that they’d be taking a privacy-first architectural approach. But yet at the same time, when asked how they plan on handling bystander consent or opt-out option for their always-on & omnipresent contextually-aware AI assistant, then they’re outsourcing these privacy architectures to the responsibility of third parties via their RFP process, which has already closed for submissions in October 2020.

They also have been mentioning their four Responsible Innovation principles announced at Facebook Connect 2020 of #1 Never surprise people, #2 Provide controls that matter, #3 Consider everyone, #4 Put people first. My interpretation is that these are stack ranked because there’s language elsewhere that indicates that the “#3 Consider Everyone” specifically refers to non-users and bystanders of their technology (as well as underrepresented minorities). Part of why I say this is because there are other passages that seem to indicate that the people in “#4 Put people first” is actually referring to Facebook’s community of hardware and software users, “#4 Put people first: We strive to do what’s right for our community, individuals, and our business. When faced with tradeoffs, we prioritize what’s best for our community.”

LISTEN TO THIS EPISODE OF THE VOICES OF VR PODCAST

Here’s some research prototype videos that Facebook has released:

3/ Part 2 of 2 of a @FBrealitylabs video on their "intelligent click" that combines contextually-aware AI with their wrist-based EMG device, customizable neural inputs, & haptics.

Their Tech Blog:https://t.co/A0uP6pAhGo

My podcast context & analysis:https://t.co/nLRELjVBPC pic.twitter.com/2n1iqryFW5

— Kent Bye (Voices of VR) (@kentbye) March 18, 2021

This is a listener-supported podcast through the Voices of VR Patreon.

Music: Fatality

Rough Transcript

[00:00:05.452] Kent Bye: The Voices of VR Podcast. Hello, my name is Kent Bye, and welcome to The Voices of VR Podcast. So today's episode is going to be a little bit of an experiment, just because this past week, Facebook held a press event called Inside the Lab, where they gave access to a number of different researchers at the Facebook Reality Labs Research, talking about their neural input devices that they've been working on. They acquired Control Labs back in 2019, and they were using this EMG wrist sensor to be able to detect the firings of your muscles from the wrist. And from that, they were able to extrapolate all sorts of really detailed information that could be used as an input device. And as we think about the future of augmented reality, but also VR as well, input is one of the biggest bottlenecks in terms of having Something that goes beyond just like a 2d mouse. We need to have something that has a lot more flexibility to be able to have New ways of expressing our embodiment within these virtual environments and there's actually quite a lot they can do Including getting down to targeting individual motor neurons which in some sense allows this almost like infinite number of different permutations and adaptations and combined with neuroplasticity to be able to use these wrist devices to be able to Customize your own input device and that's the trajectory that we're going down and I I actually am very excited about that because it's really in contrast to a lot of what the other brain-computer interfaces work that has been done and not only from EEG, OpenBCI, and the fMRI. All these approaches are trying to get access to the brainwaves, but it's actually a lot of high-fidelity input that you can get from the EMG from the wrist. So I've previously done an interview with Control Labs and have a little bit of context of what they're working on, but also they were starting with really on March 8th, there was a interview that Mark Zuckerberg did with the information Information is a subscription-only website. I had a chance to talk to Alex Heath last year, talking about his reporting that he's been doing, Facebook as well as Apple, and all these different companies as they're working on augmented and virtual reality. They actually just recently started a daily newsletter where Matt Olson's been doing a really great job of covering a lot of the more business-side aspect of what's happening with all these different companies. Based upon a lot of the other reporting that is happening at the information, but also just digging up a lot of the different news of the day in a daily newsletter way. To kick off that new newsletter from the information, they had Mark Zuckerberg on because Zuckerberg reads the information. a fan of their work. But he also wanted to kick off what ended up being this public relations campaign around neural interfaces that started last week, and then this week they have the second edition. As part of that, they invited all the press to be able to listen to this. The caveat is that everything was technically on the record, however, because everybody is mostly working as a print journalist. They're not allowing me as a podcaster to record direct quotes, which is by and far my preference to be able to do an episode like this because I just want to be able to provide as much full context as I can. So in absence of that, I'm going to be just going over the presentation and paraphrasing things and just kind of trying to summarize kind of like how I do at the takeaways at the end of my podcast. And yeah, just in terms of expediency sake and, It's in the middle of South by Southwest, so I'm covering all these other things that are happening right now, as well. I feel like it's important enough of a story to dive into. Plus, there's other things that I have a unique perspective on, especially when it comes to a lot of the different aspects of privacy and Facebook's vision of what they're terming as both the intelligent click and contextually aware AI. Previously at Facebook Connect, they talked about this concept of egocentric data capture, which, in essence, is being able to capture everything that's happening from a first-person perspective. in order to build out these contextually aware AIs. For me, I'm more skeptical around the need for some of this contextually aware AI, or at least there's enough privacy concerns I have around it that it makes me question whether or not it's worth going to that extent to have all of these things. Basically, these on-camera, 24-7, recording everything around you, including all of what other people are saying and doing around you. To me, there were some questions that I asked at the end that I'll get into. I didn't necessarily hear a satisfactory answer. Despite the fact that Facebook is claiming that because they have access to be able to build the platforms from scratch, that they have full latitude to be able to bake in security, privacy, and safety in from the very start. which I admire that as a concept, but what I see in practice is different. Just as an example, they say that they have ethicists and privacy people that are embedded on the teams, but there was no ethicist or privacy person that was made available within the context of this presentation, and I asked some questions that were follow-up questions, and I didn't necessarily get answers. I'm hoping, though, that I'll be able to have access to be able to talk to, like, a neuroscientist like Reardon, who was one of the founders of Control Labs, I'm hoping that I'll be able to do an interview with him here soon, perhaps within the next couple of weeks, to be able to dig into some of my personal open questions around the privacy concerns. I know that Reardon has talked about before, in an interview with NPR, he said that you could take 30 seconds' worth of neural data from somebody, and not only would you be able to identify them, but you'll be able to identify them for the rest of their life. So our neural input signatures are so specific to our own identity that it's going to be potentially personally identifiable information. But I guess my question is, what kind of information can you infer from that, especially when you start to tie it into other biometric signals? That's sort of an overview, where I'm very, very excited, generally, about the concept of neural interfaces and neural input, and excited about what they can do with haptics. Generally, I think that the neural input device that they're saying is going to be a key part of the future of augmented reality input. However, there's just other aspects of how they're approaching this contextually aware AI and intelligent click that I have a little bit more caution. So that's sort of the overview. I'm going to just go through some of the different sections. The first section, Shrepp, who's the CTO of Facebook, he was just kind of introducing everything in terms of Facebook sees that there's both AI and networking, but also virtual and augmented reality is going to be this new computing platform. And the information actually reported that Facebook had about 10,000 of their engineers working on augmented virtual reality, which ends up being around 17% of their entire company. One of the things that Alex Heath said at the Clubhouse chat after they posted this interview with Mark Zuckerberg back on March 8th, Alex's take was that at the beginning of every earnings call, Mark Zuckerberg will get up and he'll say a bunch of stuff about virtual and augmented reality, but none of the financial reporters or anybody is really paying attention to him or reporting on it. It's almost like they don't believe that it's an actual thing. Actually, in that Clubhouse chat, there was a financial reporter that came in and said that. This AR and VR is just a tertiary side thing. It's different from their core business of advertising. I don't think that people really understand, in terms of how serious Mark is, that this is a new computing platform and this is the future of computing. Facebook really missed the boat with mobile, and they're really trying to not miss the boat on this time. They're investing literally billions of dollars into this. They're really diving deep into this and investing in all sorts of different cutting-edge research. At this point, there's not really a viable competitor to the standalone VR headset of the Oculus Quest. They're really set up, at this point, to just run away with the mobile standalone VR market. What happens in the PC market, I think, is still yet to be determined. But even the Steam reports are coming back saying that there's a lot of people that are using the Oculus Quest 2. or Oculus Quest 1 as one of the primary headsets, especially when it has wireless, using something like virtual desktop. Anyway, the point is that I think that there's a lot of people within the broader press that just doesn't think that Facebook is all that serious about. virtual augmented reality. Whereas, you know, I've been covering it for almost seven years now, and I certainly see that they're really going all in on this and that there's really not a lot of other competitors aside from probably Apple, but you know, Apple is going to kind of swoop in whenever they're ready. But other than that, there's HTC and there's valve and you know, a lot of other players, there's Niantic and Google, but Google's really focusing on AR and not as much VR, and then there's Microsoft, but they've really been focusing on a lot of the enterprise aspect. I think a lot of the applications for augmented reality are going to be in the enterprise. I do expect that Microsoft is going to have an advantage in that context, but whether or not Microsoft's going to be able to translate that enterprise into a successful consumer product, I think that's a little bit more of a huge open question. So, anyway, that's a little bit more of the context as to why Facebook is just really going all in on this, and they see this as a big part of their future, which I think is a part of what Shrepp was trying to set the context for, but also that the input is one of the most challenging aspects to augmented reality, of how to get really good input in the future of human-computer action and be able to have the evolution of that. A big reason why I think they wanted to bring this out is to just make some forward-looking statements. Maybe they're using this technological angle and getting other tech press to be able to get excited about this. But I don't know. We'll see how it continues to unfold, since there's certainly starting to get momentum within the Quest adoption. All right. So in the second section, they have Sean Keller, who is director of research of Facebook Reality Labs. And he's really trying to bring in a new era of human-computer interaction. So he's looking at the issue of the AI interaction problem. And he said that he's trying to bring in this next great paradigm shift of computing. He's working with computation neuroscientists, engineers, material scientists, and roboticists around this AI interaction problem. He said that they're really focusing on the wrist, just because there's so much fidelity in terms of the input that you can do there, and that people already have watches that they're wearing, which I think is a little bit of an allusion to some of the reporting that Alex Heath has done on the information. He said that part of Facebook's plan is to start to introduce some of this risk-based technology into some of their smartwatches and potentially have different health-related applications for some of this EMG sensor data to be able to compete with not only what Apple is doing with the smartwatch, but to use this watch as a nexus point for a lot of the human-computer interaction of the future. And a lot of the actual demos, actually, people are wearing these wristbands on both hands. So I don't know if people are going to start to be wearing it on both hands or, you know, as they move forward, eventually have these wristbands that have both the EMG detectors for the neural input, but also for the haptic devices as well. So, you know, the other thing that Sean started to talk about is this concept of the intelligent click, which the intelligent part is the whole AI aspect, which is the AI is paying attention to your surroundings, and it's trying to create this model of your context. This is what they've also been referring to as this contextually aware AI, which back at Facebook Connect, they started to introduce the concept of the ecocentric data capture in order to really build up this contextual information. You basically have to record all this information and throw a lot of machine learning at it in order to create these models and to make these inferences. And for what he says is to understand your intentions and your personal boundaries, but also these different contexts so that it's going to have a little bit more information for what types of tasks may be relevant to whatever your context is. Now, this is where I get a little bit more skeptical about this whole idea of contextually aware AI, just because we don't have artificially general intelligence yet. And there's a lot of limitations to machine learning and a lot of gaps for what ends up having to hoard immense amounts of information without having real robust conceptual models around that, especially when it comes to all the relational dynamics of what context even is. Here's a good example. If Facebook is talking about context, it's all about the spatial context, but there's lots of relational context or meaning context. And when you talk about context with your romantic partner, you could be talking about the context of your family, if you're working on a business together, you could all of a sudden in a moment be in the same spatial context, but all of a sudden now be talking about something that is work-related or related to your faith or your religion or your family or your identity or your finances. I mean, each of these are different contexts, and really knowing how to draw the lines between those contexts, I think, goes way beyond just what the spatial context is. So I think they're maybe underestimating what this concept of contextual-aware AI is going to need just because a lot of this contextual information actually is living within the conversational dynamics that ends up what in linguistics is referred to as pragmatics. The pragmatics is the contextual dimension to our language that as an example, the word script to a programmer, they assume that you're talking about like a JavaScript or coding. But if you say script to a film screenwriter, then they're going to be talking about this whole other concept of script. So pragmatics is the contextual dimension of linguistics. And I think there's similar levels in which that even if you look at common sense reasoning within AI is something that just hasn't necessarily been figured out yet, just because it's such a complicated contextual issue. So just throwing egocentric data capture at this contextually aware problem, I think is vastly underestimating, and there's so many privacy concerns there that I have, that I'm just generally more skeptical about a lot of that stuff. Actually, one of the things that Sean said is that they want to be able to build this responsibly, and they want to open up a discussion to be able to make sure that they can't solve all these ethical issues on their own, and so they're opening it up to the broader community to be able to get feedback. And so I took that opportunity at the end, and I asked a very pointed question, which was, at Facebook Connect, they talked about the need for egocentric data capture in order to achieve this contextually aware AI. And today, they said that people are actually a part of that context. And implicitly, there's going to be personally identifiable information with these other people, as well as your contextual relationship to these other people. And so how does Facebook plan on managing consent or any type of opt-in or opt-out dynamics for third-party bystanders that are part of this contextually aware AI? So that was my question, and their answer was basically that, well, that's why they put out an RFP during Facebook Connect, because they don't know and they want to get more insight. So to me, that was not a satisfactory answer, just because if you really want to claim that you're doing privacy-first architectures, or you're really baking in privacy from the very beginning, and you don't even have any concept for how to manage consent or opt-in or opt-out dynamics for this contextually aware AI that's going to be recording everything around you all the time, then I think there's some fundamental problems there in terms of how are you going to bake in privacy after the fact that you've already move down this specific architecture. Now, I do agree that they probably will need to have certain information around your surroundings. But I guess this is a thorny issue. They're saying, OK, this technology wants to exist. This is going to be very useful. In order to get there, there's a lot of these ethical and normative standards in society that they have to start to negotiate and deal with. So anyway, that's sort of my rant around the privacy concerns that I personally still have around this concept of contextually aware AI that they've been talking about. So that kind of wraps up that section with Sean Keller. So before they went on to the next section, they actually played a video that is also being released widely available. It's actually better to watch it if you haven't seen it, but if you are just listening to the podcast, then you can also sort of hear what they're saying in that. And this kind of gives a little bit of a sneak peek as to some of the other things that they're going to be covering for the rest of the presentation, specifically around the neural input, as well as some of the haptic devices and generally how they see how some of this AR input is going to be used within their devices.

[00:16:08.764] SPEAKER_04: With AR devices, we're asking the question, how do we build a computing platform that is truly human-centric?

[00:16:18.208] SPEAKER_03: Facebook Reality Labs is a pillar of Facebook. It's dedicated to bringing AR and VR to people, to us, to consumers of the world.

[00:16:27.054] SPEAKER_02: Every single new computing era requires new input devices, new set of interactions that really make this possible.

[00:16:33.999] SPEAKER_03: With AR glasses, I think the key here is to communicate with our computers in a way that is intuitive at an entirely new level. The wrist is a great starting point for us technologically because it opens up new and dynamic forms of control. This is where some of our core technologies, like EMG, come into play.

[00:16:50.294] SPEAKER_04: Neural interfaces, when they work right, and we still have a lot of work to go here, feel like magic.

[00:16:57.654] SPEAKER_01: So if you send a control to your muscle saying, I want to move my finger, it starts in your brain. It goes down your spine through motor neurons. And this is an electrical signal. So we should be able to grab that electrical signal on the muscle and say, oh, OK, the user wants to move their finger.

[00:17:16.243] SPEAKER_04: What is it like to feel like pushing a button without actually pushing it? That could be as simple as, hey, I just want to move this cursor up or move it left. Well, normally I would do that by actually moving. But here, you're able to move that cursor left. And it's because you and a machine agreed which neurons mean left and which neurons mean right. You're in this constant conversation with the machine.

[00:17:40.866] SPEAKER_05: This new form of control, it requires us to build an interface that adapts to you and your environment.

[00:17:48.148] SPEAKER_03: Everything starts with a click.

[00:17:50.700] SPEAKER_05: The intelligent click is the ability to do these highly contextual actions in a very low friction manner.

[00:17:57.586] SPEAKER_02: It's kind of the purest form of a superpower. You are in control, but the system is exactly inferring the right thing for you to control. All you have to do to operate it is just click.

[00:18:07.714] SPEAKER_05: So for example, if I'm cooking and I'm kind of pulling some noodles out of a box. The interface could ask me, would you like to start boiling the water? The wrist can also be a spot where the technology is communicating back to the user.

[00:18:20.579] SPEAKER_03: The haptics, the sensation of touch around us, this is part of how we learn and use motor control. It's critical to AR and XR.

[00:18:28.063] SPEAKER_00: We wondered, as you pull back the bowstring on a bow, if we tied that not to the tension growing on people's fingers, but rather squeezing on the wrist, would it add to that experience, like you're pulling back the bowstring? The answer is yes.

[00:18:45.393] SPEAKER_02: That future, it really is the computer that is seamlessly integrated into your day-to-day life. The next computing platform is the mixed reality platform, the one that really totally blends your virtual environment and your real environment in a seamless way.

[00:19:00.162] SPEAKER_03: We're in this moment where we can move from personal computing to personalized computing.

[00:19:05.045] SPEAKER_04: What if you and a computer agreed to design a keyboard together, and you type faster on it than anybody else in the world could type on your keyboard?

[00:19:15.162] SPEAKER_03: I think what this enables is the ability to not have to focus on a computer, on a phone, and be able to still interact with other people. It's going to open up a new generation of communication and access and navigation.

[00:19:27.285] SPEAKER_04: It leads to this phenomena of increased agency, of you feeling like a level of control you've never had before. We want computing experiences where the human is the absolute center of the entire experience.

[00:19:44.933] Kent Bye: OK, so the next section was with Riordan, who was one of the founders of Control Labs, and actually one of the founders of Internet Explorer, and then went off to get a PhD in neuroscience, and then went in a whole other direction of working with these neural inputs. Really fascinating character. I really hope I get a chance to chat with him. I'd love to get a little bit more of his story and get a little bit more context as to all these different things. But for me, this is what makes me the most interesting about these announcements. I do think that these neural inputs are going to be really, really quite compelling. One of the things that Reardon was saying is that the amount of neurons that are dedicated to your wrist as a result of your hands is absolutely enormous, one of the largest of anywhere else in the brain. There's so much potential for what he was saying, that you can actually target individual single neurons. With that, you can start to do all sorts of adaptive controls and latent controls. They even showed a demo of controlling a single motor neuron, which was meaning that they weren't actually moving at all, but they were firing this motor neuron that wasn't actually resulting in any movement of your hands, but at the same time was able to start to move different things within the environment. The idea that you could potentially go through this training process, which I presume is some sort of multimodal correlation between being able to see some sort of feedback and then do the actions and then eventually you just are able to kind of like have this thought control to some extent. So it's all good. going through these EMG and the motor cortex. So it's all around the motion rather than your intentions or thoughts per se. So really, really quite interesting to kind of see some of the different aspects of what they were showing. And they showed a number of different demos of being able to put the Control Labs detector on the wrist and be able to reconstruct the full movements of the hands, even if someone wasn't even born with all their digits and or have some sort of other deformities that As long as they're able to access to the wrist, then they're able to reconstruct what would presumably be a able-bodied hand. So that was really quite interesting to also see a sneak peek of some of the accessibility features of the neural inputs. But one of the things he said is that it's faster, has higher bandwidth, joyful, and is just overall more satisfying. And so being able to get to the point where you're able to express at the speed of thought without as much friction of even moving your body, which, you know, as we move into this new era of spatial computing, I think these different types of neural interfaces are going to potentially enable something that sounds really quite magical. But he was saying that with the number of different motor neurons and in some ways compositing them together in these different combinations, that is essentially infinite control to all the different computing technologies and really taking out a lot of the intermediary steps and the interfaces between man and computer just a lot of the fiction starts to erode, and you're able to do things that we could have never imagined before. And there are a lot of different aspects of the personalized impact of some of this data. In other videos outside of this talk, he talked about how just taking 30 seconds of data would be able to give personally identifiable information for you right now, but also forever in the future. They said that they're deeply committed to transparency and trying to find different ways in which they're able to handle this data responsibly. I'd be very keen to hear a little bit more about some of those challenges, especially from my perspective, when you start to fuse this together with other biometric information, say eye tracking or galvanic skin response or other information, your emotional detection. or even information that is from the environment that you're looking at. So if you're able to correlate what someone's looking at and correlate it to the environment, and then even though the EMG may be separated in that localized context of your hand, that EMG signal may not mean anything but in the larger relational context to all the other information that they may have access to, especially if you're in a VR environment where they could really see a whole bunch of like omnipresent information. then I guess being able to draw additional correlations where I would argue has the potential to have what Bretton Heller has called biometric psychography, which is to be able to not just only identify you, but to go into your personal interests and your needs and your desires. Stuff that's a lot more intimate in terms of your intentions and your values and your actions in the world that is trying to predict your behavior that could be used for different psychographic profiling so that I think is the concern and I'd love to hear a lot more about how That is going to be dealt with and that was some of the follow-up questions that I was asking but was not able to get any Specific answer beyond that. Well, it's just EMG. It's a localized context and you know, it's not revealing any of this personalized information or anything like that. So yeah, love to hear a little bit more information, especially when it's combined or recorded or fused together with other input data as well. Stay tuned to hopefully I'll be able to get a chance to talk to Reardon to be able to unpack that a little bit more because I do think it's very exciting to see what's possible, but also there are certainly some privacy concerns that are new. every new amazing possibility also brings amazing new complications that have to be dealt with as well. And I think that was a big theme that I was getting throughout the course of this discussion is that they're aware that there are a lot of these different privacy and ethical issues, and that they're trying to, in some ways, come to the press to be able to say, hey, we want to start to have different conversations about that. So I'm hoping that in the spirit of that, that I'll get a chance to talk about some of these issues into a lot more detail. All right, so another aspect I thought of in this section is that they start talking about the different ways that you can combine like the pinch of the four digits, the different types of squeeze or roll, flick or gestures. And when you combine all those things, then you start to have like this composite language that goes way beyond what we have with our existing either 2D tablet interface or a mouse or keyboard. What Riordan said is that it's going from a 2D mouse into this 6DoF mouse. They showed a demo of being able to control and manipulate objects at a distance. What I think of is the corded keyboard that Douglas Engelbart also demoed during the Mother of All Demos. It's probably one of the technologies that he demoed back in 1968 that not a lot of people have paid much attention to, but I've certainly watched some videos with Steve Mann, who walks around with a corded keyboard, which is essentially like what the stenographers use. And it's like playing a piano chord where you're pushing multiple keys at the same time. So you can kind of think about rather than when you're typing, you're pushing one key at a time and a corded keyboard, you have less keys, but you're doing more combinations. So I expect that we're going to see eventually a lot more of this type of corded keyboard interactions. And what Riordan was saying is that you're able to essentially train it to be your own personalized keyboard. So you can start to type at a speed that is your personalized keyboard that no one else could could even achieve. And so, but those combinations, I think, were key, because that's, I think, as I look at some of the different user interactions, you start to do the this sixth off interactions at a distance, you know, they have this demo called the force where you're telekinetically moving objects and being able to shift objects around. And just to have a neural input device that allows you that type of control of world building and you know, I just, I'm so excited to see where this goes. Cause I think it's going to open up so much possibility, the floodgates of innovation when it comes to input and like people really training what is essentially kind of like this shortcut keyboards. And when you are really proficient at a piece of software, you're able to have all these shortcuts. Well, this is kind of like doing these different gestures to kind of create your own shortcuts and create your own language to be able to communicate with the computers. So. Yeah. To me, I mean, think about the previous discussions that I had with like Jerrion Lanier in his book, he talked about the concept of homuncular flexibility and some research that he did with some research that he did with Andrea Stevenson, one Jeremy Balanson and Jimmy Lee talking about the homunculus is like this place that does the body mapping. And the flexibility is that you can start to map your body into non-humanoid virtual entities, whether it's octopus or, having extra tentacles, extra arms and extra whatever, like being able to map these individual motor neurons into these things that have these virtual representations, but also the demo of just kind of just setting out an intention. It's not actually moving anything, but you're able to have additional control. So kind of training your body to start to use these latent potentials of all these motor neurons that are there and start to use them in these virtual embodiments, but also to do tasks. So really wild, wild stuff. The amount of expressivity that people are going to be able to have, I'm just really excited to see where all that goes. Especially when I see stuff like at the We're All Shakespeare's Company and Marshmallow Laser Feast. They were doing a lot of live theater actors doing puppeteering of these different embodiments of different characters. Cobweb or a set of moths that are doing this different flocking behaviors based upon the different gestures So those different types of gestures but moving it more down to not just you know moving your body around but moving your Individual motor neurons and your wrist to be able to do all sorts of other virtual embodiment So just the very beginning of lots of really exciting stuff. Okay, so that's the end of that section Next up was Jorge Benko, and he's a director of research science, and he was talking about the adaptive interface, which is what they're considering to be this combination of augmented reality input as well as with this contextually aware AI. So this is where they started to get into things like intent prediction and real-time contextual user modeling. So like going through the four steps of understanding the user context, inferring the goals, adapting the interaction, and then having the user input. So I'm going to have a direct quote here, cause I think this is important to flesh out a little bit here. Cause context comes up again and again and again here, because as you have these contextually aware AI, what's that mean? Well, it means the following it's always available, augmented reality interface. It has to be different. Facebook believes that it has to mediate how you perceive the world and fundamentally has to be more aware of your environment and your context and be responsive to it. It needs to present the right information at the right time and hide it as long as it's no longer needed. So this is where you get into like this always on, we're going to be recording everything around you and be aware of this context. And so, yeah, when I think about Helen Nissenbaum's contextual integrity theory of privacy, she really emphasizes how privacy is really context dependent. And so my concern is that this always on just context free, we're going to be recording everything independent of whatever context it is, is absent of knowing when to kind of turn that on and off. to be able to start to pay attention and train this AI on all this stuff, because it's such an intimate presupposition that I don't know, like this almost seems like a lack of context to say that we're going to be always aware of your context because it's just recording everything all the time. Whereas if you're in the doctor's office or you're, you're talking to, if you're just talking about sensitive information, at some point, do you turn off these always on glasses that are going to be also recording all this information? So. Yeah, I think there's gonna be certain tasks where you want this contextually relevant information if I'm doing a task of like at work Turn it on while I'm at work But if I'm not at work, then do I want that on when I'm you know having a conversation with my wife? So it's stuff like that where I don't know. I think it has to be more context-specific rather than context-independent and more bounded because I don't know, like I said, it's so fluid in how you control the boundaries between these different contexts. There's a historical context. I mean, common sense and AI research is a huge problem. And I think at the core of it, it relates to a lot of these contextual dimensions that we just have these associative links that we make unconsciously, and we don't really even think about it. And it hasn't been necessarily formalized. And so I think context is in this kind of realm. It's very fuzzy. It's very much into the mind of the user again, like I said before. So I don't know, I get, this is the area that starts to give me a little bit more of a pause in terms of, you know, going down what is potentially a dark path of striving towards this contextually aware AI, where I'm a little bit more of like, well, do we really need that? Or can we maybe put it into a little bit more of a bounded box rather than just having it record everything all the time? So, yeah, so that was the end of the section from Banco. All right, so the next section is from Tanya Yonker, who is a manager of research science, leads research on adaptive interfaces. And so she talked about this concept of an intelligent click, which is essentially what I think of in some ways. It's like a glorified Clippy, where Clippy was coming up as like, Oh, do you want to do this? But a lot more powerful in trying to give the right suggestion at the right time. But in order to do that, it has to have three different things. It has to have an AI that understands your current context, an adaptive user interface that is able to shift around to whatever your existing context is, and to be able to give the right information at the right time, as well as the lightweight input that is coming from your wrist. So again, there's a lot of ways in which that there has to be lots of different solutions for the interfaces to be able to interact it, but also these inference solutions to be able to extrapolate what that context is and what your intentions might be to be able to give you that right interaction at the right time. And she also took a little bit of a peer into the future where right now they're starting with the essentially like this binary single bit input of this contextual assistant to giving you this intelligent click as you're like cooking. It says, okay, do you want to start a timer? As an example that showed in, okay, you just click the timer. And it's reading that information based upon the recipe that you're doing, but also kind of putting that timer over the bowl that you need to pay attention to as you go off and do other things. And so being able to do that type of contextual relevance of overlaying information in different interfaces. In the future, she said, in the next section that we'll get into, is that the haptic interfaces, the ways in which that not just like text that you're reading, but also ways that you could start to have this communication with haptic devices that are giving you different feedback on the wrist to simulate different aspects of haptics, which seemed pretty cool that they were able to do as much as they were. I was kind of surprised to see that. I can't wait to, at some point, check out some of those different haptic devices on the wrist. Going back to the Intelligent Click and the new AR and UI shell, but also moving forward, they're looking to this concept of the interaction superpower. Having contextual assistance, but higher bandwidth input and proactive optimal assistance. the AI that really understands what you really need, and almost like mind-reading type of thing. Getting into how much do we want the Facebook AI, did the measurement of having proactive optimal assistance to us all the time. But eventually, this concept of the adaptive interface that they're calling it, Having something that's the hyper personalization that's very unique to you in your own situation that you're able to basically remix All of these different things and the final quote that I'll read as a verbatim quote She said ultimately this is a vision of a fully adaptive system that directly blends the digital into your real world Allowing you to move seamlessly between the two with minimal effort, which I think is pretty accurate Like it or not, this is sort of the trajectory of the technology. And I think for me, it's just a matter of like, what are the auditing tools? What kind of algorithmic transparency is into these AI assistants? You know, can you edit them? Or, you know, it's just, I don't know, like, it's, it's, you know, it's, I don't know. I guess for me personally, I'm very interested in the VR aspects, but for some of these aspects of the AR, I'm not quite sure I'm fully bought into having these always-on devices. I can certainly see them potentially having them within specific contexts that are bounded, but yeah, that's just my gut reaction so far. So that's Tanya Yonker talking about the intelligent click and the adaptive interface. And the final section, they did a little bit about some of the research they've been doing in haptics and adding different haptic devices on the wrist. So Nicholas Colonese, he's a manager of research science there at Facebook Reality Labs. And so the haptic haptic device is interesting just because when you're touching virtual buttons, then how do you give some sort of feedback? And what they found is that through the phenomena of sensory substitution, which I originally talked to David Eagleman about when he was doing neo-sensory vest, he was talking about turning your torso into an ear. And what he was telling me was that as long as you get the right signals up into your brain, then you can start to do all sorts of like remixing and mashup of different information so the brain is like this pattern recognition machine and it's taking all this multimodality sensory input and if there's synchrony between the signals that you're seeing then the brain kind of like extrapolates it out so even though you're touching these virtual buttons but if you feel the haptic feedback within your wrist then your brain kind of interprets that as if you're touching something Really weird. I really look forward to seeing what that actually feels like my own direct experience of having that type of sensory substitution But it sounds like it's pretty compelling that they're able to do quite a lot of different stuff and they showed some more experimental prototypes that they're not actually releasing a lot of the videos there, but The Tambi and another actuator required a real big compressor that would need to be miniaturized in some ways, but finding different ways to give pressure or to squeeze the wrist. And they're able to do things like climb a ladder or feel like you're pulling back a bow and arrow and giving you resistance. And generally, even like they showed a picture of someone putting their finger over some bumps and having the haptic device respond to that and then give you the feeling that you're actually feeling the textures of that. Now there's synth touch which gets really fine grained nuances of the textures and you know really need your fingertips to be able to see that so I don't know if they're able to kind of simulate the degree of the sensitivity that your fingertips have, say, just from the wrist input. I guess I'm skeptical that that's possible, but who knows? Maybe it is able to get that. But my understanding, at least with the fingertips, is that it's such a high frequency input and that the fingerprints actually are pretty key of determining some of that. So I don't know if you'll be able to get the same type of fidelity by doing this kind of sensory substitution there at the wrist. But some of the other aspects of the haptic interfaces of the vibration squeeze the push turn pull virtual objects in space and you know, they're just still getting started with all this stuff and Nicholas said that they've been doing research for about four years now of this type of risk-based haptic. It sounds like a really promising area. Like I said, probably the miniaturization and the commercialization of some of that is probably one of the bigger aspects. But if you're already having all these other aspects of the EMG detector on the wrist, then what kind of battery life would you need to be able to have an always-on device that you're wearing all day? So anyway, that's some of the haptics information. And then they moved into a Q and a section to kind of wrap things up here. I'll just sort of briefly talk about some of the questions and answers. So there was a question around whether or not, you know, there was a timeline in terms of where this was going to be end up in any products. And that was actually asked twice the first time they said, it's just early research. There's no technological roadmap that's been announced yet. And then. Alex Heath later asked whether or not the wristwatch would be a way to start to introduce some of this technology, and they also refused to answer that because it was coming back to a timeline issue. But based upon some of Alex's reporting, that does seem to be a good potential path of having a wristwatch that includes some of this information that could be used for medical purposes. to be able to start to introduce this at more of a mass consumer scale that eventually will be a fully fledged EMG neural input device for augmented reality, but they would need to sort of do some more iterations to get it out and miniaturize it and productize it and everything else. So to me, that makes sense, but they were not committing to any of that at this discussion. I asked my question, which I talked about earlier, which was, how are you going to be dealing with the bystanders, the people who are not using the technology? How do you allow them to either opt out or not have all their private information be wrapped up into your contextually aware AI, the differences between what is your information with other people's information. I think it's a really thorny, tricky problem. They kind of punted this to say, Oh, we, we submitted a RFP back in September that actually closed back in October of 2020. And so I'm not sure if they got any feedback or input or if it's still open or if they're still taking feedback on that. But that to me was a little, like I said earlier, just unsatisfying because it's like. this is their responsibility, and in some sense, it feels like they're kind of outsourcing it. Ethics does involve lots of different views and perspectives, and you kind of do need to engage in these different conversations. And I guess I just don't see enough evidence that that conversation is going on, or there was a more evolved thought other than, hey, we'll let other people figure that out through this RFP process, rather than putting forth their own ideas of how that model for privacy or you know, just having a more evolved vision of their own philosophy of privacy and how other relational dynamics happen. They have done some research with ARIA where they would blur out different faces, but there's lots of different ways to have personally identifiable information. So anyway, the whole aspect of the bystanders, I think, is a pretty key aspect, not only from a privacy perspective, but also from just an information architecture. And it's kind of a paradox that you'd have to almost identify people to know that they don't want to be identified. You know, it's sort of like if you're going to have some way to identify people to say, OK, well, their preference is actually to not be identified. So it's a little bit of a chicken and egg paradox there to really implement that. So I don't know how they would necessarily start to approach that with their vision of this contextually aware AI. What's it mean when you're around other people who don't want to be a part of your context? Or at least they don't mind being in your context, they just don't want to be in the context of the Facebook AI that's trying to make all these other inferences about not only you, but all the relationships of who you're connecting to. So I think that, to me, is probably the least clear. And the question that I asked, I didn't feel like I really got a satisfactory answer, like I said. I don't know. Maybe we'll have more information as we move forward, and they'll have more information that they'll be able to present back from whatever RFP information they got from these different proposals that have been coming out. That happened back in September. But again, Thomas Meckenser has talked about this pacing gap, which is that the pace of technology moves so fast that the conceptual frameworks around them are lagging behind so much. And so they are dealing with this issue of they're blazing forward at a really quick pace, but the policy frameworks and conceptual frameworks to be able to understand it it doesn't necessarily keep up, which is a big part why I'm personally involved in different efforts like the IEEE's Global Initiative and the Ethics of Extended Reality to try to have some sort of consortium to have some of these different types of discussions. But still, the interface with the companies and how the companies have their own side of the market dynamic of them trying to innovate with all the other existing market dynamics. But what is the tech policy that needs to help constrain it or at least bound it in some way, especially when it comes to all these different discussions around a federal privacy law? This certainly is an area that feels like it's very nebulous when it comes to some of the boundaries of ethics and privacy. especially when it comes to their vision of this contextually aware AI. So yeah. Some of the other questions, you know, what about the information overload? And obviously that's a huge issue of just peppering with people. It's like the Clippy from Microsoft, you know, just persistently coming up and annoying you. And how do you deal with the overload or the design to be able to make sure that you're not overloading people? So that's certainly a challenging issue there. And also a question around accessibility. And there's certainly a ton of different accessibility. And it came back to this concept of the ultra low friction interactions and how the latency of these interactions are going to be so low that you're going to start to just. Be able to what Reardon said is that to put people at the center of the interactions of the computer and that all sorts of quick iterative adaptations have the potential to happen. Once we have these neural interfaces that we could see this rapid evolution of ways that we're able to interface with competing. And yeah, just generally in this, this is devices that they're going to be able to rewrite the rulebook in terms of being able to bake in all these different features from the very beginning. So that's all that was covered there in the press conference. And like I said, I'm hoping to have some follow-up discussions, because I did send some follow-up emails, but I didn't get any specific answers specifically around the fusion of other information and to extrapolate that when you have correlated all sorts of other biometric information to be able to collectively have this idea of the biometric psychography, which is that you're able to extrapolate the user's likes, dislikes, preferences, and interests And yeah, having the neural interfaces on the periphery and the edge compute and, you know, whatever other things with differential privacy, or, you know, someone had asked this question around, you know, what happens when you have this information translated into information, and there was a reemphasis that this was just EMG motor information, it's about movement, it's like, if you were moving your hand in the glove, it would only only be on that level. But my take is that when I talk to neuroscientists, and the principle of embodied cognition, the way your body moves is the way that you're thinking. So you can actually extrapolate a lot of information from motor information and movement, especially when it's in the context of other information within an environment. So I don't necessarily buy that just because it's muscle information or movement data that it doesn't give interesting insights into our thought patterns or other information that could be considered this biometric psychography. So I guess I'm skeptical of their dismissal that it's just EMG and that the movement data could actually be correlated back to deeper aspects of body cognition. So anyway, hope to have some more conversations about all this. Uh, so stay tuned. We'll see. And, uh, yeah, just, uh, thanks for listening. It's a little bit of an experiment to try to, in some sense, do a recap of a presentation that they gave without being able to actually use any of the direct audio. So anyway, that's all I got for today. And. I wanted to just send a thanks to all my Patreon supporters to support the work that I'm doing here on the Voices of VR podcast. And if you'd like to become a supporting member and to help me continue to bring this coverage, then please do consider becoming a member of the Patreon. This is a listener-supported podcast and I do rely upon donations from people like yourself in order to continue to bring you this coverage. So you can become a member and donate today at patreon.com slash Voices of VR. Thanks for listening.

Play episode

#985: Facebook HCI Research on AR Neural Inputs, Haptics, Contextually-Aware-AI, & Intelligent Clicks

This is a listener-supported podcast through the Voices of VR Patreon.

Rough Transcript

More from this show

#987: The Neuroscience of Neuromotor Interfaces + Privacy Implications with Facebook Reality Labs’ Thomas Reardon

#814: Neuroscience & VR: Using Muscles & EMG for Neural Interfaces with CTRL-labs

#444: Developing a New Eye Interaction Model with Eyefluence

Menu

Play episode

#985: Facebook HCI Research on AR Neural Inputs, Haptics, Contextually-Aware-AI, & Intelligent Clicks

This is a listener-supported podcast through the Voices of VR Patreon.

Share this

Rough Transcript

More from this show

#987: The Neuroscience of Neuromotor Interfaces + Privacy Implications with Facebook Reality Labs’ Thomas Reardon

#814: Neuroscience & VR: Using Muscles & EMG for Neural Interfaces with CTRL-labs

#444: Developing a New Eye Interaction Model with Eyefluence

Menu

Share this