Sean Dougherty and Jeffrey Colon are Access Technology Experts who work for the LightHouse for the Blind and Visually Impaired non-profit, and they were very active at the XR Access Symposium during a number of sessions focusing on the NSF-grant funded SocialSense XR: Making the Invisible Visible program, doing a review of Vision Accessibility with AR+AI Tools, and facilitating a group discussion about Exploring Accessible VR for Blind Users.
This was the final interview from my trip to XR Access Symposium, and a fitting end to my 15-part series on XR Accessibility because we explore both the many challenges of making XR accessible for blind and low-vision users, but also some of the many exciting possibilities for how XR can be used as assistive technologies to help solve real problems. Mobile-phone based AR apps that integrate AI and computer vision features are already seeing a lot of early adoption within their communities, and there’s lots of work that’s happening on the VR and virtual meeting front (including Zoom) that has the potential to feed into more AR assistive technology features in the future that help make the physical world more accessible.
You can read a rough transcript of my interview with Sean and Jeffery down below.
There’s still lots of work yet to be done with XR Accessibility, but hopefully this 15-part and 8-hour series has helped to map out out the landscape and contextualize the work that has already been done and also what is yet to be done in the future. Again, here is all 15-episodes of this Voices of VR podcast series on XR Accessibility:
- Shiri Azenkot on founding XR Access
- Christine Hemphill on defining disability through difference
- Reginé Gilbert on her book about Accessibility & XR Heuristics
- Christian Vogler on captions in VR & potential of haptics
- Six interviews from the XR Access Symposium poster session
- Dylan Fox on the journey towards XR Accessibility
- Liz Hyman on the public policy POV on XR Accessibility
- Mark Steelman on accessible XR for career exploration
- W3C’s Michael Cooper on customizable captions in XR
- Joel Ward on challenges with government contracting for accessibility and live captioning with XREAL glasses
- Jazmin Cano & Peter Galbraith on Owlchemy Labs’ pioneering low-vision features for Cosmonious High
- Liv Erickson on intersection between AI & Spatial Computing for Accessibility
- Ohan Oda on upcoming accessibility AR features in Google Maps
- Yvonne Felix on using AR HMDs as an assistive technology for blind and low-vision users
- Sean Dougherty & Jeffrey Colon on the challenges and opportunities in making XR accessible for blind & low-vision users
This is a listener-supported podcast through the Voices of VR Patreon.
[00:00:05.412] Kent Bye: The Voices of VR Podcast. Hello, my name is Kent Bye, and welcome to the Voices of VR podcast. It's a podcast about the future of spatial computing. You can support the podcast at patreon.com slash Voices of VR. So this is episode 15 of 15. It's the last of my series of looking at XR accessibility. And today's episode is with Sean Doherty and Jeffrey Colon. They both work at Lighthouse, and they're looking at low vision and blindness in the context of immersive technologies. Yeah, just a really fascinating exploration for what they're doing with the National Science Foundation grant and just generally how to make virtual and augmented reality, which are both very visual mediums, how to make them more and more accessible. So they walk through some of their different experiences with screen readers and existing tools and some of the different initiatives and projects they have, and generally the excitement of both themselves and the blind and low vision community for the potential for using virtual and augmented reality as these assistive technologies. So that's what we're covering on today's episode of the Voices of VR podcast. So this interview with Sean and Jeffrey happened on Friday, June 16th, 2023 at the XR Access Symposium in New York City, New York. So with that, let's go ahead and dive right in.
[00:01:25.502] Sean Dougherty: My name is Sean Doherty and I work at Lighthouse San Francisco. I work as the manager of corporate relationships within access technology on our team.
[00:01:35.011] Jeffrey Colon: Hi, my name is Jeffrey Colom. I'm director of access technology at Lighthouse and on the access technology I oversee three areas. DeafBlind training, individual and group training. on everything access technology, so smartphones, iPads, computers, smart speakers, wearables, and also on the fair area we do corporate partnerships. So we partner with different organizations, stakeholders, community partners, startups, big players in the field. And to work on accessibility could be like working on their website. It could be providing corporate training. It could be designing a new concept. So we work multiple areas on that.
[00:02:23.200] Kent Bye: OK. And yeah, I'd love if each of you could give a bit more context as to your background and your journey into doing this type of work with the intersection of technology and accessibility.
[00:02:32.740] Sean Dougherty: Yeah, so for me, I've spent a lot of my career working in the education technology space at a couple of different non-profits. So one of those non-profits was focused on helping reduce student loan debt through kind of a unique model. It was like a customer service model to help college students gain work experience, and we were able to pay them tuition assistance. And so that kind of got me started out early in my career in the education space. I then moved into the education technology side, working with app developers focused on K-12 education. And that included apps all across the K-12 tech ecosystem, as well as apps focused on some AR and VR applications, such as Expeditions VR on Chromebooks and classrooms and areas like that. I kind of pivoted into the accessibility space. I'm a low vision user of assistive technology. So I have an eye condition called cone dystrophy. So I use a lot of screen readers, spoken content tools, zoom magnification tools. So as a user, I really felt like there was an opportunity to help other people with assistive technology and accessibility. And I pivoted into this space. A lot of these tools are used in EdTech and in K-12 classrooms. So it was kind of a natural pivot. I was already familiar with some of the softwares that were working on dyslexia, text-to-speech. And I came to Lighthouse about two years ago to work on the corporate partnership side. So like we were mentioning, doing app and website work with developers to make their experiences more accessible, working on user testing. And given our location, that we're in San Francisco, We get access to a lot of the Silicon Valley startups and tech companies that are working on things like autonomous driving, now with generative AI tools. So we get a lot of early exposure to these technologies, and we want to make sure that our community has an opportunity from the beginning to be involved in the user research, the testing, to help inform features and functionalities to make sure that they're accessible from the start instead of waiting till later down the road.
[00:04:37.092] Jeffrey Colon: For me, wow, I've been working in the field like 23, 24 years. I am originally from Puerto Rico, so I started in the field working in Spanish, helping consumers. I started to provide, at that time, like people working on Windows, like with screen readers, then started to work as an access tech specialist. providing tech training to people with multiple disabilities. Even more than blind, low vision, I started to work with people that are deaf-blind, people that have cognitive issues. So then I started to move more into the accessibility part of it, and then more focused on blind and low vision, then started to work on a company called Freedom Scientific that now is called Vispero. They are one of the major companies in the access technology field. They developed a screen reader called JAWS. So I started to work with them for eight years. Then I started to move into the States. I moved to the States and started to provide work on different organizations like checking accessibility on multiple pages. Also with the government of my country working on making sure that some of the websites are accessible to blind and low vision users. I started to work at Lighthouse in 2020 as an access tech specialist and then moved to the director of access technology position in late 2021.
[00:06:06.134] Kent Bye: And I'd love to hear a little bit more context for how technologies like virtual reality or augmented reality intersecting with accessible technologies, where that started to come onto each of your radars, for where either VR or AR started to come into the picture for the type of work that you do.
[00:06:22.844] Jeffrey Colon: Wow. For us, I can tell you that we started to see, because we have been working with AI for a while, like AI tools for reading that solve a problem that we have, like we cannot read printed materials. So we started to see that incorporation of AI. And then we started to see more applications integrating augmented reality into it. So we're starting to see like object detection and then you can use your fingers to explore pictures or headsets to get to your environment. So we're starting to see that problem-solving, moving to other experience like accessible games. And then we're starting to see more collaborations with partners on developing accessible environments that include VR. And then we got approached by NSF to work on this partner to make sure that VR environments are inclusive. So yeah, it was like a natural conversion for us working with the AI tools that we were working and now to start to move into the AR and started to be at the beginning of this exciting technology making sure that it's accessible.
[00:07:37.317] Sean Dougherty: Yeah, we've been seeing this incorporated in a lot of the tools that we've been using for quite some time. There's apps called Optical Character Recognition, or OCR, which are apps that can detect text and objects and give audio output to blind or low vision users to help them explore their environment and better understand it. And we're now starting to see with some of those tools, such as seeing AI that's developed by Microsoft, They're now incorporating an AR functionality where you can take a picture of the environment, you can explore it with your finger, and every time you interact with an object in that environment, it will identify what that object is. Even transpose a text box above it that will tell you what the object is, and when you touch it, it will explain it to you with speech output. and they're also incorporating haptics when you're feeling your way around the screen and vibrations and also sound with immersive audio when you touch those objects you're hearing a sound notification. So we've been seeing it in those areas primarily but now with some of the new launches in the space around hardware such as Apple's Vision Pro and focusing that on AR as well as VR. We're starting to see some possibilities for that technology to be more accessible. We're trying to work on a couple of projects in that area. I think developers are going to need some time, but hopefully by using things like Apple's ARKit, and other standards, they can basically transition that visual content into non-visual information. And that will be primarily through haptics as well as immersive and spatial audio.
[00:09:08.121] Jeffrey Colon: And also I think that standards will help us that they're working on developing some kind of guidelines because standards usually when these technologies are so new will take time to develop and they develop kind of itself until they're ready to then multiple organizations will start converting into it and then they start to accept it. So right now, that process started. So we're seeing more guidelines that allows developers to follow those and start creating those accessible environments. So this is the time where we are right now. So that part will take time. But we're seeing now more developers trying to make sure to follow those initial guidelines and to try to get common sense and work together in implementing those guidelines that allow us to create more accessible environments, like a multiple experience like extended reality or XR that allow us to immerse in different areas like AR, augmented reality, virtual reality and mixed realities.
[00:10:09.087] Sean Dougherty: We're also seeing an excitement from the blind and low vision community for these technologies. The users that we work with and support in the community really want the opportunity to demo these technologies and give feedback. That's not always available and sometimes organizations don't know who to reach out to. So we try to support there with user testing, but there's other organizations that can help as well. to really make sure that blind and low vision individuals and users with other types of disabilities are involved in the conversation and can really give that feedback. Because some developers try to create filters and experiences where maybe they can experience some vision loss or other types of impairments when evaluating VR, AR technology. But there's no better way than working with the users themselves to really get their direct experiences and feedback. And ultimately that will inform an accessible experience. And usually we see when these types of users are considered that the design that comes out of it is more universal and really benefits everyone. And so we're excited about that.
[00:11:12.807] Kent Bye: Yeah, I was talking to a representative from Google on, and he was showing me some of the guidance features that were being integrated into Google Maps to do the virtual positioning system and to give directions and to where different landmarks were. And one of the things he said was that, folks who have low vision or blindness might be getting a little overwhelmed with all the different technologies that are emerging right now. I'd love to hear some of your reflections on, like you're talking about how the phone plus AI has been a part of your community for a while, but have you seen an acceleration of the different types of applications and products and companies that are coming to folks who have low vision or are blind, asking for feedback on their technology? If you've seen like an acceleration of these different types of developments so far.
[00:11:56.820] Jeffrey Colon: That's a great question. I think that we see that with multiple organizations. And yes, definitely it's a problem, because sometimes users are still trying to understand. And then next day, they see chat GPT, GPT-3, GPT-4, BAR. And then they say, oh, I will be able to use all of these applications. This is too much for me. And that could be overwhelming. And sometimes, like some organizations wanted to, Like Sean said, they wanted to get feedback, but they don't know where to reach out. And sometimes, maybe they reach out to a person that is still understanding what they need. And maybe that feedback they receive will not be complete or will lack because the person is still learning. So yes, we have encountered situations like that, or developers that are trying to understand whether it's if they want to immerse in this So, yes, we're seeing that situation and what we try to say is collaborate with the developers, collaborate with the organizations on multiple areas, making sure that accessibility is taken from the beginning and making sure that they challenge themselves. how we can create an inclusive design. Because I think that that's more important thinking about how if I made my environment accessible, then I will get as many users as I can. And we recognize that. And we're seeing that with some applications as they try to integrate accessibility. But in the way that they do is that they describe everything and then two or three minutes as even me as a power user I was probably will go away because maybe I'm hearing like multiple audio sources describing me like 80 gestures like Sean just smiled, Charles just thumbs up and Jack just moved to the right. So imagine all of those 80 announcements plus my screen reader talking. So that's when we try to tell them, hey, let's try to rethink this in a way that is more inclusive, more natural. And that's why we call natural audio and how to feedback placement. So it is more easier to navigate, especially because we need to take a consideration that maybe The person that will be using this is a 60-year-old person that is starting to get on technology, and we don't want to overwhelm some users with that. So things that we're seeing and sharing, we can elaborate on, like customizations. We're recommending developers to work on customizing their products. users can have their opportunity to decide which areas they want to focus on. If they want more audio feedback, if they want more head to feedback, and if it's audio description, how descriptive the application will be, and on the low vision side, other customizations that you can finish on.
[00:14:52.118] Sean Dougherty: Yeah, I think the big tech developers have done a really good job of integrating accessibility now into mainstream devices. And so there are a lot of assistive technology tools out there, but users now have a lot of the features that they need built in directly to those devices. And I think that's been incredibly important for the industry. It's also helped hold developers accountable because basically if the big players like Google, Apple, Microsoft are building these tools and capabilities into their products, in many cases they're holding their partners to those same standards. So if you want to incorporate into an app's marketplace or ecosystem with one of the big tech companies and have your app or tool available on those devices, you have to adhere to some of those optimizations and standards as well. So I think that's really helped move the industry forward. Sometimes, though, it can be challenging for users that rely on multiple assistive technologies and maybe don't fit one kind of given use case or box, and sometimes we see some challenges there. An example I can give is for myself as a low vision user, I rely a lot on the spoken content tools, which are kind of like the lighter text-to-speech tools. And I also use the zoom and magnification tools. But sometimes I also like to turn on voiceover, which is the full kind of screen reader on iPhone. And sometimes those technologies are not the easiest to use together. Like if you basically turn on voiceover and you're using it for audio feedback, and then you want to zoom in, it's a little bit tricky to navigate some of those zoom tools on an iPhone or iPad, for example, if you're also using voiceover. Sometimes these tools are kind of built like there's a certain screen reader for blind users There's a certain screen reader for low vision users But some users want to use both and incorporate them and sometimes challenges can kind of arise and so just using that as an example, but sometimes we see that with other assistive technologies as well that Depending on the user's preferences, there can be some competing pieces to the tools, and sometimes they don't always work well together. So we're trying to encourage developers to think about that and getting back to the user preferences and allowing the user to have multiple options that they can work with that meets their needs at the end of the day.
[00:17:05.683] Kent Bye: Yeah, and Sean, earlier today you were on a panel discussion talking about some of the different work that you're working in collaboration with other academics as a part of this larger National Science Foundation funded program where my impression is that you're trying to take different things that are happening in these virtual spaces that are maybe even Zoom calls that are trying to identify different gestures or emotions or making things that are generally invisible and somehow making them more visible. So maybe you could give a bit more context for what you're doing on this project to try to give a little bit more contextual and relational information to these virtual spaces using some of these assistive technologies with either blind or low vision populations.
[00:17:45.028] Sean Dougherty: Sure. Yeah, we're working on an NSF grant project that we've been involved with for several months. And we're partnering with XR Access, which is part of Cornell Tech. And we're also partnering with Benetech, which is a nonprofit that focuses on document and content accessibility. They have products like Bookshare that are used widely by the community. So Our organization, Lighthouse San Francisco, is collaborating with XR Access and Benetech on this grant. And there's a couple of different things we're focused on. Right now is not a set of standards and guidelines that are fully defined for XR accessibility. There is the Web Content Accessibility Guidelines and the W3C that set those guidelines in place. I mean, they have thought about XR Access and have some kind of best practices. I mean, a couple of working groups that are thinking about this technology, but they haven't fully laid out those standards. And so, essentially, we're putting together some prototypes that can be tested by users, and we've conducted some user research, and we have focused quite a bit on current meeting environments and use cases. So Zoom experiences, virtual conferences, presentations as the current technology, but we're also thinking about the future of this technology and how we can transform these experiences and effectively deliver them in an XR environment. And so we're looking at things like social cues. and how to basically identify those cues, which is primarily visual information, and communicate it to users in a non-visual way. So this will be delivered through haptics, through audio cues, different types of sounds, and so basically what we're trying to test is what are the most effective sounds, and how do we deliver them in a way that's informative to the user for kind of scanning the room, or getting information on who's engaging with them, who's paying attention, what are those reactions. But it's not distracting from maybe people that are talking in that virtual space. Or maybe if the user is actually presenting a presentation and they're a screen reader user, they might be listening to their screen reader for their presentation notes. They're also trying to listen to maybe the chat that's taking place in the room, or users that are entering or exiting that space. And so you have a lot of competing audio information that can be a little bit overwhelming to the user. So even though we find audio to be incredibly useful for accessibility of XR, we're trying to think strategically about the right sounds for these types of cues, and then also things like audio ducking. So like, how do you lay in this audio in a way that is effective for the user and not overwhelming and kind of stacking on top of each other?
[00:20:23.198] Jeffrey Colon: I think that we're seeing, like Sean explained, we wanted to make sure that we can provide those options that we can partner with organizations and make sure that we can cover and we can give the users the opportunity to identify the areas that they want to work on. So with that National Science Foundation project and also partnering with companies like Benetech, making sure that we can cover all of the areas that we will cover, like Sean explained, like gestures, like what's happening in that environment, and making sure it is naturally accessible and not overwhelming. And also working with other partners, like the other areas that needs to happen on that environment are accessible, like if the person encounter an object, that that object is described. So, it's just an immersive experience that works with recognition of gestures, recognition of those visual aspects, but also working with the other part of the platform, the object, what's happening out there is also accessible. So, that's the challenge and it's very incredible to partner with these organizations and make sure that we can cover all of the areas.
[00:21:37.698] Kent Bye: And it sounds like there's a common theme here that the phones are a pretty crucial assistive technology for lots of folks who are disabled, whether it's low vision or blind or deaf or hard of hearing. And so we have a lot of mobile phone users. And now as we move into the head-mounted, augmented, and virtual reality headsets, there's a couple of things. One is that the Apple Vision Pro was just announced, and there's a lot of excitement. I think that a lot of those accessibility features that are baked into the core operating system of iOS are going to potentially be integrated into something like a head-mounted display like Apple Vision Pro. The Android accessible features haven't really necessarily been transferred over into the virtual reality experiences from MetaQuest Pro because a lot of things are in Unity or Unreal Engine. It's a little bit of a black box in that sense. And so there's also this other dilemma of It's essentially a bit of a chicken and egg problem where a lot of the virtual reality headsets aren't really accessible for folks who have either blind or low vision. And so not a lot of users. And so because of that, there's not a lot of feedback. And then there's not a lot of accessibility that's being developed. But at the same time, you had mentioned on stage that you had at least a third of your low vision folks that had at least tried it in some capacity. And so I'd love to hear some of your initial thoughts on virtual reality as a medium and the barriers of accessibility and how you see that continuing to evolve and grow and develop as we have other players come in like Apple with Apple Vision Pro?
[00:22:59.708] Sean Dougherty: Sure. Virtual reality is a highly visual technology, so that poses a huge challenge. It's very visually dependent. And so, yeah, I think there is an opportunity with things like Vision Pro. It does appear that Apple plans to bring over voiceover and some of those other built-in accessibility features that are available on a Mac and on an iPhone into that environment, into that AR environment. But one of the challenges is around the haptic aspect. We're a little bit unsure about how haptics are going to be incorporated in the current form factor. We haven't had a chance yet to get our hands on a Vision Pro, so we're hoping to do some testing with it at some point. But I do think that there is an opportunity to hopefully incorporate haptics into that device. But yeah, in our research and in our community, we do have some users that are trying out some VR gaming and are enjoying it. I think right now, the technology is a little bit more accessible for low vision users as compared to blind users, depending on the form factor, the headset you're using, and the experience. In some cases, for blind users, there's not a lot of feedback that they're able to get. If that application of VR is not using a lot of audio, spatial audio, or is not using any haptic feedback, it might be hard for them to understand what exactly is happening in that experience, and it might not be the best user experience. But for some low-vision users in our community that we've worked with, they actually find the experience to be quite engaging. And the reason why is because with the screens being so close to their eyes in some cases, it's actually easier to see that type of screen in a headset as it is to look at a large TV across the room. And so by having the screens closer, And this is very situational. It depends on the user, their specific eye condition. But in some cases, it can kind of compensate or offset a little bit of that vision loss. And they can actually see the screen relatively well and engage with that experience. But I think the gap is still with blind users. And I think that's the area, at least in our community, where there's the biggest opportunity, I think, to further adapt this technology.
[00:25:13.540] Kent Bye: Yeah, Jeffrey, have any thoughts on virtual reality?
[00:25:16.442] Jeffrey Colon: Yes. I think that, like John said, this is still the beginning. And we're seeing in this conference, for example, the Alchemy Labs, some of the things that they did with converting their game and giving an accessible experience. We also see some other games, for example, on the PlayStation 5, on that platform. We're seeing some good accessible games on that. And yes, it's just great. How can we make sure that being inclusive? Like Sean said, how can we make sure that blind users can benefit from that experience? How can we make sure to integrate audio from the beginning? Because if I'm using a headset and I am working on an application, if I need to activate my screen reading from the menu, I cannot access the menu. So I will be stuck until somebody helps me. So how can we make sure that that is happening from the beginning in a way that is accessible? So that's why maybe the vision processing, how can we import those accessibility settings that we already have in the Apple ecosystem, in the Google ecosystem, So make sure that we import those from the beginning so the experience is more accessible for everybody.
[00:26:34.613] Kent Bye: I wanted to get your thoughts on the development of the technology where you're able to potentially prototype some of these different accessibility features in a virtual world like Cosmonius High where you have a fixed virtual environment, you know what everything is, they're able to have these different accessibility features and how some of those different user interface patterns may eventually come over to augmented reality with head-mounted devices where as you walk around the physical reality, you're able to maybe have some of the similar type of functionality for either low vision or blind users. So I'd love to hear any reflections of this workflow. We already have a lot of mobile phones that people are using already, but as things go on to these head-mounted devices, if you foresee this development of these user interface patterns for accessibility for VR, eventually coming over to augmented reality and head-mounted devices.
[00:27:23.143] Jeffrey Colon: I think it will happen. I definitely think it will happen. And like the Alchemy Labs, people were mentioning yesterday, it will depend on the accessibility model. And like some of the things that Sean mentioned and I mentioned, so how can we make sure that those guidelines are there? developers can have the accessibility framework so for example on the swift platform on apple that those accessibility barriers are very well defined so developer can take advantage of those whereas in the google platform or other platform ps5 that that accessibility model is well defined in combination with the guidelines that allows the developers to follow up education to the e-developers, so creating those educational models that developers can see, and with advocacy, blindness organizations, or other people with disabilities organizations, allows them to make sure that developers understand how important this is for the entire community. So, yes, it's a combination of accessibility framework, right, the model, and working with the developers to use the model, education, and advocacy from the users for that to happen. We are at the beginning, and we are seeing that. We're seeing organizations involved in this conference that we just participated. people from the FCC, we saw people from W3C, and this is very important because if we do it together, and developers see the value of this, I think that there is a good opportunity to make things right from the beginning, whereas if we wait too much, then we will have to, in the accessibility work, we call it retroactive feeding, so we will have to basically retroactive for accessibility, and that will be complicated. So that's the way that I perceive it. It's a combination of making sure that accessibility framework is in place from the developers to the users to know how to advocate in order to make this correct.
[00:29:24.187] Sean Dougherty: I just wanted to add, I think it's really important to focus on the current tools that are being used and what's working well with those tools and then what is the VR equivalent or XR equivalent of those tools. So for example, blind and low vision users are really familiar with screen readers and if they're on mobile devices, They're using swiping gestures on the screen. They're using double taps to interact with things. They're getting audio feedback, and there's a little bit of haptic feedback involved as well with those gestures. If those users are on a computer, they're also using a screen reader, but they're primarily using keyboard navigation, and they're getting audio feedback. So those are kind of the standard experiences with those types of mainstream technologies today. I think the challenge is that You know, blind low vision users need the screen reader experience that works well in XR. But what does that look like? Does it involve gestures with your hand? Does it involve a controller that has haptics? You know, what is the form factor that makes sense? And so I think That's where the user research and feedback comes into play. Because one challenge in the industry are what we call overlay tools, which is essentially when a developer creates a suite of accessibility tools that are overlaid onto the platform and are available. Yeah, and they're available to the user, but the challenge is those tools are usually not based on necessarily what users are accustomed to using and what they prefer to use. And so it's kind of perceived as a helpful tool set, but then when users go to use them, those tools don't really work well or they're not what they're expecting. And so we're trying to avoid that type of experience in XR. We're trying to help build tools that are based on what users are familiar with on the AT side with screen readers, with zoom magnification tools, and an XR equivalent. And so I think it will take some time through research for that standardization to kind of take place to find something that's natural and works well.
[00:31:18.155] Jeffrey Colon: Yeah, I think that that component is very important and adding that user feedback model. So I need to advocate for myself, but also they need to hear me understanding what I used before in order to make this to work. We don't need for them to reinvent the wheel. We just need them to follow what we have and make it, transport that into the VR, into the XR reality so everybody can take benefit of what we have. And maybe it will be, like Sean said, maybe it will be from factors, controls. It will be a combination of how to feedback. So give the user that opportunity to make sure, like one of the accessibility principles is make sure that we can use multiple tools, that multiple tools, that we can use multiple tools in order to navigate.
[00:32:04.601] Kent Bye: So you just gave a whole presentation going through a lot of these artificial intelligence and machine learning-driven tools that are more on the phone-based augmented reality side, where you can take a picture of a scene, get an object description of what's in there, but also this natural language input, ChatGBT. There's a lot of explosion of what's happening with both computer vision and large language models and the ability to have this conversational interface. And so you also just mentioned looking at what works. And it seems like there's a lot of stuff that entering past a critical threshold of utility, where you're able to have all sorts of new capabilities with the confluence of all these tools. So yeah, I'd love to hear any reflections on what's happening with this intersection of AI, machine learning, and computer vision and AR with phone-based tools right now.
[00:32:50.774] Sean Dougherty: Sure. I think a lot of these tools work really well for current use cases. So like optical character recognition or OCR apps, they use cameras on smartphones to be able to process text and objects within environments. So they're using AI to be able to assess what those objects are. And some of these applications have incorporated some AR aspects as well. It seems to be working well. for that use case, but the challenge is bringing this into the VR or XR space and basically creating an equivalent because the thing is OCR works pretty well, but it's kind of hard right now to use it in real time. So right now, like a lot of applications such as Seeing AI do a really good job of this, but they work best when you actually capture a picture of something and then you can kind of explore that image with your finger and understand what are the objects in that image. but the challenge is to do it in real time in like an XR space. So I think we'll see this technology hopefully continue to evolve and be able to have an OCR equivalent experience that in real time can assess what are different graphics and visual elements in an XR experience and be able to communicate that to the user beyond just a basic description. So I think we need something a little bit more advance in that way. But I think it's an area of opportunity. It's just also a little bit of a challenge right now.
[00:34:16.645] Jeffrey Colon: Yeah, definitely. I think that we're seeing that movement happening. We're seeing the evolve of technology. For me, being like 24, 25 years in the field, I have seen like in the beginning, for example, everything was computers, computers, computers. And if you are lucky, computers. So, you don't have too many options there. And the learning curve was huge. So, you see people waiting for months in order to get their training because the instructors were just occupied trying to teach them the keyboard, trying to teach them how to use the computer, navigate, all of the commands, all the four to close. So, now then we transfer, like we talked this morning in the conversation, we transfer to the phone. And now we're seeing the learning curve is less. so we can teach a blind user how to swipe left and right, to move between element, double tap an element, and they don't need to worry about all of the computer keyboard commands that they have to learn in the past. They will use it if they go on the computer side, but now the learning curve is easier on the iPhone. We're seeing that also on computers with low vision users. What the learning curve was a lot, learning all of the commands, working with the mouse, enlarging the mouse, making sure that the focus is retained and then doing that balance to change the font size and the fonts do not get disturbed and you don't see pixelated letters. And then we change now and we're seeing naturally applications that allows you to use large text by default. You don't need to do anything. We're seeing dark mode integrated into application. So we're seeing how the technology is evolving and we're seeing that beginning on the, like Sean said, on the virtual experiences. And we are excited to see what's happening. Like Sean mentioned, those challenges in real time are very important. So if we can work on that, we're seeing now all of those applications working on making sure that the real-time experience is more easier. Easier is not like before, like it just reads right away and then you have to move and every time that you move it reads again and then you get confused. So they're working on that. So we are excited about all of these opportunities that are coming. And we want to make sure that users are there because this is for me as a user, but also it's more important for our community to make sure that they can take advantage. And if they wanted to use it again, it's inaccessible. So that's the more important thing for us to make sure that we can help.
[00:36:47.062] Kent Bye: Great. And finally, what do you each think is the ultimate potential of all of these immersive technologies from augmented reality, virtual reality, AI, with accessibility in mind and what it might be able to enable?
[00:37:01.311] Sean Dougherty: Particularly with AR, I'm really excited about the ability for a user to get more feedback on their environment, particularly for things like orientation and mobility. I see use cases with maybe like as the current technology advances, the headsets get kind of slimmer and slimmer or ultimately become just a pair of glasses that's really integrated that a user can explore their environment. and get real-time information on objects, on locations, navigational directions, and then also can even adjust their environment. Like for myself, I'm a low-vision user that has a really high sensitivity to light, and so I rely on really dark lenses, dark sunglasses. I'm always kind of experimenting with what are the darkest lenses I can find, and those are usually like sportswear glasses or ice climbing glasses and things like that. But maybe in the future with AR incorporated into different types of headsets, I might be able to adjust the lighting in my environment automatically even when I'm outside. I might be able to get it to a level that's even darker than any available lenses on the market. And so I think that ability for users to kind of adapt their environment to their own needs at any given time could be incredibly useful. And this might be adding things to your environment, or it might be taking things away that are maybe a distraction or you want to kind of block out. So I see that as a really big opportunity. I mean, there's a lot of use cases for AR and VR, but particularly for the disability community, for us to be able to get more information on our environment and also adapt our environment when we're on the go for our needs, I think is an incredible opportunity.
[00:38:41.028] Jeffrey Colon: For me, it's the same. Going to what Sean said, I'm very excited about what I'm seeing on technology. I've seen developers working on making sure their environments are accessible. I see in my community coming to us, hey, what about GPT? This is something that I can incorporate. This is something that I can work with. So I'm very excited about that. I'm very excited about. having that opportunity to naturally move from the computer to a phone, to classes, to having that environment with me at all times. So I can decide how I wanted to use it, if I wanted to use my wearables. Maybe for me, it could be a headset that I will have, and I will hear that immersive experience when I am walking. and I am moving, and then it will tell me when I point it to the right, it will tell me what is out there with the glasses. Or maybe when I am moving to the right, it will give me half the feedback that I'm getting closer to the curve. Or a user that is working with the guide dog as a companion, so working with that as an opportunity to use a complement to that navigation. I'm very excited about that. I'm very excited to see, like Sean said, maybe it will be an opportunity for me to darken my environment. Or maybe the contrary, if the environment is dark, maybe I want it to be lighter for other users that have conditions that need bright lights. So having that opportunity to work with that as a blind user to get the audio that I want. As a deafblind user maybe we will use ProTactile to let the user know with some feedback that they can feel in their body to let them know what's happening when they're moving. So I'm very excited to what the future holds for technology. I think that XR in combination like all the XR experiences is where the technology is heading. And we saw that in the CSUN conference, in the assistive technology conference in California. We saw even the developers telling us, hey, I think that what I'm presenting this year will be very different of what I will be presenting in the next two years when we think about AI, VR, AR, and the XR experience. So I'm very excited about what the future holds.
[00:40:58.646] Kent Bye: Is there anything else that's left unsaid that you'd like to say to the broader Immersive community?
[00:41:03.228] Jeffrey Colon: No, I just, at least for me, I just want to thank you. I just want to thank you for giving us the opportunity to share this with the community and know that we are working. We want more people, like XRX has said, we want more people to enjoy this experience. So find our channels, find Lighthouse, San Francisco, find XRS. And other companies, let's encourage other companies, like do the same, our Alchemy Labs, that we need that advocacy. And also developers, we want to work with you. We want to make sure that the experience is easier. This is not something difficult. This is something that we can work out together. And if we collaborate from the beginning, the experience will be great. And even users will take more advantage of that. So just want to say thank you.
[00:41:54.382] Sean Dougherty: I would encourage developers in the community to just be bold and seek out different types of users and really look for that feedback. And even if it means that the users are critical of experiences or certain aspects of the tool are not working well today, that will help inform the opportunity for what needs to be fixed and updated to make these experiences more accessible. And you don't really know that until you engage with users of many different backgrounds, experiences, disabilities, use cases. So really seek out different user types and don't try to put your solution in a box to say, oh, it's only an XR application for a particular use case. Maybe try to think of other use cases that you haven't even considered and other users that might be able to take advantage of those use cases. And it will ultimately create a better XR experience that works for everyone.
[00:42:44.065] Kent Bye: Awesome. Well, Sean and Jeffrey, thanks so much for joining me here on the podcast to help share a bit of your own journey into working in this intersection of technology and accessibility and all the really amazing work that you're doing with XR and accessibility with Lighthouse and lots of exciting tools yet to come, but also quite a lot of work still yet to be done. And yeah, thanks for joining me to share your perspectives and insights. So thank you.
[00:43:05.272] Sean Dougherty: Thanks for having us. We really appreciate it. We really appreciate it. Thank you so much.
[00:43:09.158] Kent Bye: So that was Sean Doherty. He's the manager of corporate relations within access technology at Lighthouse in San Francisco, as well as Jeffrey Colon. He's the director of access technology at Lighthouse. So I've had a number of takeaways about this interview is that first of all, Well, I think the overall vibe that I get is that there's a lot of excitement about the potential for where this can go for both virtual and augmented reality as an assistive technology, but also that there's quite a lot of work that still needs to be done in order to make VR and AR as an accessible technology for folks who are blind or low vision. This is probably the one area where when it comes to accessibility with XR, that has the most work that needs to be done. I think they were both excited for some of the pioneering work that was happening with Cosmonius High and what Alchemy Lab is doing with starting to create a prototype of what a screen reader might be like if you start to translate that from 2D into more of a 3D spatial environment with different objects. How do you start to move your hand around and start to choose and see the different objects that are in the room and some of the properties, some of the states, and how to interact with them in different ways. So, yeah, some of the cutting edge for what is going to be possible with these technologies. Again, as I started to play through the Cosmones High, I don't think it was necessarily at the point where someone who is completely blind would be able to play through the entirety of the game. I do think that the feature that they implemented is more for folks with low vision, because I think there's still a lot of some of the spatial context that you get from looking and seeing at the scene that isn't all being translated into the audio just yet. So I still think there's some work to be done there and continue to playtest it and to see what are some of the other features to really close that gap. Again, I wanted to shout out this project called SceneVR Toolkit by Yuan Zhao, which was mentioned by Sherry Azenkot as this Unity toolkit that was adding different things like a magnification lens, bifocal lens, brightness lens, contrast lens, edge enhancement, peripheral mapping, text augmentation, text-to-speech tool, and depth measurement. So these are all things that were being added on top of these immersive experiences. And so Sean, as a low vision user, was saying that he uses a variety of different tools, including the magnification tools, as well as the screen reader tools. And for him, he likes to mix and match them. And sometimes they don't necessarily play nice together. And so I think the idea here that he's trying to emphasize is that there's a number of different tools and technologies and a number of different needs. And sometimes given the context, he prefers to use one or the other, or sometimes both at the same time, but sometimes they conflict with each other. So just the same, I wanted to shout out the Seeing VR toolkit because there are different things like the magnification lens that I would have found really quite helpful as someone who uses glasses. It definitely impairs my vision if I use VR without my glasses. And as I started to play through Cosmoneus High, I noticed that I was able to get a gist of what was happening in the world, but I would have benefited from having a magnification glass where I could still get a closeup of some of the different texts because it was quite blurry and I had to kind of put my face right up to different objects and it would have been nice just to have like a magnification glass within the context of the environment to start to use. So yeah, these different types of tools and just thinking about how to start to translate from 2D into 3D and Yeah, just figuring out how to deal with the information overload and use haptics. They were both concerned that the Apple Vision Pro had no haptics because that was something that would be an additional modality to start to handle some of the different information overload that can happen if you start to have all these different types of audio cues. So yeah, they're doing a number of different types of research just trying to see how can you do this kind of audio sonification of these different gestures and these different meetings just to get more of a sense of what's happening in the social dynamics of a room and getting different emotional feedback that they would normally be getting if they were able to see everybody in their body language. And yeah, just to try to balance this information intake of having these symbolic translations of trying to code with spatial audio some of what's happening in the scene around them and to kind of have this way that as they're in this world, they can decode what those sounds are. So this was one of the different demos that was being shown as well in the context of the XR Access, where they start to do this type of sonification of these different gestures and movements. Yeah, it also reminds me of the, Lucy Jang did the whole talk on the video descriptions, which is being able to have this narration of describing what's happening, kind of like the image tag for images, but within videos describing what's happening in the videos. And yeah, right now the platforms don't really allow for an audio description channel, but if they would allow multiple channels, then they could start to have the ability to upload multiple channels to be able to have these types of audio descriptions as well. So there's stuff like that where the technology infrastructure and the platforms are not necessarily supporting these things. And so anything that would happen would have to be kind of a custom middleware to inject something in there. And also requires oftentimes the content creators to be able to provide all this additional context as well. describing all the objects in the scene. And yeah, just like you would have to write image descriptions for the alt image tags for images on the web. This is just doing the equivalent of that, but for all the objects are in the immersive experience and be able to describe them for someone who can't see them. So, and one of the different presentations that Jeffrey and Sean did on the last day was to just kind of walk through a lot of the different artificial intelligence phone apps that they've been using. They kind of walk through demonstrating some of them, the ways that they work, sometimes the ways that they don't work, but yeah, just to be able to snap an image and to use this kind of computer vision with an augmented reality type of applications to start to identify different objects in the photo as they take a picture of a desk, just to see what the objects on the desk are. So just the same within the context of virtual reality, then this is something like Cosmo News Hi has started to implement the ability to scan your hand around the room to see what kind of objects are, or if you're teleporting, just to give more context as to where you're teleporting to. And yeah, lots of different specific text options and contrast and for colorblindness, the existing virtual reality checks, the VRCs for if you want to get something into the Quest store, they don't have any requirements for people who are low vision or blind. Most of the stuff around vision is around color blindness, but not necessarily around either blindness or low vision. So this is an area that's ripe for innovation to start to see what does this screen reader for 3D look like and what works and what doesn't work. Christian Vogler, again, making this caution that you can't just take all the user design patterns in 2D and just throw it into a VR and expect it to work the same. So they're both very keen to see the approach for what Apple has been doing with the Apple Vision Pro with having the assistive technologies that they have within iOS start to be put out into a context of virtual augmented reality for folks who are blind or low vision. There's a lot of excitement that they have, but they also want to test it out because they want to see that if there's something that isn't quite working, then they want to try to address that as soon as possible. But because they're both located within San Francisco, then they're at the nexus of interfacing and being an early adopter of all these different technologies and help to spread the word and do trainings and teach other folks who are blind or low vision for how to use the technology, but also for folks to make the technology more accessible for folks who are blind or low vision. So yeah, this National Science Foundation grant, there's a whole panel discussion about that. That'll be a video as part of XR Access where they start to dig into some more of the details for some of the specific things that they're doing there. But yeah, I just really enjoyed being able to sit down both with Sean and Jeffrey just to get a sense of their own experiences around what their struggles are with it comes to the technology, but also they both genuinely were very excited about the potentials for where this could go because this does seem like they could start to use some of these different technologies as an assistive technology and just allow them to have more freedom and autonomy as they move about the world and perhaps to increase their amount of agency and interactivity as well as there's different ways that augmented reality can start to have that type of impact on how they engage with the physical world. But You know, virtual reality as this experimental playground to push the edge for what's even possible technologically, because it's a little bit more bounded constraints with the virtual world, where the creator of the virtual world has a deep understanding for everything that's there, whereas that's not always the case with physical reality. So there's just a lot of opportunities to push the edge, what's possible technologically. And yeah, they just are generally trying to understand how to bring these types of optic character recognition tools or other AI tools. You know, there's a lot of apps that they're using on their phones as a real assistive technology. And then is it going to be some type of middleware device that is going to also have some type of screen reader that's able to plug into all this data again? the data either it's using type of computer vision AI and to extrapolate or it's going to be drawing directly from the metadata of these different experiences. One of the things that came up in Cosmonius High was that there was no platform level option within the operating system to signify that this was a user who was blind or low vision. If that was a choice and a flag that was made at the operating system level, then that was something that could potentially be passed along into these different applications. And so that when you launched into them, they can directly launch into these different modes without having to turn anything on. So stuff like that, where it's able to track you at the deepest level of the operating system, but also have the special features that are being revealed. So. Yeah, I think it's an area where that's ripe for innovation to push forward what's even possible and to integrate all these different modalities. And I think, like I said, generally they're both excited, but also want to just make sure that as these applications are being developed, that developers and the entire industry starts to really figure out how to make XR more accessible to this type of screen reader of these 3D spatial environments and what that's going to start to really look like at a systemic level, not a one-off level like Cosmonius High, but what are the different types of tool sets that are going to be implemented either at the operating system level of the devices themselves or some type of middleware or other application that's able to add some of these tools into these existing experiences. So yeah, ideally, I think it would be at the core operating system level, just like you're able to do magnification in the context of a 2D app. Would there be an equivalent type of functionality that could be inserted in So yeah, the architecture of all this is still yet to be determined. But like I said, it's just really great to be able to hear both from Sean and Jeffrey to hear about some of the challenges that still need to be overcome, but also the possibilities that both they and their community are really excited about where this could go in the future. So this is the end of my deep dive into XR accessibility. Going to the XR Access Symposium was a catalyst here and digging into a few conversations from my backlog. And like I've mentioned in other podcasts, I went back and added transcripts to all of my previous interviews, all 1,236 interviews now that are live on my site have rough transcripts. They're not verified. They're not clean. There's likely inevitably going to be errors in each of them, but there's at least. these little buttons that you can click that can go directly to that point in the audio where you can start to listen to it and verify the transcript based upon what you hear and what's being said. And yeah, I just feel like the accessibility is something that is one of the biggest open challenges and problems in the context of virtual reality. It's something that is going to require a lot of design innovation, a lot of creativity. As Shiri, as I call it, said, you know, this is something that can be really fun and really push the edge for what's possible. It's kind of like a green field in the sense where you can start to push limits, do things that haven't been done before, but also to look at the existing legacy, things that actually really do work and start to figure out how to create the equivalent in some of these different immersive experiences, which is a point that both Sean and Jeffrey made a couple of times in the context of this conversation. So that's all that I have for today and I just wanted to thank you for listening to the WSIS VR podcast. If you enjoyed the podcast then please do spread the word, tell your friends, and consider becoming a member of the Patreon. This is a WSIS Reported podcast and I do rely upon donations from people like yourself in order to continue to bring you this coverage, so you can become a member and donate today at patreon.com. Thanks for listening.