Team training scenarios are often difficult to schedule due to the logistics involved coordinating many different people’s schedules. One solution has been to use virtual humans as stand-ins for actual humans in team training scenarios where the conversation is mediated by Wizard of Oz interactors who are puppeting the virtual humans. The goal is to recreate a sense of social presence so that the person being trained forgets that they are interacting with virtual humans instead of actual humans.
LISTEN TO THE VOICES OF VR PODCAST
There still needs to be a human in the loop to be able to interpret and respond to the primary person being trained, but the human interactor operating the virtual human can respond by selecting a number of pre-recorded scripted responses. Even though real humans are almost always preferred, using virtual humans can give more accuracy and repeatability to the training scenario, and provide similar strong results with more efficiency.
Andrew Robb is a post-doc at Clemson University, and he’s been researching how to use virtual humans in these types of team training scenarios. Specifically, he talks about training nurses to stand up to surgeons when they want to proceed with a surgery despite replacement blood not being ready yet, which would put the patient’s life in danger if there’s a complication in the preparation process. These are complicated social dynamics and if a nurse isn’t comfortable in speaking up, then it could result in a patient dying. So Andrew has been focused on how to recreate a sense of social presence using virtual humans in order to create a team social dynamic that allows nurses to get practice and training so that they have the confidence to speak up against someone on their team who wants to violate safety protocol.
Andrew mentions a paper by Frank Biocca and Chad Harms titled “Defining and measuring social presence: Contribution to the Networked Minds Theory and Measure, which sets out some definitions for social presence and a networked minds theory for understanding the mechanics of social presence in a virtual environment. They say that: “Most succinctly defined as a ‘sense of being with another in a mediated environment’, social presence is the moment-to-moment awareness of co-presence of a mediated body and the sense of accessibility of the other being’s psychological, emotional, and intentional states.”
Networked Minds Social Presence theory describes ingredients for co-presence with other ppl: https://t.co/gkvUNNotOM pic.twitter.com/v99b9XgEIt
— Kent Bye (Voices of VR) (@kentbye) August 11, 2016
I had a chance to catch up with Andrew at the IEEE VR conference where he talked about his experiments in using virtual humans within team training scenarios, some of the research of how humans self-disclose more information to virtual humans, how gaze behavior could provide an objective measure for social presence, and more details about other theories of social presence and co-presence that provide models for how we create models of people’s minds, feelings, and motivations.
Donate to the Voices of VR Podcast Patreon
Music: Fatality & Summer Trip
Support Voices of VR
- Subscribe on iTunes
- Donate to the Voices of VR Podcast Patreon
Music: Fatality & Summer Trip
Rough Transcript
[00:00:05.452] Kent Bye: The Voices of VR Podcast. My name is Kent Bye, and welcome to The Voices of VR Podcast. So when I was at the IEEE VR conference this year, I had a number of different conversations talking about this idea of social presence. It's essentially this feeling that you get of being co-present with other people, and you kind of forget that the interaction is being mediated by virtual reality technologies. And so today I'll be talking to Andrew Robb who's a postdoc at Clemson University and he's been researching social presence and specifically applying these ideas and concepts to virtual humans and being able to do training situations where you're able to recreate entire social dynamics with virtual humans. So he's essentially trying to train nurses to be able to stand up to surgeons when they're trying to do something that would be putting the patient's life in danger. And I've actually found that doing a lot of these training exercises with virtual humans, it's a comparable difficulty to be able to stand up to a virtual human. So that's what we'll be covering on today's episode of the Voices of VR podcast. But first, a quick word from our sponsor. This is a paid sponsored ad by the Intel Core i7 processor. If you're going to be playing the best VR experiences, then you're going to need a high-end PC. So Intel asked me to talk about my process for why I decided to go with the Intel Core i7 processor. I figured that the computational resources needed for VR are only going to get bigger. I researched online, compared CPU benchmark scores, and read reviews over at Amazon and Newegg. What I found is that the i7 is the best of what's out there today. So future proof your VR PC and go with the Intel Core i7 processor. So this interview with Andrew happened at the IEEE VR academic conference that was happening March 19th to 23rd in Greenville, South Carolina. So with that, let's go ahead and dive right in.
[00:02:01.362] Andrew Robb: I'm Andrew Robb, I'm a postdoc at Clemson University. A lot of my research focuses on virtual humans and team training. One of my specific research interests is how agency affects behaviors, where when I say agency I'm talking about if someone's real or virtual. And so team training is a really interesting and important area to do agency research because you have so many potential variations in agency. And it's a point where if there are differences, that really makes a big difference because we're training in often complex and very important situations like medical training or military training. And so if agency is changing the experience people have while they're getting training, then that's going to lead to inconsistent behavior in the real world.
[00:02:48.659] Kent Bye: So maybe we can take a step back and talk about the role of virtual humans in training. Like, why is it helpful or useful to use a virtual human in a training scenario?
[00:02:58.021] Andrew Robb: That's a great question. There's actually a lot of reasons why. It's been interesting in my research, we've sort of, our thinking about this has evolved over time. Originally, the approach was the only time you'd want to use a virtual human is when a real human isn't available. And that is a big part when we're talking about team training. One of the issues you get in team training is in any real environment everyone's extremely busy and one key person who's missing can derail the entire exercise. So with virtual humans it's easy to plug in a virtual human and let the training go forward. But additionally for other reasons why as we've been seeing more interactions and been doing more training we've realized there's actually are a lot of times when you would want to use a virtual human even if a real human is available. One particular reason is issues with consistency. Virtual humans will be very consistent. They'll do exactly what you want them to do every time, or at least they'll do exactly the same thing every time. They often don't do what you want them to do because they're still just computer programs. But you would get the same mistakes every time, so you'd still have that consistency. And that's not something you'll get with real people. You'll have people sort of go off on tangents. People will just behave inconsistently from session to session. You also lose consistency when you have just different people. Even if they do exactly the same thing, they'll look different, you'll have different relationships, and all of that can play a large role in why we behave the way we behave. And that's something you can really easily control with virtual humans. Another instance is when you want to do something that people can't actually do. One big example of this that we've done at the University of Florida is we call them cranial nerve patients. So what we're trying to simulate here is neurological damage so that med students can see it in practice before you actually will encounter it in the real world. And the problem with neurological damage is it's impossible for a real person to fake it. In this case, we're talking about eye movement patterns where one eye gets kind of stuck and the other eye moves properly. And try as hard as you want, you can't fake that unless you actually have this neurological damage. And it's thankfully a rare event, you don't find many people with it, but that also means it's hard to find someone who actually has it to practice with. But that's something that's really easy to do with a virtual human. It's just a simple change in the code and now all of a sudden, med students can get hands-on experience with it.
[00:05:17.584] Kent Bye: And it also seems like there is things that you're presenting today about like how sometimes people interact with virtual humans differently. Like one thing you mentioned was that they disclose more than they would to a real human. So what are some of those behaviors that we act differently with virtual humans rather than actual real humans?
[00:05:35.574] Andrew Robb: One of the really interesting things that I didn't get to talk about today is while we do see differences in the way people interact with virtual humans and real humans, the overwhelming sort of theme is that people will treat virtual humans like they're real people. And so when we see differences, it's almost always a difference of degree rather than kind. So one example I like to give is if you have someone who's biased towards, say, people with dark skin as opposed to light skin, they will automatically exhibit the same biases towards virtual humans who also have dark skin as opposed to light skin. You can see a difference in degree where the bias may not be exhibited as strongly, but it's still there and present. So you've got differences in degree, though not in kind. So I can talk a little bit more about the one example you mentioned that I refer to is self-disclosure. So this is some research out of, I believe, ICT in California. that looked at essentially a virtual counselor. And so people would talk to this virtual counselor or talk to a real counselor and they'd look at how much personal information they self-disclosed and the sort of the sensitivity of that information. And they found that people are willing to disclose more information and more personal information to virtual humans than real humans. This is actually another good example, going back to your previous question, of when you might want to use a virtual human instead of a real human. What's most likely going on here is that you don't have this issue of sort of being concerned that this other person is sort of judging you and that there's going to be sort of consequences for sharing this highly personal information. That's not as much of a factor with a virtual human.
[00:07:15.818] Kent Bye: It just kind of makes me think of like we're creating these AI Spocks in some way that have no real emotional affect. It's just the facts, you know, and it seems like when creating these virtual humans, it's eliciting different behaviors that we don't normally display when we're talking to intelligent humans that have emotions.
[00:07:35.491] Andrew Robb: So, and some of the differences that we do see, there are very obvious differences, like we were just talking about self-disclosure, where it's caused by just the agency itself. It's this person knows that this entity I'm talking to is not real, so I'm going to be able to disclose more information to it. But a lot of the differences that we see are probably due more just to the limitations in current technology in virtual humans. And as we get virtual humans that are more able to closely replicate human behavior, just doing that better and better, I suspect what we'll see is those differences in degree are going to start to shrink. Specifically, the paper I was talking about today was looking at gaze behavior with virtual humans and real humans. And one of the findings we talked about is that people spent more time looking at virtual humans than they do real humans. We don't know why yet, though one hypothesis we're considering is that it could be caused just because the virtual humans don't convey as much information through nonverbal behavior. And because those signals are missing, people are spending more time looking at the virtual human to try and get that information, which they would instead be able to get very quickly from a human face, and so they can look away. And so if that is a large cause here, then what we'll see is as virtual humans' expressiveness increases, that difference in behavior will start to decrease, and then people will start looking at virtual humans for less time, and that'll more closely mirror looking at real people.
[00:08:59.953] Kent Bye: Well, I guess the thing that I would wonder is whether or not the virtual human was actually mimicking, you know, natural eye gaze, you know, because there's a lot of different algorithms that I know have come out through SIGGRAPH that have propagated into the game development community through Coffee Without Words or with Technolust and trying to use these kind of algorithms to mimic how we actually look around and use our eyes because we actually can emphasize things by when we look at each other and if the virtual human is just kind of Blankly staring out without you know It's it kind of gives a little bit of like on the autistic spectrum when they may not you know be fully aware of the different social cues and so were there any integration of like eye gaze, you know behaviors within these virtual humans and
[00:09:39.425] Andrew Robb: We did have some. There were definitely improvements that we can make and will make in the future. So I can just describe our gaze model. It was a simple model, but it did capture a lot of what you're talking about with cues and sort of the important social cues. Basically what we had the virtual humans doing is they would look at either whoever was talking or who was being spoken to. So we've got three people in the room. So if the nurse is talking, the surgeon and the anesthesiologist, or virtual in this case, will look at the nurse. And then if the nurse is talking to the surgeon, the surgeon will look at the nurse, and then the anesthesiologist will then sort of look back and forth between the surgeon and the nurse. Similarly, we've got a patient in the room, so if the patient's under discussion, the surgeon and the anesthesiologist can look at the patient. In this case, we're basically able to just sort of pre-code all that because we're working with a very specific training scenario that was very fortunate for this exercise because there was a great deal of sort of consistency or predictability in it. Just because of the scenario, we always know essentially what's going to happen next. And so we're able to predict a lot of that information instead of having to infer it, which is one of the sort of problems with doing good gaze is inferring a lot of those social cues.
[00:10:50.206] Kent Bye: And so what is it that you're measuring? Like, how are you able to deduce, like, what questions are you asking and then how are you measuring those results?
[00:10:59.300] Andrew Robb: So since the focus here in this study was gaze, that's something we're able to do very objectively. Ideally, we would use what's called an eye tracker, which is automatically able to detect where the eyes are pointing. However, in this case, we couldn't use an eye tracker just because of the limitations in the technology. Eye trackers usually have to be sort of very stationary, seated in one place. And this is a scenario where people are moving all around the room. So what we did instead is we have video recorded of the person's face from the virtual human's perspective. And then we can have multiple people go through and sort of do basically close observation of the video, slowly going through, looking at where the people's eyes are pointed, and coding that using a program called Anvil, which is built for video coding like this. And, of course, you want to be able to say if this coding is sort of working or not, if you're actually capturing gaze behavior accurately. So what we do is we have multiple people code the videos, and then we can use various statistical tests to look at what we call inter-rater reliability. And we found in this case that we did have good iterator reliability, which means that at the very least we know that the different coders are coding in the same way. And because they're coding in the same way, it seems likely that they're actually capturing that gaze behavior.
[00:12:15.225] Kent Bye: If somebody is going through these training situations and they're failing, what does that success look like and what does failure look like in this specific example?
[00:12:23.354] Andrew Robb: So in this particular session, what we were doing, the training goal was to help prepare nurses how to speak up about patient safety issues. So what's actually happening in the training part is this team is preparing a simulated patient for surgery. And towards the end of the preparation, the surgeon learns that the anesthesiologist had forgotten to send blood samples down to the lab, which means replacement blood is not ready for the surgery. And that's a big issue because if the patient needs replacement blood and they don't have it, he's probably going to die. If he doesn't, he'll be in extremely critical condition. So this is a big issue, not having blood available. And it also can take a while to actually get blood processed. It takes a while to actually process the samples. There can be unexpected complications, which drastically increase the amount of time it's going to take. So it's very important that you don't start surgeries that need replacement blood until you actually have it available. But in this case the surgeon is frustrated, he's had a long day, he's just kind of angry and decides that he's going to start the surgery anyways because he thinks that there will be enough time to get the replacement blood, send the samples down, get it processed, get the blood before he's actually going to need it. And so he sort of says, all right, whatever, just send the samples down now, and we'll go ahead and get started. And then you sort of have a pause where you have the nurse has an opportunity to actually speak up and challenge the surgeon. In this case, we had sort of if they don't speak up, the surgeon will explicitly say, all right, are there any objections, or we're going to go ahead and get started. So, failure in this case looks like letting the surgeon actually begin the surgery. One of the interesting things, I'm glad you asked about failure versus success here, is there's actually a lot of sort of degrees of failure that all still in the end constitute failure that we weren't expecting. When we sort of designed this, we're thinking, okay, everyone's going to stop the surgeon or they're just not going to say anything. And what we actually found is there's a lot of sort of degrees between not saying anything and stopping the surgeon. Some of the sort of the alternative failure behaviors we saw were they'd just speak up originally and then give up and let the surgeon start anyways. Some people would shift responsibility to the surgeon where they'd say, after sort of defending why we needed the blood, they'd say something along the lines of, well, you're the surgeon, it's your decision, but I still think this is a bad idea. And that's actually not true. The whole team is responsible for protecting the patient's safety. And so the nurse should feel capable and empowered to actually go and stop the surgery if something like this is happening. You'd also have people who would sort of sanction the surgeon by saying, all right, we can start, but I'm going to file a report about this afterwards. So the surgeon would sort of hopefully get punished in some capacity afterwards. And then finally, at the very end, you only get to the point where people would, what we'd call, stop the line. And the surgeon in this case is refusing to listen. He's not going to stop. The only way to stop him is to get someone in management basically come in the room and say, no, you can't do this. So their nurses would call the charge nurse usually. And then that's when the simulation ended. But then you'd have, in the real world, the charge nurse would come in and stop the surgeon. And also sort of illustrate why this training is so important. When we actually look at who did how many of these different things, we found that about a fourth of the nurses actually stopped the line. And I always want to strongly emphasize this is not a reflection of the hospital's ability that we're working at. Because when we talk to nurses, talk to people from various hospitals, they often guess that fewer than 25% would actually stop the line. We've heard 5% a lot of the time. So the fact that we're not seeing many people actually stop the line is more a reflection of the difficulty of this task. There's lots of reasons why it's hard for a nurse to actually challenge a surgeon here. Institutional reasons, sort of personal cultural reasons. This is just a challenging situation and that's why training here is so important. because it's not something that people are going to learn how to do in the real world, but instead being able to practice it with virtual entities or just in a controlled setting so they can get that experience and do it better in the real world.
[00:16:22.821] Kent Bye: Yeah, the thing that I'm really taking away is that there's a lot of power dynamics that are existing within these different types of situations, but also a lot of like social groupthink behavior that can happen because, you know, there may be someone who is taking charge and are you gonna stand up to what could be seen as a superior and whatever hierarchical power scale in the room. But so let me just get this clear. There's a virtual anesthesiologist and then there's a actor who's playing the role of the surgeon who is just kind of brute-forcing trying to go ahead and it's the person who's getting trained is the nurse who is standing to the side interacting with this virtual anesthesiologist and this real human actor and the goal is for them to intervene and stop this situation.
[00:17:06.977] Andrew Robb: So we actually had three conditions. So the general what you're describing is accurate, but we had three conditions instead of just the one where we had the one condition like you described, human surgeon, virtual anesthesiologist. We also had a condition with a virtual surgeon and a human anesthesiologist and another one where everyone was virtual except for the nurse. We didn't have sort of the fourth condition where they're both human. Actually, because we ran into the same logistical problems that you do doing real training, and that it was hard to actually get the surgeon and the anesthesiologist together at the same time to do this training with the nurse. Also, we just had limited participant pool, so we chose to focus on these three conditions.
[00:17:43.623] Kent Bye: And so what were the differences, or what did you find?
[00:17:46.055] Andrew Robb: So it's actually amazing how few differences we saw in behavior. The primary difference we saw is this difference in gaze behavior that I've referenced earlier, where people are looking at virtual humans more than real humans. When we actually looked at the breakdown in the rate at which people spoke up, we saw basically no differences. We did see that people may have been speaking up at slightly higher rates when there was a human anesthesiologist, but the agency of the surgeon had appeared apparently no effect on how likely people were to speak up. We've actually done a second follow-up study to this that was working with a similar situation involving speaking up. Instead we were changing the agency of sort of a fourth teammate. So we unfortunately can't sort of infer sort of backing this up. But there again we saw that the agency of this fourth teammate, in this case a surgical technician, who's sort of a support person, didn't really have much of an effect on speaking up. And so one of the big takeaways from this is that we're seeing that agency here is not the driving factor influencing how people speak up. Instead, it appears to be more about their own sort of mental models of what you do in this situation, where if someone has experience speaking up and they're comfortable with it, they're going to speak up here. If people don't do that in the real world, they're not really doing it here either.
[00:19:05.921] Kent Bye: So you're kind of creating these social situations virtually to give them an opportunity to learn how to speak up, it sounds like.
[00:19:11.946] Andrew Robb: Right, exactly. That was one of the big research questions with that first study is just, can we recreate sort of the same social difficulties with virtual humans? Because if 90% of people stop the line with a virtual human and 10% stop the line with a human, practicing with the virtual human may not really carry over to any improvements in the real world because it's so easy in comparison. So finding that it's of comparable difficulty is a really important finding for this type of training.
[00:19:38.888] Kent Bye: And so how dynamic and interactive are these virtual humans? Are they taking natural language input and being able to process that and somehow give the proper response?
[00:19:49.080] Andrew Robb: So that sort of natural language input system is still one of the big problems with this type of training. Ideally we want to get there, and we will eventually, but we're not there yet. So what we've done in our studies is we use what's called a Wizard of Oz, where you have someone who's actually sitting behind a curtain, that's the reference to the movie, controlling the virtual humans. So in this case we have 100, 200 things that the virtual human knows how to say. And the person behind the curtain is listening to what's going on and telling the virtual human how to respond. So we've got basically just a giant set of menus that's organized and categorized. And there is some predictive stuff going on where responses are suggested to the Wazir based on what's happened previously. But because we can do this Wizard of Oz style for this type of interaction, we can get very high accuracy rates. We rarely ran into anything that the virtual humans couldn't answer, and we also rarely had sort of the wrong thing spoken. Every now and then you'd have sort of a fat-fingered response where the wrong thing gets selected, but you can get very high accuracy using Wizard of Oz, as long as you've taken the time to really explore sort of the solution space of what people are going to want to say, and you've prepared for that in advance.
[00:21:03.914] Kent Bye: So it sounds like normally you would get like actual real surgeons or people that, would you get actors or real surgeons?
[00:21:10.884] Andrew Robb: So when you're building these scripts you want to get real people as much as possible because someone who's sort of just an actor isn't going to be able to anticipate as well sort of the edge cases. What we've usually done is we'd have basically focus groups where we would bring in sort of the nursing management that we're working with and have them sort of role play and try and brainstorm about every possible thing someone could ask, even if it's not important to ask it, someone still might. And then we'll also pilot it with some people. So we've built our script, we bring in some nurses, some actual nurses who would be participants from our pool of participants and have them sort of go through it and ask them, what are some other things you might have done here that, was there anything that was inaccurate or incorrect? So it's a very iterative process.
[00:21:54.715] Kent Bye: Yeah, I guess the larger question I was getting at in some way is that there still needs to be a human behind the scenes operating it that is not completely automated with AI and all everything that you can just sort of send people off and recreate these social situations. And do you foresee that like in the future being able to kind of recreate these social dynamics? Or do you think that we'll always kind of need like the Wizard of Oz person behind the curtain, you know, operating the actual social dynamics?
[00:22:19.413] Andrew Robb: So I do think we will be able to get rid of the Wizard of Oz eventually. One of the big reasons you need a wizard in this type of training is just the converting speech into text. So we have other training scenarios, like I mentioned some virtual patients earlier, where you can do sort of a chat style interaction and those we do automatically. You do get more errors because you get sort of there's two sources of errors. There's speech recognition errors like speech text. There's also speech understanding of sort of what did this person say? How do I respond? So we can get rid of the speech recognition errors by using text input like instant message style chatting, but you still can't get rid of the speech understanding errors. Though, we are seeing that error rate goes down over time as we do a better job figuring out how to create scripts and how to understand speech. That being said, so I think we can get rid of the wizard. I don't think we'll ever get rid of the script designer. So I think you will essentially always need to have that human in the loop creating the virtual characters in the first place. it'll become easier to make scripts as we can use AI to sort of harvest interactions and sort of make suggestions of how to refine the process. But I'm personally skeptical that we'll ever have AI that can sort of create itself, if that makes sense.
[00:23:36.481] Kent Bye: Yeah, yeah totally. So in terms of like measuring the sense of social presence, is that something that you are also trying to measure and how do you measure that?
[00:23:45.288] Andrew Robb: Yes, that's actually one of my big research interests as well as social presence. Still the predominant way to measure social presence is through surveys. And there are definitely a lot of issues with surveys. Mel Slater especially has talked about this. in that you really want a physiological or a more objective metric and there are better solutions to this with presence where it's just sort of feeling like you're in a place where you can do a lot of physiological readings like you have someone standing on a cliff and you're collecting heart rate or a galvanic skin response and if those go up compared to if they don't, you know, someone's experiencing more presence. However, it's really hard to come up with good sort of physiological objective metrics for social presence that are consistent in all situations because physiological responses are much less consistent from social situation to social situation. So, that is still sort of one of the issues, I think, with social presence research is we need to do a better job figuring out how to measure social presence beyond surveys. That's actually one of my interests in Gaze is exploring if Gaze can serve as a more objective metric because Gaze behavior is something that can be held sort of more constant between various social situations. Again, there are plenty of variations that occur, but I think there's some potential there, but that's a whole area for future research.
[00:25:07.335] Kent Bye: Yeah, and I know that Mel Slater talks about the two big components that he sees of presence as the place illusion and the possibility illusion, which is creating a sense of place and then also having a coherent, like, believable world where it all makes sense. But when I did the Toy Box demo at Oculus Connect 2, where you have this highly dynamic environment, you have your hands, and so you have this real sense of presence, but there's also another person there that's interacting with you, like, in real time. And to me, that's like the highest level of presence that I've ever felt in an experience. And so I'm wondering from your perspective, if your definition of social presence is something that is an additional illusion that is within Slater's two components, or if it's something that's different. And I'm just wondering how you define it.
[00:25:52.604] Andrew Robb: Sure. So I usually think of sort of, if you have like the umbrella concept of presence, I usually think of it as there's three types. There's what you'd call place presence, which is what we usually refer to as just presence. It's feeling like you're in an environment and you can break it down further like Mel Slater does. You can have sense of self presence, which is related more to your sort of body and your feeling of sort of ownership over that. So like if you don't have an avatar at all, you probably have fairly low self presence. If you have a very responsive avatar, you'll have probably very high self presence. And then you have social presence as well, or you could call it co-presence to some degree. You can argue about if they're the same thing or different, but they'd sort of fit under this umbrella of just feeling like the social interaction or being with other people is either sort of real and actually there, or it's kind of fake and non-present.
[00:26:39.183] Kent Bye: Yeah, and it seems like the virtual body ownership illusion is kind of like invoking that sense of self-presence, and is that kind of like how you think of it as well?
[00:26:48.282] Andrew Robb: Right, yeah, self-presence is very much related to this, that I have a body and I'm controlling it accurately, that it's actually mine.
[00:26:55.205] Kent Bye: And so how do you, what are the components of social presence, like how do you kind of break that down then?
[00:26:59.541] Andrew Robb: That's a good question. There isn't really a well-defined taxonomy of social presence. Harms and Bioka have a good taxonomy of social presence. They call it the networked minds questionnaire that breaks social presence down into multiple characteristics. I can't remember all of them off the cuff, but it's things like co-presence, attentional awareness, so feeling like there's sort of shared attention or mutual awareness. Other things sort of that get at that connection between people in terms of sort of attention, behavior, just sort of mere presence and understanding of each other.
[00:27:32.266] Kent Bye: So what are some of the biggest open questions around social presence that you feel is kind of driving your research?
[00:27:39.095] Andrew Robb: So one of my big questions is driven by some observations I've had actually in this study involving the speaking up to surgeons. I think one limitation with social presence right now is we tend to view two concepts that are actually different as the same. We tend to think that as long as you're perceiving that the signals you're getting from a virtual agent are the same signals you get from a real person, So you're sort of sensing that this person is real. We tend to think that that sensing something is real is the same thing as believing that it's real. And I'm seeing a lot of interesting things coming out of talking with participants that indicate this isn't always true. There are some people who are just like, wow, this was like amazingly realistic. I was just talking to this person who didn't exist. And you'll have other people on the other end of the spectrum like, this was just really creepy and awkward and I don't want to do it again. Thankfully there aren't many of those, but you do get them occasionally. But then you'll have other people who say really interesting things like, my mind was telling me that this guy was real, but my eyes were telling me that he's virtual. And so there's this disconnection between the sensation that something's real and the belief that it's real. And this is not something that has really been explored, and I think it could be really important.
[00:28:52.563] Kent Bye: Is there any personal stories or memories that you have that you have kind of like the highest level of social presence?
[00:28:59.240] Andrew Robb: That's a good question. Nothing's coming to mind. Partially, I think, because we were so sort of, you could say, almost even lucky with this last scenario. Almost everyone had really high social presence. It was one of those instances where the scenario just sort of clicks. So one of the, I think, big sources of presence and social presence is sort of behavioral contingency where you do something and you get the appropriate response back. That's what really makes you feel like you're actually there. And this scenario, we were able to just predict really well what people would do in all the circumstances. So people generally always got a good response back. Part of that as sort of an aside is that we were working with an argument here, where the surgeon is just sort of mad and angry and sort of yelling at you. And so when you're in an argument, you're often somewhat irrational. And so you can just sort of say anything off the cuff, and it doesn't really have to relate to what was said before, and it still feels natural and realistic. So that very much worked in our favor. One interesting point I could refer to is that we would sometimes see people in sort of arguing with the surgeon, talking about things that the surgeon, because he's virtual, obviously doesn't have. Like, so one particular instance is this nurse said, well, would you do this with your son? Like, would you go ahead and proceed the surgery with your son if he was on the table? And obviously he's a virtual surgeon, he doesn't have a son. So it's interesting to see sort of those automatic responses where people just refer to real things that virtual people obviously don't have, and it's just natural and instinctive. And no one sort of pauses to stop, or at least many people don't pause to stop and say, wait, that didn't make any sense. Let me go to a different argument that would actually work with the virtual human.
[00:30:37.009] Kent Bye: Yeah, and for people who are interested in learning more about social presence, I know that social multiplayer is a huge trend within the consumer VR and something that a lot of people are really interested in. And it seems like there's a lot of really interesting things that have been researched along this line of social presence. So if people wanted to get more information about it, is there any researchers or places that you would point them to?
[00:30:59.160] Andrew Robb: Two of the big researchers that have been very inspirational for me are Mel Slater and Jeremy Bailenson. They've both done a lot of research, Mel Slater mainly with more presence and Jeremy Bailenson has done a lot with social presence. And both of them honestly are just inspirational for the types of research they design. They are able to come up with very interesting scenarios and interesting metrics that are able to really get down to some really interesting findings. They're sort of able to tease things out by asking questions in really unusual ways.
[00:31:30.160] Kent Bye: And finally, what do you see as kind of the ultimate potential of virtual reality and what it might be able to enable?
[00:31:37.121] Andrew Robb: Most of my focus just on my research is training, so I'll answer this from a training perspective. We all have sort of need to improve in a lot of different areas of our life, and most of those improvements only come with practice. And one of the difficulties with actually getting better at these things is finding the opportunities to practice. This is especially true, I'm thinking here, of interpersonal skills. That could look like challenging surgeons when they're making mistakes, or it could just be small talk with just your neighbor. There are plenty of people who find that awkward. It could be there's been work done with giving presentations and sort of that social anxiety. And I think as virtual reality gets better that we're going to be able to sort of improve ourselves a lot more often in a lot more areas because we'll have those opportunities for practice and for reinforcement that just don't exist right now.
[00:32:31.584] Kent Bye: Great, well thank you so much. My pleasure. So that was Andrew Robb. He's a postdoc at Clemson University and he's been studying how to recreate convincing social dynamics using virtual humans and researching this concept of social presence. So there's a number of different takeaways that I had from this interview is that first of all, Wow. It is really interesting to see how you can use virtual humans to be able to kind of recreate these different types of training scenarios. Now, in this case, they're actually kind of doing more of a mixed reality situation where they're actually in a physical operating room and they may have this big LCD screen that has these virtual digital avatars of doctors or anesthesiologists that are being puppeted by a Wizard of Oz interactor in the background. So at this point, the artificial intelligent natural language processing isn't quite sophisticated enough to be able to have it fully automated. So there's a human that's essentially listening to it and then giving a response from a fixed set of different available responses. And so they're able to kind of standardize the different reactions. One thing that I thought was really interesting is that there's a certain level of irrational argumentation that's happening from the doctor and that going into kind of like these emotional irrational arguments actually made it more believable to the point where some of the nurses were starting to debate with these surgeons saying things like, well, what would you do for your son? Which obviously it's a virtual avatar who doesn't have any children. So. It reminds me of the little anecdote that Charlie Hughes shared back in episode 409 where there was a teacher that had been talking about one of the students within the Experience that wasn't really real and so kind of created this moment where she found herself talking to somebody else as if Sean was real But yet it was just a kind of an imaginary virtual character that she had all these different interactions with And so the really interesting thing that I see that's something that's emerging that's new within interacting with virtual humans is that I think that if you're able to create this convincing enough sense of social presence, then there really is no boundaries in our minds that's determining whether or not something is an interaction with a real or fake character. And so I think As time goes on, we're going to continue to have these lines blurred into what's real and what's created synthetically from these technologies, these virtual humans as well as artificial intelligent non-player characters. And I think that it's really interesting to find that they've found that it's a comparable difficulty to be able to stand up to a virtual human, even if it's virtual and completely fake in a certain sense. It's still kind of calling forth enough of convincing social presence that it's still trying to overcome the difficulties that these nurses face when trying to actually stop the line and stand up to what may be kind of perceived as a hierarchical power differential between the nurse and the surgeon. And just like in thinking about interactions with virtual humans, I think that there's a couple of interesting points that have been brought up briefly here in this interview, just that People, when they're interacting with a virtual human, tend to self-disclose more information about themselves. It's just something that they feel more safe to be able to talk intimately with a character who they know is not going to judge them in any way, and so it allows them to open up a little bit more. Also, I think that what Andrew is finding is that people tend to look longer at virtual humans than they do than real humans. Maybe it's because we're not getting a lot of the other subtle non-verbal body language cues from a virtual avatar. And so Andrew thinks that people may be looking at these virtual humans longer because they're not getting as much of that information that's usually being conveyed and contextualizing the communication that's happening. So on that note, Andrew thinks that one objective indicator for how much of social presence that you have is to be able to look at the different gaze behavior and somehow look at the length and the duration and where people are actually looking and and see if that's some sort of indicator for the level of social presence that people have. I think generally when people talk about different levels of presence, there's surveys that you take after the fact, but there's also other more objective biometric data that you might be able to get in real time that may be a little bit more reliable that I think people are trying to find out for presence in general, but also Andrew's trying to look at that for social presence. So Andrew also mentioned this paper from Bianca and Harms called Defining and Measuring Social Presence, a Contribution to the Networked Minds Theory and Measure. So just this networked minds theory has like three different levels of social presence and essentially has this model that when you have these social interactions, you kind of have a predictive model of what is happening with the hidden intentions of the person that you're talking to. And so humans are very complex and there's a lot of things that people say that is inferred from the context and the intonation and tonality, nonverbal body language, all these things that we kind of fuse together when we're interacting with other humans. And so I think having a clear idea of what all these different ingredients are will help to create these social environments that is sure to include all those nonverbal social cues, but also have a way to kind of understand what is actually happening and why we may not feel full social presence with a AI NPC character that we know is fake. and where we may have more social presence with an AI character that's being puppeted and controlled by a Wizard of Oz human that you know is understanding you but perhaps has a more constrained and limited amounts of responses they can give you versus interacting with other real humans which I think is pretty clear at this point what's a human and what's an AI bot but that in the future may change as there's more and more sophisticated ways to be able to mimic human voice and add the emotion and tonality and the fluctuations and Just like there's a lot of CAPTCHAs that are happening on the internet to detect spam bots, I think that we're going to be interacting with a lot of different characters within virtual reality where it's going to be a little bit unclear whether or not they're an AI or whether or not they're really human. And so Bioka and Harms have defined broadly what they see as social presence, which is this sense of being of another. And they don't say that it's another human specifically, but it could be another human or artificial intelligence. But you have the sense of being with another within the mediated environment. And so that mediated environment most often is through a virtual reality technology. So that social presence is that moment-to-moment awareness of co-presence of a mediated body. So it's either a digital avatar, artificial intelligence, but some sort of entity that you're getting some sort of access to this being psychological, emotional, and intentional states. So if you get the sense that this other being has a psychological life, an emotional life, and also has some sort of intentions for who they are, why they're saying what they're doing, this intentional states I think is one of the things that I think is really interesting to see that this is such a big part of the co-presence model because you're trying to figure out like, who is this person? Why are they saying what they're saying? What is really motivating them? What's driving them? All these are information that's informing someone's intentional state. And in order for you to feel co-present with other beings within a virtual environment, I think that you have to kind of believe that they have some sort of intelligence. And if they're intelligent, then they're going to have some sort of intentional states. They're going to have an emotional life and perhaps some sort of psychological personality profile as well. And so this networked minds theory really starts to talk about how you have this mental conceptualization of other beings that you're interacting with in virtual environments, whether they're human or artificial intelligence. So that's all that I have for today. I just wanted to thank you for listening. And if you'd like to support the podcast, then become a donor. You can send just a few dollars a month and all of the donations add up and make a huge difference in allowing me to continue to bring you the Voices of VR podcast. So send me a tip. Go to patreon.com slash Voices of VR.