#293: AI & the Future of Interactive Drama

andrew-sternAndrew Stern doesn’t enjoy most AAA video games because he wants to be able to say anything at any moment within a social simulation and participate in an interesting story. About once week, he’d like to engage with a high agency, interactive drama with artificially intelligent NPCs. Rather than long and extended play sessions, he’d prefer a short 20-30 minute experience that he can play over and over again trying different strategies with characters who feel real and plausible.

This isn’t just a pipe dream because in 2005 Andrew was a co-creator of Façade, which is one of the only interactive drama games that has natural language input and offers both local and global agency to the player. For the past couple of years, Andrew has been working with his Façade collaborator Michael Mateas as well as with Larry LeBron on a DARPA-funded AI program. IMMERSE is a gestural-based, virtual training simulation for soldiers to learn de-escalation social skills in non-English speaking environments. I had a chance to catch up with Andrew and Larry to learn more about using AI to create plausible characters, IMMERSE, as well as their new company called Playabl where they’re continuing to develop their Unity AI toolkit for creating fully interactive dramas.

LISTEN TO THE VOICES OF VR PODCAST

When I saw a demo of IMMERSE at the Portland Virtual Reality Meetup, I felt like I was seeing the future of what’s possible in creating plausible NPC characters. Andrew, Larry, and Michael created a number of different layers of social behavior modules that could be turned on and off. The combination of these modules started to yield emergent behaviors that transcend what would be possible in trying to hard code all of the hundreds of potential branches.

In thinking about presence in VR, according to Mel Slater’s theory, the two major ingredients of The Place Illusion and the Plausibility Illusion. Creating plausibility within a VR experience means that you have to create an environment that feels coherent, and believable. Rob Morgan says that this means that NPCs in VR need to be responsive to your actions even more so than in a 2D game. Ross Mead talks about all of the different body language and social behaviors that are important to this. And in terms of natural language input, Façade was a pioneer of allowing you to say anything at anytime and have the story adapt and respond to your character.

I think that this type of AI work and characters are going to be a huge part of creating experiences where you’re a “Character with Impact.” For anyone interested in learning more about how to architect and create a interactive drama with both local and global agency, then I’d highly suggest spending $5 to buy the “Behind the Façade” Guide, which is an amazing document that talks about the architecture of a highly dynamic story that makes the user feel like they’re an active participant in creating it.

Andrew, Larry, and Michael are still in the early phases of Playabl, but I look forward to what types of tools and stories that they end up creating. For people interested in moving beyond passive experiences, then their AI technology and framework is one of the most compelling approaches to high agency, interactive drama that I’ve seen so far.

Become a Patron! Support The Voices of VR Podcast Patreon

Theme music: “Fatality” by Tigoolio

Subscribe to the Voices of VR podcast.

Rough Transcript

[00:00:05.452] Kent Bye: The Voices of VR Podcast.

[00:00:12.118] Andrew Stern: I'm Andrew Stern. I'm a designer and a programmer and a writer of interactive characters and narratives and social games. I've been in the game industry since 1992. Real quick background, I majored in computer engineering and filmmaking, double major. and was always interested in games, but at that time, when I was younger, wasn't sure how to combine my interest in computer science and story, because back then there really weren't, in the late 80s, early 90s, I mean, there was interactive fiction, but there really weren't interactive stories yet. But I worked on virtual pets throughout the 90s at a startup called PF Magic. So those are actually the world's first virtual pets. This is before Tamagotchi or Nintendog. And I was the AI programmer and one of the designers of these characters and they came out in like 1995 and was a whole franchise until the late 90s. And then I met a guy named Michael Matias, who was a grad student at Carnegie Mellon, and he and I decided to collaborate on an indie game called Facade, which is an interactive drama, a one-act interactive drama, where in your own words, in natural language, in real time, you improvise a short play with this married couple, these two characters, Grace and Trip. The game was inspired by the play Who's Afraid of Virginia Woolf? And in this interactive drama, they try to make you take sides in their relationship, they're having a big fight, and depending on what you say or do will determine if they stay together or not. And so we released that in 2005 as a kind of a free game, and try to launch this vision of interactive drama. And following that, I worked on iPhone. I had an iPhone game studio here in Portland called Stumptown Game Machine and built more virtual pets for the iPhone. And then after that, rejoined Michael's team. He was now a tenured professor at UC Santa Cruz and worked on this big research project called Immerse to build social skills training systems for the government. And that just finished.

[00:02:14.079] Larry LeBron: So my name's Larry LeBron. My background's actually not in computer science, though I got into it a little bit later in life. So I actually majored in anthropology undergrad, and from there went into music, music education and performance for quite a few years. But then in looking for a change of pace, I was, you know, I've been a lifelong fan of games and see a ton of potential in interactive media in general and started dabbling in programming and just basically went really deep really quickly and really fell in love with it. I was surprised by how creative I found programming immediately, even from like the introductory levels. Anyway, ended up doing a deep dive into programming, which over a little bit of time actually led me to UC Santa Cruz. And one of the things that led me there actually, so in thinking about what kinds of interactive media I wanted to create, I got really excited at the potential of what I was seeing a lot in the independent game space, really narrative-based games, things that were kind of pushing the limits of expressivity, both for characters and environments, in interactive media. So, you know, some of the things you weren't, I wasn't really seeing in sort of the more high-end games. And one of the works that I encountered during that exploration was Facade. So, when I learned that one of the people who co-created that, Michael Matias, who Andrew mentioned, was right there in my town, you know, I went and paid him a visit and That kind of started our working together and ended up being a graduate student in his lab. And after graduating from UC Santa Cruz with my comp sci degree, I joined the project that Andrew mentioned, the IMMERSE project, which was this government, this really massive government funded effort to look into training good stranger and social skill behavior through interactions with virtual actors. And got into that team, worked a lot on building that kind of content. and on expanding our authoring tools. And then since then, I've moved to Portland and starting a new endeavor with Andrew and Michael.

[00:03:56.889] Kent Bye: Awesome, yeah. To me, just to kind of set the context for where this kind of fits into the virtual reality space is that there's a couple of key components to presence. One is to create a sense of place, but that's something that's taken care of with the head tracking and creating virtual worlds and making those believable. there's another element of creating a sense of plausibility, that you live in a coherent world and that not only that, but that you have some sort of agency where you can have some sort of impact onto the story. And so, Andrew, you pass along your Behind the Facade PDF, which is an amazing PDF breaking down the structure for how you went about creating Facade. And one of the things that really stuck out to me was this balance between local agency and global agency. And maybe you could kind of Describe to me how Facade was able to combine those two.

[00:04:47.815] Andrew Stern: Right. Agency is the holy grail of what we're trying to create with our interactive characters and stories. I mean, it's the hardest. It's giving the player control. When you take action, you actually have true effects on the world. Traditional games do a good job of that. In the physical space, with guns and driving vehicles and so on, you have a lot of control, of course, where your bullets are going and maybe where you're locomoting and bits of object interaction. But, you know, in general games have low agency when it comes to conversation or any kind of communication with NPCs. So, as well as even on a higher level, how the plot of a narrative unfolds over time. You know, almost all games are very linear or there's you can diverge for a short amount of time from the overarching linear narrative for a bit, and then you come back to the main linear thread. So when we were developing Facade, we realized it made sense to distinguish between local or global agency. So local agency, which I guess is in a sense more common in traditional games, is when you do a moment-by-moment interaction, whether it's either in the physical sense of moving around or using objects, or conversationally, by speaking something, Do you get some sort of at least immediate effect? Do you get some sort of immediate at least short-term reaction from the world and the other characters? But it may not have any impact over the long term, but you can at least get that moment by moment change We call that local agency and the harder problem is global agency where all these bits of moment-by-moment interactions that you're doing add up over time and have a major effect on the direction of the entire experience and So it helped to distinguish those two because there are various degrees of difficulty to implement and it helps us analyze our own experiences that we're building and how it compares to other games and interactive experiences others are building.

[00:06:43.238] Larry LeBron: I just wanted to add one thing on to, you know, talking about agency is something that's fascinated me as well for a long time. It's something I actually learned when studying with Michael is research into this concept. I think there's a common misconception that what agency means means I can do whatever I want. Like, you've given me this game, now I can do whatever the hell I want and it'll behave exactly as I want it to. Like, essentially, I mean, which that would almost be like being a god in whatever virtual world you're playing in. But really, the holy grail of agency is to strike that balance between supporting everything that you've suggested should be possible. So this definitely gets into this plausibility. So if you're in a world, yeah, if you're placed behind a racing wheel, you should expect to race a vehicle. You shouldn't expect to be able to play a viola or something. So basically, it's striking this balance that if I'm the player, every affordance that you've suggested, that you've kind of implied through your world building that I should have, I should have. If there's something that it seems clear that I should be able to touch or I should be able to interact with and I can't, then there's immediately a lack of agency. But it's possible to achieve what you'd call high agency with that kind of limited suggested affordance. So yeah, that car example or any other kind of scenario you'd imagine where there's a lot of things you just might not expect to be able to do. You can still, you know, feel like you have high agency as long as everything you think you should be able to do, you can do. So it's one of the issues, as soon as there's a character in a world, immediately, you know, because we're used to interacting with people, you know, day in and day out, we have a huge set of expectations for what kinds of things we should be able to do with another character, with another person in a world. So as soon as you throw up a menu and only give someone, you know, three choices of what they can say, you've immediately, you know, drastically limited the agency relative to the suggested affordances and expectations.

[00:08:20.152] Andrew Stern: Yeah, I agree with everything you just said there, Larry. The design of the affordances plays so much into, are you pulling off plausibility or not? But plausibility, of course, is bigger than VR, right? I mean, that's applicable to no matter the medium you're presenting your interactive story, whether it's on a tablet or text-based or a traditional console game. So this idea of plausibility, it's something that we think about all the time. And VR has its particular flavor of what those requirements are.

[00:08:50.880] Kent Bye: Yeah, and, you know, the thing that was really striking to me about, you know, looking at Behind the Facade and reading about the structure was that there was this series of things that were seemingly like cosmetic or superficial, and then other things that would be more branching in terms of setting the direction of the story into a completely different branch or path. You know, and also having a chance to have you come to the Portland virtual reality meetup and present on Immersive C, what you're actually doing there. There's also this preservation of context of, you know, from moment to moment of being able to track that in terms of a human interaction. And I think that the thing that I found fascinating is that in a lot of ways, you're breaking down like human behavior of things that are happening naturally. And you're trying to break these down into these sub modules and then layer them on top of each other. And so Maybe we could just at a high level talk about that process of trying to break down human behavior into these computer programs and then add them together contextually.

[00:09:46.736] Andrew Stern: Right. So Facade, which Michael and I built over a five-year period from 2000 to 2005, it was a research project. In fact, Michael finished his PhD at Carnegie Mellon during the project. Facade was part of the work that went towards that. So it was research, and we published a bunch of papers together. It was also an art project. It was a personal art project for myself and Michael, and it was a prototype for potential. new commercial product in a sense. So the research side of it, what we're doing, like you say, Kent, is we're trying to figure out how do you divide and conquer this problem? How do you break up all of the richness of what's going on with interacting with people? If you're gonna make artificial people, or AIs, how do you break it down? Like what are the component, how do you break up all that content into manageable pieces? And so the fundamental content piece we came up with in Facade was the beat, the story beat. And there's a writer named Robert McKee who wrote a book called Story about Hollywood screenplays and how they're structured. And he talked about story beats. And in any one beat, which might last a few seconds, there's always some sort of value change happening. Something's moving. Some state is changing in any one beat. That's how stories move and how they advance. And so in facade essentially a beat is sort of like the equivalent of a line of dialogue or maybe an action-reaction pair of dialogue. It might be something that NPCs said and the player needs to react to or vice versa. That's like a unit. And then we developed other kind of units like that in the subsequent research project at MERS with the government called social games, which are collections of rules that operate over some kind of value. So for example, there's a affinity game where it's a small set of behaviors that are involved when characters are maybe becoming friends or breaking up friendships or there's another game called the authority game or another one called the threat game and so on so these mechanisms are ways that we so in other words when any two characters are interacting and all the different kinds of things that could be happening all the kind of little games and negotiations and communications that we're having we have to try to break them up in some way and Affinity and authority and threat are examples of what were useful for this training of social skills in a military or police context. But facade, for example, has other variations on those social games. And you can imagine that any one story, any one particular story having a particular set of social games it needs. And ideally, these are built up over time and there's a massive library of them that when you're building a new story, you can reuse these social games in new stories you're building.

[00:12:24.773] Larry LeBron: Yeah. I mean, the way I like to think of it is basically moving towards the social character AI equivalent of our modern day physics and graphics engines. I mean, right now, amazing things are possible there because there's essentially a nice framework in physics where you can outline the way rules work, the way an environment should behave, and then give things certain properties, and then you have physics. And obviously, that's still not perfect, but there's been tremendous headway in that realm. Whereas, when you look at character reactions and character decision-making for AI, for NPCs, essentially, the tradition has been to just basically hand-write long sequences of branching decision-making, because modeling, it's a really hard problem to model all of human social decision-making. I'm certainly not claiming that we've solved that, but we're definitely pushing towards that direction, because if you stick with that kind of hard-coded, you know, hand-authored, branching path sort of decision-making, eventually the authoring challenge just becomes way too great, where if you want to have an experience where a player is interacting with your NPC and feeling like they have some agency and like the character's actually behaving in a plausible, believable way, you'd need, you know, just a ridiculous team of authors to, you know, handwrite every single possible repercussion of every single player choice, especially even compounded by the fact, you know, if you're giving the player, like, natural language input like Facade does. So that's where there's kind of no choice but to move towards this more procedural solution where things are in little modules that then can combine and, like you say, hopefully create this emergent behavior and sense of life and social presence that the characters have.

[00:13:54.297] Andrew Stern: Merchant behavior is important, but at the same time, we are building simulations, not about it. These are machines we're building. You know, these social games that I was mentioning, where these beats, or the drama manager that sequences these beats, are little software machines. And that allows some degree of generativity, which helps solve the authoring problem that Larry was just mentioning. But at the same time, as authors, as creators of these experiences, we can't just rely completely on emergent behavior, because there's things that we want to happen as authors. You know, like The Sims is a very super successful game, and it really was in many ways pretty unstructured and could meander, and people loved that about it, that it was so open-ended. I mean, in later versions they started adding a little more goal-directed kind of behavior, more like narratives on top. Still I found it compelling in the sense of how pretty rich the simulation was but also pretty fragmented and kind of meandering and sometimes boring the idea of drama if you're making an interactive drama drama is efficient and has pacing and And, you know, I would love to play an experience where I don't have to play it for 20, 30, 40 hours. I want to play something for 30 minutes, 15 to 30 minutes, something like that, that's high agency and intense. And it's done and it's finished and I get this great experience out of it. And if I want, it's a rich enough simulation, I can play it again a bunch of times and try different things or whatever. In other words, we're trying to find this balance of building these machines that are simulations and allow for emergent effects, but still have a focus and efficiency and convey an experience that we as authors somehow are trying to define the kind of experience we want players to have.

[00:15:40.772] Kent Bye: And so I've, I took a look at a lot of different video playthroughs of people, you know, trying different things in Facade. And there's a range of people who are just kind of trying to mess with the characters to actually trying to legitimately come to one of the five endings that are in Facade. But as I was reading through the behind the Facade document, I got a couple impressions of like, well, first of all, that this is way more like a nuanced mapping of an entire relationship, dynamics of backstories, and this non-linear web of tension that's speaking to the façade of the relationship of things that are spoken and unspoken, and things that are hidden and that are revealed. just anger and tensions that are building throughout the whole arc. And that it's less about a linear story about that, but you're kind of constructing in a way where you're able to jump around to all those different dimensions of that. And so, you know, I'm just also struck about like, if I was thinking about trying to create and write my own interactive fiction, this seems like the process of Trying to map this out in a way like on a whiteboard or I just really curious about when you're approaching these problems how you start to even Start to break it down and then actually go about trying to construct these stories

[00:16:59.285] Andrew Stern: Yeah, I mean, the quick answer, Larry, is we're creating sort of like a bag of behaviors and bits of content. It's a collection. It's like a database isn't the greatest word, but it's a big collection of small behaviors, little machines. that any of them potentially could happen at any time. I mean, the system is open, and it's not like, again, it's not like a tree or a branch where you can only get to this bit of dialogue after you've traversed a long tree and managed to get there. Theoretically, any line of dialogue in Facade, for example, or any other direct-to-drama we might build, could be sequenced at any time, but there are sort of preconditions and effects that we author on top of them so that the system knows when is it appropriate or not to make that particular bit of content make sense to play or not, but it's that dynamic. It's like a huge collection of things that could happen. The system, we're writing these drama managers and narrative sequencers to figure out the good way to both make a story happen that's well-formed and be as reactive as possible to what the player wants to do.

[00:18:04.198] Larry LeBron: Yeah, and relating to what you're talking about there, Kent, I mean, I totally agree, like, sitting down and trying to just jump headfirst into that kind of mapping would be really, really challenging. So actually, our traditional flow has been to start from thinking about the kinds of stories and the kinds of dynamics we'd like to tell, almost like, you know, writer's room screenplay kind of style. And we've also done, you know, improv acting exercises and role play. And that kind of helps us get a sense of the space, the space of possibilities. You know, obviously, like Andrew's talking about, no specific playthrough is ever going to, there's no hard-coded playthroughs, but there is a general possibility space and kind of social simulation area that, you know, we know we want to support. And that also comes out from, you know, watching players and seeing what kinds of things players try to do in the space. And because of the way the system is built, you can continually add content. And as long as you add those rules, then that content becomes available under the proper context. So it's kind of like an iterative process, too. You don't need to just sit down and define this tremendous map before you get to hit play the first time. You can definitely build this up slowly. And then you'll see your new bits of content that you've been adding and the new story possibilities you've been iterating on introduce and kind of help you build upon the scenario you're building.

[00:19:10.356] Andrew Stern: But in terms of like, what are some of the top down design work that can happen? Some of it is like, what are these social games going to be? Like if you're building a particular interactive scenario, what social state is the most important? A lot of it is like sort of you alluded to earlier, Kent. If you take a certain situation, how do you boil it down to the least, sort of the simplest in a way, set of bits of state? Like, what's actually happening? Like, if two characters are trying to become friends or not, is there a value essentially for your degree of friendship? Or is there a value to how many slights or praises? You know, what do you need to keep track of? So how do you model these social dynamics in a simple enough way That there's a real art to that like you could get to sit overly simple and then you don't have much richness But if you make that model too complicated now, it's too hard to control It's like too many pieces in it because you're writing some machines that operate over those values. So that's the design challenge and that's those are the techniques and the idioms that we've been practicing for a few years now that we're getting a you know, okay. And we're in the process of building tools for ourselves and in theory to, in the future, hopefully others could have access to, to author those things. And examples of these social games, a base library of them would be what we want to get to.

[00:20:30.934] Kent Bye: Yeah, and I think that, you know, when I saw your demo, I got the demo of Immerse at the Portland Virtual Reality Meetup, got this distinct aha moment where seeing how to be able to take this technology and be able to apply it to soldiers who need to learn how to interact with different cultures and what are the different cultural norms, how do they build rapport, how do they do a lot of the soft skills that, you know, may be very expensive to hire actors to be able to run through all these things, so.

[00:21:00.550] Andrew Stern: Yeah, and I mean another application the government was interested in as well But it seems like even more important right now is they also potentially wanted these trainers for the police on how to de-escalate Situations so some of our scenarios and we didn't demo that one at the VR meetup But there's another scenario we built where characters are getting angry and you have to calm them down and that seems pretty relevant these days and based on current events

[00:21:24.040] Kent Bye: Yeah, but there's this moment where you had gone through these different iterations where you started to turn off different social behaviors and programs and kind of break down a scene. And to me, that was one of the more insightful points of seeing what you had done because you can see the kind of emergent behaviors, but then once you start to break it down, then it becomes so much more clear about how unnatural it may feel if someone is just running one of these programs rather than how a natural human may interact. So maybe you could talk a bit about that process of constructing those combinations of behaviors.

[00:21:59.537] Larry LeBron: Yeah, sure. So obviously, yes, like Andrew's talking about, I mean, basically, what's happening in the program is there's a state space. So there's a bunch of tracking of what has happened and what are, I mean, eventually, it's a computer program, everything's, everything's numbers, there's no magic. So at some point, everything is represented by some value, or basically by the presence of some kind of flag and memory, like, oh, you pushed me, or you yelled at me, and you know, and within the last minute or something. So then the trick becomes basically figuring out like you're talking about if you want that combinatorial effect So like Andrew was talking about these different games like, you know affinity building friendship or asserting authority, you know Those can somewhat exist in their own bubbles and you'll have basically within one of these social games You'll have the characters forming intentions purely within that game and then deciding. Oh, hey, I want to make friends with you so I'll say hello to you or I'll try to give you a gift or something like that and And if you run that on its own, then that will kind of just happen and you'll see those interactions on their own. Once you get to the point of wanting to combine that with other effects and get this kind of emergence, if you add another, so like, yeah, there's an example I think we showed at the meetup, where then if you add this authority modeling game, all of a sudden then you have a whole separate state of ideas, like for example, over ownership and permission. So like, If I'm in your home, I might be allowed to touch certain things, but other things might rub you the wrong way. So there's a whole game there. And if I touch something that I'm not supposed to have and didn't have permission, then there are certain reactions that you should be motivated to basically act on back towards me. So where we see the cool effects is once we turn both of those on and we see something like, oh, OK, one person wants to make friends and wants to give a gift. Hey, there's something that looks like a gift over there. And he goes ahead and grabs it. if that thing was actually turned out to be owned by someone else, and then it triggers something in this other neighboring social modeling system around authority, then all of a sudden you get this reaction like, hey, I was trying to just be friendly to this one person, and I actually ended up maybe inadvertently pissing off this other person and starting an altercation, and that's where you get these more rich, complex social dynamics, which are great, because they also don't need to just be, like we were talking about, that wasn't a hand-authored scenario, it came from the combination of these effects on state overlapping with each other.

[00:24:03.087] Andrew Stern: Yeah, I'm just reminded of the scene in 2001 at the end when the main character starts deactivating Hal and starts turning off his modules one by one and pulling out the little, and Hal starts to slow down. It's really, it's not too far from that. We're building up this intelligence as a collection of competencies that they each have the smarts to mix together and sort of compete with each other to some degree for resources. I mean, a lot of what infrastructure we've had to build is how, when a character has ten things mentally that they could do or want to do in any one moment, how does the character's mind, who gets to control what body part, how do you mix things together naturalistically? I mean, those are all, you know, as humans, as we grow up from babies to toddlers to, kids where our brains are learning how to Walk and chew gum at the same time all the nuances of how do you move your body and do all these things simultaneously? So we have to build those competencies into these ideally they're completely procedurally animated so that we have the Full control in the end of the day. What we're really doing is like layering animations together So, you know in facade or in the immerse project a character could be talking about one topic with their mouth their facial expression might be expressing an emotion that was left over from the previous interaction their eyes and maybe a player or some other things happening when their eyes are looking towards that and They're in the middle of walking somewhere for some other reason. So all these things can be layered together So you have to build an animation engine? to pull off this kind of combination so that characters can express all these things. So there's so many components. There's the understanding and the communication, being able to understand what the player is doing and richly so that the player can have agency. You know, there's the reasoning and all these social games. And then there's the need to be able to express all of it. So all this stuff has to get built.

[00:25:52.352] Kent Bye: Yeah, the thing that was really striking to me is that as I was reading through this facade document, I was just starting to think about my own relationships of starting to break down my own affinity and different things of like how this would be mapped out in a computer program. Or, you know, in the case of this IMMERSE program, these different behaviors of building rapport and affinity and trust and authority and You know, the question that comes to mind is, where does that research come from for these kind of fundamental building blocks of our social behaviors? And then how do you take that and then build characters that feel real and have emergent behaviors that also feel real?

[00:26:30.013] Andrew Stern: So, for example, on the IMMERSE project, that was DARPA-funded, that had a team of, it was like a 50-person team, nationwide team from all universities and industrial research labs from around the country. So it included psychologists, subject matter experts, you know, people that know folks in the military or police that have been on the ground that they can talk to and understand what other dynamics need to happen, as well as tons of computer scientists and so on. I mean, that was part of the psychology team, their job was to figure out what do we need to train exactly? What exact social skills, you know? So there's some information there coming from people who do this for a living. But also at the same time, like in Facade, it was just me and Michael, and we were strongly influenced by good drama and just good storytelling in general. And some of it's obvious. I mean, this is still so early days. It's still so much Kitty Hawk, you know, trying to get to something off the ground here that it's pretty obvious, like, yeah, if you're trying to become friends or if you're trying to reveal secrets or if you're trying to threaten someone, the simplest way to model those things, it's not that. You know that said for any one particular story. You don't need to build a full mind I mean ideally you have a lot of capability, but you can just choose a subset of what your particular drama is mostly going to be about and it so it's some balance between general purpose and specifically what your drama needs

[00:27:55.675] Larry LeBron: Yeah, I'll add to that, that I mean, as you start delving into this topic, people can get pretty concerned, because it seems like you're trying to tackle like, what's a classically like, you know, hard AI, AI hard problem, meaning essentially unsolvable. Like, if you were actually trying to model a perfectly complex and fully fleshed out living human being, I mean, that's an incredibly challenging problem that we couldn't possibly hope to solve. And this is, you know, channeling Michael a little bit here. But you know, one point he tries to drum in with this kind of research is that In order to make interactive drama, you don't need to solve that problem. You don't need to model a full person. Instead, what you're trying to do is model a dramatic actor. And like Andrew's saying, that drastically reduces the scope of the problem. So even if, theoretically, we look at a social game like affinity building or friendship and think, wow, I could sit here for the rest of my life and just talk about different ways to make friends, Within that, there's still just for your specific story, you can build yet a much smaller bit of content that, you know, will still give you a rich possible space that players will be able to see in your work. But again, so it's like two levels down. So even within friendship, which is huge, you don't need to solve that whole space of problems. And then friendship within full human level social simulation, you don't need to solve that full scope of problem. So those layers make this tractable, basically.

[00:29:04.980] Kent Bye: Yeah, that's really interesting to bound the problem down to something that's achievable within the context of you're trying to train people.

[00:29:12.042] Andrew Stern: Yeah, well, it actually goes to a larger point, which goes into plausibility and even presence, is we're trying to abstract to some degree. I mean, we're not going for realism, per se. We're going for drama, which it's reality with the boring bits taken out. So that's on the story level. Visually too, to avoid the uncanny valley, like in Facade, Facade had a kind of a cartoon or a graphic novel rendering style. And as we all know that if you don't go for photorealism, you're now reducing the burden of what you need to author and so on. And similarly, the AIs, we're talking about this whole process of modeling these psychological processes, these social games, we're trying to abstract a simplified but still compelling version of them. So that's what makes it doable.

[00:30:02.048] Kent Bye: Yeah, and when I did the Immerse demo, I didn't know anything about what the goal was and what I was trying to do. And after a few people went through, I sort of saw how the goal was to find somebody. And depending on when you show someone the photo, you get different reactions. So if you try to immediately go up to somebody and say, hey, where is this person? Then the reaction was basically you fail. They're not going to tell you. So part of this that you're trying to train is these soldiers is to be able to teach them Well, how do you get what you want in the end, which is to find where this person is? But before you do that, you need to have to build rapport Yeah, yeah, absolutely.

[00:30:38.186] Larry LeBron: And that's some of the joy of that is again It's a different form of sort of interactive exploration is creating experiences that are shorter But that through replaying and through trying different options you can explore the social simulation space versus exploring You know like a broad physical space like you see in a lot of other experiences

[00:30:56.122] Andrew Stern: Yeah, I mean there's still a lot of the art to this that game designers are well aware of, and even though the tech we're trying to build here will give us more flexibility and freedom than maybe historically game designers have gotten to have, but still, no matter what, as a designer you're still very, it's a very constrained design space, I mean you're trying to figure out how to design, you know, choose a premise to your interactive drama or story that Somehow can be contained in a short amount of time that the affordances that we can offer are, you know, finite, but still feel satisfying. We're always trying to find that design sweet spot. So for example, virtual pets, they were a true sweet spot in the design space of living characters that can be immediately reactive, but have less intelligence than people. You know, so they were perfect. I've made, I've shipped like eight or nine virtual pet products in my life because of that, because it's just such a great sweet spot there. Moving on to humans, you know, Facade Those characters were specifically designed to be somewhat self-involved, so that it's believable that they're not going to be able to do any, you know, too many things. They're wrapped up in their world, but they're open enough to allow you as the player to have some degree of agency. And I should say, the Behind the Facade document that you read, as well as other papers we published, you know, we say, you know, we didn't achieve as much agency that we want in Facade. It's still pretty limited, so it still has a long way to go, but Yeah, those characters are designed specifically to make this problem tractable. Likewise, in the Immerse demo with the social training system for the government, we, for example, picked characters that speak broken English, which makes it easier. You can still communicate with them, but it's, you know, more limited and it's believable.

[00:32:41.841] Kent Bye: Yeah, talk a bit about some of the inputs that you have, because you are both recording your voice and having some degree of voice input into Immerse, but also gestural controls and Talk a bit about what type of inputs you can actually have into these different training simulations.

[00:32:58.017] Larry LeBron: Sure. So specifically for Immerse, yeah, it deviated a bit from Facade. I mean, the government was really interested in a fully embodied interface, which was something new for us. So we should mention also that, I mean, that Immerse project that we're talking about a bit was also, is a large collaboration. So we were working on the character AI, but there were folks working, so for example, on a gestural recognition system at SRI Princeton, and that basically employed Connect 2, and they have a whole system that runs in parallel to our AI and the game engine. that's looking at the gestures of the player and trying to analyze like oh did this you know so basically has a finite set of discrete gestures that can recognize like oh you waved you bowed you nodded your head and kind of you know a decent vocabulary built up there and then there's yet another level of sort of raw tracking of like oh I see you know some part of the player's body is extending towards one character or another and you're recognizing where the player is in space for things like oh you just ran towards me you know in a lot of games you run towards someone and there's no reaction in that It can be frustrating. So in this experience, if you run towards someone, they flinch and say, what the hell? And while I'm talking about the collaborators, I should also just mention that also in conjunction, the game engine and a lot of the animation engine stuff was built by BBN over in Cambridge. So there's really three pushing forward on this effort. So yeah, so it's basically gestural and then yes, there's voice input also We experimented a lot and over the course of that project with how to use voice and like Andrew said since there's this broken English the government actually explicitly Didn't want you to be able to just communicate like there was never a translator offered I mean the idea was you're supposed to be interacting with people who are foreign to you and you're in a strange place since again the idea is training this set of social skills in an unknown environment and So voice was used more for reaction to tone, like if you yell, it'll frustrate them, or you can call and get someone's attention. And we experimented a bit with different ways to use voice in the application. We looked at some pitch changes and things like that to try to detect if a player was excited or engaged. And those became more side experiments. I mean, in the end, the main input was based around gesture, based around head tracking, so figuring out where are the players directing their gaze for figuring out who they're addressing, since, again, you're just a person in space interacting with characters on a screen. And then the voice became a little bit more of a side feature, and sometimes a programmer backdoor, too.

[00:35:09.755] Andrew Stern: In general, in the long term, Fassad's primary input was natural language. And in fact, it still remains, as far as we know, the only game where you can say anything you want at any time. I mean, interactive fiction, you know, you can enter text, but it's mostly command-based. You know, move here, pick up the axe, and drink the potion, or whatever. There's few to no games or you can just speak naturally at any time So I mean Siri and things like Siri are the first real wave of agents that you can talk to with your voice But even Siri it's not conversational exactly you can it's mostly command based and So, Facade, you know, it's 10 years old now and there still isn't something out there quite like it. Like Larry mentioned, Immerse, the government specifically didn't want that kind of interaction, which was good because it allowed us to focus on other things like the gestural input. But going forward, we want to bring the natural language interface back and continue improving it. I should say about that interface, which we spent about a year on during Facade, building our own rules and our forward-chaining rule engine to parse and interpret natural language. You know, when you play Facade, and anyone who's played it will speak to this, you know, maybe half the time, Facade does a pretty good job understanding the Grayson trip, understanding what you're saying. 25% of the time, partially got it, and 25% of the time, really didn't get it. And that's with us limiting the length of any one utterance. When you type to speak, if you go too many characters, it starts to cut you off and you have to sort of hit return before you can say anything too long, like six or seven words, essentially. We plan to get back to adding that interface into our future projects.

[00:36:53.923] Kent Bye: And so in terms of the deployment and, you know, what kind of use cases that the government's using this immerse, maybe you could talk a bit more about some of the use cases that they're using to train people with.

[00:37:05.550] Andrew Stern: Well, so DARPA has these various phases of development for their projects. This was a 6.1 project, which means basic research. It was an actually, it's like the earliest phase research project that they have. So it was never really meant to get to more than a prototype, which is basically what we got to. And, you know, we're hoping that they'll continue funding it and start to build, you know, more usable versions of it and use it. But that hasn't happened yet.

[00:37:32.088] Larry LeBron: Just to give a little extra background there, one of the main motivating factors is, so it's interesting, their current method for training these kinds of skills, or their current most effective method is to actually do live role plays, like to build an entire, so VR enthusiasts will appreciate this, so basically building out your entire false space, like a village, something like that, and then having actors, human actors with headsets who are walking around the space and interacting with trainees, and being essentially guided by an on-site director who's watching this and trying to make certain, essentially controlling and being the puppet master and controlling the simulation, that's very effective, but it just doesn't scale well. So that's where the potential for this technology is really exciting. Because again, that kind of training is really prohibitively expensive. So if it could be handled in a virtual environment, essentially, then you could imagine that being accessible to far greater numbers of people.

[00:38:21.627] Kent Bye: And what is it about this interactive drama, interactive fiction that makes you really passionate about it?

[00:38:28.405] Larry LeBron: Oh, wow. Um, boy, I mean, I just I feel like since I was a little kid and like watching movies and thinking I was part of it, or, you know, playing games and having that illusion of having more control than I really had. And I love reading. I love watching good drama, but I'm just really enjoy that feeling of agency of being able to make choices and have an impact on something and that I mean definitely I'm a big Star Trek fan and the holodeck is also a huge influence on me and I mean I'm not I'm claiming not claiming we're quite there but I mean that idea I mean I often joke that's kind of like if someone handed us the holodeck right now we could show some like really cool physics simulations but the characters would still be you know pretty infantile like you'd walk up to someone in the holodeck and you'd have a menu of four things to select and So I'm just really, I mean, I want to see us get there. I want to see us have those kinds of experiences where, you know, you get home after a long day and you're feeling, you have a certain feeling for a certain kind of experience and you could just enter that space and have a story. I mean, ideally a completely procedurally unfold before you just, I mean, it's essential, total escapism is total pleasure. And I think there's a lot of other purposes that are really profound and useful too. I mean, on some level, I think I'm chasing that like completely immersive fantasy vision personally.

[00:39:39.264] Andrew Stern: Yeah, I'm more interested in not being fully immersed. I mean, actually, AR to me has more of an appeal than VR. Of course, the two media overlap quite a bit. I don't really want to completely escape the real world. I'm very interested in dramas and stories that are about real life. I love contemporary dramas that are set in the real world versus fantasy or sci-fi, personally. But at the end of the day, yeah, I love theater, and if I have 20 or 30 minutes on maybe one night a week, I mean, the kind of experiences I would love to play, I wouldn't want to play every day. It'd be like once in a while, like about the same frequency maybe that I want to go see a play or a film, you know, a movie in the theater. I could, for 30 minutes, in my living room or whatever, maybe even get up if it's AR, maybe my living room becomes the set for this one room drama that I'm in, like Facade. Imagine Grace and Trip in your living room. Either they've come to your house and they're hanging out with you and all this shit happens, or your living room becomes their apartment, whatever. Yeah, and I get to basically act in a play. That'd be really, really, really fun. I think it's really interesting to think about forming relationships with virtual characters. That's something I've thought a lot about, especially building the virtual pets, which I started working on in the mid-90s. That was our core mission. How do you become attached and want to take care of these characters? Now, at the same time, it's pretty creepy and maybe dysfunctional to do this. So there's a balance here, and people are already addicted enough to TV and games. Some people, of course, with World of Warcraft, they've spent way too much time with them in terms of a good life balance. So I'm very aware that the seductive power of all this, in theory, can be dangerous. And in fact, talk about danger, I mean, we just worked for three years on a DARPA-funded project. However, it was the most pacifist DARPA project you could imagine, teaching social skills to trainees on how to de-escalate situations and have good social, be good strangers. So it felt okay. So there's that aspect of wanting as a consumer to get to have that kind of experience. Because honestly, I don't play most games. I'm not interested. I think there were amazing spectacles, and I'm really impressed at the fidelity and just what goes into making today's AAA games, but I don't play them. I'm not interested. They're just, I don't care to shoot and drive vehicles. If I have some time, free time, I want to watch an awesome episode of Breaking Bad or Louis or something like that, or go to a play or a great film. Still so no that's from the consumer side from the developer side of it all it's super interesting to develop this stuff I mean, it's so just you know, how do you like we've been talking about how do you? Break it all down and build this stuff. It's you know with all the virtual pets. They were really really fun to build I didn't really play them they were kind of for kids in a way, you know for adults to to some degree, but They weren't really for me, but they're incredibly fun to build. So even if the stuff we end up working on isn't something that I would play for some reason, making it is a wonderful pursuit.

[00:42:56.074] Kent Bye: Yeah. As I was reading through the behind the facade PDF, the thought that went through my mind is that I wish I could see all these values, like as a debugger, as I'm playing through, because you're in some ways, it's sort of like you're modeling this affinity and where you stand and, you know, whose side are you on? And I can just imagine a future where you could have some sort of like relationship simulator where you may have a relationship with somebody and you could see exactly where you are at with them in terms of. you're standing with them and, you know, all these values that are behind the scenes, just putting them up front, but then being able to kind of like use it as actual training for how to foster and develop relationships.

[00:43:34.178] Andrew Stern: Yeah, I mean for sure the products we hope to build and so as Larry alluded to at the beginning of this interview that he and I and Michael Matias have formed a new company and we're starting to build the tools and tech to make products. And in these products, right, it will make sense at some point to expose the AI. There's some way you can play it and turn on and show everything that's happening under the hood. That would be really fun for some players for sure. But yeah, like you say about sort of having the simulation parallel in some way your real life You know, like for example Sims players would some players would build their lives in the Sims They would create their family and sort of act out roleplay dynamics Yeah, you can imagine all kinds of interesting crossover products in AR and or VR. I suppose where Like some sort of crazy weird product would be something that's always running that's monitoring that's listening to everything's going on in your life And it's building a model of it, and then you can play out stuff that happened in your day again. Or you're dating somebody and you want to experiment with what it would be like if I said this. The future, it's insane to think about where this eventually will go over the decades. Now that said, I've been working on this since the early 90s. Michael and I met, Michael Matisse and I met in the late 90s. We thought in about 10 years from then, about 2008, We would have a lot of stuff happening and it's now it's almost 2016 and it's still it's taking forever I mean, we're doing our best to make progress ourselves on our team, but it's insane how long it's gonna really take to build all this stuff

[00:45:14.334] Larry LeBron: Yeah, there was just one point to make. I mean, it covers a few things we were just talking about. But basically, you know, one thing that excites me is that as you give players the potential to express themselves and feel like they're being creative in an interactive experience, then you get this kind of interesting back and forth between the player and the software and the experience where they start to feel ownership and start to feel like it's their creation because there's actually, you know, the procedural simulation behind it, essentially, you know, backing that up. letting their experience be personalized so that, you know, it's truly theirs and something they own and that's something that really interests me and something that, you know, excites me about this direction.

[00:45:48.718] Andrew Stern: Right. I mean, obviously everyone either wants to have tools like this to build their own stories and or the stories, the interactive stories themselves are so high agency that you're essentially authoring your own story as you go. So. Right. We understand that and hope that either our games or interactive stories will offer that high agency and or we'll release the tools to allow others to at some point build their own.

[00:46:12.617] Kent Bye: Yeah, and just to emphasize that point is that if you go on the YouTube and do a search for facade, you'll see, you know, thousands of videos with millions and millions of hits of people doing just that of, you know, having their own experience. And they're all very unique enough that you can watch a whole lot of them and see a whole range of different experiences that people have. And I think That's the thing that I think that to me really points to the future of these interactive stories is that, you know, giving that people that level of control. And just to kind of wrap things up here, the thing that I like to ask everybody is, what do you think the ultimate potential of virtual reality is and what it might be able to enable?

[00:46:48.955] Larry LeBron: Oh, man, handing that to me. I should have known this was coming, too, because I've heard you ask this on every interview. I should have had something in the can. Like Andrew said, it's both tempting and dangerous, this idea of complete immersion. I mean, there's definitely a part of me that feels like it's seducing me. I mean, the ability to truly leave, essentially divorce yourself from your physical surroundings and absolutely enter another world, you know, Ready Player One style. I'd be lying if I said there wasn't a significant part of me that's very attracted by that. I mean, also kind of terrified of it, I guess, on some level. But I also think I mean that doesn't necessarily have to mean you're completely leaving the real world It could just mean that you're it's kind of like exploring, you know parallel dimensions, but They don't have to be unreal just because they're virtual I guess would be one way of putting it I mean you could still be deeply social in a virtual experience You could still be exploring something that has its own notion of truth like just because it's not made of atoms and is made of bits I don't think makes it less, you know legitimate and So, I think it just comes down to the kinds of experiences that are created and sort of how they respect people's time is one way of putting that. So, I'm very excited about it. I mean, I'd like to see that potential at least be reached and I'd like to see people also use it responsibly, I guess.

[00:48:00.887] Andrew Stern: Yeah, I see these technologies as part of a natural lineage of media for creating character and story. We've had drama to the novel, to television and cinema. And I like to use the word storytelling, by the way, in the case of what we're doing here. I feel like this is story making. If it's a real simulation, it's dynamic enough that we're not telling a story, we're creating one as we go. But I don't see it really as you know, there's naysayers out there. They're like anti technology, you know in my daily life if I tell people what I do some people feel like oh god, you're making stuff that Now who's gonna want to interact with the real world anymore that kind of thing? But I'm at least the way I feel it is it's not that much different than a good novel I mean and some people are addicted to reading, you know, or television or whatever they don't want to deal with the real world and they just immerse themselves in artificial stories, but it's going to be a balance in someone's life. They're going to get to play these things, and then they should turn it off and have regular real-life interactions, too. But it's very seductive. My daughter wants to play Minecraft all the time. I have to limit her time on it. It is sort of dangerous stuff. So like any medium, it's going to have its good uses and not-so-good uses.

[00:49:17.810] Kent Bye: Great. Is there anything else that's left unsaid that you guys would like to say?

[00:49:21.866] Larry LeBron: Oh, yeah, I want to say thanks again and just let folks know that, as Andrew mentioned, the two of us with Michael Matias have started a new company. We're calling it Playable. It's P-L-A-Y-A-B-L. And it's based here in Portland. We're definitely going to be moving forth and building more interactive stories and story experiences like this, dramatic experiences. And if you want to read a little bit and also get some information on Facade and Immerse, you can check out our site at playable.ai.

[00:49:47.712] Kent Bye: Awesome. Well, thank you.

[00:49:49.093] Larry LeBron: Yeah, thanks so much.

[00:49:50.386] Andrew Stern: Thank you, Kent. And thank you for listening. If you'd like to support the Voices of VR podcast, then please consider becoming a patron at patreon.com slash Voices of VR.

More from this show