I interviewed Matt Romein about MILKMAN ZERO: The First Delivery on Monday, November 17, 2025 at IDFA DocLab in Amsterdam, Netherlands.
This is a listener-supported podcast through the Voices of VR Patreon.
Music: Fatality
Podcast: Play in new window | Download
Rough Transcript
[00:00:05.458] Kent Bye: The Voices of VR Podcast. Hello, my name is Kent Bye, and welcome to the Voices of VR Podcast. It's a podcast that looks at the structures and forms of immersive storytelling in the future of spatial computing. You can support the podcast at patreon.com slash voicesofvr. So continuing my series of looking at different immersive nonfiction and digital storytelling pieces from IFA DocLab 2025, I'm going to be talking about the Voices of VR Podcast. Today's episode is with Matt Romine, who's got his latest theater piece where he's performing but also using video game technologies to kind of do a performance. In this case, it's a text adventure game that he's kind of scripted and constructed to give you this kind of adventure for what he's calling Milkman Zero, the first delivery. So in this piece, Matt is basically up on stage and he is typing into a computer and the audience is watching him. And a lot of this experience is like you're kind of in this Twitch streaming type of context where you're watching someone else play a video game, but it's mostly you're interfacing by reading what he's writing and seeing how he's using his imagination to do things that you wouldn't quite expect to see how he's going to solve these variety of different problems. And also the whole text adventure game takes a turn on his overall adventure. And so this is something that he was sort of prototyping and trying to expand out and to see if it would merit attention as he's sitting there typing at a computer for around a half hour. And he's going to continue to expand it out to maybe 45 minutes to the full hour. So that's what we're covering on today's episode of the Voices of VR podcast. So this interview with Matt happened on Monday, November 17th, 2025 at IFA Doc Lab in Amsterdam, Netherlands. So with that, let's go ahead and dive right in.
[00:01:51.921] Matt Romein: My name is Matt Romijn. I make mostly performance-based projects using XR tools, a lot of game engines, a lot of interactive technologies, but mostly coming from the approach of a more theatrical kind of mindset. Hmm.
[00:02:08.018] Kent Bye: Great. And maybe you could give a bit more context as to your background and your journey into the space.
[00:02:12.581] Matt Romein: Yeah, I studied acting when I was an undergrad. And from there, I got into sound design, video design, and was also a minor in art technology. And I started working in dance and theater a lot, kind of as a designer, as a technician. And then I went to grad school at NYU with this program called ITP. and started learning to code more, and then had a somewhat brief foray into more installation art. And now in the last five years, I've come together to join my theater practice and background with the more XR installation background.
[00:02:48.149] Kent Bye: So how do you identify as like what type of genre or art or what's the name that you call yourself?
[00:02:54.796] Matt Romein: I mean, I usually just call myself an artist, but I have been trying to position myself in the context of theater work specifically, mostly just because I enjoy the... setup of a theater that people will buy a ticket, they'll sit down, you get to play with lights and sound and really you can kind of count on people's focus and the people watching the show will expect you to do something or stop focusing. But as opposed to more of like a gallery situation or a museum or installation stuff, I kind of like that there's a narrative format and a time-based format that everyone will experience sequentially.
[00:03:34.690] Kent Bye: Yeah, and the Bag of Worms project I think of as sort of a mix between technology and performance and lecture. And so you're using the theatrical staging to do different types of performance, but mediated through technology in different ways. And so maybe you could talk a bit about where that idea started and then how it's continued into the project that you're showing here this year.
[00:03:55.996] Matt Romein: Yeah, so I did a project in 2022, and it was called Bag of Worms, and it was at IDFA Doc Lab as well, and we talked about it actually. But it was a motion capture theater project where two people were in motion capture suits. You're seeing them projected on the wall behind them in a digital world with digital bodies. And it's kind of a meditation on video game violence and control and domination. And it's very funny. It's very bloody. And that was purely just an experiment I had with some theater friends of mine that I had designed for. And I was just like, I've seen a lot of dance in motion capture, but I haven't seen like... more dramaturgical, theatrical approaches to motion capture that also play with the actual people that are on stage as characters as well as their digital counterparts. But after I did that project, I realized that that one wasn't explicitly a video game per se, but I play a lot of video games and just the medium of video games really has an interesting narrative format to it that we experience solo most often or watching someone else play. And I was curious to explore playing games that I make as a form of theater. So I had another project called God Mode. where it was a kind of multi-dialogue story-based game or like multi-choice dialogue where I would be playing the game and selecting responses. And it was about a little boy's birthday and it gets violent and crazy and all the stuff I usually do. But all the avatars in the show were live facial motion captured by actors. So there was a real kind of still physical presence, and that was entirely scripted, but it had the feeling of being like, oh, is this actually a game that could go in different directions? And that brought me to this project, Milkman Zero, which is a text-based video game about delivering milk. And there's no images, there's no computer graphics, it's just text, almost like an old-school Zork kind of game. But I play it in front of an audience, and... Yeah, it goes to some interesting places. It's about 30 minutes at the moment. Nice.
[00:06:09.992] Kent Bye: And just the other day you were giving a talk around this and you said you had read a book around the text adventure games. Maybe you could talk a bit about some of the inspirations for digging into this format of the text adventure games.
[00:06:22.034] Matt Romein: Yeah, there's a book called 50 Years of Text Video Games. I think that's the name of it. I apologize, I don't remember the author's name, but I think it was a blog series that then became published as a book. And I think from 1975 to 2025, each chapter is one game from each year and kind of a deep dive and... I just found it really interesting. I never played very many text-based video games. And specifically in like the 2000s and even the 90s, it started to get really experimental. I mean, there's a whole offshoot of the interactive fiction kind of a scene because now that video games were pretty graphic heavy, you had to get like kind of experimental. Why would you do a text-based game? And I guess I was curious to make a piece that wasn't so reliant on computer graphics. That had been most of my work so far and to kind of strip something down to purely text and almost as a kind of a performance constraint to have to make something using only text felt like a new life challenge for me. And also being totally honest, like it's just harder to tour things and get funding for making things now. And I kind of wanted to make a show that I could just have only my laptop and me and I could bring it anywhere. And so far it's been pretty successful. Yeah, to just keep, stay lean and stay mean.
[00:07:45.726] Kent Bye: Yeah. And it's interesting watching your performance to see how you're essentially sitting at a keyboard and you're playing the game and you're typing out different phrases and different words and whatnot. And so like, how do you think about your performance? Cause it's pretty stilted or a lot of the humor and the communications coming through the text. And so you're not necessarily like the center of it, but you are there. And I'm just trying to get a sense of like, as a performer and an actor, how you think about what this performance is as you're like sitting there. and what's coming out of your body versus what's on the screen.
[00:08:18.108] Matt Romein: That's probably what I've been struggling with the most in this piece. At the moment, my choice has been to just be a pretty neutral with like maybe one or two moments of showing just a little bit of like emotion, like a smile or something. And it's a little tricky. I mean, like a spoiler, I guess, is that When you're watching a performance, I'm typing, I'm interacting. It feels like maybe it could go in different directions, but the entire thing is fully scripted. And you kind of realize that by the end, that this is a scripted performance. But I didn't want to do the thing where I act surprised or I act shocked, because no one's going to believe that I... No one's going to think I actually have never played this game before. So to act a certain way feels disingenuous. But... I guess I'm trying to be a kind of neutral slate to kind of allow people to sink into the text and just treat me, the player, as a bit of a cipher for themselves. But what does become interesting is I do have moments where I'm typing things that are not the obvious thing to do and some of the kind of bratty way we all play games of like, I'm going to try and make this character do this in the games. Like, you really shouldn't do that. So there is some character that starts to come through. And I guess actually a funny story is this piece was actually originally the ending to Bag of Worms. We had done it as a 40-minute version. I was going to do a 70-minute version. And it made a lot more sense because in that context, you knew who the person playing the game was. It's a character from that show. You know the guy who made the game. It was my character. And all that context of where the game comes from, who's the person playing it was answered. But in this solo version I'm developing right now, I have not quite figured out how to bring the player into the context of who is this person and why are they playing this game.
[00:10:13.382] Kent Bye: Yeah, I recently had a chance to see Asses Masses in Portland, Oregon, where there is like different text inputs and text prompts and dialogue choices. And what was interesting around that version was that the audience ends up shouting what they want. And it's a way for the audience to kind of express their agency or their identity, because ultimately the person that is playing is kind of the center. And so I'm just curious to hear some of your thoughts of the use of like video games as a way of kind of creating this narrative structure for a performance that We have Twitch streaming and other ways that people watch people play games, but in a way that has more of a theatrical setting, you get an opportunity to have certain expectations be built up and then perverted throughout to be a part of the performances for it to go in new and novel ways that people don't quite expect.
[00:10:57.805] Matt Romein: Yeah, I mean, it's funny. The most common question I get with this project is, is this a game I can play? And it's not. As I was saying, it's entirely scripted. But I think for me, the thing I've discovered when I think about this stuff of things like Twitch, the act of watching someone else play a game... It's pretty ambient. I think Twitch is primarily a kind of second screen kind of thing for a lot of people. I'm actually not sure. I don't watch that much Twitch, but it's not something that sustains focus. It's very durational. You can jump in and out. You're also giving attention to the chat screen as well as the actual content. And for me, my favorite theater is really intentional with the time that's being used on stage. And there's a kind of taut rope kind of feeling, like a tension that you want to maintain that could kind of snap, that you're always trying to stay like a step ahead of the audience. So I guess to say when I'm thinking about video games as a performance, a lot of times the scripted version is most important to me because it lets me keep that theatrical tension. And I'm trying to write the script in a way that feels true to how someone would play a video game, like what choices they might make, but also cut out all the stuff of like the call to sacks of, well, that wasn't the right thing to do. And that was boring to watch someone spend 10 minutes trying to figure out this mechanic. So I think there's a kind of way in which the audience will suspend their disbelief to kind of go along with me that I would figure out how to do something so quickly or know what to do next and just kind of move the story forward.
[00:12:38.430] Kent Bye: Yeah, when you were talking about this project the other day, it was really interesting just to see how you're using Google Sheets and a spreadsheet and TouchDesigner. Would you be willing to talk a bit around the pipeline for where the data are coming from? And how are you using something like TouchDesigner to create this appearance as if there is an interactive game that's happening?
[00:12:57.395] Matt Romein: Yeah, to paint a little bit of a picture, there's text that comes across the screen, and it's kind of the narrative part. And then there are points where there's a character input field where I type stuff. And it's kind of toggling back and forth between those two modes. I just start with a Google Doc. And I write the script of what the game will say and then what I will say and what the correct thing to type is. And that's all just like a movie script you would essentially see. And I'll write notes for myself of like, oh, and then the inventory screen comes up or whatnot. From there, I put everything into Google Slides, and each cell is basically like a slide in the game, so to speak. The text that comes up and then clears the screen, and the next text that comes up. And then also in other columns in the Google Sheet are play a sound effect, have a pause before moving on, so I can kind of build out how it goes. Then I download all that as a CSV file. I feed it in the TouchDesigner. That parses all the information. It sends it to the sound system. It sends it to the animation system. It toggles back and forth between character input or just rendering text. And then the last element of all of that is it's all done in HTML, CSS, JavaScript. So TouchDesigner is rendering static web pages and just using the text renderer part of the HTML because it's so much better than TouchDesigner's text rendering system. Yeah.
[00:14:26.331] Kent Bye: So is it progressing each time you hit enter or you have to enter in the right text and then it's sort of matching and seeing if it matches then it progresses?
[00:14:33.158] Matt Romein: Yeah, so I have to write the right text. If I don't write it correctly, it says, I'm sorry, I don't understand, which is actually part of the script, too. There are points where I want it to say, I'm sorry, I don't understand. But if I make a mistake, there's kind of a dramaturgically sound kind of fail safe where I'm like, oh, shit, I got to write that the right way. But I actually have a different view screen than what's on the stage that everybody's watching. And I can see things like how long is the next sound effect that's going to play? What's the thing I have to type? What's coming up in the next slide? At this point, I have it all memorized. But there are moments where I'm like, wait, was it open door or was it enter room? And that can be kind of a failsafe for me.
[00:15:18.277] Kent Bye: Yeah, a lot of projects this year at DocLab are going back to another era of the early beginnings of the internet. And we're kind of going back even to the early beginnings of some of these video games that are just text-based. And so I'm just curious to hear any comments on the use of imagination in terms of theaters are already asking people to kind of project out and to imagine things. But here, using text as the primary mode to encourage people to use their imagination to help create a sense of plausibility, but also expectations.
[00:15:47.956] Matt Romein: Yeah, I guess it's fun because I should say the text is really kinetic. It comes out character by character. There's lots of pauses. So it's not just like describing like I'm reading a book. The very way that the actual words render on screen and the timing of them feels very cinematic and feels very time-based to me. So in that sense, I think it also allows for the audience to have a somewhat... similar timing of each moment that comes up. You're kind of reading all at the same speed, I hope. Maybe I move a little too fast at times, but that allows the jokes to land at the same time. If all the text just showed up immediately, maybe someone else would get to the end and laugh at that joke and another, you know, it's like all just kind of disjointed. So there's elements, and I should say that this is not set up like a typical text adventure. I think the old school ones show you all the text at once. You see the stuff that it had rendered before. It's like a full screen of information that you can kind of refer back to. And I was like, this is going to be too distracting for the audience. But I don't know. I think it is also just a curiosity for me of like, can I take something at its simplest form, which is just words on a screen and make that like a collective imagination space or a theater space? And it's been a fun challenge for me. And I think it's been more successful than I might have imagined before I built my first prototype of it.
[00:17:16.556] Kent Bye: And it's fun to see it in a crowd because you have collective humor moments where people are also laughing and you get a sense of a shared experience in that way. And Casper Sonnen had mentioned that this is part of the spotlight, so you're kind of prototyping and experimenting. So what were some of the biggest lessons that you got from your showing here at DocLab?
[00:17:34.818] Matt Romein: I think the biggest lesson I got is that it can sustain a 30-minute version. I'd only done it as a 10-minute version, and I was like, at what point are people going to get their eyes going to get tired or just going to get bored of the format? And I haven't had anyone express any, like, I started to wane at the end there. I would have wanted something a little different. So I'm actually building this towards about a 60-minute piece, a kind of evening-length theater piece that'll be, I think, a bit... more like have some theatrical elements they're actually physically on the stage even but it's given me some confidence that i could build a 60-minute show in this format and i can sustain audience focus and attention i think the other thing i've learned too is just that there's some comedic timing stuff there's a few i've made a few changes while i've been here after performances where it's always like just get to the point faster like sometimes i think i'm supposed to set up a joke by doing a more believable video game thing. Like I need to examine this thing before I interact with it. And people will buy that I just get to the meat of the thing I'm trying to do there. I don't mess with all the inventory system stuff.
[00:18:44.721] Kent Bye: Yeah, all the leaps and things at work that you don't expect to work, I think, is a real pleasure in the experience of this whole adventure. And also kind of harkening back to another era of how we enjoy these different types of games and really paring it down to the essence. And you're about to go see Handel with Kara, which I think is also another theatrical piece that's trying to boil things down to the core essence. So I'd be curious to hear some of your thoughts on that at some point. But yeah, I guess as we start to wrap up, I'd love to hear what you think the ultimate potential of this type of intersection of immersive art and performance and what it might be able to enable.
[00:19:15.281] Matt Romein: Um, I guess for me as a theater person, I guess I really do think of myself that way these days. I just think there's so many new narrative formats that come from interaction that we all kind of hold on to as a individual thing, like usually in our rooms and whatnot, or just our own sense of play. And I think There's a whole generation of artists that kind of grew up playing video games, grew up interacting with technologies, and also know how to use those technologies as well as having practices in installation and performance and theater and club and drag. And I think that it's just a really ripe space for exploring a new kind of performance language.
[00:20:02.477] Kent Bye: Nice. And is there anything else that's left unsaid that you'd like to share with the broader immersive community?
[00:20:06.895] Matt Romein: No, I'm always excited to see the stuff that comes out of other artists, and especially at this year's Doc Lab, I think there's been some really incredible pieces, and I'm especially excited for the next one we're about to see. It's Handle With Care is the name, right?
[00:20:19.482] Kent Bye: Yeah, Handle With Care.
[00:20:20.503] Matt Romein: Yeah, so there's a lot of people exploring these spaces, and I want to encourage more people to make this kind of work.
[00:20:28.723] Kent Bye: Awesome. Well, thanks so much, Matt, for joining me today to explore this intersection of different ways of using interactive technology in the context of performance and theater. And, yeah, I think it's a compelling experience that can work when you don't necessarily expect it that it can work, you know. But we have lots of different other prior examples of people with Twitch and other kind of formats. But yeah, having this kind of shared experience of collective imagination is real interesting. New areas of exploration that I haven't seen quite as much so far. So yeah, thanks again for joining me here on the podcast to help break it all down.
[00:20:59.447] Matt Romein: Cool. Thank you so much.
[00:21:01.548] Kent Bye: That's all that we have for today, and I just wanted to thank you for listening to the Voices of VR podcast. If you enjoyed the podcast, then please do spread the word, tell your friends, and consider becoming a member of the Patreon. This is a listen-supported podcast, so I do rely upon donations from people like yourself in order to continue to bring you this coverage. You can become a member and donate today at patreon.com slash voicesofvr. Thanks for listening.

