#912: Virtual Conference Lessons Learned from IEEE VR 2020 + Experiential Design Tradeoffs

On March 6th, 2020, the IEEE VR announced a venue change on their website from Atlanta, Georgia to online in virtual reality for the academic conference set to happen between 22-26 March, 2020. The organizers of the IEEE VR conference had two weeks to translate their existing, co-located schedule 1-to-1 into a virtual conference. Luckily Blair MacIntyre had already been planning a supplementary online, VR portion of the conference using a developmental release of Mozilla Hubs Cloud Enterprise that would allow for people to use their phones, tablets, laptops, PCs, or virtual reality devices to have immersive, virtual reality experiences.

I had a chance to attend all five days of the IEEE VR conference while immersed within virtual reality providing real-time feedback and commentary on Twitter, and exploring what virtual reality has to offer the art of gathering in the pandemic age of virtual conferencing. This essay will unpack some of my deeper thoughts, experiences, & reflections from IEEE VR based upon a 90-minute conversation that I just had with a couple of the general chairs of the conference.

The majority of the IEEE VR conference schedule was talks in three different tracks, and they primarily used Zoom meetings that were rebroadcast out to three concurrent Twitch streams, and then were later archived onto their IEEE VR YouTube channel.

They primarily focused on creating virtual spaces to co-watch the live Twitch streams, but this was not as successful or useful as they originally thought it was going to be due to limited time to interact, the unclear architectural affordances that made clear how audio was transmitted, and violating of normative standards of presentations by trying to multiple, concurrent audio streams happening at once. We can really only pay attention to one audio stream at a time, and so users didn’t having as many social interactions in presentation spaces, especially in the absence of clear audio-isolated areas.

The poster sessions were particularly successful, and they were able to generate some open-ended social VR spaces for gathering, but overally it was difficult for the IEEE VR conference to get a critical mass of participation in the online portion or to foster emergent conversations.

I ended up spending a significant amount of time at the IEEE VR conference immersed within the virtual reality experience experimenting and exploring different immersive architectures that would facilitate emergent conversations. I realized that a shared context and shared intention weere key components. There were a number of Birds of a Feather gatherings that happened throughout the conference, and it ended up that setting a specific time to meet with a specific question to be answered or problem to be solved was a great catalyst for gathering within a VR space in Mozilla Hubs in order to facilitate a group conversation.

I had a chance to sit down with two of the general chairs of the IEEE VR conference, Kyle Johnsen and Blair MacIntyre to unpack some of lessons learned from IEEE VR conference. It’s possible to do a 1:1 translation of a co-located schedule into a virtual conference, but the tools to best transmit the content of the talks end up being the existing and well-established, 2D video broadcast teleconferencing tools of Zoom rebroadcast to a Twitch or YouTube livestream. But what about all of the embodied conversations and in between spaces? How can virtual reality start to replicate some of these more emergent hallway conversations? The IEEE VR’s implementation using Mozilla Hubs is a start, and there were additional insights and innovations that came out of the Laval Virtual’s online conference a month later that used the Virbella platform.


There are a lot of tradeoffs and optimizations that need to be made for an online conference that starts with the underlying intentions and purpose of gathering. I’m going to try to break down some of the tradeoffs of virtual conferences through the lens of my experiential design framework that breaks phenomenological experiences down into Social & Mental Presence, Active Presence, Embodied Presence, and Emotional Presence.


For me, I tend to break down the essence of a conference gathering between the content of what’s being discussed as well as the relational dynamics that can be cultivated with the people who are there. Most virtual conferences so far have focused 85-90% of their attention on translating and transmitting the content through livestreamed talks, but less effort on how to replicate the relational and social dynamics that naturally emerge when people are sharing physical space with each other with their full and focused attention and intention to be open for new connections and new possibilities for collaboration. So for an academic conference, there are multiple tiers of peer-reviewed content to be shared with the wider community ranging from accepted talks, posters, panels, demos, and workshops.

The poster sessions are able to work particularly well in VR because the poster sets a specific context that is spread out over a space. There were also demo areas, but it’s really impossible to demo emerging VR hardware solutions through the lens of another piece of VR hardware. You have to try it out natively on that original piece of VR hardware, and so demos are one of the hardest thing to translate into a virtual conference. The best that you can do is to record and share a 2D representation of the experience, which can’t ever really fully replicate a direct embodied experience.

In terms of transmitting the schedule and information about the gathering, there was 100% reliance upon the 2D website, and if you were only seeing the experience while immersed within VR, then it was difficult to know when and where things were happening. So I’d like to see more information architecture innovations for thinking about how to communicate when and where events are happening. Sites like The Wave VR have a series of upcoming screenings, AltSpaceVR has a curated listing of upcoming events that are happening that day, and VR Education’s ENGAGE platform all have a scheduled listing of upcoming events. So I’d like to see more thought in terms of translating a schedule into an embodied, immersive space taking inspiration for how SXSW and GDC create large-scale schedules that can be interpreted at a glance for what’s happening at any moment and where.


In terms of social presence, there’s a challenge for how to discover where emergent social gatherings are happening. IEEE VR had a natural discovery mechanism for showing which rooms had clusters of conversations, but Johnsen said that there needs to be better ways of signalling that you’re open to conversation at specific times or when you’re in a specific location or context. I noticed that it takes a lot of dedication to hang out in empty social VR space by yourself at a conference waiting for someone to collide with you in order to seed social conversations. At least once I was able to help seed a conversation that started with just two other people, but soon expanded out to over 20 people in a room.

The sponsor rooms that had videos showing their latest research ended up being a great catalyst for gathering organically emergent conversational clusters that allowed for meeting new people who were 1-degree of separation away or for lurkers to edge their way into existing conversations. The ghost lobby feature in Mozilla Hubs was activated, which facilitated some new dynamics for lukers to do conversational discovery, but it also has some privacy tradeoffs as people could potentially be listening into conversations in a disembodied way. We’re not used to letting people listen in to our conversations in an omniscient way, and this could help facilitate new lurking dynamics for introverts, but there are also risks and new normative standards for the line between public and private conversations within virtual spaces.

IEEE VR used Slack for backchannel for communications, which helped provide a persistent way to send direct messages to people. They didn’t use Discord, and so there isn’t a way to friend people on the communications medium beyond the specific context of this gathering like there is on Discord. I ended up friending many people on Twitter to use Twitter DMs. But Slack does a great job of creating a context-dependent, disposable architecture for backchannel conversations.

Slack ended up providing some interesting, ephemeral direct messaging conversations and coordination since it was difficult to run into people within an embodied virtual environment. My preference is optimize for in-world, embodied exchanges, and so I avoided using the 2D interfaces and language abstractions as I spent most of my time immersed within virtual reality experience and I didn’t use the Slack backchannel very much. So while there was a fair amount of social interactions and engagement on Slack, this wasn’t interesting or particularly new for me as a VR journalist and I tended to avoid it. But I couldn’t avoid it completely as the organizers relied heavily on Slack. They said that about half of the attendees that never logged on to Slack, and so it was easy to miss some of the more emergent aspects of the conference, especially announcements about the Birds of the Feather gatherings that were being dynamically planned throughout the conference.

Virbella has a system wide announcement tool that ends up being a pretty blunt tool that is like a pop-up ad on your 2D experience. Finding ways to send broadcast messages to everyone is an open design problem yet to be fully figured out. With both Mozilla Hubs and Virbella, there were no broadcast messages that were sent when I was in virtual reality.

The discovery of social happenings was a challenge. You had to commit yourself to hanging out into an empty room in order to attract other people to come talk to you. But since most of the rooms were too boring to hang out by yourself, then you’d have people quickly drop in and immediately drop out as there would be no one taking the leap to make themselves available for social interactions. But there was also a bug that required a hard refresh in order for your presence to show up in the master listing of room capacities, which meant that people would have to do a hard refresh every time they went to the master navigation page. I didn’t realize this until many days into the conference, and many people never realized that a hard refresh was required to get an accurate tally of who was in a room, and so the rooms were reporting that they were completely empty even when they weren’t, which only accelerated the problem of cultivating clusters of people.

I’d like to see more consideration to be able to dynamically indicate where emergent clusters of people were gathering by tying an API call to the server gathering the attendance count of rooms to a spatialized object that scales up in size relative to how many people are in a room. This will require some work in making it easier to navigate from room to room, but imagine seeing a hallway of portal doors to different rooms within VR and being able to look determine how popular they are just by looking at it in VR. VR Chat solves this with their 2D menu interface, but it’d be nice to have some of this endemic to the virtual architecture surrounding the portals to navigate between virtual worlds. But there are current limitations of even creating these navigation portals within WebXR.

WebXR Navigation is an open problem at the spec level, and so traversing from page to page is still yet to be full solved. But there’s a WebXR Navigation proposal that can be implemented, which will make traversing from WebXR room to room easier. Once this is figured out on platforms like Mozilla Hubs, then it should be easier to create portals that link different rooms and help with dealing with the native limitation of only being able to have 25 people in the same virtual space at the same time. Hubs does have the ability to traverse from room to room, but the session data isn’t transmitted, and so you have to go back to the browser interface to reconfirm that you’re actually going to the correct location. It’s not an optimal user experience flow, but it works for now to enable a primitive way to transverse from room to room.

It’s still a big open problem for how to best facilitate emergent conversations in virtual spaces. Johnsen suggested that there needs to be a better way to indicate that you’d be open and available for conversation. But this also requires being cognizant of when you are in phase of open to possibility to another phase of being not available and closed. There’s already chat dynamics of marking your status of available or not within VRChat as an example, but also from other 2D chat applications like Discord and Slack.

I suspect that a key to facilitating these types of emergent conversations will be in the establishment of a deeper context in the architectural design of spaces. Imagine that you want to have a conversation about 3DUI, then what would a theme park of 3DUI experiments look like in VR? You could create an entire interactive experience exploring different 3DUI tradeoffs to provide people with a direct experience of some of the 3DUI work that would actually best use the VR affordances.

Johnsen said that most of the content at IEEE VR was focused on 2D posters, images, and videos, and that there was actually very little VR content to directly engage with. What would a conference look like where there was a more participatory engagement with the latest 3DUI research and experiments? Could there be a way to export some of the Unity code into a WebXR experience in order to create some social dynamics around it? Or perhaps future 3DUI research will be conducted within WebXR itself to make it easier to archive and share to the rest of the research community, not only from an experiential perspective but also the perspective of sharing the source code in an open source fashion.

There needs to be more thought about about how research work can be shared in a collaborative WebXR environment like Mozilla Hubs, which then these immersive experiences could become hotspots for emergent collisions at conferences that help to establish a deeper context and more meaningful and focused conversations. How can the ongoing research of VR by academics be used to create spatialized experiences that use the medium of VR to communicate their ideas?


When I think about active presence, then I’m focused on how my agency is being expressed, received, and reflected by the experience. The IDFA DocLab’s Caspar Sonnen told me that “the fundamental character of an interactive piece of work is that the more you put in, then the more you should get out.”

So if I dedicate myself to having a real-time, live, embodied experience at IEEE VR, then what do I get out of it? Ideally the more I put in, then the more that I get out. But the design of virtual conferences still has a long ways to go in order to achieve this. It was hard to get out as much as I was getting in because the quick translation of the conference into a virtualized experience was primarily focused on the experience of collaboratively watching Twitch streams, which didn’t naturally cultivate high-agency interactions. The times that I did feel like I was seeing this type of direct return of getting out what I put in was when I was interacting in real-time with other people.

The creation tools in Mozilla Hubs have the possibility of allowing you to express dynamic agency as you’re able to alter the room by bringing in objects from Sketchfab or Google Poly, to draw, or bring in images. But there were larger performance and security issue concerns that led to disabling these creative features.

The Mozilla Hubs Cloud architecture allows for you to create your own spaces, and it’s this creation of new spaces that could included within the overall conference experience that got me really excited. There’s an opportunity for a Burning Man build-out type of dynamic where attendees have the possibility to create customized spaces that set a specific context for social interactions, embodied exploration, communication of ideas, or the delivery of an immersive experience to be received. I personally wanted to have more interstitial threshold spaces to recreate those hallway conversations, and so I experimented with creating a room with the schedule and a hallway to go from room to room. But then I found that I didn’t want to hang out in those rooms by myself, and I didn’t set a specific time for people to meet me. So it ended up not being a compelling architecture to facilitate serendipitous collisions.

It’s these chance encounters at gatherings that make it feel like your agency of embodied movements are able to facilitate new connections that ultimately help you either solve problems you’re trying to solve or to accomplish a deeper intention with what you’re trying to create and manifest into the world. I thrive off of these types of serendipitous encounters at physical conferences, and it’s one of the most difficult things to abstract into component parts in order to replicate within virtual spaces. The emergent Birds of a Feather gatherings seemed to accomplish this in the best way.

Lara Lesmes and Fredrik Hellberg of Space Popular held an amazing Birds of a Feather gathering exploring how to recreate public spaces within immersive architecture, and it ended up facilitating some of the most engaged and interesting conversations of the conference for me because it was able to abstract out some really interesting open problems of what makes a virtual space a “public space” in the sense that there are government-supported places that have a persistence and permanence of serving the larger public interest. Is permanence the key here? Or is it more about matching the emergent intentions of the collective with a dynamic architecture that is able to facilitate cooperation and collaboration?

A big part of the underlying character for active presence are the live elements. What is it about a live gathering that makes it live? Johnsen brought in the analogy of a live sports event where the action is emergent, and that it just isn’t the same if you watch it later. He was surprised by his reaction of not finding the pre-recorded talks as compelling as the ones that were live. I totally agree and have been trying to unpack why that might be.

From a process philosophy perspective, it’s as if there’s an emerging concresence of many things coming together to create an occurance or happening, and in that live moment you feel as if you’re actively participating in that concresence. It’s a participatory dynamic where the engagement of many people combine to create something that’s beyond the contributions of an individual. To me, a pre-recorded talk feels stilted and dead. There isn’t the excitement that there’s something at stake of a live demo, or a presentation where something could go wrong. Or like Johnsen said, that there’s an offhand comment that is very fitting for whatever is emerging at that moment. There’s a giving and a receiving that happens during a talk, and the watching of a live presentation with a live audience can actually change the dynamics of a presentation. It’s a subtle effect, but for me there’s a huge difference between the energy of a live performance in front of 1000 people, and something that was recorded in the privacy of my home office.

But what’s the essence of these live elements? Since many or most of the talks at IEEE VR were pre-recorded, my phenomenological experience was that it wasn’t compelling enough to be in VR to watch a pre-recorded talk. For me, there’s a participatory, experiential element when the audience watching something live. Part of it is having collective focused attention, and some of this can still be replicated within virtual talks with a live audience. It’s almost if these collectively shared moments also help to cultivate a deeper context for the conference that can help to facilitate deeper conversations later in the gathering. Does it matter whether the talk is live or the audience watches live? I can feel a big difference, and I really prefer the live when possible (for some time zones or language barriers make a live transmission more difficult).

But the other big aspect of the “liveness” of live talks is being able to directly engage and interact with the speaker through a question and answer. MacIntyre talked about speaking on a panel at Laval Virtual, and then having people come up to him afterwards to be able to discuss what was talked about. Because IEEE VR was abstracted into 2D livestreams, then there was actually a pretty significant barrier to facilitating those embodied interactions after a talk. Some of the IEEE VR sessions encouraged attendees to enter into a Mozilla Hubs room after a session to facilitate a group discussion or an interactive Q&A, which worked as long as there were less than 25 people who wanted to do that.

I tried to attend the Q&A after Mozilla Chief R&D Officer Sean White’s closing keynote, but the room was flooded with more than 25 people. There were audio connection issues where people would join but couldn’t hear everyone in the room, and then I had to rejoin and the room had exceeded it’s capacity and so I couldn’t get back in. Mozilla Hubs was originally architected for private contexts and to only facilitate groups of people 25 or less, and so these large-scale conference gathering use cases are going to require some fundamentally new architectures in order to facilitate the gathering of a large number of people into a shared space.

AltSpaceVR solved this very early with the ability to re-broadcast presenters into a series of sharded instances in some of their live performances featuring Reggie Watts, and it’s something that Fortnite can achieve at the scale of tens of millions of people at the same virtual event. But it’s an architecture that’s still needs to make it’s way to other social VR experiences like VRChat and potentially eventually Mozilla Hubs if they decide it’s on their product roadmap.

The final point in terms of interaction is that there isn’t currently any way to easily create scriptable or interactive content within Mozilla Hubs. It’s easier with the Mozilla Hubs Cloud instance to be able to add dynamic scripts or customized A-Frame components, but making scriptable and interactive content in Hubs is non-trivial and it’s not intuitively obvious how to bring in levels of interactive content.

There are other social VR experiences like VRChat that allow you to upload an entire Unity scene with their Udon scripting language, or experiences like Rec Room or NeosVR where there’s an entire visual scripting language that facilitates the creation of interactive content. But it’s still very early days for creating interactive social experiences within Hubs, especially due to the many security and safety risks that are introduced with this interactive content. I’d recommend checking out some of what’s happening in VRChat and Rec Room to see some of the frontiers of creating immersive experiences that are focused on games, interaction, and the expression of individual and collective agency.


The IEEE VR wanted to optimize for performance across a broad range of different devices, which meant that the architectural components of the space had to be fairly limited. VRChat has a way to have creators upload different versions of a world that are optimized for the PC and for the Oculus Quest, and then there’s a progressive enhancement of fidelity and interactive components that different based upon what power of a device you’re on. So users can share the same virtual social world across multiple devices where the PC user would have a higher fidelity experience than the Quest user, but that there could be different versions of the world that’s optimized for whatever your system can handle.

This type of progressive enhancement isn’t implemented in Mozilla Hubs or WebXR yet, and so it either ends up being a lowest common denominator experience or an experience that is higher fidelity, but accessibility starts to break down if you don’t have a device powerful enough to handle it. I experienced a lot of IEEE VR in the Oculus Quest, and once you start to have 20-25 continuous audio streams within an experience, then the audio starts to get really choppy, frames start to drop, and it becomes more of a fragmented experience.

But there was hardly anyone using VR headsets throughout IEEE VR, and so most of my embodied interactions ended up being with what felt like a stilted, disembodied zombie representations of folks who were interacting through a 2D computer interface. I was able to have a number of really high-quality conversations despite this, but I find that I have a much higher bandwidth of communication when everyone is immersed in VR and there’s more subtle body language cues that are available for me to get a richer embodied social experience.

The audio falloff settings are hard to always know how they’re set, and this was an architectural element that MacIntyre talked about how there isn’t an intuitive sense for how sound travels in virtual spaces. We have a lot of intuitions about how sound propagates in real life, which actually helps to establish social dynamics in different spaces. Having proper falloff and audio isolation enables for clusters of people to organically gather and cluster. I appreciate a fairly exponential falloff that allows me to quickly get distance from other people, but yet still hear a slight mumble of other people talking.

The absence of good audio spatialization settings in Virbella during Laval Virtual meant that a lot of the emergent social dynamics were clustered around audio isolation bubbles that were indicated by dotted circles on the ground. These type of audio isolation bubbles in Virbella allowed for a type of dynamic sharding, which allowed up to 1200 people to be in the same social space. But the problem with this approach is that the audio was either too loud or too soft. The closing night parties of Laval Virtual had a soccer field of 100 people in there with no audio spatialization, which sounded like 100 people screaming in your ear at the same time at the same audio level. It was impossible to have a conversation unless you got a good distance away from this huge crowd.

The flip side of being too loud is being too silent. In Virbella, if you were in a room with 40 people who were spread out to different audio isolation bubbles with 100% falloff, then there was a distinct uncanniness of feeling like you were in a completely silent room even though there were dozens of people in it having conversations. I found myself preferring the underlying hum of a cocktail party where you can get a sense of the social presence of other people, but with an exponential falloff where it’s difficult to make out the specifics of what people are saying. I’ve been in many situations when I thought that the underlying hum was too loud and that I thought I’d prefer silence, but after experiencing the silent version I realized that I actually like hearing a slight background hum to give me the sense that I’m present with other people.

Proper audio spatialization and falloff is probably one of the hardest open problems in social VR spaces, and there’s still many different contexts where it works and doesn’t work. Mozilla Hubs uses a more, peer-to-peer WebRTC audio stream architecture that can only really handle 20-30 concurrent, spatialized audio streams within the same room. The audio spatialization falloff settings could be tweaked in the version of Mozilla Hubs used during IEEE VR, but this is something that will need a lot of work and further exploration to see how to set the proper falloff settings, ways to potentially scale up to more people, and different architecture of spaces that make it more clear how the sound is or isn’t isolated through the entire environment.

Another embodied presence aspect are the avatars and avatar representations. The range of default avatars in Mozilla Hubs is really quite limited, and there are tools like TryQuilt.io where you can customize the skin of your avatar. The current bar for avatar representation is being set by VRChat, and so all of the other solutions are going to be a subset of options you have in VRChat for embodied avatar representation.

I appreciated the ability to customize my avatar, but generally only about 5-10% of the people will do any type of customization, and so the default settings of avatars will end up dictating the range of identity expression that’s possible. I generally find this range to be quite limited, but I know that it’ll continue to grow and evolve over time.

Being able to add dynamic lip movements to avatar representations will make a huge difference, as right now the entire robot head bounces up and down to indicate someone is talking. Sometimes some of the more humanoid avatars in IEEE VR hubs would have their entire body bounce up and down while talking. I imagine that in the future, we’ll have a sort of early days of social VR nostalgia at these primitive avatar expressions just as we look back on the animated GIFs and Geocities type of web design on the early days of the web.

Another aspect of embodiment is the opportunity for embodied journeys through virtual spaces that allow for embodied collisions. Because most of the locomotion from room to room involved abstracted teleportation that primarily used the 2D interface, then there were very little opportunities for embodied collisions. These tended to happen the most at the sponsor areas during specific break times that were announced on Slack, during the poster and demo sessions, or hanging out within the dedicated social VR spaces. These hallway conversations are a critical part of the conference, and there needs to be more of a deliberate effort to architect spaces that facilitate these embodied interactions, but also time within the schedule where people are expected to have an embodied virtual presence.

The 1:1 translation of the schedule meant that the one-hour breaks were too short. Most people were attending to their lives, email, bio breaks, and resting from Zoom talks, and so there wasn’t a proper rhythm for people to come into the VR portion of the conference. The poster sessions were during lunch and only and hour or two, but again this was following a long block of 2-3 hours of talks. I think there needs to be longer times reflected within the schedule that alternate between the talks and different social events in VR that really use the affordances of embodied conversations. Whether that’s birds of a feather discussions, the poster sessions, the sponsor areas, or creating architectural spaces that are more reflective to contexts and topics of interest for that community.

The sponsor areas ended up doubling as social hang-out spaces, but many of these spaces were not designed to for social gatherings. How can the sponsor area create content that’s endemically interesting on it’s own to experience within VR, while at the same time facilitating emergent conversations? This requires a fundamental understanding of the architecture of emergent conversations, and how to do some deeper experiential design that uses the spatial affordances of the medium to communicate in way that’s deeper than text, photos, or video. There’s a long way to go to accomplish this, but the Virtual Market in VRChat starts to really explore some of these virtual expo dynamics.

The final point that I’d make about embodied presence for now is the relationship between the underlying architecture of a space, and the deeper context that this can help to set. The architecture of most conferences and gatherings is completely generic, and structured around time. Conference rooms are empty boxes that are valuable because they gather people together at a specific time to talk about specific topics. But how can the architecture of a space better serve as being able to set a deeper context across a broad range of different topics?

Rather than structured around time, then what if the conference was structured around space? It’d be more like an interactive theme park, where you’d go to the 3DUI portion of the park to play with the latest 3DUI innovations. Perhaps the social VR portion would have a variety of different social VR experiments to try out. Maybe the collaboration portion would be facilitating people actually collaborating on some topic. So what would theme park or immersive theater version of IEEE VR look like? How could the architecture of that version of a space start to facilitate a deeper context for more meaningful interactions? How do you translate abstract ideas into embodied architectures?

With VR, you have the capability to be a lot more like Burning Man, which creates specific art and architecture that serves the needs of that specific gathering. So dynamic and emergent structures can be built to help to focus attention and intention to help answer the emerging questions and solve the pressing problems.


Emotional presence is the hardest thing to cultivate in the absence of a physical, embodied presence. These are intimate moments of being vulnerable with each other, but also cultivating deeper relationships. MacIntyre talks about the late-night karaoke ventures, or mundane conversations that start by getting coffee, but that can lead to meaningful emotional exchange and commitment that could lead to future collaboration or deeper friendships.

A lot of the cultivation of emotional intimacy has to do with communicating to others about your shared experiences of something. The most common shared experience are the tools and your experiences with the virtual world itself. There’s many self-referential conversations that are all about talking about how the conversations and experiences are impacted by that particular platform. I had a lot of conversations about Mozilla Hubs at IEEE VR, and a ton of conversations about Virbella at Laval Virtual.

This seems to be like the equivalent of talking about the weather with strangers when you travel to a place. It seems to serve as a way of accessing the simplest, surface conditions of an environment that are impacting your emotional experience. It’s joyful when things are working well, or it can be cathartic to share your frustrations when the technological limitations are preventing deeper emotional engagement.

It’s often in the in-between, threshold spaces of the hallways where these types of interactions often take place. But there’s usually little to no consideration for how to cultivate these interstitial contexts in virtual conferences to facilitate these types of intimate interactions.

The opening night parties or gatherings at the end of the day at conferences often serve this function, but it’s difficult for people to fully commit entire days for multiple days in order to facilitate these types of interactions, especially after a long day of consuming talks and suffering from Zoom fatigue.

The dynamics of working from home during IEEE VR was a relatively new dynamic for many people who were able to explore remote work for the first time as the shelter-at-home quarantines were just starting to begin. Co-located conferences get people to travel to a new location, which serves enable co-located, embodied presence that’s easy to give your full attention. There are always meetings and virtual distractions that are impossible for some to avoid, but conferences served to gather people in a shared environment where they more or less have their full and complete attention available. People also usually have a certain degree of openness to possibility for emergent and unplanned interactions. How should virtual conferences be structured in order to really optimize for having someone’s complete attention?

Cultivating a context that rewards your full presence and emotional investment with meaningful interactions that cultivate emotional intimacy is something that takes a lot of deliberate design and a broader desire from a community who wants that. This is usually an unintended side effect that organically emerges from the investment and commitment that happens with traveling to a location. But it’s also the most difficult to design for and re-create some of these emergent dynamics virtually.

The social tools and friending tools within VRChat seem to do the best job at being able to cultivate a broad range of private contexts that facilitate increasing levels of intimacy. When creating private instances, then there are rules that dictate how the social dynamics will unfold in that space. There’s Public spaces, hidden Friends+ spaces where friends can bring friends, or friends-only of the room owner, or private Invite+ rooms that require a specific invitation from someone at the party, or a private instance where the owner of the room controls who is there. These facilitate a broad range for how emergent collisions can happen in virtual spaces, and an enable the ability to drop portals from room to room, and to create different contexts that assume different levels of intimate connections with differing abilities to join different rooms.

As an example, the Friends+ dynamic is the equivalent of being at a conference party and joining a friend who is immersed in a conversation with people you don’t know, but they invite you into the conversation and you get introduced to a lot of new people. It’s these types of emergent dynamics that can be formalized within what the rules of a private instance are within VRChat, but are more difficult to emerge within the social dynamics of spaces like Hubs. There is the ability to have private rooms that aren’t publicly listed, and then friends can invite other friends. But once the URL is out, then it’s out. It’s functionally like the Friends+ rooms of VRChat, but with limited ability to make a true invite-only room or friends only room where you’d have to be directly connected to the room owner. The ability to navigate between the architecture of these room privacy settings in VRChat cultivates differing levels of intimate contexts for people to connect to each other.

I see a number of virtual gatherings choosing different social VR platforms for different contexts. So maybe the conference portion in Mozilla Hubs, but maybe there’s an after party that’s in AltSpace or VRChat or TheWave, or SomniumSpace. The Education Summit did this for their after parties. But this creates fragmentation between user identities and friends networks that don’t transfer between, and also diffuses the private and professional context into worlds that are more of a public context. So it’d be like going out to a night on the town in Las Vegas with a small group of friends, rather than going to a specific conference party where there will be the opportunity to make new connections.

I tend to prefer maintaining the professional context of a conference after party, as it quickly gets fragmented into public spaces, but it’s also really difficult to create a persistent URLs to easily allow anyone who has the URL to join. It often requires a backchannel conversation of friending people on the right program, and then receiving a real-time invite. Or the loading screens breaks up the real-time communications channels, which requires something like PlutoVR or Discord audio chat or some other live audio channel outside of VR completely. It’s still very fragmented, and difficult to go from platform and platform and still stay connected.

Conferences like GDC or SXSW have a ton of parties happening after the conference in big fancy locations, while academic conferences tend to use the default hotel party or lobby spaces. But I imagine that the future afterparty scene in VR will start to pull in popular underground locations on a variety of different social VR platforms, and perhaps getting after party invitations will start to feel like a whole game within it’s own like Sundance or SXSW.

The other aspect of immersion into a space is the time zone that you’re in. If you have to timeshift your native time zone, then this will limit your ability to be fully emotionally present for the full range of events. Traveling to a physical location forces people to adjust to that time zone, and it can be really disruptive to fully live into another time zone while you’re still at home.

During Laval Virtual, I ended shifting my schedule to invoke a sort of virtual jet leg. I ended up staying up all night the first day, and then slept during the late afternoon to then start to get up at 1am PST time in order to make the 10a CEST European time zone. I felt like I was traveling to Laval France, and I was 100% dedicated to providing my full attention. But that’s a huge commitment that doesn’t always work between different time zones and, so the concurrency dilemma between multiple time zones is a huge open problem. How can you schedule a conference in order to optimize the critical mass of attendees across multiple time zones?

The fact that it was free and available to more people increases the range of diversity and inclusion that’s possible. There still needs to be a lot of outreach to be done, as well as more consideration for how to create architectures that facilitate options that allow for a broader diversity of participation and inclusion. This not only applies to race, class, gender, but also degrees of introvertedness and extrovertedness as well as the amount of power and privilege that people enjoy.

MacIntyre talks about some of the intention of gathering at conferences at IEEE VR is to promote the “upcoming stars” in the academic research field, but because the amount of resources required to travel to conferences meant that it limited it to established universities and for people who had enough time and resources available to travel. Now that virtual conferences are opening up the range of more inclusive and diverse participants, then what other architectural and cultural dynamics need to be implemented to fully realize this vision of trying to open up the resources and these social dynamics up to a broader range of people?

If it becomes difficult to dedicate your full attention to a virtual conference over the course of 2-5 days, then why not have more frequent gatherings for some of these communities? MacIntyre suggested there could be IEEE VR Mondays that facilitate more focused groups either weekly or once a month. If you distribute out the level of focused attention across, then it creates a dynamic where people could commit their full attention for 1-2 hours at a time over the longer period of a year.

There’s still the forcing function of a deadline where the publication of the IEEE VR Proceedings happen, which serves the peer review academic function of creating archival quality material that advances the knowledge of a community. What Johnsen said is that the conference serves as a re-calibration opportunity to tune into the deeper trends of what’s working and what’s not working so that he could change the direction of future research for the upcoming year. This tuning into the zeitgeist happens for each of these research fields and requires a broad range and diversity of talks to helps dictate what these trends are for the overall community. But it’s not a rational deduction, but more of an emotional, intuitive gut reaction based upon the deeper patterns and trends that you’re feeling are emerging in your field.

For me, this process emerges out of the series of serendipitous collisions I have where I’m able to see these repeating patterns and themes. So attending conferences like IEEE VR help me to determine some of these emerging themes of where things are going. Right now, a hot topic for me and the rest of the VR community is looking at how VR can help with the art of gathering together, being able to help fill in the gaps for when the abstracted 2D Zoom interfaces aren’t satisfactory enough and how embodied representations in virtual spaces help to facilitate organic clusters of conversations that happen from meeting in physical spaces, and also how to architect for and cultivate these serendipitous collisions that help to match the deeper intentions of people to help find potential collaborators or answers to their theoretical questions that they have.

I’ll be continuing to talk with different social VR programs, and event architectures that are exploring these conversations. If you wanted to see some of my embedded reporting with more detailed thoughts and reflections, then be sure to check out my epic 100+ tweet IEEE VR Twitter Thread and Laval Virtual Twitter Thread.


The direct one-to-one translation of the IEEE VR schedule was really informative to see what works and what doesn’t work. There will continue to be different experiments and iterations of virtual conferences in order to get down to the essence of what combination of tradeoffs are required in order to get down the essence of the art of gathering.

  • Why do we gather?
  • What’s the worth of coming together?
  • How do we best share our work and connect with each other?
  • How can we formalize our knowledge in rigorous ways?
  • How do we facilitate new connections and foster future collaborations?
  • How can we form a community to help each other solve our problems?
  • How can we facilitate mentorship and deeper professional guidance?
  • And how can we explore how embodiment and virtual spaces can facilitate the types of emergent conversations that are difficult to orchestrate through the lens of 2D Zoom technologies?

These are many vital open questions for not only the virtual reality research community, but really everyone who is trying to figure out how to restore the glimmers of what it means to share space with each other in an age of physical distancing. There are some valuable lessons from IEEE VR, but we’re going to need many more iterations and experiments on the level of conferences, but also at the level of architectural experiments that explore different dynamics. I’m looking forward to continuing to explore the latest experiments in social gathering, but also exploring the future of virtual conferences in an ongoing series with platform providers and conference curators.

More podcasts and coverage coming soon. If you’ve read this far and would like to continue the conversation about virtual conferences, then feel free to send me a note at kent@kentbye.com using a subject that includes “Virtual Conferences.”


This is a listener-supported podcast through the Voices of VR Patreon.

Music: Fatality