#192: Bernd Froehlich on Information Visualization & Multi-User CAVES

Bernd-FroehlichI catch up with Bernd Froehlich at IEEE VR 2015 to talk about Information Visualization in VR & Multi-User CAVE systems.

Bernd Froehlich is a full professor with the Media Faculty at Bauhaus-Universität Weimar. His research interests include real-time rendering, visualization, 2D and 3D interfaces, multiviewer display technology, and support for tight collaboration in colocated and distributed virtual environments. Froehlich has a PhD in computer science from the Technical University of Braunschweig. He is a cofounder of the IEEE Symposium on 3D User Interfaces and received the 2008 Virtual Reality Technical Achievement Award.

Become a Patron! Support The Voices of VR Podcast Patreon

Theme music: “Fatality” by Tigoolio

Subscribe to the Voices of VR podcast.

Rough Transcript

[00:00:05.452] Kent Bye: The Voices of VR Podcast.

[00:00:12.039] Bernd Froehlich: My name is Bernd Fröhlich. I'm a professor at the Bauhaus University of Weimar in the media informatics department and I have the chair for virtual reality and visualization research there. I'm in this field for almost 20 years, I would say. I started in 1993 at the GMD in Bonn, worked on a system that was a 3D tabletop display called the Responsive Workbench. And since that time, I'm a great fan of virtual reality and virtual environments. Then I moved on to Stanford as a research associate there for two years. The Pat Henrins group working on two-handed interaction, two-user displays and so on. Then later on I got the professorship at Weimar in 2001 and since then I'm there. And my group in Weimar focuses on two areas, visualization and graphics research, and also on virtual reality and 3D user interfaces. In the visualization area, we have two different departments. One is working on information visualization, so visualization of graphs and large data sets in the sense of abstract data. And then we also have visualization research that deals with large models, that is large seismic data, for example, large volumes, large images, large scanning data, large triangle models and so on. So we are integrating support for all these large data types into our rendering framework that is called Guacamole. and that's the basic scene graph and deferred rendering framework that we are using and on top of that we have a VR framework that's called Avango that we use to develop our applications. So we have developed the basic graphics and visualization algorithms to display advanced content in virtual reality but we also do the VR research and here we focus on collaboration in virtual environments and this support for collaboration depends on a few things. One is actually providing a shared space for multiple users in virtual reality. And this shared space is provided in our setups by providing multiple people with individual stereoscopic views of a virtual environment. So in regular virtual environments you have only support for two different images typically. A left eye image and a right eye image is projected on a screen typically. and then only one person can be provided with a stereoscopically correct view, perspectively correct view, and the other ones would perceive distorted images. In our systems we can provide up to six users with individual stereoscopic images so they can walk in front of a display individually and see the virtual content from their own perspective. By this way they can actually point to features and talk to features using their bare hands. They can trace features with their hands and so on. So that's not possible in regular virtual environments. So that's the basic support to provide a shared space for multiple users. But we even go further because we also want to include remote participants. People who are not at the same location and enable a shared space between two groups of people that are at different locations. And we do that by actually capturing the people in front of a display in 3D using Kinect-style cameras that capture not only the picture of these people but actually the 3D geometry of these people. And the 3D geometry And also, of course, their appearance is transmitted over the internet to the remote location where their 3D appearance is reconstructed and you can see a 3D avatar of that person that looks like the real person. And this avatar is displayed in 3D such that this virtual person can extend their hand out of the screen and basically virtually shake your hand even though that person is at a completely different site, different location. So now we have the shared space among co-located people at the same location that can point at virtual objects and trace features and talk about virtual objects. But they can also do that for the remote people. So the remote people would see where the local people are pointing at. in the virtual environment. So we provide this shared space. Then to support multiple people in a virtual environment you need even more facilities because they also need to work on the data and really collaborate, not by just pointing. So they need input devices and they need spaces for personal work. Because we have now six people on one side, maybe six people on the other side. And they don't want to talk to each other all the time. There might be subgroups forming or people might individually prepare something to show later to the others. So you need techniques so that they have their private territory where they can also work. Then they need something, a storage space where they can store some intermediate models they are preparing and so on. And there needs to be the shared space that's provided by the multi-user virtual reality system and also by the telepresence system. And we developed a technique that supports that, that we call photo portals. Photo portals is something where you have a camera in your hand and you can take a picture of the virtual environment. But it's not a picture. It's actually a portal to that location that you were taking a picture of. So you can look into this picture as if you're looking through a window into that place where you took the picture from. So later on you can collect these pictures and later on revisit these pictures. Or you can change something to this window, to this picture frame basically, and prepare something for the others, then show it to the other, or enter this recorded place later on. And the interesting part is that you can not only record a picture, you can also record the actions of the users using this virtual video camera. For example, if somebody is showing, some remote person for example, is showing you something in the virtual environment and explaining you something, you can actually record that and play that back. And when you play it back, you see this virtual person, the representation of that real person, what that person was doing in the past. But that same person can actually be viewing what he was doing at the time when this video was recorded. So it's kind of like time travel. You can observe yourself what you were doing in the past. Usually you can only do that in video. Now you can do this in a 3D representation, in a 3D world. You can walk around yourself while you are acting out something in the past. So it's a very, very cool environment. What we also do is we develop the input devices and interaction techniques so that people can have complementary input, what we call another feature that's important. Because if everybody in this environment, if you have 12 people, 6 local and 6 remote people, and everybody would like have a pointing ray, a laser pointer kind of pointing in the environment, then you would have like 12 laser pointers in the environment, that would not work at all. So what we do, one person has typically this video camera for example, this photo portal interface as we call it. Then maybe one person has a pointing device and then we have a central device that is used to navigate through the environment. It's a stationary device that we call a spheron. The spheron is a large trackball. on top of bases and it's like a pole basically. And when you use this sphere, the large sphere, you can rotate in any direction and that makes you look into that direction. Or then there's a handle on this pole that you can pull towards you then you're moving forward or you can push away then you're moving backward for example. So you can actually navigate through the environment. But there's only one such device and it's purposely designed to be large so that everybody can see the actions of the person that is actually steering. So you can actually observe very easily what is happening. If you would have a small controller in your hand and you would navigate the other people through the environment, they would not know what's going to happen, but they can see what you're doing, that you're rotating the big sphere to change the view, for example, so all this is observable. For example, what you can do when you have two sides that are connected and then each of these sides have one of these devices. When you align these devices on top of each other, people would be standing face to face across each other in the virtual environment. But they can also decouple and one party is moving in front of you and you can see their avatar representation. And you can follow them through an environment, they're showing you something and then you stop somewhere and they turn around and then you can watch each other and discuss about, for example, a certain building in the city or whatever you're visiting. Yeah, so far about our work maybe.

[00:09:33.565] Kent Bye: Wow. The thing that comes to mind is I was talking to one of your students earlier and he was saying that in this cave environment, running at a frequency around 720 hertz in order to have six people with the shutter glasses to have all these different perspectives within the cave, which is usually only one, but you're doing it in a way that you're able to have up to six. And so that brings up a couple of questions. One is, what type of insights were you able to come up with now that you have that sort of multi-user environment when it comes to virtual reality? And the second part is, what are still sort of the big open problems when it comes to collaboration in virtual environments?

[00:10:06.329] Bernd Froehlich: Yes, so the system that we have actually uses 360 Hertz because it has only six time slots and that would be only six different views because you need at least 60 Hertz per eye. So what we do, we have a second projector or stack of projectors that uses polarization to separate left and right eyes and then we have actually 12 different views on a single display that gives us these six users that are provided with active stereo images. So, what have we observed? It's very evident that everybody who has seen such a system that provides multiple people with individual stereoscopic views, they don't want to go back to a single user system. Never ever. Because when you go back, all except one user will have a distorted view. And you were tolerating that because you didn't know that a better system exists. But once you have seen how such a multi-user system works, you cannot go back. That's a clear observation and people very naturally walk around and trace with their fingers along features, point to things and talk about it. It's a very intensive discussion and it's no longer just a demo. It's really now a place where a group of people can work together. So that's a clear observation and the open problems are really what are the tools to facilitate that group collaboration among the local people and among groups of people at different locations. We developed some tools like these photo portals that these are first step. But I think there are many more of these tools needed for different purposes, like in the car industry. If the people are standing around a virtual car model, they might need other tools to interact with the system, like physics-based interaction and so on, which is not easy to synchronize across the internet, basically, if you're in two different or multiple different locations. Then the other problem is certainly the scaling towards more than one, no more than two groups that are collaborating across distance. That's one area of scaling have three parties, four parties, five parties. But I don't think we need 1,000 parties or groups collaborating. It's usually two, three, four sides that want to collaborate. But even then, There's no longer the situation that you can be directly face-to-face in a meaningful way. You'd probably be around a larger virtual table or something like that. The other thing is scaling towards more than six users. There are some technical issues there, of course, because there are different approaches. There are so-called computational displays. light field displays that might enable that in the future that you even have a 3D image that is glasses free and and for an arbitrary number of users but these displays are not yet available and I haven't seen any larger display maybe 55 inch or something like that. So with the projection based technology that we are using you could actually go double the number of users if you make the time slots shorter. That's in principle possible. You could run at 720 Hertz then you could have up to 12 users, up to 12 users, yes. You could even make half them again, then 1440 Hz, that would be up to 24 users. In principle that's possible, it's just a matter how much light you have available. And if you want to run this by a single projector, say 12 users by a single projector means 20,000 lumens divided by 12, that's the number of lumens per user. So at some point you won't have enough brightness. But in principle you could stack just a number of projectors to have more brightness, but that doesn't seem to be ideal, that you would need 60,000 lumens to support 12 or 24 people. I'm not sure. But on the other hand, I'm also not sure how often you need a group really working together on something that requires more than 10 people or 12 people. Very rarely there are these decision-making meetings, for example, in a car manufacturing company where there are more than 10-15 people involved. So maybe that's not such a huge problem. But the other interesting part is that also using head-mounted displays now they are getting better and better. And I think collaboration in head-mounted displays is also a direction which will happen, which is not yet much developed. And I think that will be also a direction we will follow. Not only collaborating using a projection-based display, but also head-mounted displays and maybe even mixed environments where on one side you have three, four guys who wear a head-mounted display and collaborate with a group of people who is in front of a projection-based display.

[00:14:58.907] Kent Bye: Cool. So in terms of information visualization, I've talked to Oliver Kralos who does a lot of data visualization connected to a 3D geometric space. So looking at earthquake data, it's tied to a 3D space that has an inherent sort of depth data that's tied to that location. And then you have information visualization that's not connected to a 3D space, but you're using the depth dimension to be able to get additional insights. And so I'm curious, since you've been researching and looking at this, what type of data sets lend themselves to be getting that much more insight into them, and then what kind of industry applications you've seen using VR for information visualization in an abstract way?

[00:15:35.061] Bernd Froehlich: I haven't seen any industrial applications that use information visualization. Information visualization really in the sense of abstract data like product data or something which does not have any geographical attributes, right? I have not seen any information visualization application in VR so far, especially not in the industrial sector. First of all I think if you consider that people will meet in virtual environments in the future and that will of course happen if you think about social networks where people already connect in the future they will be able to connect to wearing a head-mounted display and meet each other in a virtual environment. Then, I think people also want to have visualizations of information, also abstract information, available in the virtual environment. So, that does not necessarily mean that this information is actually displayed as a 3D visualization, as a 3D information visualization. It can be a flat visualization, but it will be embedded into this virtual environment in a convenient way so that you can intuitively interact with it. But when does the third dimension make sense, right? And I have a few ideas in this direction. Sometimes there's a technique in information visualization that's called focus and context. And if you think about it, perspective is a natural focus and context display. Focus context display means that some things in your data are in the focus But all the other data is shown at some reduced level of detail as well so that the focus is always shown in the context. That's why it's called focus on context. And perspective is a natural focus on context display. And at some point we for example had a display that was a ring. and that ring was slightly tilted in depth and so you could read and see in detail what was at the front of the ring and at the back of the ring to the perspective it was smaller and condensed and in some sense it was only hinting at the information that was there. So that was perspective as a natural focus and context display and if you need a focus and context display where you can use that natural perspective it may also make sense to make that available really in the virtual reality application. So because there you have naturally perspective involved. And then there are also some displays that naturally lend themselves to a 3D interpretation I would say. I can give you another example that is if you look at so-called parallel coordinates. Parallel coordinates displays are displays where you have multiple data sets, where each data item has multiple attributes. For example, a notebook has multiple attributes in the sense that it has a processor from a certain company, has a processor from a certain type, has a certain amount of main memory, had a certain size of disk, a certain size of Screen and all these are attributes of a notebook and if you have many notebooks you can display that by having for each attribute a vertical axis and then when you draw a notebook It's basically connecting the attribute values for that particular notebook across the axis. So one line Crossing all the axis which are manufacturer of the CPU Type of the CPU and memory size and so on this line represents one notebook And then you could have multiple of these lines representing multiple notebooks. That's a parallel coordinates display. And that's a 2D display. But if you think about it, some attributes are time-dependent. So, for example, the price changes over time. And this attribute has a natural additional third dimension and that's the depth behind the display. It's natural to move in time by slicing basically forward and backwards through this plot in time. And then I think you have a 2D display and time is often a natural third coordinate basically that you can make use of. And then again, it makes sense to use such a technique in a virtual environment because it's very easy to look at it from the top, you can look at it from the side. When you look at this third axis from the side, you see a regular time series plot of the price for all the notebooks, for example, that are in your display. So, whenever you have a natural third coordinate, it totally makes sense. When you have a focus context, it makes sense. And there are probably a few other cases where you could make good use of the third dimension.

[00:20:21.380] Kent Bye: And finally, because you've been involved with virtual reality for over 20 years, you clearly see something that's compelling and motivating to keep involved with it. And we seem to be sort of on the cusp of crossing the chasm into a big consumer VR revolution, perhaps. And so I'm curious from your own perspective, what you see the ultimate potential of what virtual reality may be able to enable.

[00:20:43.610] Bernd Froehlich: Yeah, I think for me it's quite obvious that the next wave is social nets, social platforms in a virtual environment and that people really meet each other in the virtual environment with They're real body representations, like real-time scanned 3D body representations. You can sit across somebody you might know or not know and you can see his real body rendered, continuously getting better quality rendered. So I think it's such an amazing presence of that other person if a scanned version of that person, real-time scanned version of that person is rendered. So I am very, very convinced this kind of telepresence, tele-immersion will be really the next wave. It's not going to be only games. It's really involving the people and bringing the people into the virtual world that pushes things forward. only in applications where it's about jobs, it's about work, but it's also about leisure and just about social events where you just want to meet. And there you want to see the body and the behavior and the facial expressions of other people. And I'm convinced that's going to be a big step forward, a real wave and hype actually at some point for VR.

[00:22:07.365] Kent Bye: Okay, great. Well, thank you so much. Thank you. And thank you for listening. If you'd like to support the Voices of VR podcast, then please consider becoming a patron at patreon.com slash Voices of VR.

More from this show