Introduction

It is 5:55pm in the evening in the UK and you’re planning to attend a meeting at 10:00am the same day in Los Angeles. You are responsible for creating the computer graphics scenario for a new movie and there is to be a conference with the director and producer who are based at Universal Studios. You could have taken the 11 hour flight to LA, spent a day or two adjusting to the time change, paid lots of money for the hotel and flight, and been away from your UK base office for several days. Instead, it is now 6.00pm and you step straight into the LA studio, and meet the director, producer and other people working on the movie. You shake hands, talk for a while, and then you all walk together through the set that your company has designed. The director suggests various modifications, and you prototype these changes there and then.

You are a UK surgeon, and you’ve been discussing a particularly difficult tumour operation with a colleague at BIDMC in Cambridge MA. Today you meet your colleague inside the brain of the patient, and talk through the likely operating procedures. At various points you push your finger into the brain material in order to get a better understanding of its physical resistance, your Cambridge colleague does the same and you compare notes. A procedure is planned.

You are in the UK and your partner is away on a 12-month project in North Carolina. Unfortunately you’ve been unable to physically meet for several months now. But at the same moment you step into a room to be together and talk for a while, your regular weekly meeting. The sense of being together is enhanced by finger-tip-to-finger-tip physical contact.

The above scenarios can all be realised, in an elementary form, today. People can step into virtual environments (VEs), and meet and talk with other people at remote locations, and even share some physical contact with them. The purpose of this proposal is to carry out research to explore such highly immersive real-time body tracked environments in use for day-to-day shared contact between people. This is a project that is looking towards a future where such highly immersive telepresence will be commonplace, rather than restricted to a few laboratories and companies as it is today.

Key Facets of the Proposal

Background

When people meet in shared virtual environments, what determines their ability to effectively cooperate together? If the purpose of their meeting is rehearsal or preparation for a later meeting in real life, what is the relationship between what happened in the virtual meeting, and what later happens? We have been studying these problems in two previous collaborative projects: COVEN (Collaborative Virtual Environments) a European ACTS project, and ‘Virtual Rehearsal for Acting’ a Digital VCE (LINK) project (Virtual Centre of Excellence on Broadcast Multimedia).

In [11] we carried out an experiment where 10 groups of 3 people each met first in a virtual environment to try to solve a set of riddles posted on the walls of a virtual room. After meeting virtually for a while they then met in the corresponding real room and continued the same task. The point was to see how their social interaction changed from the virtual to real experience. The participants were represented by simple block-like humanoid avatars, which had no possibility of exhibiting facial expression. Arm movements were only possible for the one immersed participant in each group. The major findings that relate to this proposal were that the avatars took on personal and social significance, notwithstanding their extreme simplicity, reproducing an earlier result [2]. In two follow-up studies [12][15] we found similar results.

In the second project, Virtual Rehearsal for Acting, we carried out an experiment where three pairs of professional actors and a director each met in a shared non-immersive virtual reality system over a two-week period to rehearse a short play. The actors and director never met one another physically until a short time before a live rehearsal in front of an audience. The actors were represented to each other by avatars (virtual characters) which could be controlled to make a range of facial expressions, and some body movements, including navigation through the space. The study examined the extent to which virtual reality could be used by the actors and director to rehearse their later live performance. The results showed that over the period of the study their sense of presence in the virtual rehearsal space, their co-presence with the other actor, and their degree of cooperation all increased. Moreover their evaluation of the extent to which the virtual rehearsal was similar to a real rehearsal also increased [10]. Debriefing sessions with the actors and director were reported, which suggested that a performance level was reached in the virtual rehearsal which formed the basis of a successful live performance. The work showed that despite the limitations of the system, the actors could achieve a significant level of acting performance during the virtual rehearsal, which translated to the physical rehearsal.

These earlier experiments relied on visual and auditory communication between the participants, but no haptic interaction, that is the sense of touch. A conceptual framework for communication in shared VEs and the role of haptics was recently discussed in [5]. This followed an experiment undertaken by one of the authors as part of a team in the Research Laboratory of Electronics at MIT – where people carried out a task together with haptic feedback between them. Even though they were in remote places, and could not see representations of one another, they jointly manipulated a computer graphics image of a ring on a wire, and had to move the ring together so that it would not touch the wire. They each used a PHANTOM device (http://www.sensable.com/) which gave each a direct sensation of the force applied by the other. This experiment was carried out with a control group which only had visual feedback compared to the experimental group which had visual plus haptic feedback. Not only was the task performance significantly better with the addition of haptic feedback, but also the sense of being together, the sense of co-presence between the individuals was enhanced. This experiment was carried out purely to understand the task-performance and human factors implications of haptic communications between people. In this proposal we wish to repeat the same kind of experiment, and also begin to understand the feasibility of haptic communications in the context of internet delays. There is also an important usability question here: in spite of the possible distortions induced by internet delays, does the presence of any haptic feedback at all, however, inaccurate, add to the sense of co-presence between individuals?

Description of the Research

The purpose of this proposal is to obtain resources to create shared virtual environments, which are simultaneously inhabited by people on both sides of the Atlantic. Using these environments we will conduct a series of experiments on collaborative performance. These experiments will be carried out in highly immersive environments - including UCL's 4-sided CAVE and UNC's wide area tracked environment [17].

First something about the environment: In the 1998-99 JREI equipment funding round we were funded with a 'CAVE', amounting to £900,000. 'CAVE' is an acronym for 'CAVE Automatic Virtual Environment' a recursive definition that evokes thoughts of 'Plato's Cave' (it was invented by Carolina Cruz-Neira of the University of Chicago at Illinois). The CAVE concept and first implementation was at the University of Illinois at Chicago in 1992 [4]. Ideally a CAVE is a room that has all six walls as projection screens, on which a virtual environment (VE) is projected. A participant wears lightweight stereo glasses with a head-tracking device mounted. This is used to compute the gaze direction of the wearer, and from a computation based on inter-ocular distance, stereo projections are computed (for each of the six walls). Typically shutter technology is used to present alternate left/right eye frames to the viewer, who is then immersed in a completely surrounding VE. The projection system and software is organised so that participants are typically not aware of the corners of the physical room. Several people can be situated in the CAVE at the same time, though the display is completely correct for one participant. The CAVE can also be multi-participant in the sense that virtual characters representing people at physically remote sites can be present in the CAVE, and therefore people at several locations can simultaneously inhabit the same shared space.

We recently tested out the multi-participant aspect of the CAVE by linking together our CAVE with one at Chalmers University in Sweden over the internet. In this pilot attempt two people, one in London and the other in Gothenburg, were able to inhabit the same shared space, could see and interact with avatars representing each other.

We have carried out joint research with UNC's Computer Science Department over several years. There was a joint experiment on presence and interaction in virtual environments in the summer of 1998 [16] with a UCL researcher, Dr Martin Usoh, spending 3 months working at UNC. Professor Brooks was a Visiting Researcher at UCL for one semester during 1998. We are continuing to collaborate on presence in virtual environments, thus this proposal aims to strengthen this collaboration by providing a shared experimental platform. We have also carried out joint research with MIT as described above, including a sabbatical visit of M. Slater to MIT during the first semester in 1998.

The project will aim to carry out three experiments, illustrating the three different examples of collaboration mentioned in the introduction – walking through a complex scenario, interaction between people in relation to some other object, and personal contact. We discuss each of these applications in turn.

Safety Evacuation Planning in Complex Industrial Settings

UCL[3] and UNC[1] have each developed systems for detailed modelling and visualisation of complex industrial assets typically encountered in the process and power industries. In particular UCL has developed massive image databases and accurate 3D models of a number of such facilities using digital photogrammetric technologies. Whilst such 3D plant databases are now routinely used in wide range of engineering design decision support roles they are most commonly deployed in desktop or auditorium VR type environments. We propose to investigate the utility of such models for shared virtual site visits in support of site familiarisation and mission rehearsal within complex process plant environments. We would envisage that such scenarios would include a virtual guided tour and evacuation training (equivalent to that currently required for all visitors to an offshore facility) alongside the rehearsal of a virtual engineering or maintenance task (as might be required before entry into a hazardous or radioactive environment). Such experiments would seek to establish the appropriate level of detail required for these activities together with the degree to which the immersion of both the trainer and the trainee facilitates effective knowledge transfer between geographically discrete locations.

Interactive Clothing Design

There are currently two (LINK) funded projects at UCL concerned with the simulation of clothing. The first is called the Centre for 3D Electronic Commerce (http://www.3dcentre.co.uk/), which is concerned with the simulation of clothing on virtual humans for the purposes of retail. The goal of this project is to take 3D models of actual people, available via a 3D Hamamatsu Body Scanner at UCL, which produces a digitised 3D model of a person, and then simulate clothing on that model (http://www.cs.ucl.ac.uk/research/vr/Projects/3DCentre/) . The second project, called PROMETHEUS, is led by the BBC, and is concerned with the creation of a complete path from video-tracked actors through to MPEG broadcast (http://www.bbc.co.uk/rd/projects/prometheus/index.html). Real actors’ behaviour is turned into 3D avatar behaviour. The work at UCL is believable rapid clothing modelling of the actors.

We will use this work as an example of shared collaboration between designers in remote places inhabiting the same virtual environment linked via internet 2. The designers, some at UCL and others at UNC will meet in a shared virtual space. One of them will play the role of a model for the clothing. All will be represented to each other by avatars. They will be able to move around each other, including of course the model, and interact with the model at a number of levels:

· They will be able to change the colour and texture of the clothing;

· The clothing will simultaneously be represented by pattern pieces on a workbench. By interactively modifying the shape of the patterns they can immediately observe the effect of the clothes on the model.

· They can ask the model to adopt various poses, and again see the impact on the lay of the cloth.

· The clothing model is based on a number of physical properties of the cloth, such as its elasticity. These physical properties can be represented as force-feedback delivered through the PHANTOM device. Hence the designers will be able to experience aspects of the physical properties of the cloth, and again interactively change this by manipulation of a suitable representation shown on the workbench.

Overall then the goal is to place designers with a model, and interactive tools in a highly immersive shared environment, so that they can discuss and interactively manipulate the clothing design.

Haptic Communication

In the clothing experiment haptics was used as part of understanding the characteristics of an object – essentially as an aid to visualisation. In this experiment we will explore the thesis put forward in [5] that touch communication is an essential component of effective person-to-person contact. We will repeat the ‘move a ring along a wire’ type of experiment, referred to above, and described in [7] with two fundamental additions. First, and most important we previously avoided the issue of network delays in the earlier experiment, by running the same scenario on two remote monitors but driven by the same computer. At that time we wished to see the effect of haptic communication in itself, abstracting away any difficulties that might be caused by time delays. In this experiment of course we will monitor internet delays, and attempt to build a predictive model as to how variations in delay impact the performance of the task, and also the sense of co-presence between the participants. As before we will run the experiment with two conditions: a visual only comparison, and a visual plus haptic comparison.

Second, we will make the task itself more sophisticated. The subjects will see avatar representations of one another’s bodies, and thus will be in the same virtual space. Irrespective of whether or not haptic feedback is enabled, they will be able to reach out and touch one another in a visual sense (recall the ‘shaking hands’ of the scenario when the graphics designer met the movie director). The task will therefore make use of the space – requiring the subjects to jointly build a structure out of a pile of virtual bricks. This will be accomplished, as before, by each using the PHANTOM device to push on an object, and when they push together they will be able to lift it, and then place it down somewhere else. Each will feel the force exerted by the other, thus establishing a physical contact between them, even though they will be thousands of miles apart. The activity of building a structure out of bricks is therefore highly collaborative – it requires negotiating an agreement about what the structure is to be, and then continuing agreement about how and where to place the bricks. If one of them does not apply sufficient force the brick will fall. If one applies too much force, the other will have to force back. They will be very aware of each other’s movements through this physical contact.

Thus in this experiment we will be exploring interaction between people carrying out highly collaborative tasks, and crucially the possibility of haptic communications over the internet. It will also be a chance to explore how social relations develop in such a scenario.

Benefits to the JANET Community

The Distributed Interactive Virtual Environment (DIVE) is the experimental platform used for our experiments, as extended within the COVEN project. DIVE is especially tuned to support multi-participant virtual environments over the Internet. At the networking level, DIVE is based on a peer-to-peer approach, where peers communicate by reliable and non-reliable multicast based on IP multicast. Conceptually, all peers share a common state that can be seen as a memory shared over a network. Processes interact by making concurrent accesses to that memory [6].

DIVE is fully integrated with the World Wide Web. Any file or document necessary to a DIVE session can be accessed using the http or ftp protocols. DIVE supports 3D formats such as VRML, and 2D image formats such as GIF, JPEG and PNG. In addition DIVE can visualise web documents using MIME compliant mechanisms. DIVE also supports live audio and video communication between participants. Sounds are spatialised and video streams can be texture mapped on to objects in the virtual scene.

In a typical DIVE world, a number of avatars, i.e. the representations of human users can leave and enter dynamically. Additionally, any number of applications exist within a world. Such applications typically build their user interfaces by creating and introducing necessary graphical objects. Thereafter, they ‘listen’ to events in the world, so that when an event occurs, the application reacts according to some control logic.

At UCL our contribution to DIVE has been at the fundamental computer graphics level, in particular the ability to cope with very large models – such as a model of London developed for a travel application within the COVEN project [13]. Our second major contribution has been to incorporate into DIVE relatively sophisticated virtual human characters (avatars) that have been used to represent the actors in the Virtual Rehearsal for Acting project [10] and also in a Wellcome funded project on the use of virtual reality in the context of psychotherapy for fear of public speaking [9]. We will use these avatars to represent the participants within the immersive applications described in the previous section.

We have concentrated on the development of DIVE mainly for use in an immersive context, but of course exactly the same program can be used for desktop or other single screen-based displays, such as the large screen display available at the MIT laboratory. A typical DIVE session involves some participants in immersive displays, and others experiencing the same shared virtual space from a desktop – indeed our original 3-person experiments always used this setup. In the Virtual Rehearsal for Acting project only desktop displays were used. In this new project we will concentrate on a high degree of immersion, since we believe that this offers substantial benefits where spatial organisation is important and where person-to-person contact is required. Of course the same applications can also be used non-immersively: users do not require a million pound CAVE or even a head-mounted display system in order to benefit from the experience gained in the collaborative applications that we will explore.

So the first benefit is in the development of the DIVE software in the internet 2 context, for the three types of application described earlier. This software and the lessons learned from deploying it, will be available to the wider community.

The second benefit is the knowledge gained from the experiments themselves. We consider questions as to how successful such remote collaboration can be, and what are the factors that enhance or limit the success. Success will be measured both in terms of task performance and in terms of the social interaction that develops between the participants. Here we will expand on our previous work considering the emergence of leadership and its relationship to computational power (immersed participants tended to emerge as leaders irrespective of other factors), the requirements of avatar expressiveness for useful social interaction, and the impact of network performance on social interaction and task performance. Also we will include haptic communication and interaction across internet 2, and the issue of whether any degree of haptic communication is preferable in terms of task performance and social interaction – even if network delays do not allow accuracy.

The third benefit may be the most informative, although it will not be possible to express this as a formal experiment. In the very early stages of the project, essentially from week 1, it will be possible to establish a shared virtual environment between the three sites, exploiting the highly immersive systems (CAVE and HMD systems at the various sites) and desktop systems. We will of course use this shared environment to conduct the project meetings and planning wherever possible. In a 3-way experiment between London, Nottingham and Athens under the COVEN project our most significant finding was that the highly complex logistics of the experiment were managed entirely using the shared virtual environment itself. Hence an experiment across three remote sites where 3 people came together to carry out a joint task was itself conducted and managed by the experimenters using exactly the same shared virtual environment. We believe that the same will occur in this new project – and will be a convincing demonstration of how people thousands of miles apart can effectively cooperate in real-time using this system.

References

1. Aliaga, D. and Lastra A., Automatic Image Placement to Provide a Guaranteed Frame Rate, Proceedings of SIGGRAPH99, Los Angeles, August 11-13, 1999.

2. Bowers, J., Pycock, J., O=92Brien, J. (1996) Talk and Embodiment in Collaborative Virtual Environments, Electronic Proceedings, http://www.acm.org/sigchi/chi96/proceedings/papers/Bowers/jb_txt.htm.

3. Chapman, D.P. and Deacon, A. 1999. Virtual environments from panoramic images. Videometrics VI. Proceedings of SPIE 1999. Vol 3641, 118-126.

4. Cruz-Neira, C., Sandin, D.J., DeFanti, T.A. (1993) Surround-Screen Projection-Based Virtual Reality: The Design and Implementation of the CAVE, Computer Graphics (SIGGRAPH) Annual Conference Series, 135-142.

5. Durlach, N. and Slater, M. (2000) Presence in Shared Virtual Environments and Virtual Togetherness, Presence: Teleoperators and Virtual Environments 9(2), 119-136.

6. Frecon, E. and Stenius, M. (1998) DIVE: A scaleable network architecture for distributed virtual environments, Distributed Systems Engineering Journal, 5(3), 91-100.

7. Ho, C., Basdogan, C., Slater, M., Durlach, N., Srinivasan, M. A. (1998) An Experiment on the Influence of Haptic Communication on the Sense of Being Together, BT Workshop on Presence, http://www.cs.ucl.ac.uk/staff/m.slater/BTWorkshop. A revised and expanded version of this paper is to appear in Transactions on Human-Computer Interact, TOCHI, in press.

8. Koltun, V., Chrysanthou, Y., Cohen-Or, D. (2000) Virtual Occluders: An Efficient Intermediate PVS Representation, Rendering Techniques 2000, Proceedings of the Eurographics Workshop, 59-70.

9. Slater, M., D.Pertaub and A.Steed (1999) Public Speaking in Virtual Reality: Facing and Audience of Avatars, IEEE Computer Graphics and Applications, 19(2), March/April 1999, p6-9.

10. Slater, M., Howell, J., Steed, A, Pertaub, D-P., Garau, M., Springel, S. (2000) Acting in Virtual Reality, ACM Collaborative Virtual Environments, CVE'2000, in press. Also presented as a SIGGRAPH 2000 Sketch.

11. Slater, M., Sadagic, A., Usoh, M., Schroeder, R. (2000) Small Group Behavior in a Virtual and Real Environment: A Comparative Study, Presence: Teleoperators and Virtual Environments 9(1), 37-51.

12. Steed, A., Slater, M., Sadagic, A., Tromp, J., Bullock, A. (1999) Leadership and collaboration in virtual environments, IEEE Virtual Reality, Houston, March 1999, 58-63.

13. Steed, A., Frecon, E., Avatare, A., Pemberton, D., Smith, G. (1999) The London Travel Demonstrator, VRST’99, Proceedings of the ACM Symposium on Virtual Reality Software and Technology, 50-57.

14. Tecchia, F., Chrysanthou, Y. (2000) Real-Time Rendering of Densely Populated Urban Environments, Rendering Techniques 2000, Proceedings of the Eurographics Workshop, 83-88.

15. Tromp, J., Steed, A., Frecon, E. Bullock, A., Sadagic, A., Slater, M. (1998) Small Group Behavior in the COVEN Project, IEEE CG&A, 18(6), 53-63

16. Usoh, M., Arthur, K., Whitton, M.C., Bastos, R., Steed, A., Slater, M., Brooks, F. (1999) Walking > Walking-in-Place > Flying, in Virtual Environments, Computer Graphics (SIGGRAPH) Annual Conference Series, 359-364.

17. Welch, G., Bishop, G., Vicci, L., Brumback, S., Keller, K. (1999) The HiBall Tracker: High-Performance Wide-Area Tracking for Virtual and Augmented Environments, VRST’99, Proceedings of the ACM Symposium on Virtual Reality Software and Technology.

18. Sonnenwald, Diane, R. Berquist, K. Maglaughlin, E. Kupstas Soo, M. Whitton, (2000) Designing to Support Scientific Research Across Distances: the nanoManipulator Environment, in Collaborative Virtual Environments, E. Churchill, D. Snowdon, and A. Munro, eds, London: Springer Verlag, in press.

19. Aliaga, Daniel, J. Cohen, A. Wilson, E. Baker, H. Zhang, C. Erikson, K. Hoff, T. Hudson, W. Stuerzlinger,R. Bastos, M. Whitton, F. Brooks, D. Manocha (1999) MMR: An Interactive Massive Model Rendering System Using Geometric And Image-Based Acceleration, Proceedings of the 1999 ACM Symposium on Interactive 3D Graphics (Atlanta, GA, April 26-28, 1999), pp. 199-206, 237.

Executive Summary