Voice Communications in Virtual Reality Environments


Carlos Sterling

Software Engineering Technology

University of Southern Mississippi

Box 5137- Hattiesburg, MS, US

E-mail: Kableman26@yahoo.com

Phone: 1 601 544 7184

Fax: 1 601 266 5717

Tulio Sulbaran

Construction Engineering Technology

University of Southern Mississippi

Box 5137- Hattiesburg, MS, US

E-mail: Tulio.Sulbaran@usm.edu

Phone: 1 601 266 6419

Fax: 1 601 266 5717



Abstract:  Current users of Virtual Reality (VR) Environments distributed over the Internet, communicate among themselves using text-chat. Text-chat is a slow communication process, which distracts the users from their main task in the VR environment.  The primary goal of this paper is to describe a project that focused on the creation of an interface that allows voice communication among VR users. The resulting interface from this project allows anyone using a VR Environment over the Internet to logon to a voice communications server from anywhere in the world.  The users are able to communicate without having to install new software on their computer. This type of communication is expected to foster people’s abilities to share their ideas and solutions to problems in research, education, and/or professional environments anywhere on earth where there’s a computer with Internet access.


Keywords: Virtual Reality, Voice Communications, Voice Chat, Voice Communications Server. 


1- Background of Communication Problems in VR Environments

Virtual Reality (VR) Environments such as Active Worlds, Cybernet Worlds, and Chat in 3D Worlds allow users to communicate through text-chat [ActiveWorlds 2003, Cybernet Worlds 2003, Chat in 3D Worlds 2002].  During the text-chat, the average person is able to type about 30 to 60 words per minute, while someone speaking can say 300 to 400 words per minute [Environmental Data Systems 1995].  Thus, it can be seen that text-chat compared to voice communications is much slower and time consuming.  Another point raised about text chat is an increased difficulty in expressing an idea, because end-users do not take the time to type in every little detail about the message they are trying to point out.  Additionally, during text-chat users cannot express degree of emphasis by inflection or intonation since voice is not present. Therefore, text-chat has the potential to reduce the amount of interaction between users.


Additionally, most of the interaction between the users and VR Environments are through devices (such as: mouse, keyboard, joystick, pen, etc.) operated with the users’ hands. Therefore, text-chat limits the users’ capability to communicate their ideas while performing other activities within the VR Environment such as: interacting with objects, navigating and/or pointing out particular aspects of the VR Environment to other people. For example, text-chat limits architects, engineers and/or constructors abilities to describe the main components of a building during a Virtual Reality walk-through. Thus, architects, engineers and/or constructors equipped with a microphone head set will be able to communicate by voice while using their hands to focus on the interaction in the VR Environment.


2- Primary Objective of Voice Communications in a VR Environment

The primary goal of this paper is to describe a project that focused on the creation of an interface that allows voice communication among VR users. This project augments the existing text-chat capabilities in VR environments by making the interaction among users more natural.  The migration from text-chat to a more natural communication such as voice is a developing trend on the Internet. Jack Woodall supports this with the statement that “Sound & image are the natural means of human communication, not writing. Therefore voice should eventually replace text on the Internet” [Healthnet Medical Discussion 1999].  Additionally, the author believes that this interface will foster interaction among VR Users.  The development of voice communications in VR has the overall effect of being a training tool in everyday situations, which will eliminate the users need to convert their thoughts into text format. 


3- Approaches to Establish Voice Communications in VR Environments

During the experimental stages of this project, several problems were encountered during the implementation of voice communications in VR Environments. Two of the unsuccessful approaches are briefly described in this section of the paper. These problems are not intended to discourage other researchers from exploring further those approaches. These problems are presented as a background to support the final approach used by the author to establish voice communication in VR Environments.


The first approach explored the development of scripts using a language such as PERL or JAVA. Clicking on an objective within the VR environment would activate these scripts. Upon activation, these scripts would display a list of commonly available voice communicators such as: Yahoo Instant Messenger [Yahoo Messenger 2003], AOL Instant Messenger [AOL Instant Messenger 2003], and Microsoft Instant Messenger [MSN Messenger 2003]. Then the user had the option to select one of the voice communicators from a drop down list box. Finally, another script would execute the selected voice communicator automatically, which is located on each individual’s machine.  Although, this approach would take advantage of commonly available voice communication software, it was found that, due to security reasons, the operating systems do not allow web scripts to run applications located on the users’ computers. Technical experts in the field of programming and networking were inquired regarding this issue and they concurred that this approach was a security threat, due to the fact that it could be used to activate viruses on the users’ computer [Expert Exchange 2003]


The second approach was to compile on the server (where the VR environments are located) most of the commonly available voice communicators (or provide direct link to the voice communicators web sites).  The links to the communicators would be provided within the VR environment. The user would click on the link, which gives the option to select one of the communicators. Then the user had to install the communicator on his/her computer. Similarly to the previous approach, this approach would take advantage of commonly available voice communication software, but it would require the user to go through the setup process every time a voice communicator was selected. This repetitive setup and installation process would have defeated the purpose of reducing users’ downloading and installation of software applications. Additionally, with this approach the administrator of the VR environment would not have any control over who uses the voice communicators.


The approach used was to manage a server side voice communication application.  The voice communication application was linked to an object within the VR environment. Clicking on an objective within the VR environment activated the communication application. Upon activation, a logon screen appears which gives the user the opportunity to enter login information and establish voice communication with any of the authenticated users. All of these operations are web-based programs running the communications application on the server side. This approach provides easier accessibility and control for the user and administration. Although, this approach was the one that was finally implemented it did not come without difficulties. The main difficulty was the application’s incompatibility with newer versions of PHP.  The reason behind this problem was that the communication application was originally written with global variables, instead of the new super global variables, which were designed for security reasons [Foxserv 2002]. This problem resulted from the default PHP installation and configuration, which sets up an INI file with global variables = off.  There were two possible solutions to this problem. The first possible solution was to go through all of the communication application PHP code and change from global variables to super global variables. The second possible solution (which was the one implemented by the author) was to change the INI file and turn on the variables in the “System32 Registry” of the Windows operating system [Foxserv 2003]. 


It can be observed that voice communication for VR environments could be accomplished using different approaches after solving the challenges presented in each approach. However, it was not the objective of this project to explore solutions for each of the approaches. The objective of this project was to identify and implement one successful approach, which will be described in the following section.


4- Solutions to Embed Voice Communication Links in the VR Environments

This project allows voice communications among users of VR Environments distributed over the Internet in a user-friendly environment.  The way the users interact in the VR environment is to select a VR object such as a headset or a phone, which when clicked calls up the voice communications application located on the server.  Once the users are linked to the voice communications server, they are prompted to provide login information.  This is followed by a system verification process, which authenticates the users’ information and then grants access to establish voice communications with other users currently interacting in the VR Environment.


To establish the voice communication in Virtual Reality environments distributed over the Internet the following components were considered: 1- Server hardware set-up and specifications, 2- Server software set-up and specifications, 3- Integration of the communications software and VR Environment. Following is a brief description of these three components.


4.1 Server Hardware Set-up and Specifications

The communication server (hardware) used during this project consisted of an Intel Pentium 4 Processor, which runs at a speed of 2.66 GHZ.  The system has a 512 MB of RAM running at 333 MHz along with a 60 Gigabyte Ultra ATA/100 7200 RPM Hard Drive.  The computer comes equipped with a SoundBlaster Live 5.1 Digital Sound Card, along with an Integrated Intel Pro 100 M PCI Ethernet Network Card. This gives more than the needed power to run a voice communications application and VR Environment from a web server that requires speed or memory.


4.2 Sever Software Set-up and Specifications

The operating system used for this project was Microsoft Windows XP Professional with IIS 5.0 (Internet Information Services 5.0) as the web server software.  The IIS 5.0 web software configures the server and installs a directory on the hard drive that allows connectivity to the Internet by way of Port 80. The voice communication application files are stored in the root directory created on the hard drive. After the IIS web server was setup, PHP (Hypertext Preprocessor) was installed, which is a server side scripting language that is embedded into web pages.  The reason PHP needs to be installed is to allow support for the voice communications software, because the software was written in PHP code. The INI files were changed to support global variables = on [Foxserv 2002].


The voice communication application implemented in this project (Voiceweaver by StreamComm) is based on Microsoft’s technology. Voiceweaver software [StreamComm 2003] allows members to access the communications server through the use of PHP, which is a scripting language that produces Dynamic HTML web pages. One current limitation of this software is that it only works on Windows 95 and later operating systems.  Future work could be geared towards testing the functionality of the voice application software [StreamComm 2003] on other types of operating systems such as Linux.


4.3 Integration of Communications Software and VR Environment

With hardware and software installed in the communication server, the next step was to integrate the voice communication software into the VR environments. Thus, it is required to explain the integration from two perspectives: 1- Voice Communications and 2- VR environment.


From voice communications software perspective, the equivalent to a hyperlink to the communications server was embedded into the VR environment. This allowed users to connect to the voice communications software in the server while interacting in the VR environment.  Once the users visit the VR environment, they are asked to install a PHP plug-in on their computer to keep the voice communications between users secure from intrusion.  This takes only a few seconds to install, and never takes place again as long as the plug-in is not deleted from the user’s computer.  Once the plug-in is installed, a login window opens to give the user an opportunity to verify login information.  Users of the voice communications application are assigned a username and password by the administrator of the web server. Figure 1 shows the web based login screen for the VR users. Another function of the voice communications application is the online administration function, which allows the administrator to logon from anywhere with Internet access. Administrators can add, change, or delete user accounts without having to actually be at the server. Figure 2 shows the web-based login and menus for the administrator.  This gives remote accessibility to the administrator and user.


Figure 1.  Users Web-based Login Screen to the Voice Communication Server


            Figure 2. Administrator Web-Base Login Screen and Account Management


From the VR Environment perspective, the Virtual Reality Environments could have been created using any of the many available languages such as: VRML [Virtual Reality Modeling Language 1995] and/or Java 3D API [Java 3D API 2003].  The VR Environments could be also created with software applications such as 3D Studio VizR3 produced by Autodesk [3Dstudio Viz R3... n.d.], WorldToolKit produced by Sense8 [Sense8 2000], and/or Internet Scene Assembler (ISA) produced by Parallel Graphics [Parallel Graphics 2003].  The VR environment used for this project was a simple building that was created with ISA. In the building one of the objects served as the link to the voice communications server. Figure 3 provides a snapshot of the VR Environment used in this project.


The process to make an object serve as a link to the communication server is very simple and is as follows. The first step was to use Internet Scene Assembler (ISA) [Parallel Graphics 2003] to import a building or world. The second step was to place inside of this building or world objects such as a table, phone, chair, and any other items that deemed appropriate for the environment setting.   Each object has certain characteristics and properties that can be changed, and one of them is the ability to embed addresses and parameters. Thus, the third step was to specify in the address section of an object, a file location to execute (Voice Communication Software) on the mouse click. With the parameters section a target value could be specified.  With these functions the voice Communications software could then be executed from the server.


Another key aspect is for the end-user to be able to see the ‘virtual world’, which requires the user to download a Virtual Reality plug-in such as Cortona [Parallel Graphics 2003].  Once this plug-in is installed on the browser, it allows navigation of the VR world. Another way that would work uses Cortona Jet, which allows the ‘virtual world’ to be embedded in an HTML web page by using the Java language, which does not need a VR plug-in installed to be able to see and navigate through the ‘virtual world’.


Figure 3. Snapshot of the VR environment used for voice communication



5- Contributions of Voice Communications in VR Environments

There are many benefits available from this type of project; a more productive working environment that is easily accessible from practically anywhere there’s a computer with an Internet connection. Each user has voice access in a Virtual Reality Environment to discuss problems or solutions pertaining to their area of expertise.  This is an actual VR world-meeting place used to gather and to communicate with more interaction than in a standard text-chat environment.  The solution is for everyone to communicate with one another based on one voice software application that is remotely accessed without having to download different voice communications software.  The embedding of the software in the ‘virtual world’ is the key component. Linking an object to the server’s various software applications through a Virtual Reality Environment is very beneficial.


Voice communications in a Virtual Reality Environment have the potential to enhance user’s online experience. Today’s youth spend hours in front of video games that are done in virtual reality. A learning environment created in a VR environment with voice communications and other activities dealing with education could lead to youth’s greater comprehension and understanding of educational materials. Voice communications in a VR environment could be used towards a more productive work, school, and research environment.


Another contribution is that the approach used during this project allows a person with basic computer knowledge and equipment to setup a web server and create ‘virtual worlds’ embedded in HTML along with voice communications application embedded into the VR environment.  This would open the door to more creativity and extensive implementation of Virtual Reality environments.  New discoveries of ways Virtual Reality could be used have endless possibilities.  There would be the possibility of the Internet taking on a new look and new way of user interaction. 


6- Summary

Currently, text-chat is the main communication avenue for users of Virtual Reality (VR) Environments distributed over the Internet. This report presents a successful approach to include voice communications in the VR environments. The approach presented is independent of software or language used to develop VR environments. One of the main advantages of the approach presented is that it does not require a user to download hardly any software to be able to communicate by voice. Additionally, a centrally located server provides the remote access to a user-friendly VR environment that allows voice communications for people all over the world while giving appropriate control to the administrator.


7- Future Extensions

In the near future the idea of voice communications in VR environments could go one step further with ‘avatars’ interacting with the ‘virtual world’ and opening up voice sessions with other ‘avatars’.  This would take a great deal more research but would be something interesting to pursue.


8- Acknowledgments

Thanks are in order for the people who have helped in steering this research project specially Professor Doris Kemp at the University of Southern Mississippi.  Thank you for your time and patience.  It is greatly appreciated.


9- References

3D Studio Viz R3. (n.d.).  Retrieved April 13, 2003 from the AutoCAD Pipex Dial Web site: http://www.autocad.dial.pipex.com/3D%20Viz%20Features.htm


Active Worlds. (April 2003). Retrieved April 2003, from Active Worlds Web site:



AOL Instant Messenger, AIM Home Page. (2003). Retrieved March 20, 2003 from the AOL Instant Messenger Web site: http://www.aim.com/


Chat in 3D Worlds. (2002). Retrieved May 9, 2003, from PC Pursuits Web site: http://www.pcpursuits.com/3DWorlds.htm


Cybernet Worlds 3D Chat Community. (2003). Retrieved May 9, 2003, from Cybernet Worlds Web site: http://www.cybernetworlds.com


Environmental Data Systems. (1995). Retrieved May 9, 2003, from REU Web site: http://www.reu.com/edsys/medifile2.html


Experts Exchange, Providing IT Information That Drives The Enterprise. (2003). Retrieved March 6, 2003 from The Experts Exchange Web site:



Foxserv, The Foxserv Project. (2002) Retrieved July 13, 2003 from the Foxserv Web site:



HealthNet Medical Discussion. (1999). Retrieved May 12, 2003 from the HealthNet Web site: http://www.healthnet.sk/discussion/messages/122.html


Java 3D API. (April 2003). Retrieved April 13, 2003 from the Java.Sun.Com Web site:  http://java.sun.com/products/java-media/3D/


MSN Messenger. (2003). Retrieved March 20, 2003 from the MSN Messenger Web site:



Parallel Graphics, Internet Scene Assembler (July 2003). July 15, 2003 from the Parallel Graphics web site:  http://www.parallelgraphics.com/products/downloads


Parallel Graphics, Cortona VRML Client. (July 2003). Retrieved July 15, 2003 from the Parallel Graphics web site: http://www.parallelgraphics.com/products/cortona/


Sense8. (2000). Retrieved May 9, 2003 from Sense8 Web site:



StreamComm. (2003). Retrieved March 27, 2003 from the StreamComm Web site:  http://www.voiceweaver.com/?affid=speakfreely


Web Wiz Guide, The Web Development Site. (2001-2003). Retrieved March 12, 2003 from the Web Wiz Guide Web site: http://www.webwizguide.com


W3C, Virtual Reality Modeling Language. (April 1995).  Retrieved April 13, 2003 from the W3C Web site: http://www.w3.org/MarkUp/VRML/


Yahoo! Messenger. (2003). Retrieved March 20, 2003 from the Yahoo Web site: