Web Based Speech to Text and Text to Speech Application for Providing Online Academic Test for Blind Students


Submitted by:  Carlos Sterling – Carlos.Sterling@usm.edu


Advisors: Dr. Tulio Sulbaran  - Tulio.Sulbaran@usm.edu – (601) 266 6419

                     Professor Doris kemp - Doris.kemp@usm.edu – (601) 266 5673

                                              Professor Desmond Fletcher - Desmond.Fletcher@usm.edu – (601) 266 5185


June 3, 2004


0 – Abstract


Technology is key in providing an alternative for blind students in secondary education.  Currently there are problems with existing technology at the university level which do not provide a web based speech to text and text to speech academic testing application.  The software that has been evaluated produced very low results.  Each piece of software had some qualities, but overall did not produce the desired result.  With this in mind, the priority is to develop an application to handle these issues, which blind students are facing everyday.  The latest technology will be used to create the speech to text and text to speech web based testing application.  The user or in this case blind student will be able to hear questions and answers, which will allow them to input there answer in the form of speech.  All the users’ answers are stored in a database, along with the professors’ questions and answers.  The desired result will be a fully operational web based software for blind students and teachers.  Overall this project should provide a benefit to blind students that want to experience as much of the normal classroom setting as possible.


1 - Problem Statement


With blind or visually impaired students entering secondary education to improve their lives there arises the need for more web based speech to text and text to speech applications. According to the National Center for Educational Statistics , “The number of students with disabilities attending higher education has increased. In a recent study, the number of postsecondary undergraduate students identified as having disabilities in the United States was found to be 428,280, representing 6% of the student body.”  Also in the report it is stated that blindness and visual impairments account for 4.4% of the 428,280 attending a postsecondary institution. [National Center for Educational Statistics, 1999] Another study done at the University of Washington claims that advancement in technology and increased job specialization have resulted in career opportunities in fields that were once considered unsuitable for individuals with disabilities. [Burgstahler, S. (2000)]  With the growing number of blind people attending college, more advanced speech to text and text to speech web based applications for providing online academic test for blind students are needed to assist teachers to deal with blind or visually impaired students in their classsroom.


2 - Objective

The objective of this project is to design a testing application for the blind or visually impaired student, based on text to speech and speech to text technology.  This application will promote a better teacher/student relationship in the classroom. Text to speech and speech to text should be web based and easy accessible.  Current software evaluated at The University of Southern Mississppi Library did not display any means of providing a test that works on text to speech and speech to text technology that is web based.  There is a need to provide a technolgy that would enhance the overall experience of a blind person attending a university to provide them with a means to take test like traditional students.  According to The National Association of Blind Students (NABS), “Through advocacy and collective action, we work to maintain high standards and expectations for the education of blind students across the country, as we address relevant issues that arise.  Such issues include how to effectively deal with the Disabled Student Services offices at colleges and universities, how to build positive and productive relationships between consumers and state rehabilitation agencies,  how to use alternative techniques to successfully accomplish educational goals, and insuring the validation of standardized gateway tests, such as the GRE and the LSAT”. [NABS, n.d.]  The text to speech and speech to text web based academic testing software would provide an interaction for blind students to enhance their educational experiences by providing them with a tool to suceed at the college level.

Project Approach

3.1 - Methodology

The development of  a voice application that will enhance blind students interaction in a traditional classroom setting requires different components to come together such as, Multimodal code, web servers, databases, and web editor software. The project will be using Microsoft products such as its operating system and database software. Future work with the speech to text and text to speech project can be tested with other operating systems and databases.

Stage one of the project requires purchasing or gaining access to a web server, which will be hosting the web based application.

Stage two requires that the server (hardware) be setup if the server has to be purchased, otherwise use settings already in place with the accessible server.

Stage three requires that all related software needed to complete the project be installed.  These software applications include Web Design Software, Speech Application Language Tag Editors and Applications,  Database Software, and other Miscellaneous Software.

Stage four will be the beginning of the writing process.  The thesis will be an ongoing stage and will progress by chapters during the total project phase.

            Stages of Thesis:

1.                   February 14, 2005 – last day to contact Graduate Reader regarding manuscript production.  This is done online.

2.                   March 18, 2005 – Last day to have title page to Graduate Reader for approval.  Title page must be approved by the Graduate Reader before it is signed by the committee.

3.                   March 28, 2005 – Last day for thesis defense.  Get approval from graduate committee with or without provisions.

4.                   April 4, 2005 – Last day to submit thesis to the Graduate Reader for proofing and final approval.

5.                   May 13, 2005 – Last day thesis may be deposited in the office of Graduate Studies for graduation.

Stage five is where the actual project begins.  A web based user interface will be designed to handle the display and user inputs and outputs.  The interface will provide blind people access to academic test through the use of “Speech Application Language Tags (SALT), which is the standard being designed to "extend existing markup languages such as HTML, XHTML, and XML. Multimodal access will enable users to interact with an application in a variety of ways: they will be able to input data using speech, a keyboard, keypad, mouse and/or stylus, and produce data as synthesized speech, audio, plain text, motion video, and/or graphics. Each of these modes will be able to be used independently or concurrently [Cover Pages Hosted by Oasis]”.

Stage six will be a learning phase in the project to learn how to use some of the technology such .NET , ASP, and SALT to accomplish the goal of providing a voice interactive project.

Stage seven and eight will require establishing a link between the interface and the database.  All questions and answers, along with usernames and passwords will be stored and retrieved from a database.  Once this is done blind students will login to the web based application to choose the test and hear the question being read in a speech format.  The next step will be to provide a multiple choice response such as A,B,C,or D.  With each response given the answers are stored in a database. After the test is completed it is graded, which it then tells the blind person their score.   Other options will be added in to allow the blind person to have more control over the test; such as repeating the question, submitting the test, and voice login; just to name a few.

Stage nine will be the focus of making the web pages interact with voice commands that drive to web pages to be totally free from using sight or feeling.

Stage ten requires a thorough testing of the application to eliminate all problems that may occur when a user is interacting with the software.

Stage eleven will require a Powerpoint Presentation and demonstration of the project to be presented to the Graduate Committee.  This will be the final stage of the project.

Stage twelve requires that a thesis be submitted to a Graduate Reader.  The Graduate Reader will approve or disapprove of the thesis.


3.2 - Project Timeline


Start Date:

End Dates:

Acquire Server:


June 15, 2004

Friday, June 25, 2004

Setup Server (Hardware):


Monday, June 28, 2004

Friday, July 9, 2004

Install Software (Web Design Software,  SALT Editors and Applications,  Database Software, and other Miscellaneous Software):


Monday, July 12, 2004

Friday, July 30, 2004

Begin Thesis:

Tuesday, August 2, 2004

Monday, March 28, 2005

Design web based interface:


Tuesday, August 2, 2004

Friday, October 29, 2004

Research the .NET Framework. Learn ASP and SALT languages.


Tuesday, August 30, 2004

Monday, November 29, 2004

Establish a link to the database from the SALT application:


Monday, September 6, 2004

Monday, November 29, 2005

Test Phase 1: Saving and retrieving from database:


Monday, November 29, 2004

Friday, February 11, 2005

Test Phase 2: Voice interaction with web pages:


Monday, December 20, 2004

Friday, February 11, 2005

Test for accuracy and reliability:


Friday, February 11, 2005

Friday, March 28, 2005

Present final Project to Graduate Committee


Monday, March 21, 2005

Friday, March 25, 2005

Submit Thesis to Graduate Reader:

Tuesday, August 2, 2004

Friday, May 13, 2005


3.3 - Resources Required

In addition to the stages there are several key resources needed to accomplish the project and they are as followed:

            Voice Server (Such as: Pentium 4, RAM 1 GB, 120 GB Hard Drive)

            Operating System  (Such as: Windows 2003 Operating System)

            Web Page Software (Such as: Macromedia Dreamweaver MX )

            Programming language (Such as: VoiceXML, SALT, ASP)

            Communication Devices (Such as: Microphone Headsets)


4 - Expected Results

The expected results stemming from this project will result in a web based speech to text and text to speech application for providing online academic test for blind students.  This application will be tested thouroughly to provide a working product that will prove invaluable to blind students and professors in traditional classroom settings. It will improve the quality of education that a blind person can receive and promote an equal playing ground for the blind.

5 - Impact of the results

The major impact that this type of application will provide is more blind students participating at the college level with non blind students in a traditional academic setting. According to The National Federation of the Blind (NFB).” Whether one is blind or sighted, the ability to access and manipulate electronic information and control the technology through which such information is obtained are crucial elements which help to determine whether an individual is able to compete with his or her peers” [The National Federation of the Blind, 2002 ]. This type of impact will promote new technology to be pursued by higher institutes of learning that have the desire to attract as many students as possible regardless of there handicapped status.

6 - Possible Extensions

Future work on the project will be geared toward providing discussion type questions for academic test, or test that contain both multiple choice and discussion questions. Also there could be a random implementation of test questions from the database, which would provide a different test for each student. Another possible avenue to pursue would be to allow teachers to submit their test by voice instead of having to manually enter each question.  There are numerous capabilities that can be added by future technologist wanting to pursue this type of technology.

7 - Summary

In an educational environment we look for ways to improve student lives regardless of there race, age, sex, or disabilities. With a growing number of blind students entering secondary education there develops a need to produce web based voice applications. Speech to to text and text to speech educational testing software will provide the means to a better education and produce a better teacher to blind student interaction in the classroom. The speech to text and text to speech application will have a limited use as far as questions and answers of each test, but this will open up future work on the project.  Such work would include making the test provide discussion questions which will have a greater impact on providing unlimited testing capabilities. 

8 - References

Burgstahler, S. (2000). Building the Team: Faculty, Staff, and Students Working Together. Retrieved March 15, 2004 from The University of Washington Web site: http://www.washington.edu/doit/Faculty/Rights/Background/statistics.html

Cover Pages Hosted by Oasis. (March  2004). Speech Application Language Tags (SALT). Retrieved September 10, 2004 from the Oasis Web Site: http://xml.coverpages.org/salt.html

Miloslav, Nic (2000). VoiceXML Reference. Retrieved April 26, 2004 from the Zvon Web Site: http://www.zvon.org/xxl/VoiceXMLReference/Output/

National Center for Educational Statistics. (1999). An institutional perspective on students with disabilities in postsecondary education. Washington DC: U.S. Department of Education.  Retrieved March 15, 2004 from The University of Washington Web site: http://www.washington.edu/doit/Faculty/Rights/Background/statistics.html

The National Association of Blind Students. (n.d.). National Association of Blind Students. Retrieved April 13, 2004 from The National Association of Blind Students Web Site: http://www.nfbstudents.org/

The National Federation of the Blind. (2002). Technology, Blindness and the NFB.  Retreived April 26, 2004 from The National Federation of  the Blind Web Site: http://www.nfb.org/tech/ibtc2.htm

W3C. (2004). Voice Extensible Markup Language (VoiceXML) Version 2.0. Retreived April 26, 2004 from the W3C Web Site: http://www.w3.org/TR/voicexml20/