Search past question, project, seminar or forum topic:



DESIGN AND IMPLEMENTATION OF A MOBILE APP FOR THE VISUALLY IMPAIRED

Project topic for Computer Science department.

CHAPTER ONE

INTRODUCTION

Background of the Study

This work describes the process of developing a prototype of a mobile application, which allows a visually-impaired user to use the smartphone camera to capture words as images, which are then processed to produce audio of those words. The system uses already existing Optical Character Recognition (OCR) and Text-to-Speech (TTS) frameworks, combining them in a way that taken together, provide the desired results.

OCR is the mechanical or electronic translation of images of handwritten or printed text into machine-editable text (Sagar, Shobha & Ramakanth Kumar, 2008). It can be used for a variety of applications, including:

Scanning printed documents into versions that can be edited with word processors, like Microsoft Word or Google Docs.

Indexing print material for search engines.

Automating data entry, extraction and processing.

Deciphering documents into text that can be read aloud to visually-impaired users.

Archiving historic information, such as newspapers, magazines or phonebooks, into searchable formats.

Electronically depositing checks without the need for a bank teller.

Placing important, signed legal documents into an electronic database.

Recognizing text, such as license plates, with a camera or software.

Sorting letters for mail delivery.

Translating words within an image into a specified language.

OCR can recognize both handwritten and printed text. But the performance of OCR is directly dependant on quality of input documents. OCR is designed to process images that consist almost entirely of text, with very little non-text clutter obtained from picture captured by mobile camera (Mithe, Indalkar & Divekar, 2013).

In TTS, text recognized by OCR will be the input, which is converted to speech as output. A text-to-speech (TTS) synthesizer is a system that can read text aloud automatically, which is extracted from Optical Character Recognition (OCR). A speech synthesizer can be implemented by both hardware and software. Speech synthesis is the artificial production of human speech (Sasirekha & Chandra, 2012). A computer system used for this purpose is called a speech synthesizer. A text-to-speech (TTS) system converts normal language text into speech. A synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output.

   Statement of  the Problem

  • Visually-impaired people find it difficult, if not impossible to perform visual tasks. For instance, text reading requires the use of a braille reading system or a digital speech synthesizer (if the text is available in digital format). Majority of published printed works do not include braille or audio versions, and digital versions are still a minority. On the other hand, they are unable to read the simple warnings in walls or signals that surround us. Thus, the development of a mobile application that can perform the image to speech conversion, whether it’s a text written on a wall, a sheet of writing paper or in another support, has great potential and utility.

1.3       Motivation

Visual-impairment is a very serious problem in our society. Most visually-impaired people communicate using braille, but our day-to-day information (such as sign posts, instructions on walls etc) are not in braille. This makes it difficult for them to cope with everyday issues. The researcher was thus motivated to develop this mobile app to solve this problem and help people suffering from this ailment to perform visual tasks and cope with everyday activities with as little difficulty as possible.

1.4       Aim and Objectives

The aim of this study is the design and implementation of mobile app for the visually impaired. This is to be achieved by the following objectives:

To design a user interface for receiving and processing visual information.

  • To implement an Optical Character Recognition (OCR) system to process the text images to digital information.
  • To implement a Text-to-Speech (TTS) system to convert the processed text to audio information.

To design a user interface for output of text and audio information.

  •   Purpose of the Study
  • Visually-impaired people find it difficult to perform visual tasks. For instance, text reading requires the use of a braille reading system or a digital speech synthesizer (if the text is available in digital format). Majority of published printed works do not include braille or audio versions, and digital versions are still a minority. Thus, the development of a mobile application that can perform the image to speech conversion, whether it’s a text written on a wall, a sheet of writing paper or in another support, has great potential and utility.
  • Visual impairment makes life rather difficult for people who suffer from this health problem, but the use of technology can help in some day-to-day tasks. In this context, the present work focuses the development of a photo-to-speech application for the visually impaired, and its ultimate purpose is the development of a mobile application that allows a user to "read" text. To achieve that, a set of frameworks of Optical Character Recognition (OCR) and Text to Speech Synthesis (TTS) are integrated, which enables the user, using a smartphone, to take a picture and hear the text that exists in the picture.
  •  Significance of the Study

The technology of Optical Character Recognition (OCR) enables the recognition of texts from image data. This technology has been widely used in scanned or photographed documents, converting them into electronic copies, which one can edit, search, play its content and easily carry (Elmore & Martonos, 2008). The technology of speech synthesis (TTS) enables a text in digital format to be synthesized into human voice and played through an audio system. The objective of the TTS is the automatic conversion of sentences, without restrictions, into spoken discourse in a natural language, resembling the spoken form of the same text, by a native speaker of the language. This technology has had significant progress over the last decade, with many systems being able to generate a synthetic speech very close to the natural voice. Research in the area of speech synthesis has grown as a result of its increasing importance in many new applications (Thomas, 2007).

1.7     Organization of the Work

This project has been organized into five chapters. Chapter 1 introduces the project. Chapter 2 explains the literature review on work done on this subject. Chapter 3 discusses the research methodology. Chapter 4 explains the implementation and evaluation of the design and Chapter 5 includes the summary of results, conclusion, recommendations and suggestions for further research.

1.8    Definition of Terms

Optical Character Recognition (OCR)

The process of recognition and automatic conversion of existing characters in the written-support image into the text format, which can then be used in various applications.

Text-to-Speech (TTS)

A computer system that should be able to read aloud any text, regardless of its origin

Framework

A framework, or software framework, is a platform for developing software applications. It provides a foundation on which software developers can build programs

Software Development Kit (SDK)

A programming package that enables a programmer to develop applications for a specific platform.


For complete material - Click Here

Other topics you might be interested in: