1 CONVERTING REAL TIME IMAGES INTO NATURAL SOUND THROUGH A CUTTING EDGE SYSTEM Ajay Chobey per 1st Affiliation

1

CONVERTING REAL TIME IMAGES INTO
NATURAL SOUND THROUGH A CUTTING
EDGE SYSTEM

Ajay Chobey per 1st Affiliation (Author)
line 1 (of Affiliation): Information Technology
line 2-Vel Tech Multi Tech Dr.Rangarajan
Dr.Sakunthala Engineering College
line 3- 42, Alamathi Road Avadi, Chennai – 600062
line [email protected]

Abstract—The spread of visual weakness is
an exceptionally delicate issue around the
world. Dazzle individuals need to play in the
ordinary activities of various challenges.
These incorporate the challenges of moving
in total independence and the capacity to
look for and perceive objects. Until 10 years
prior, the main guide that a visually
impaired individual has utilized are sticks,
control mutts going with people or to move.
In the most recent decade, electronic gadgets
have been brought into the universe of the
visually impaired with a specific end goal to
encourage the lives of these individuals. We
made a framework which will help the
visually impaired individual the capacity to
feel nature through sound.

I. INTRODUCTION
Dazzle individuals confront a few issues
throughout their life, one of these issues that is
the most imperative one is identifying the
impediments when they are strolling. In this
exploration, we recommended a framework with
two cameras set on daze individual’s glasses that
their obligation is taking pictures from various
sides. By looking at these two pictures, we will
have the capacity to discover the obstructions. In
this strategy, first, we examine the likelihood of
presence a protest by utilization of extraordinary
focuses that then we will call them “Proportional
focuses”, at that point we use paired technique,
institutionalize and standardized cross-connection
for checking this likelihood. This framework was
tried under three distinct conditions and the

Pavan S per 2st Affiliation (Author)
line 1 (of Affiliation): Information Technology
line 2-Vel Tech Multi Tech Dr.Rangarajan
Dr.Sakunthala Engineering College
line 3- 42, Alamathi Road Avadi, Chennai – 600062
line 4- [email protected]

assessed blunder is the adequate range. Visual
impairment is a condition of without the visual
recognition because of neurological or
physiological elements. The incomplete visual
impairment speaks to the absence of joining in
the development of the optic visual or operational
hub of the eye, and aggregate visual impairment
is the full nonattendance of the visual light
discernment. In this work, shoddy, a
straightforward benevolent client, shrewd
visually impaired direction framework is planned
and actualized to enhance the portability of both
visually impaired and outwardly debilitated
individuals in a particular region. The proposed
work incorporates a wearable gear comprises of
lightweight dazzle stick and sensor-based
impediment recognition circuit is produced to
help the visually impaired individual to explore
alone securely and to keep away from any
hindrances that might be experienced, regardless
of whether settled or versatile, to keep any
conceivable mishap. The principal part of this
framework is the infrared sensor which is utilized
to filter a foreordained zone around dazzle by
emanating reflecting waves. The primary target
of this venture is to build up an application for
daze individuals to recognize the items in
different ways, identifying pits and sewer vents
on the ground to make allowed to walk Detecting
objects utilizing picture preparing can be utilized
as a part of numerous modern and in addition
social applications. This venture is proposing to
utilize question identification for dazzle
individuals and give them sound/vocal data about
it. We are distinguishing a question utilizing the

2

portable camera and giving voice guidelines
about the course of a protest. The client must
need to prepare the framework first about the
question in development. We are then doing
highlight extraction to look for objects in the
camera see. We are taking help of point where
protest is put to provide guidance about the
question.
II. BACKGROUND AND RELATED WORKS:
All around, an expected 40 to 45 million
individuals are thoroughly visually impaired,
135 million have low vision and 314 million
have some sort of visual disability. The
frequency and socioeconomics of visual
impairment shift significantly in various parts of
the world. In most industrialized nations,
roughly 0.4% of the populace is visually
impaired while in creating nations it ascends to
1%. It is evaluated by the World Health
Organization (WHO) that 87% of the world’s
visually impaired live in creating nations. This
paper condenses late advancements in sound
and material criticism based assistive
innovations focusing on the visually impaired
group. Current innovation enables applications
to be proficiently conveyed and keep running on
versatile and handheld gadgets, even in
situations where computational prerequisites are
critical. Subsequently, electronic travel helps,
navigational help modules, content to-discourse
applications, and additionally, virtual sound
showcases which consolidate sound with haptic
channels are getting to be incorporated into
standard cell phones. This pattern joined with
the presence of progressively easy to understand
interfaces and methods of association has
opened an assortment of new viewpoints for the
recovery and preparing of clients with visual
weaknesses. The objective of this paper is to
give a review of these improvements in view of
late advances in essential research and
application advancement. Utilizing this review
as an establishment, a plan is delineated for
future research in portable association outline as
for clients with exceptional requirements, and in
addition at last in connection to sensor-crossing
over applications all in all.
EXISTING SYSTEM
A. Dedicated Systems:
In the existing framework, dazzle individuals
more often than not utilize the red and white
stick which is just restricted to recognize that
there is an impediment or not. The individual
outputs the zone with the stick. The radar passes
on data to the client as vibrational input.
B. Disadvantages Of Existing System:
? He or she is always depended on
others for assistance.
? The person will not know what kind
of obstacle is it.
? Is not always reliable.

III. Abbreviations and Acronyms
YOLO You Only Look Once
TTS Text To Speech
ETA Travel Aid

IV. PROPOSED SYSTEM:
A. SYSTEM OVERVIEW:
The individual hears every one of the items
around him. The precision of the
acknowledgment is 70-90 percent. Sits tight for
the client for input. Gets more precise by
learning individually.
B. PROPOSED SYSTEM ADVANTAGES:
? Almost accurate with each prediction.
? Also increases the hearing capacity of
the person.

C. PROPOSED SYSTEM ARCHITECTURE:

Fig: 1

3

V. METHODOLOGIES
A. MODULE NAMES:
These are four modules in the project:
? Capturing The Surrounding
? Image Processing
? Convert Gathered Data Into Text
? Convert Text To Speech
1) Capturing The Surrounding
Catches the picture continuously with
electronic picture adjustment. It is the
association between the framework and the
encompassing. The got picture relies upon the
sort of camera utilized for taking the photo. The
base megapixel measure for identifying the
picture.
2) Image Processing
Changing the picture into a grayscale utilizing
vigilant edge recognition and contrasting these
pictures and the formats and consequently
utilizing YOLO calculation for finding the
substance in the picture.
3) Convert Gathered Data Into Text
Get just the required information and deliver
the suitable content information with it. Optical
character acknowledgment (likewise optical
character per user, OCR) is the mechanical or
electronic transformation of pictures of wrote,
transcribed or printed content into machine-
encoded content, regardless of whether from a
filtered report, a photograph of an archive, a
scene-photograph (for instance the content on
signs and announcements in a scene
photograph) or from subtitle content
superimposed on a picture (for instance from a
transmission). It is generally utilized as a type
of data passage from printed paper information
records, regardless of whether travel permit
reports, solicitations, bank proclamations,
automated receipts, business cards, mail,
printouts of static-information, or any
reasonable documentation. It is a typical
strategy for digitizing printed messages with the
goal that they can be electronically altered,
sought, put away more minimally, showed
online, and utilized as a part of machine
procedures, for example, subjective registering,
machine interpretation, (separated) content to-
discourse, key information and content mining.
OCR is a field of research in design
acknowledgment, manmade brainpower, and
PC vision.
4) Converting The data Into voice
The accumulated information is changed over
into discourse utilizing a content to discourse
motor. Discourse blend is the fake generation of
human discourse. A PC framework utilized for
this design is known as a discourse synthesizer
and can be executed in programming or
equipment. A content to-discourse (TTS)
framework changes over ordinary dialect
content into discourse; different frameworks
render emblematic semantic portrayals like
phonetic interpretations into the discourse.
Blended discourse can be made by connecting
bits of recorded discourse that are put away in a
database. Frameworks contrast in the span of
the put away discourse units; a framework that
stores telephones or diaphones give the biggest
yield go, however, may need clearness. For
particular utilization spaces, the capacity of
whole words or sentences considers amazing
yield. On the other hand, a synthesizer can
consolidate a model of the vocal tract and other
human voice attributes to make a totally
“engineered” voice yield. The nature of a
discourse synthesizer is judged by its likeness to
the human voice and by its capacity to be
comprehended. An understandable content to-
discourse program permits individuals with
visual debilitations or perusing incapacities to
tune in to composed takes a shot at a home PC.
Numerous PC working frameworks have
included discourse synthesizers since the mid-
1980s.
VI. Working
After switching on the device the algorithm
starts to capture the images from the
environment and the system is independent of
the camera used to retrieve the information. It
functions much efficient when the configuration
of the products is enhanced. It can also be
trained to new thing which can be easily be
identified by the algorithm. Currently, the
system is capable of identifying up to more than
3000 different objects. It is also not limited to
such extends but it can be further be raised to
more features. The working process is presented
below.

4

Fig: 2
The text to speech engine has three phases the
text analysis, linguistic analysis and wave
generation. The text analysis is used for getting
the character in the word while the linguistic
analysis helps in determining the word and thus
helping to pronounce the phrase in the sentence.

Fig: 3

VII. Abbreviations and Acronyms
YOLO You Only Look Once
TTS Text To Speech
ETA Travel Aid

VIII. CONCLUSION:
In this undertaking, a help to a visually
impaired individual furnishes so they can
connect no sweat. In contrast with other help for
the visually impaired individuals they are just
constrained to a particular capacity and are
exceptionally confined. Be that as it may, the
proposed framework can recognize questions as
well as ready to figure out what the protest is
and gives a live constant criticism of the
individual.

ACKNOWLEDGMENT

We thank God Almighty for giving us such
wonderful blessing and we express our gratitude
and sincere thanks to our beloved Founder
President Col. Prof. Dr. Vel. Shri.
R.RANGARAJAN B.E.(Elec.), B.E.(Mech.),
M.S.(Auto.), Ph.D. and Vice Chairman
Dr.(Mrs)SAKUNTHALA
RANGARAJAN,.MBBS., for their support
through the institution.
We thank our Principal Dr.V.RAJAMANI
Ph.D., for offering us all the facilities to do the
project.
We also express our sincere thanks to
Mr.R.PRABU, M.Tech., Head of The
Department, Department of Information
Technology for his support to do this project
work.
Special thanks must be mentioned to our
Project Coordinator, Mr.P.VIJAY ANAND
M.Tech., Assistant Professor, Department of
Information Technology, for his advice and
valuable guidance.
We also express our profound gratitude and
Special thanks to our Internal Guide,
Mr.C.Sarvanan, M.Tech., Assistant Professor,
Department of Information Technology, for his
suggestions and valuable guidance at every
stage of the project.
We thank all the Faculty Members in our
department for their guidance to finish this
project successfully. We also like to thank all
our friends for their willing assistance and to
our beloved Parents for everything that they
have done for us.
Further, the acknowledgement would be
incomplete if we would not mention a word of
thanks to our most beloved Parents for their
continuous support and encouragement all the
way through the course that has led us to pursue
the degree and confidently complete the project
work

5

REFERENCES:
1. S.Kirubakaran,smart guidance for blind
people(2011),current trends in information
technology.
2. G.Balakrishnan, G.Sainarayanan,
wearable stereo vision for blindly impaired
people(2012), mechatronics&
automation(ICMA) international conference.
3. J.Choi, crosswalk and traffic
light(2013)frontier of computer vision
workshop.
4. Nazilmohajeri, Roozbehraste, obstacle
detection (2007),proceedings of world
conference vol 1:2007,P687.
5. Massimo Bertozzi,AlbertBroggi, a
parallel real time system for generic obstacle
and lane(2001), IEEE transactions on image
processing
6. B.L.Meijer. experimental system for
auditory image representation (1992), IEEE
transaction on biomedical engineering.
7. F.Hong, A.Chekima, ETA for visually
impaired(2011), international journal of disaster
recovery and business connectivity, IEEE a new
vision vol.3,aug.