By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email
No need to pay just yet!
About this sample
About this sample
Words: 2318 |
Pages: 5|
12 min read
Published: Jul 15, 2020
Words: 2318|Pages: 5|12 min read
Published: Jul 15, 2020
Visual impairment and blindness caused by contagious diseases has been extremely diminished, but rising numbers of people are at danger of age-related visual impairment. Today 285 million people in the world are living with a vision impairment. A visual impairment refers to when one loses part or all of one's capability to see (vision). The impairment persists even with the use of eyeglasses, contact lenses, assistance, or surgery.
Visual information is the basis for most navigational tasks, individuals suffering with visual impairment have a disadvantage because appurtenant information about the surrounding environment is not available. Assisting the visually impaired along their navigation path to the destination is a challenging task which has drawn the attention of several researchers. With the recent progresses in inclusive technology it is possible to augment the support given to people with visual impairment during their mobility. This literature review categories the current findings into distinct concept categories like stereo vision in blind navigation to enable advanced analysis into the area.
According to WHO Fact Sheets, it is estimated that approximately 1. 3 billion people live with some form of distance or near visual impairment. With regards to distance vision, 188. 5 million have mild vision impairment, 217 million have moderate to severe vision impairment, and 36 million people are blind.
Visual system is an incredible feature in humans that enables the person to perceive the world around us. It is an imperative and inevitable necessity for all the humans to explore the ambient information in order to perform their daily activities and lead a comfortable life. Without this functionality, mankind has to face a lot of problems in their day-to-day life. Hence there is a vital need to understand this problem of the loss of human sight. The endeavor of restoring the vision to the visually impaired is the subject of intensive research in both engineering and medical professions.
One of the major problems faced by the visually impaired is to navigate through the environment without colliding any obstacles. To overcome this problem, long canes and guide dogs have been used by the blind for several years. However these long canes and guide dogs will give only the information of the nearby obstacles within a short distance, failing to retrieve the information of the environment. Obstacle detection is the process of detecting some object or barrier around the autonomous system that can affect movements of a system.
Obstacle detection is used to avoid collisions in paths and to ensure safety of vehicles, robots etc. Blind persons have difficulties in navigation. Obstacle detection has a lot of importance in blind navigation. Mostly, blind people use stick cane to detect nearby obstacle and to navigate, but this is not much accurate for navigational purposes at a long distance. Obstacle detector can aid blind peoples in path navigations and can avoid collisions. In this literature review we present the current findings in stereo vision in blind navigation to enable advanced analysis into the area.
Computer Vision
Computer vision is a branch of computer science that helps computers to see, classify and process images in the similar way that human vision does, and then contribute to useful output. It is like conveying human judgement and instincts to a computer machine. Computer vision assignments include approach for collecting, processing, evaluating and understanding digital images, and derivation of high-dimensional data from the real world in order to provide numerical or symbolic information, e. g. , in the forms of decisions.
Stereo vision
Computer stereo vision is the extraction of 3D information from digital images, such as those obtained by a CCD camera. By comparing information about a scene from two vantage points, 3D information can be extracted by examining the relative positions of objects in the two panels. This is similar to the biological process stereopsis. Stereoscopic images are often stored as MPO (multi picture object) files. Recently, researchers pushed to develop methods to reduce the storage needed for these files in order to maintain the high quality of the stereo image. With stereo cameras, objects in the camera’s field of view will appear at slightly different locations within the two images due to the cameras’ different perspectives on the scene.
Estimation of depth map using single image
Single image is used along with, canny edge detection along with morphological operations is used to find the obstacles. Local depth hypothesis is used to find depth information. Canny edge detection is one of the optimal edge detector. The noise is first removed by blurring the image with the help of Gaussian filter. Edge detection is performed on the obtained blurred image by studying pixel’s gradient magnitude. The sharp edges are then detected by removing everything around the local maxima.
Edge linking is then performed where broken edges are joined to retrieve a meaningful image. This is performed by using a technique of morphological operation also known as dilation. The obstacles are then filled with white i. e all the pixels in closed boundaries are coloured white sing flood fill operation.
The farthest point from the user is the Vanishing Point. The brightness of the pixels increase from the vanishing point pixels to the pixels near the user. There are four depth hypothesis: top to bottom, bottom to top, left to right and right to left. According to the vanishing points one of the hypothesis is selected. When there are obstacles with no vanishing points then the default hypothesis is selected which is bottom to top. Final Depth for obstacle is estimated by comparing the depth map obtained on the bases of hypothesis using vanish points to the depth map for a flat surface without any obstacles as default. This information obtained about the spatial information of the obstacles is given to the visually impaired person.
Depth Map generation
In 3D computer graphics a depth map is an image or image channel that contains information relating to the distance of the surfaces of scene objects from a viewpoint. The term is related to and may be analogous to depth buffer, Z-buffer, Z-buffering and Z-depth. The "Z" in these latter terms relates to a convention that the central axis of view of a camera is in the direction of the camera's Z axis, and not to the absolute Z axis of a scene. It finds corresponding matches between two images. We have already seen how epiline constraint makes this operation faster and accurate. Once it finds matches, it finds the disparity. By adjusting the values of numDisparities and blockSize, we can obtain better results.
Acoustic Feedback
The acoustic feedback is responsible of informing the user about the presence of different types of obstacles, either static or dynamic. The main goal is to transmit warning messages fast enough, so that the person can walk normally while avoiding dangerous situations. At the hardware level, Bluetooth bone conduction headphones is used. In this way, the user’s ears are not covered and the user can continue to capture the other external sounds from the surrounding environment.
Acoustic alerts are more preferable and easy to use rather than tactile stimulation. Vibrations used in tactile simulation are insufficient to capture the overall dynamics of the environment. Furthermore, systems adopting tactile simulation are considered as invasive because they require an actual physical contact with the human skin. The detection of the obstacle distribution within the scene, the semantic interpretation of the identified objects and the effective transmission of this information to a user are the key elements for an effective usage of any ETA. However, the manner of transmitting the acoustic feedback plays a central role in the acceptance of the device by the users.
There are many approaches for obstacle detection. One of the approach is using smartphone where a set of interest points are extracted from the image and is tracked by Lucas Kanade algorithm. Then, the camera and the background motion is estimated through homographic transforms. Obstacles are marked as urgent or normal based upon their distance to the subject and associated vector orientation. The detected obstacle is sent to object classifier which determines the degree of danger. Another technique is using color segmentation and stereo-based color homography. Another technique to detect obstacle uses a camera and ultrasonic sensors. The obstacle is detected using ultrasonic sensor while from the camera image, object size is calculated.
Blavigator
According to the Blavigator model, the system is composed of three main components which are the position and orientation unit which is responsible for supplying the navigation system with the user’s location in the form of local and/or global coordinates, the Geographic Information System (GIS) contains geo-referenced data, stored in a database and the user interface. It assists the visually impaired people because it acts as a substitute for natural vision sensing and interacts via audio interfaces.
Voice
In Voice a single video camera captures the image which is scanned and converted into sound. The loudness is proportional to the brightness of the pixel in the image. High frequency is used for top portion and low frequency is used for lower portion.
NAVI
In NAVI, using image processing techniques, differentiation between the background and objects is made, using brighter pixels for closer objects. Image is then used for conversion into stereo sound with loudness proportional to the brightness of the pixel in the image.
Optophone
In Optophone, using edge detection technique a depth map is created, which is converted into sound with the technique similar to one used in Voice system.
Yoshihiro Kawai and Fumiaki Tomit’s prototype system
In the prototype system, there is a computer, a headset with a microphone, an headphone, three small cameras, and a sound processor. The three images are used to prepare a 3D structure and object recognition which is then converted into 3D virtual sound. In A pioneering work by Zelek et. al. model images captured by the two video cameras are used to get disparity information, which is used to provide the information of object through tactile feedback. Ifukube et al Navigation SystemIfukube et al developed a navigation system using two ultrasonic sensors which interacts with the user using sound. The idea behind this model was the reflection of ultrasonic waves by the nearby objects.
Smartvision
In SmartVision, Geographic Information System data is used to make geographic decision, indoor or outdoor and stereo vision system for safe and accurate navigation and orientation. It consists of Global Positioning Unit (GPS) for outdoor navigation and wifi for indoor, apart from this Radio Frequency Identification (RFID) and computer vision are used both indoor and outdoor for landmark or obstacle detection.
The GIS and Location are used used to provide the orientation and navigation instructions to the user about the nearby obstacle and the appropriate direction of movement. The disparity in the two captured images are used to create a depth map to get the information about the distance between the user and the landmark/obstacle. The disparity in the image has been kept inversely proportional to the distance. This means that the object closer to the user is represented with brighter pixels as compared to the objects far away from the user. Feature detection is also performed and in order to improve the performance only a portion of interest in the image is selected. The portion selected is usually the one near to the user.
The example here taken is of a circle for which the image has been transformed into binary image with the circle considered as landmark to be white and the background to be black. The path is decided by nearest circle as the immediate point of destination and the position of user along with position of circle used to find the correction angle which will be used by the user to correct his trajectory. The instruction provided to the user to change its trajectory is not in absolute values but the range of angles is divided into zones. The first zone being of minimal correction and the third zone of maximum correction.
Navbelt
Navbelt developed by Shoval et al helps the user to navigate using a beep sound and employs ultrasonic sensor for collecting data. The model was found to be bulky and difficult to carry. There are many rangefinder techniques like ultrasonic, Laser etc. but are usually used to find the distance at which obstacle is and are unable to detect the change in the surface. James Coughlan and Huiying Shen proposed a method to detect curbs based on stereo vision but was not successful as the alignment of the stereo camera was not stable. Two calibrated laser sensors and stereo camera are used by Aniket Murarka et al. for color segmentation and motion based obstacle detection. It is quite expensive and complicated and still suffers from the aligning problem.
Assisting the visually impaired along their navigation path is a challenging task which drew the attention of several researchers. A lot of techniques based on RFID, GPS and computer vision modules are available for blind navigation assistance. A depth estimation technique devoid of any user intervention from images was used for development of many navigation system and its application for assisting the visually impaired was investigated. The main objective of proposed works was to create a system that detects specific landmarks in the environment, provide orientation/navigation instructions and give distance information to the blind user. detection, the proposed models have been proven to be able to provide valid and simple instructions to the blind user, assisting his navigation in a non intrusive manner.
At the moment the disparity information (depth maps) is not being used to provide distance information together with the correction indications and this side is still unexplored. Since each pixel of the depth map represents the distance of the user to the objects in the corresponding pixel of the captured image, adding the distance information to the correction outputs is a next step.
Browse our vast selection of original essay samples, each expertly formatted and styled