Pssst… we can write an original essay just for you.
Any subject. Any type of essay.
We’ll even meet a 3-hour deadline.Get your price
121 writers online
Sign language is a language which uses visually exhibited sign patterns to define by simultaneously combining hand shapes, orientation and movement of the hands, arms or body, and facial expressions to fluently express one’s thoughts or else to communicate with others and is usually used by the physically impaired people who are physically challenged. Automatic Sign Language system needs faster and accurate methods for identifying static signs or a sequence of produced signs to help interpret their appropriate meaning. Major components of Sign Languages are Hand Gesture. In this paper, a robust approach for recognition of bare-handed sign language which is static is presented, using a novel combination of features. These include Local Binary Patterns histogram features based on color and depth information, and also geometric features of the hand.
Linear binary Support Vector Machine classifiers are used for recognition, coupled with template matching in the case of multiple matches. The research aims to work on hand gesture recognition for sign language interpretation as a Human-Computer Interaction application. Sign language is a language used by physically impaired persons. It is a language which uses hand gestures to convey the appropriate meaning, opposite to that of acoustically conveyed sound patterns. It is analogous to spoken languages and it is a reason why linguists consider it to be one of the natural languages, but there are also some notable variations form spoken languages. Though sign language is used over the globe, it is not universal. Several Hundreds of these are in use, which varies from place to place and are at the core of local deaf cultures. Some sign languages have achieved recognition legally, while some have nothing. Regionally American Sign Language, German Sign Language, French Sign Language, British Sign Language, Indian Sign Language, etc. have been evolved. Indian Sign Language is one of the oldest known sign languages and is considered extremely important in its history but is rarely used nowadays. In linguistic notations, in spite of the common gossip that they are not real languages, sign languages are as rich and complex as any spoken language.
Study of these languages by professional linguists found that many sign languages exhibit the basic properties of all the spoken languages. The elements of design are Hand shape, or palm Orientation, Movement, and Facial Expression summarize in the acronym HOLME. The core concept behind the method proposed is to exploit a novel combination of color, depth, and geometric information of hand sign to increase recognition performance while most approaches only attempt to use a combination of two or less . This enables to recognize a vast range of signs though they appear to be very common. Researchers face a very challenging problem of improvising a vision based human-computer interaction system for interpreting sign languages. This survey conveys theoretical and literature foundation. The researches based on sign languages and the challenges faced are reviewed. Some of the problems that spoken and written language of a country differ from other countries. The syntax and semantics of a language vary from one region to another in spite of the fact that same language has been used by several countries. For instance, English is the official language of many nations including the UK, the USA.
The usage of English differs at the country level. Also, the sign language also varies from one country to another. The focus of this survey is on improvisation of sign languages at the global level. Earlier, to obtain data for SLI, data gloves and also accelerometers were used for specification of hand. Orientation and velocity, in addition to the location, were measured using the tracker and/ or data gloves. These methods gave exact positions, but they had the disadvantage of high-cost and restricted movements, which changed the signs. These disadvantages made vision-based systems come into screens and gain popularity. The sequence of images is captured from a combination of cameras, as the input of vision-based systems. Monocular, stereo and/or orthogonal cameras are used to capture a sequence of images. External light sources were used to illuminate the scene and also a multi-view geometry to construct a deeper image by Feris and team.
Proposals of the advances in the concepts of hybrid classification architectures with the consideration of hand gesture and face recognition were done by Xiaolong Zhu and team. They built the hybrid architecture by the use of an ensemble of connectionist networks- radial basis functions and inductive decision trees, which helps in the combination of merits of holistic template matching with abstractive matching using discrete properties and subject to both positive and negative learning. Investigation of effective body gesture in video sequences beyond facial reactions was done by C. Huang and team. Proposal to fuse body gesture and facial expressions at the feature level using Canonical Correlation Analysis was given by them. An integration of hand gesture and face recognition was proposed by Z. Ren and team. They argued that face recognition rate could be better by recognition of hand gestures. They have proposed security lift scenario. They made it clear that the combination of two search engines that they proposed is generic and it is not shrunken to face and hand gesture recognition purposes alone. In a sign language, assign consists of three main parts which include manual features, non-, and finger spelling. For the interpretation of the meaning of a sign, analysis of all these parameters are to be done simultaneously. Sign language poses an important challenge to being multichannel. Every channel in the system is separately built, analyzed and the corresponding outputs are combined at the final level to come to a conclusion. The research in Sign Language Interpretation started with HandGesture Recognition. Hand gestures are most commonly used in human non-verbal communication by hearing impaired and speech impaired persons. Sometimes normal people too use sign languages for communicating. But still, sign language is not universal. Sign languages do exist in places where hearing impaired people live.
To make communication between them and normal people simple and effective, itis essential that this process might be automated. A number of methodologies have been developed for automating HGR. The overall process of Hand GestureRecognition system is shown as block diagram in figure 2. There are three similar steps in HGR: 1.Hand acquisition that deals with hand extraction from a given static image and tracking and hand extraction from a video. 2.Feature extraction that deals with the compressed representation of data that will enable the recognition of the hand gesture. 3.Classification/ recognition of the hand gesture following some rules. Two different datasets are made use in ISL recognition system in this survey. The data sets are ISL digits (0-9) and single-handed ISL alphabets (A-Z). For the purpose of dataset acquisition, dark background for uniformity and easy in the manipulation of images for feature extraction and division is preferred. A digital camera, Cyber-shot H70, is used for capturing images. All the images are captured with a flashlight in an intelligent auto mode. The usual file formatJPEG is used to capture images. Each original image is 4608×3456 pixel and requires roughly about 5.5 MB of storage space. To create an efficient data set with a reasonable size, the images are cropped to 200×300 RGB pixels and barely25 KB memory space is required per image. The dataset is collected from 100signers. Out of these signers, 69 are male and 31 are female with the average age group of 27. The average height of a signer is about 66 inches. The data set contains isolated ISL numerical signs (0-9). Five images per ISL digit sign is captured from each signer. Therefore, a total of 5,000 images are available in the dataset. The sample images of the data set are shown in figure 3. In the data set, totally 2600 images cropped to 200×300 RGB pixel sizes are available. The images are collected from four males and six females.
The backgrounds of sign images are dark, as only hand orientations are required for the feature extraction process. The images are stored in JPEG format because it can be easily exported and manipulated in various software and hardware environments. Each preprocessed ISL sign image required nearly 25 KB of space for storage with72 dpi. The size of the images is 200×300 pixels. The skin colors of these images are neither very dark complexion nor very white complexion. This is due to the reason that the application is proposed on consideration of Indian subcontinent only. The colors corresponding to human skins are mainly used in capturing design images. The sample data set is shown in figure 4. To detect the hand from the background, Segmentation used.
The experimentation in this work is carried out using two datasetsconveying hand gestures performed with one hand for alphabets A to Z using Indian Sign Language. The images of this dataset before and after preprocessing stage are shown in figure 5. Linear Discriminant Analysis (LDA) The Linear Discriminant Analysis (LDA) is used to perform class speci?c dimension reduction. It ?nds the combination that best separates different classes. To ?nd the class separation, LDA maximizes both between class and within class scatters instead of maximizing the overall scatter. As a result, the same class members group together and different class members stay far away from each other in the lower dimensions. Let, X¬ be a vector with samples from c classes. Let, X, be a vector with samples from c classes. The between class and within class scatters, SB and SWare calculated as follows. The rank of SW is at most (N c), where c is the number of classes and Nis the number of samples. Most of the time the number of samples is less than the dimension of the image data in pixels. Principal Component Analysis (PCA)is performed on the image data and projected on an (N to c) dimensional space.LDA is performed on this reduced data.
The transformation matrix, W projecting the sample in to Ong and team proposed the LocalBinary Patterns (LBP). It performs local operations on the neighborhood of an image pixel. The neighborhood of a pixel is the pixel adjacent to a particular pixel. In LBP an 8-bit binary code is for a 3 X 3-pixel neighborhood of image I am, Local Binary Pattern (LBP)was proved to be very efficient means for image representation and have been applied in the various analysis. The LBPs are tolerant against monotonicillumination changes and are able to detect various texture primitives like a corner, line end, spot, edge, etc. The most popular and efficient version of LBP i.e. Block LBP (figure 8) with uniform / no uniform patterns is used as the first methodology for the extraction of hand features. The feature extraction approaches in image processing acquire valuable information present in an image. This deals with the conversion of high dimensional data space into lower dimensional data space. The lower dimensional data extracted from the images should be containing accurate and precise information which is the representation of the actual image. The image can be reconstructed from the lower-dimensional data space. The lower dimensional data is required as input to any classification methodology as it is not possible to process higher dimensional data with accuracy and speed. The inputs to an automatic sign language recognition system are either static signs (images) or dynamic signs (video frames). In order to divide input signs in an automatic sign language recognition system, acquisition of valuable features from signs is required.
All the algorithms that are used for facial feature extraction are used for Hand feature extraction as well. Classification is an essential part of machine learning. The technique is used to classify each item in a data set into one of a predefined set of groups. Classification methods use mathematical models including decision trees decision trees, linear programming, neural networks and statistics for pattern classification. In classification, a software module is created that shall learn the art of dividing the data items into different groups. With initial experimentation using multiclass SVM and decision trees, a huge number of misfits have been identified in the process of classification. Hence these classifiers are not further used for final experimentation towards recognition. During SVM classification, if more than one sign returns a positive match for a test image pair, the template matching process is executed. At first, the test image pair is checked with all the signs which returned a positive match if it falls within the range of height to width ratios of that sign defined by rain and max. If the range of ratios of a sign does not fall into that of the test image pair, the sign will not be considered as a positive match in the subsequent template matching steps. The cosine distance d cosine is then calculated between the feature vector f of the test image pair and the average feature vector f avg of each sign that returned a positive match. An edge template similarity metric sedge is also calculated. Here a bitwise operation is done between the edge template of the test image pair Xtest and the edge template of each sign that returned a positive match Xsign.
The sum of the number of white pixels in the resulting image is considered to edge. In spite of the image pairs being of different sizes, the resizing of the edge template into a standard size allows a direct bitwise AND operation to be performed. The total similarity metric stot is then defined according to (4). Here a = 0.001 and ß = 1.2 were chosen as it produced optimum results. The sign for which the similarity metrics to returns a maximum will be considered as the final output sign. Although26 classes are present in ISL single handed alphabet, the system is able to predict single handed characters with more than 95% accuracy. This is possible with LBP and SVM feature extraction technique. A sample output is shown for single-handed ISL sign ‘B’ in fig 9. The input sign image is processed through the system and a prediction is shown on the right-hand side of the output screen. The sign interpreted as single-handed ‘B’ which is the correct prediction. For sign language interpretation, N-fold cross-validation method was used with the N = 5. For a single hand (left or right) each fold is consisting of 200 images. The system is trained using 800 images from four of the ?ve folds and tested against the remaining fold of 200 images. For both hands, each fold has 400 images. The system is trained using 1600 images from four of ?ve folds and tested against the remaining fold of 400 images. The system was tested under three criteria that are Sign gestures performed by left, right and both hands. The accuracy of all these criteria is measured using the following condition, where NC is the number correctly classi?er sign gestures and N is the number of all test sign gestures.
An overall accuracy of 92.14% was obtained with a relatively small training dataset. It could be seen that the system managed well with the variation of individual signs caused by different users as well as the similarity that exist among different signs. A vision-based automatic sign language recognition which enables to recognize sentences in Indian Sign Language was presented in this work. Several features and different methods to join them were investigated and experimentally carried out. Tracking algorithms with applications to hand and head tracking were presented and experiments were carried out to determine the parameters of these algorithms. An emphasis was put on appearance-based features that use the images itself to represent signs. Other systems for automatic sign language recognition usually require a segmentation of input images to calculate features for the segmented image parts. The algorithm was designed to function in real-time without requiring excessive computational power. The results reveal that it is possible to train the system to recognize more static Indian Sign Language hand signs while maintaining high accuracy. It is also feasible to build on the framework to recognize dynamic sign language. Future depth sensor technology with higher depth and higher color resolution and more accurate skeletal tracking have the potential to improve the results of the proposed algorithm to a greater extent. The results conveyed in this work show that the usage of appearance-based features yields a promising recognition performance.
To export a reference to this article please select a referencing style below:
Sorry, copying is not allowed on our website. If you’d like this or any other sample, we’ll happily email it to you.
Your essay sample has been sent.
Want us to write one just for you? We can custom edit this essay into an original, 100% plagiarism free essay.Order now
Are you interested in getting a customized paper?Check it out!