About this sample
About this sample
Words: 1586 |
8 min read
Published: Mar 14, 2019
Words: 1586|Pages: 3|8 min read
With the advent of computers and Internet technology, the scopes for collecting data and using them for various purposes has exploded. The possibilities are especially alluring when it comes to textual data. Converting the vast amount of data that has accumulated over the years of human history into digital format is vital for preservation, data mining, sentiment analysis etc. which will only add more to the advancement of our society. The tool used for this purpose is called OCR.
Like many other languages, Bangla can also profit from the OCR technology – more so since it is the seventh most-spoken language in the world and the speaker population is about 300 million. The Bangla-speaking demographic is most found in Bangladesh, the Indian states of West-Bengal, Assam, Tripura, Andaman & Nicobar Islands and also the ever-increasing diaspora in United Kingdom (UK), United States (US), Canada, Middle-East, Australia, Malaysia etc. So the progress in digital utilization of Bangla language is something that encompasses the interest of many countries.
OCR is the short form for Optical Character Recognition. It is a technology to convert images of printed/handwritten text into machine readable i.e. digital format. Although OCRs these days are prevalently focused on digitizing texts, earlier OCRs were analogue. The first OCR in the world was considered to be invented by American inventor Charles R. Carey which used an image transmission system using a mosaic of photocells.
The later inventions were focused on scanning documents to produce more copies or to convert them into telegraph code, and then digital format became more popular gradually. In 1966, the IBM Rochester lab developed the IBM 1287, the first scanner that could read handwritten numbers. The first commercial OCR was introduced in 1977 by Caere Corporation. OCR began to be made available online as a service (WebOCR) in 2000 across a variety of platforms through cloud computing.
Based on its method, OCR can be divided into two types:
On-line OCR can only process texts written in real time, whereas off-line OCR can process images of both handwritten and printed texts and no special device is needed.
Most of successful research in Bangla OCR have been done for printed text so far, although researchers are foraying more into handwritten text recognition gradually.
Sanchez and Pal proposed a classic line-based approach for continuous Bangla handwriting recognition based on hidden Markov models and n-gram models. They used both word-based LM (language model) and character based LM for their experiment and found better results with word based LM.
Garain, Mioulet, Chaudhuri, Chatelain and Paquet developed a recurrent neural net model for recognizing unconstrained Bangla handwriting at character level. They used a BLSTM-CTC based recognizer on a dataset consisting of 2338 unconstrained Bangla handwritten lines, which is about 21000 words in total. Instead of horizontal segmentation, they chose vertical segmentation classifying the words into “semi-ortho syllables”. Their experiment yielded an accuracy of 75.40% without any post processing.
Hasnat, Chowdhury and Khan developed a Tesseract based OCR for Bangla script which they used on printed document. They achieved a maximum accuracy of 93% on clean printed documents and lowest accuracy of 70% on screen print image. It is apparent that this is very sensitive to variations in letter forms and is not much favorable to be used in Bengali handwriting character recognition.
Chowdhury and Rahman proposed an optimal neural network setting for recognizing Bangla handwritten numerals which consisted of two convolution layer with Tanh activation, one hidden layer with Tanh activation and one output layer with softmax activation. For recognizing the 9 Bangla numeric characters, they used a dataset of 70000 samples with an error rate of 1.22% to 1.33%.
Purkayastha, Datta and Islam also used convolutional neural network for Bangla handwritten character recognition. They are the first to work on compound Bangla handwritten characters. Their recognition experiment also included numeric characters and alphabets. They achieved 98.66% accuracy on numerals and 89.93% accuracy on almost all Bengali characters (80 classes).
Some projects have been developed for Bangla OCR, it is to be noted that none of them work on handwritten text:
Deep CNN stands for Deep Convolutional Neural Network. First, let us try to understand what a convolution neural network (CNN) is. Neural networks are tools used in machine learning inspired by the architecture of human brain. The most basic version of artificial neuron is called perceptron which makes a decision from weighted inputs and probabilities against threshold value. A neural network consists of interconnected perceptrons whose connectedness may differ according to various configurations. The simplest perceptron topology is the feed-forward network consisting of three layers – input layer, hidden layer and output layer.
Deep neural networks have more than one hidden layer. So, a deep CNN is a convolutional neural network with more than one hidden layer.Now we come to the matter of convolutional neural network. While neural networks are inspired by the human brain, CNNs are another type of neural network that take it further by also drawing some similarities from the visual cortex of animals *. Since CNNs are influenced by research in receptive field theory * and neocognition model * , they are better suited to learn multilevel hierarchies of visual features from images than other computer vision techniques. CNNs have earned significant achievements in AI and computer vision in the recent years.
The main difference between convolutional neural network and other neural networks is that a neuron in hidden layer is only connected to a subset of neurons (perceptrons) in the previous layer. As a result of this sparseness in connectivity, CNNs are able to learn features implicitly i.e. they do not need predefined features in training.
A CNN consists of several layers such as:
Browse our vast selection of original essay samples, each expertly formatted and styled
Where do you want us to send this sample?
Be careful. This essay is not unique
This essay was donated by a student and is likely to have been used and submitted before
Download this Sample
Free samples may contain mistakes and not unique parts
Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.
Please check your inbox.
We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!