The author of this article is Mayur Kanojiya, An Adro-Geek (Android Developer) of our team “iView Crafters”, presenting his fine R&D on “MACHINE LEARNING”.
The first impression of hearing the word “Machine Learning” I was almost imagining myself as a teacher and 50-60 machines will come to my class for learning.
Jokes apart but Machine Learning in terms of words is defined as follows:
Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed – Arthur Samuel (1959).
Machine Learning was never a thought in my head but at iView you need to be extraordinarily skillful to uplift your performance curve ahead than any average programmer. Challenges drive me! I always wanted to work with Algorithms. After discussing with our CTO, we decided to hands-on towards Machine learning. Hence I vividly started my journey wondering it to be as cool as I mentioned in the very first line.
The biggest question was how to kick-start? Perhaps I did exactly what all the Techies love to – “Google Baba”. Google Baba has a lot to say about Machine Learning. Someone said it so true THE WORLD IS ROUND and so is Google. Every theory of Machine Learning at the end had only one term and that was “Gradient Decent”. While hands-on to the Internet I learned about everybody referring to Prof. Andrew N.G’s examples about ML. There are several online portals offering specialization in Machine Learning such as Coursera, Udemy, Udacity, Pyimagesearch, Pythonprogramming.net. So I decided to go for Coursera course for ML by Andrew N. G. Equivalently, Pyimagesearch and Pythonprogramming.Net improved my knowledge in ML by coding perspective.
Prof. Andrew N. G had started the way of using math (vectorization) in coding to optimise code and also running algorithm faster. Wow!! that amazed me. Here is the sample example.
Now after learning basics of Linear Regression and Logistic Regression I came across to Neural Network and this thing had confused my head a lot between Forward Propagation and Backward Propagation. It was like I have to do FW BW FW FW BW BW BW FW FW FW BW.
I was tired of doing FW and BW and then my life saviour was YouTube, it’s my entertainment buddy. One day accidentally I searched for how Neural Network works 😛 and then you tube took me into a deep ocean with the amazing example of Neural Network like Mario Game completed by ML V O.o. Also, it took me into the Vista of my whole childhood hours that I spent on Mario Game and was just completed within minutes in front of my eyes. In Future we the “iView Crafters” are keen to take up this R&D into reality by training Neural Network on how to drive a car in GTA V.
These are the links :
So I thought we can train Neural Network to play this game which is outdated nowadays. But this video had proved me wrong. This guy had trained a neural network on to how to drive a car in GTA V and this guy’s material helped a lot in understanding the neural network.
If you are a beginner like me then you will face following words in your journey of ML – the cost function, Gradient Decent, Linear Regression, Logical Regression, Regularisation, KNN, Neural Network, training data, testing data, cross-validation, pipeline, loss, epoch and Blah Blah Blah
At iView Labs, We don’t believe in theories we believe in results.
Knowledge is of no value unless you put it into pract
– Anton Chekhov
So to convert my knowledge into a practical solution, initially I started learning python from pythonprogramming.net and trust me will learn basics of python by just watching 15-16 videos, if you know programming concepts well. After learning python, I started developing my 1st ML program by just converting my Coursera exercise into python code. (You can check this blog to understand your Coursera basics into python)
This exercise is about Linear Regression where I have to predict Boston House prices by the data given to me. I created one method which gives me the total value of cost function.
J = (h(x) – yi) ^2
Also, One Gradient Decent algorithm which reduces my cost function.
θj= θj – α * (Ə J(θ)/ Əθj)
where α is learning rate
In Linear Regression, You have to plot data first to see how actually your data is. From the curve of your data you can decide whether you are suffering from high bias(under fitting) or high variance (overfitting) according to this you can understand that your model needs more training data or tuning learning rate high or low etc. You can find my code for practice here.
This way I created basic Linear Regression Machine Learning program, also I had done same code in Octave. The doubt to strike your head would be then why I choose python to code? One of my teammate in iView guided me that python has some awesome libraries for machine learning where you don’t have to worry about these basics. You have to just remember algorithm’s name that you have to choose for your program. The best ML Libraries are Tensorflow, sklearn, keras. I coded one program in python to predict stock price by using the sklearn library. The example is here.
Meanwhile, our CTO was super enthusiastic about the conversion of my learning to the real-time execution. So we decided to implement the concept of Pipeline from my ML theory into one of our Live projects. We buckled up our shoes and started R&D on “Bill Scanner” – An OCR that recognizes characters from Bill images.
The empathy is that during ML Training, Handwritten Digits scanning with MNIST Dataset is said to be “Hello World” level program of Machine Learning. Hence super excitedly we started the task and as the days went off my cognition was wrong. Before predictions, we have to do lots of tasks, image processing etc.
Pipeline :
Object Detection
Text Extraction
Character Segmentation
Character Prediction
1. Object Detection
To detect object successfully, we will use Tensorflow object detection API. To use this API, we have to make model according to our number of class. For object detection, we have collected chunks of images to train our model. After days of training, we got 99% results.
2.Text Extraction
To extract text from bill images we had used Computer vision API by Open CV, we processed the image with lots of filters and extracted the texts.
3. Character Segmentation
After extracting text data we got this output for each text area.
4. Character Prediction
After all the steps above, you will get separate characters, now you have to apply the Neural network to predict character from an image.
This is #83rd Day of our 120 Day Machine learning Project and the quantum of implementation of Pipeline in our Project “Bill Scanner” shall be briefed in Volume – II .