Pssst… we can write an original essay just for you.
Any subject. Any type of essay.
We’ll even meet a 3-hour deadline.
121 writers online
With the active development of E-commerce, the number of online transactions by credit card is increasing rapidly, there are many risks to be taken into consideration by the financial companies whose most of the payments are made by credit card. Finance fraud is growing problem with major consequences in online payment systems affecting online transactions. In case of credit cards there are very less chances of fraud which is up to 0. 1%, still it costs a lot of money and can be in billions also. Many techniques have been discovered to limit the credit card frauds. E-commerce is giving more exposure to the sales but also it’s exposing to the hacker Criminals. Therefore they can use methods such as Trojan and Phishing to steal information of other people’s credit card. Using AI we have many parameters to validate a transaction rather than just the rules which can be simple. Therefore effective credit card fraud detection is very important. Since it can detect fraud in time based on historical transaction data including normal transaction and fraud ones to obtain fraud behavior features based machine learning and deep learning techniques.
In this big world of growing E-commerce there are many people who pay their online bills via credit card. Companies invest a lot in prevention of the risk which arises due to credit card frauds carried out by the hackers. Even with the credit card number (16 digit) and the expiration date anyone could be possibly hack it and use for himself to pay the bills and do a lot of other things sitting over place and issuing it over some other place and so the detection of such frauds are not so easy because such frauds happens in very less amount 0. 1 % which sounds less but can cost over billions to some companies which is a very huge amount, and therefore the companies invest a lot in such prevention. To prevent and detect such frauds we would want a safe and secure system and for such a system to exist, we can use ‘AI’ which will help our system to study the historical dataset and patterns with the various machine learning and deep learning algorithms in existence.
Our system will consist of various phases in which 1st phase deals with Data preprocessing, where we will make a proper dataset which will be ensuring and consist of less errors by using the ‘NumPy’ library that consist of the mathematical tool which helps to perform calculations. The next main library used here in this phase is ‘Pandas’ library which is famously used because of its features to work with datasets, for example it is used to import the datasets and manage datasets in a good and easy manner. This complete phase deals with datasets only and making it normalized which will later help our algorithms to finally get correct result.
The next phase deals with splitting of the dataset into 3 parts that is the 1st type is training dataset, 2nd is validation dataset and 3rd is testing dataset. The training dataset is used initially where machine it is being trained until it learns to predict the correct values, the testing dataset is used to evaluate the final ready model when the training of the dataset is over after which our system is completely ready to perform the actual tests. It is important that the dataset is divided equally in these 3 datasets, example if there is 0. 1 % frauds in the transaction in the entire dataset then these transactions should be equally divided in these sets. Training dataset is always big in size as compared to the validation and testing dataset.
This next phase deals with the deep learning and deals with neural network, here we will be using the supervised model of deep learning which will help us to train a machine in such a way that it can accept input and perform specific calculation which gives desired output, so such model is called supervised model. Here the main part is feature extraction which means we only need to fetch the important data which will be used in the calculation. Lets say there are 2 features x1 and x2 then we design the system in such a way that it gives a output using these features that is x1 and x2, so the perception is like formula or equation which gives us output. So we need to find solution for such equations. But one equation is not enough to find output. So we need more than one perception it means one or more perception’s output can be input to other such perception and now there are many such equations and we have many perceptions being input to another. Now we know that the features are in numerical format and they are multiplied by the weights say w1 and w2 w. r. t x1 and x2 therefore to find the correct solution and we need to find the optimal values for all such w1 and w2, so we need to find optimal value for the entire dataset to find solution and hence neural network is used.
So now we need to find an output for this entire network for given input and so here the neural network is required to find the optimal solution and this method is called feedforward where we try to find the desired solution. The feedforward process uses back propagation and keeps on trying to find out what caused the error or deviation from actual desired. Now the other part is to wait for machine to be trained on the given dataset but this is a very long and slow process and costs time, so what we do is we train the machine for a specific dataset for a given ‘EPOCHS’, this is necessary because we need to train the data sufficiently which means it should not be trained less or it should not be trained more, since training less causes to predict wrong since learning has not been complete and if we train more it becomes complex and difficult to predict the data correctly and once the process of overtraining is completed this cannot be undone and so we find it difficult to again train the dataset sufficiently, so we use a method where the random data is dropped using probability between 0-1 so that it will be trained in proper manner and this process is called ‘Dropout layer’ method. Now the question arising is, “how are these perceptions made?”. The answer is simple these are used as Activation functions,example like a step function which either takes ‘0’ as its value or a ‘1’. We will be using such functions to find the solution. There are similar functions named ‘Sigmoid Function’ and ‘Relu’ functions which takes values between 0-1 and the property of these functions is that the change is not drastic, rather its gradual.
Decision Tree is a supervised machine learning algorithm used for both classification and regression. It works for both categorical and continuous variables. It uses a tree like graph or model of decisions to predict the output. The model behaves like ‘if this then that’ conditions finally giving us a particular result. Splitting is a process of dividing a node into sub nodes. Branch is a subsection of an entire tree. Parent Node is the one which gets divided into sub nodes and these sub nodes are called the Childs of that parent node. Root node is the node that represents the entire sample and it the first node to perform the split. Leaves are the terminal nodes that do not get split and these nodes determine the outcome of the model. Nodes of the tree split for a value of certain attribute. Edges of the tree indicate the outcome of a split to the next node. Tree depth is an important concept. It indicates how many questions have been asked before the final prediction is being done.
The entropy of every attribute is calculated using the dataset in problem. Entropy controls how data is split up in decision trees and how its boundaries are drawn. Information gain denotes how much information is given by the feature about the class. Information gain needs to be maximized. The dataset is then divided into subsets using the attributes for which the entropy is minimum or the gain is maximum. This determines the attribute that best classifies the training data which is the root of the tree. This process is repeated at each branch. Decision trees perform well on large datasets and they are extremely fast. Decision trees are prone to overfitting, especially when a tree is particularly deep. Pruning can be used to avoid overfitting. The count of true negative, false positives, false negatives and true positives in the confusion matrix was 284292, 23, 37 and 445 respectively.
Random Forest is a supervised machine learning algorithm which is used for both classification and regression purposes. It is flexible, easy to use and provides high accuracy. It is a collection of decision tree classifiers in which each tree is trained independent of the other. It has almost same parameters that a bagging classifier or a decision tree has. While expanding the nodes of the tree, additional randomness is added to the model. The best feature among a random subset of features is selected for splitting a node. Due to this selection wide diversity is generated thereby building a better model.
Initially a set of random N data points are selected from the training set. Then a decision tree is being is builed associated to the selected N data points. Number of trees to be built in the forest is decided and the above steps are repeated till the time number of trees required in the forest is achieved. For a new data point makes each one of the tree to predict the category to which the data point belongs. Finally the category of the new data point is given to the category having the majority vote. So basically we create one tree and then another tree and then another tree and each of these trees is build using randomly selected subset from the training dataset. Even though each one of these trees might not be ideal overall on average they can perform very well.
One of the advantages of Random forest is that it can be used both for regression and classification. Random forest is considered as very handy and easy to use algorithm because its default hyperparameters often produce a good prediction result. The number of hyperparameters is also not that high and easy to understand. Overfitting is one of the big problems in machine learning. Generally this won’t happen that easy to a random forest classifier because if there are enough trees in the forest, the classifier won’t overfit the model. The main limitation of Random Forest is that large number of trees can make the algorithm slow and ineffective for real time predictions. Generally algorithms get trained very quickly but quite slow to predict results once they are trained. A more accurate prediction requires more number of trees which will result in a slower model. So Random Forest lags behind in the applications where run time performance is a key element.
Random Forest provides higher accuracy in detecting credit card frauds. Decision trees are convenient and easily implemented but they lack accuracy. Overfitting takes place when we use Decision trees. It is not flexible to classify the new sample because of overfitting. Decision trees are prone to overfitting especially when the tree is particularly deep. Pruning can be used to avoid overfitting. Random Forest creates random subsets of the features and then smaller trees are created using these subsets and prevents overfitting. These subtrees are then combined. Thus random forest performs better than decision trees for the dataset available.
We provide you with original essay samples, perfect formatting and styling
To export a reference to this article please select a referencing style below:
Sorry, copying is not allowed on our website. If you’d like this or any other sample, we’ll happily email it to you.
Attention! This essay is not unique. You can get a 100% Plagiarism-FREE one in 30 sec
Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.
Your essay sample has been sent.
Want us to write one just for you? We can custom edit this essay into an original, 100% plagiarism free essay.Order now
Are you interested in getting a customized paper?Check it out!