By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email
No need to pay just yet!
About this sample
About this sample
Words: 1467 |
Pages: 3|
8 min read
Published: Apr 11, 2019
Words: 1467|Pages: 3|8 min read
Published: Apr 11, 2019
In this paper, the process of forecasting the sale opportunities using data mining technique is shown. It is very important for any organization based on Customer Relationship Management (CRM) to analyze the customer behavior towards the product, which creates them the sales opportunity. The target variable sales opportunity status is forecasted using the C5.0 algorithm. There are many independent variables used to as the condition rule set at each segment in the decision tree. The accuracy of this prediction helps the seller to plan their strategy on sales opportunity according to the customer’s behavior.
In real life, business organizations are more conscious about their Customer Relationship Management (CRM) to make their opportunity in the related markets to lead their organization against their competitors. Data mining techniques are used to predict the outcome, using the prediction, the organization can able to plan their strategy to the particular product. The main thing in the sales is to make the product to client to convert the contact as lead and making the right product to the client makes the sales opportunity. Multiple sellers work together with different products, where the seller must have knowledge and opportunity about the particular product. Analyzing the strength of the competitors, planning the marketing and offers wisely for the products which gives us the competitors advantage. Through market campaigns, marketers are used to make the customers to buy the different products according to their customers account history and predict whether customer is ready to allocate budget for the particular product. After analyzing the customer’s behavior based on the budget allocation to the product, the organization can think about cross and up sale of the product. Selling the products based on the customer needs, where organization gathers customer needs using the market campaign. Attention towards the customer behavior helps the organization plan their sales marketing strategy more effectively. The performance and coordination of work done in sales team determines the opportunity status either win or loss. There are many important factors that can influence the opportunity status such as client, competitors, cross sale, up sale, product, seller and competitors. Eventually, the targeted opportunity status can be predicted whether it is Won or Lost.
The research work based on four machine learning predictive models such as Random forest, Decision tree, Naive Bayes, Support vector machine (SVM) and Artificial neural network (ANN) are implemented to forecast the accuracy of new sales opportunities status and error rate. The performance accuracy of each model is compared by Classification Accuracy (CA) and Area Under the Curve (AUC). The research implicated that how the performance of the models is affected by the quality and quantity of the data. Still the accuracy of Random forest (77.6%) is higher than the other models. But the accuracy of the C5.0 is not evaluated.
K-means is used for clustering the data, Random forest is used to reduce the dimensions and selecting the important attributes. Finally, C5.0 algorithm is used as the main classifier in order to predict the customer churn prediction of two to three months.
Decision tree is an important classification schema, where classification is a supervised data mining operation, the similar data items are grouped together and they split dataset into segments. The C5.0 algorithm is used for low memory usage, higher accuracy and increased speed with small decision trees. The accuracy performance of improved C5.0 is much better than the traditional C5.0 [5].
Here the machine learning algorithm is used to classify and predict the accuracy of the stock manipulation. The accuracy percentage of C5.0 is still very close when compared to other models [6].
The performance of CART and C5.0 is measured using the sampling techniques. CART uses Gini index measure for constructing trees, whereas C5.0 uses Information gain for generating trees. The accuracy of C5.0 is greater than the CART.
Methodology clearly explains how the dataset has been extracted from the source. It is the procedure to clean and transform the dataset. Techniques which have been used for forecast the accuracy.
A. Data Acquisition
The dataset is downloaded from Salvirt website as a raw data in Comma Separated Value (.csv) format. The dataset contains 448 cases with 23 attributes included with 51 percent won and 49 percent lost.
B. Data preprocessing
Data preprocessing is one of the important method mainly known as cleaning and transformation phase, the raw data is collected from the source and processing the data according to the implementation. Using R programming language, the duplicate values, unwanted special characters, noisy data are cleaned. The missing values are generated based on the other attributes in the dataset. The attributes are encoded for easy understanding.
C. Dataset
The attributes contain many independent variable and one dependent variable.
Target variable:
The target variable is ‘Status’, it contains the values whether the opportunity status is won or lost.
Attribute Description
Status Outcome of sales opportunity
Predictors:
There are 22 predictor variables which are placed in the dataset. These predictor variables influence the dependent variable, using these predictor variable the outcome of the dependent variable can be predicted either it will win or lost.
D. Technique:
The dataset is based on the classification model. The classification model consists of various techniques, but the decision tree using the C5.0 implementation is performed to predict the accuracy of the dependent variable ‘Status’. Decision tree consists sequence of decision conditions, where each part of the tree consists of some condition for the classification. Decision making variable is placed as the root node in the tree. The C5.0 algorithm becomes the important implementation method for classification problem in industry.
Tools used:
Rapid Miner
R Studio
In below decision tree described the red color indicates the chances of making the opportunity won and blue indicates that possibility of opportunity lost using different segments of clients, where Client is an important factor which is placed at the root node for decision making.
Focusing on the current clients helps the organization chance of increasing their won opportunity more than new and past clients. If the organization is focusing on the past clients, therefore their chances of losing the opportunity is more.
Factors that can make the opportunity to be won or lost. Seller and company authority should clearly look after the possibilities, where they can make more opportunity. Below picture indicates the organization must focus before they execute their process.
Client is one of the most important factor, the organization should make the opportunity success using the current client, growth of the client and attention towards them whether they requested for information or proposal. Create more opportunity for the clients to buy the product.
The below graph describes how the growth of the client is influencing the dependent variable status.
The size of the company where the organization size really matters for the client to trust the product. The bar chart explains that the probability of making opportunity success by using the big organization.
Using the marketing campaign, marketers can analyze the customer behavior and explain the strategy to the clients, making the clients to allocate the budget for the product. In this graph, the chance of allocating the budget or not is similar for the both won and lost. But the organization is not sure about the budget allocation, hence chances of losing is more.
The concentration towards purchasing department should less compare to other factors.
Planning the strategy according to the important factors can create more opportunities and make the opportunity status won.
Using the important factors and analyzing the customer relationship with the organization and customer behavior towards the product helps out the organization to make the opportunity success. Below picture defines that focusing on the most important factor of the previous picture helps organization to make their opportunity most likely to won (71%) and less chance of getting lost (29%).
The C5.0 algorithm produced an accuracy of 79% respectively. It works with low memory usage, high accuracy and small decision trees, which makes the organization to take the decision faster. The confusion matrix produces the values of Kappa, Sensitivity and Specificity. Sensitivity used to calculate the number of positive prediction. This outcome shows the percentage of algorithm correctly predicts the opportunity status. Specificity used to calculate the number of negative prediction. This clearly shows the opportunity status wrongly classified.
Different seller’s working together on various products must have proper knowledge about the product that they are going to handle based on that they can plan more strategies to get the competitive advantage against the competitors. Therefore, they know where to make cross and up selling for the product. The sales team must analyze the strength of competitors and client’s behavior towards the product. Client is the most important factor for the success of sales opportunity, where we have seen the complete details of different factors about clients that influence the opportunity status. Thus using the data mining technique to forecast the future sales opportunity helps us to focus on the important factors to make the sales opportunity success.
Browse our vast selection of original essay samples, each expertly formatted and styled