By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email
No need to pay just yet!
About this sample
About this sample
Words: 1984 |
Pages: 4|
10 min read
Published: May 7, 2019
Words: 1984|Pages: 4|10 min read
Published: May 7, 2019
The project aim is to find out the relationship between the dependent and independent variable. To know how all the independent variable influences the dependent variable. The restaurant rating is based on many attributes like the food quality, prize, ambience of the restaurant, if the restaurant has online delivery system, if the restaurant has table booking etc. All these factors will affect the business profit as customers will consider these factors to dine in their favorite restaurant. So, the Customer Relationship Management plays an important role in improving the business profit of any organization.
Keywords—CRM, Hypothesis, Sentiment Analysis, Support Vector Machine
Customer Relationship Management (CRM) plays a vital role in an organization. To be successful in business the organization should have a good relationship with the customer. The organization should also have loyal and long-lasting customers so that the organization business value and profit gets increased.
The main purpose of CRM is to gather all necessary customer related data and analyze the data using different data analytics and machine learning techniques. There are several advantages with the outcome of the analyses, since the data is customer related, the machine learning and data analytics outcome can be used to improve the product quality, it helps to manage the customer related data, customer interaction, manage customer accounts, finding new customers, maintain existing customers, from the analyses we can also find out what exactly are the customer expectations and the organization can improve the quality of the products. This in turn will increase the customer satisfaction and business profit to the organization. Thus, we can tell that CRM can be used to improve the value and relationship towards customer.
With the advent of internet e-commerce, social media and restaurant searches have increased tremendously. The online reviews on different products, places and restaurants will have a great impact on business profit as customers will look for the review and rating online before buying a product or dining in a restaurant. Thus, customer rating plays an important role in business profit.
The customer reviews and ratings of the restaurant online can help in improving the quality and standard of the restaurant, thus improving the business profit. The restaurant rating is important for the users online because it gives an overall rating of the restaurant which includes multiple factors like food quality, ambience, prize range, if it has table booking, if it has online delivery, what kind of cuisines, location etc.
There are multiple restaurant search websites online from with data can be taken to predict the customer ratings of the restaurant. I chose “Zomato” which is one of the most popular search website for restaurant. These rating will be useful for users who log in the Zomato site online to look for the best restaurants in the city. From the dataset, the customer rating can be classified using multiple other parameters which will be explained in the next section. In this project, the restaurant rating given by customer is been classified. The customer ratings and business profitability can be predicted.
From the Zomato dataset, the following hypothesis can be formed:
There are multiple attributes in the dataset, how all the independent attributes from the dataset influences the dependent variable which is the restaurant rating. To be precise how the restaurant: ‘Location’, ‘Cuisine’, ‘Cost’, ‘Has table booking’, ‘Has online delivery’ has an effect on the ‘rating text’.
The dataset has the following attributes such as: Restaurant Name, Restaurant ID, City, Address, Cuisines, Cost for two people, Has table booking, Has online delivery, Is delivering now, Switch to order menu, Prize range, Aggregate rating, Rating color, Rating text and votes.
The restaurant name will have names of all the restaurants in a place, the restaurant ID will be unique for all the restaurants, the city is used to list out all the restaurants in a city, the address will be useful to locate the restaurant in an area, the cuisines has a list of all the items that are served in the restaurant, cost of two gives the total amount of money for two people.
There are other attributes in the dataset which is used to find out different features of the restaurant. A restaurant can have table booking, online delivery etc., all these attributes will have a high correlation with the dependent variable which is the rating. If a restaurant has all the features and the quality of the food is very good, then there is a probability that the rating of the restaurant would be high. In other case, the restaurant may not have the all the features but it may be that the food quality is good and the overall prize may be less, so customers would prefer such restaurants and there are chances for the rating to be high for such restaurants. So, all these attributes together will determine the rating of the restaurant which be given by the customer. The rating will help other users who will login to the Zomato website, so the better the rating of the restaurant the better the business profit for the restaurant.
The aggregate rating is a numeric value from a scale of one-five, one being the lowest and five being the highest. The rating text is being encoded to excellent, very good, good, okay. For example, if the restaurant has an aggregate rating of 4.8, it will be encoded to excellent in the rating text. The rating text will be the dependent variable as it is a categorical variable, whereas the aggregate rating is a continuous numeric value.
There are various methods to find the ratings of the restaurant using machine learning techniques. These ratings will help the users from the Zomato website to choose the best restaurant to dine in for. The sentiment analysis has been used to find the restaurant rating. Here, the sentiment score will automatically classify the restaurant rating to help the users or customers to choose their best restaurant. The sentiment score can be calculated based on the user reviews, the keywords will have associated ratings which will be given a sentiment score. It will be useful to find out the tone behind the user. The process can be explained as follows. The dataset here is extracted from Yelp site, around 100,324 reviews for 2000 restaurants are taken. The reviews have many words like good, bad, excellent, marvelous, amazing, wonderful, awful, terrible etc., from all these words the sentiment score is calculated.
The process for the sentiment analysis is described below:
First the reviews are broken into separate sentiment words, there will be a text file which consists of positive and negative words and each word will have a corresponding sentiment word as seen from the above table, so the final sentiment score will be calculated.
After the sentiment words are calculated, the emoticons are identified. Let’s take an example of two sentence to understand the emoticons identification “food was GREAT” and “food was great”, the first sentence will have a higher score will compared to the sentence two, because the user has expectedly mentioned the positive review.
The sentiment score is calculated by the average of all the positive and negative score. Neutral score is also calculated. The summation of all the scores will give the sentiment score.
Finally, the rating is calculated.
The following hypothesis can be assumed:
There are other techniques for implementing sentiment analysis. We can see that techniques like Naïve Bayes, Support Vector Machines, Decision tree, K-Nearest Neighbor Classifier, Winnow Classifier, Adaboost Classifier is used.
The dataset which was downloaded from Kaggle should be preprocessed before applying any machine learning algorithms to it because the dataset will contain unwanted noise, missing values, null values and special characters.
There were many unwanted rows and columns in the dataset like, Country code, Locality verbose, Latitude, Longitude and Currency. These attributes are removed before applying the machine learning as these independent variables has not much effect on the dependent variable. There were other missing and null values in the dataset which was cleaned in R using gsub function and was manually removed from excel.
From the below output got from R studio we can find the correlation between different attributes got from the dataset.
Correlation can be classified into different types, strongly correlated, no correlation and neutral correlation.
From the above let’s take an example of strongly correlated attribute, ‘Prize range’ and ‘has table booking’ is strongly co-related. ‘Has online delivery’ and ‘has table booking’ has neutral correlation with each other.
There are multiple techniques for classifying the restaurant rating using machine learning algorithms. The rating scores can be calculated using sentiment analysis which we have seen earlier. Classifying Support Vector Machine for restaurant rating is implemented, the method implemented for this project has additional attributes to improve the accuracy rate of the restaurant rating classification. The additional attributes include seeing if the restaurant has table booking, has online delivery, is delivering now, switch to menu option, prize of the food in the restaurant, total number of votes by the customer, location of the restaurant. These are all the independent variable which will have effect on the dependent variable restaurant rating.
Now let’s see why Support Vector Machine is used to classify the restaurant rating when compared to the other machine learning techniques:
These are the reasons for using Support Vector Machine.
The Support Vector Machine algorithm is implemented in R code, and the prediction accuracy is 99.6%. The dataset is first read and stored in a local variable in R. Then the dataset is split into training and test dataset to classify any future restaurant ratings given. After the dataset is split into training and test dataset Support Vector Machine algorithm is applied. When the R code applied to the dataset and the code is executed we can see the accuracy as 99.6%.
From implementing the Support Vector Machine algorithm, we can have found out the accuracy percentage to classify the restaurant rating. The customer rating plays an important role in any organization, the business value and business profit will entirely depend on the rating of the restaurant. So, for the restaurant to have a higher rating the customer rating plays an important role, and if the rating of the restaurant is higher it will also bring new customers to restaurants. The customer relationship plays a very important role for the success and profit in business.
Browse our vast selection of original essay samples, each expertly formatted and styled