Popularity Prediction of Stackoverflow Posts

About this sample

About this sample


Words: 755 |

Pages: 2|

4 min read

Published: Apr 2, 2020

Words: 755|Pages: 2|4 min read

Published: Apr 2, 2020

The Internet is a huge resource of knowledge and information that give you any information you want. But, very often there are situations where you aren’t able to find the answers to your questions. Your question may require trivial information, analytic thinking or particular expertise that can only be given by people. not computers. Without those, your question may not be answered.

'Why Violent Video Games Shouldn't Be Banned'?

Fortunately, such websites exist out there only to be used to find experts in various fields. You can find peoples thoughts and opinions about a particular subject. Such websites are called Community Question Answering websites or CQA for short.

In recent years Community Question Answering websites (CQA) have gained much popularity as a way of providing and searching for information. These websites provide users with a direct and rapid way to find the information that the users want. Also, they provide other peoples thoughts and opinions as those peoples are experts in their fields.

StackOverflow is such a Community Question Answering website. It is a privately held website, the flagship site of the Stack Exchange. In this website, people can ask questions and get answers on a wide range of topics in computer science and computer programming.

But very often people get the answer to their question not because nobody knows the answer to that question, but the question was not asked properly or it was not structured properly so that the users can understand the question and answer correctly. Most of these unpopular questions are asked by new users who just started to StackOverflow. Most of the time these new users are students who need help for their studies. It would not have been a problem to ask unstructured questions, because they can just ask the question in a different way but StackOverflow has a system that bans users if they have a bad reputation of posting frequent bad questions. This is problematic for new users as they do not know how to give a good post. And among this banned users almost all are new users.

Prediction means telling an approximate result of an event before the event actually occurs. In technical terms “prediction means to determine result purely on the description of another related data or another related set of data”. Predicting the popularity of a StackOverflow post means to predict if the post will get likes or dislikes. People will answer the question or not. In this basis, the problem can be turned into a classification problem where exits two categories the will be popular or not. In StackOverflow post good post that gets most likes almost all of them have a similar structure. For example, good post has some code added to it, good posts have a good title etc. We will define this kind of features to apply classification algorithms.

Though a post's popularity depends on the post's content, there are many other factors that determine how successful a post becomes. These features are like Title, Domain, Author, Thumbnail, Self-Text. Good classification performance was achieved using statistical classification algorithm with varying numbers and kinds of features. The more features taken into consideration that affect the popularity of a post more accurate the prediction will be. Despite the encouragement of StackOverflow, a lot of questions on StackOverflow are not answered.

With the increase in popularity of StackOverflow, the number of questions and the number of new increased with that and with that the number of answered questions also increased. According to statistics from 2012, close to 45 percent questions remained unanswered. A decision layer text classification model works very good but it does not outperform statistical models. Some other classification algorithm were used in previous work like Decision Trees, Random Forest, Neural Networks, Nearest Neighbor. Though Neural Networks show slightly better result for prediction, the computational power needed for implementing Neural Networks and the cost is too high then other algorithms.

Get a custom paper now from our expert writers.

There are a number of feature extraction method exists such as TF-IDF, doc2vec, CountVectorizer, Text ranking. Among them TF-IDF and CountVectorizer performs very well in text classification and prediction. Features play a major role in classification algorithms. TF-IDF is a very good algorithm for finding out the frequency of word among documents. In TF-IDF, the filtered word content is segmented into words. Stop words are removed. Word frequencies are 4counted and the TFIDF values are computed according to the corpus. Candidate words are identified by the TFIDF values and word similarities are computed. Then keywords are extracted from the candidate words according to the TF-IDF values.

Image of Alex Wood
This essay was reviewed by
Alex Wood

Cite this Essay

Popularity Prediction Of Stack Over flow Posts. (2020, April 02). GradesFixer. Retrieved June 23, 2024, from
“Popularity Prediction Of Stack Over flow Posts.” GradesFixer, 02 Apr. 2020,
Popularity Prediction Of Stack Over flow Posts. [online]. Available at: <> [Accessed 23 Jun. 2024].
Popularity Prediction Of Stack Over flow Posts [Internet]. GradesFixer. 2020 Apr 02 [cited 2024 Jun 23]. Available from:
Keep in mind: This sample was shared by another student.
  • 450+ experts on 30 subjects ready to help
  • Custom essay delivered in as few as 3 hours
Write my essay

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled


Where do you want us to send this sample?

    By clicking “Continue”, you agree to our terms of service and privacy policy.


    Be careful. This essay is not unique

    This essay was donated by a student and is likely to have been used and submitted before

    Download this Sample

    Free samples may contain mistakes and not unique parts


    Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.



    Please check your inbox.

    We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!


    Get Your
    Personalized Essay in 3 Hours or Less!

    We can help you get a better grade and deliver your task on time!
    • Instructions Followed To The Letter
    • Deadlines Met At Every Stage
    • Unique And Plagiarism Free
    Order your paper now