Do-it-yourself knowledge

The stack exchange websites make their data available via a data dump. This post reflects some exploratory research done on the April 2011 data from http://diy.stackexchange.com/, a knowledge exchange site for home improvement tips:

The Home Improvement Stack Exchange homepage

Here you see a question page, showing how a question can receive a number of answers and both questions and answers are rated by the community. Users receive “reputation” points based on asking and answering questions, as well as having their answers accepted as the best answer.

A question-answer page

From the user profile information, it looks like typical users of the site are American men in their 30s, with an interest in DIY as a hobby, but working in technology. I would guess that many have been introduced to the site after having used Stack Overflow, a question answering site on programming problems which uses the same underlying software.

A typical user?

Looking at the kinds of question asked, the diagram below gives an approximate representation of the proportion of question types on the site. Given the practical nature of the topic, there is clearly a focus on procedural questions and recommendations, but it is interesting that Why type questions are also quite common, where questioners are seeking to fill gaps in their knowledge relating to observed symptoms, effects or approaches, and perhaps needing more hypothetical or diagnostic type answers.

Types of question

There seem to be two overlapping user types: those that only ask questions (441), those that only answer (219), and those that do both (237). This seems like quite a healthy distribution of information seekers and providers. Those that do both have the highest average reputation (534) with answerers on 162 and askers on 94. Again this makes some sense, showing that the more involved members are being rewarded for their efforts.

Community members – types of contribution

Users seem to receive mostly receive satisfactory answers quickly; some 51% of questions having accepted answers are received and accepted within an hour, with a further 15% in the second hour.

There is a small but statistically significant correlation between total question views and the number of answers received (r2:0.27, F:561, p<0.01). It would be interesting to untangle this and see if views accumulate while the question is "active" (receiving answers) or after the answers have accumulated. The latter could indicate that 1) a greater range of answers make the question-answer thread pages richer and more generally useful to third parties and/or 2) These topics are just generally more interesting, receiving answers and views accordingly.

If we draw the graph of users and question-answer ties, where edges reflect answers provided to a question, we can see that users asking questions with many answers have a strong community cohesion influence. The one at the centre of this portion of the graph asked the most popular question on the site: “What are the tools that every DIYer should own?”

Question-Answerer ties

Here we see his profile page and can see that much of his reputation was received from asking this question.

It’s interesting that asking the most basic, open ended – and perhaps contentious – question can engage the community and attract a flood of information and advice, boosting a user’s reputation accordingly.

Leave a Reply

Your email address will not be published. Required fields are marked *