The hugely popular Stack Overflow site provides answers to questions on a range of programming topics and has become a trusted resource for both students and professionals needing to improve their code, solve a problem, or get advice on an implementation. The owners also provide a regular dump of the entire site , allowing analysis of users and their interaction through posing and responding to questions, and voting on others’ posts. Below are three recent studies each investigating aspects of this Q&A site.
The Mamykina paper uses a nice mixed methods approach to look at both the interaction patterns on the site and the motivation of users and community owners. Some key design aspects highlighted are that the community managers are themselves respected domain experts, and that they have taken an evolutionary, collaborative approach to the development of the platform. In terms of site activity, the authors note that a high proportion of questions receive at least one answer (92%) and that answers are received and accepted quickly (median 11 minutes to be received, 21 minutes for an answer to be accepted). Whereas some users only ask (23%) and others only answer(20%), a good number do both (21%). They go on to recognise distinct user profiles in the community: activists, shooting stars, low-profile users and lurkers/visitors. Activists and shooting stars – active for a short period – together provide a large proportion of the answers despite representing a low proportion of the community (figure).
Mamykina et al also note how the gaming aspects to the site – reputation scores and badges – clearly add to the site’s stickiness and appeal, though are not without potential problems, including the tendency for answers to be provided quickly in a rush for the accompanying reputation points.
Truede et als’ study is qualitative, looking at the types of questions that are asked and the tags applied to them. They propose a coding of programming language; framework; environment; domain and non-functional for tag categories and question types including how-tos; discrepancies; environments; errors; decision and conceptual. They found code review questions to be the most frequently answered satisfactorily (ie with an “Accepted” answer in 92% of cases studied), proposing that this is the easiest category to answer being the most concrete and self-contained, usually containing a code snippet for answerers to critique and correct.
Kumar et al looked at the Stack Overflow platform as an example of a “two-sided market”, where there are two distinct user groups – questioners and answerers – and a network effect between them that helps to determine its attractiveness to new adopters. Using a data extract, they derived “attachment curves” for the platform, defined as the probability that a user of one type or another will join the platform. They found a strong “cross-side” network effect, where the rate of questioners joining the network was strongly influenced by the presence of answerers, but that this effect was asymmetric and answerers grew more slowly in the presence of many questioners.
These papers together provide some good insights into how this online community has grown and been a great success – there is a symbiotic and sustainable balance between questioners and providers, groups both being served by the interactions, to the huge benefit of the third group, the search engine visitor. We also see how some question types and categories work more successfully, and how rewards might encourage speed and quantity but may also compromise quality.
KUMAR, R., LIFTSHITS, Y. and TOMKINS, A., 2010. Evolution of Two-Sided Markets, Third ACM International Conference on Web Search and Data Mining, February 3-6 2010
MAMYKINA, L., MANOIM, B., MITTAL, M., HRIPCSAK, G. and HARTMANN, B., 2011. Design lessons from the fastest q&a site in the west, Proceedings of the 2011 annual conference on Human factors in computing systems, 2011, ACM pp2857-2866
TREUDE, C., BARZILAY, O. and STOREY, M.-., 2011. How do programmers ask and answer questions on the web?: NIER track, Software Engineering (ICSE), 2011 33rd International Conference on, 2011, pp804-807