Crowd-sourced knowledge is heavily used in software development. For example, thousands of software developers use Stack overflow website to ask questions and find solutions to the problem they are encountering. In this project, we are particularly interested in analyzing content on technical questions and answering (QA) websites, Stackoverflow.
The Ph.D. students will focus on several important questions related to Stackoverflow. For example, the evolution of the Python community over the period, the problem associated with question migration and early prediction of migrated questions, mining rule violation in Python code and identifying API misuses, etc. Software development is a complex process and there are many steps involved in it. This project will provide an opportunity to pursue research in the area of data science and software engineering.
The dataset of Stackoverflow is open source and can be downloaded from the website
https://archive.org/details/stackexchange. (Note: we need to pre-process the data according to our requirement):
This Ph.D. would be required to consider ethical considerations in data collection and potential biases in the subsequent analyses. Some links to related publication:
Please quote FNS_SS_Sept2022