TY - GEN
T1 - Identifying Trends in Technologies and Programming Languages Using Topic Modeling
AU - Johri, Vishal
AU - Bansal, Srividya
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/4/9
Y1 - 2018/4/9
N2 - Technology question and answer websites are a great source of technical knowledge. Users of these websites raise various types of technical questions, and answer them. These questions cover a wide range of domains in Computer Science like Networks, Data Mining, Multimedia, Multi-threading, Web Development, Mobile App Development, etc. Analyzing the actual textual content of these websites can help computer science and software engineering community better understand the needs of developers and learn about the current trends in technology. In this project, textual data from famous question and answer website called StackOverflow, is analyzed using Latent Dirichlet Allocation (LDA) topic modeling algorithm. The results show that this techniques help discover dominant topics in developer discussions. These topics are analyzed to find a number of interesting observations such as popular technology/language, impact of a technology, technology trends over time, relationship of a technology/language with other technologies and comparison of technologies addressing an area of computer science or software engineering.
AB - Technology question and answer websites are a great source of technical knowledge. Users of these websites raise various types of technical questions, and answer them. These questions cover a wide range of domains in Computer Science like Networks, Data Mining, Multimedia, Multi-threading, Web Development, Mobile App Development, etc. Analyzing the actual textual content of these websites can help computer science and software engineering community better understand the needs of developers and learn about the current trends in technology. In this project, textual data from famous question and answer website called StackOverflow, is analyzed using Latent Dirichlet Allocation (LDA) topic modeling algorithm. The results show that this techniques help discover dominant topics in developer discussions. These topics are analyzed to find a number of interesting observations such as popular technology/language, impact of a technology, technology trends over time, relationship of a technology/language with other technologies and comparison of technologies addressing an area of computer science or software engineering.
KW - Latent Dirichlet Allocation (LDA)
KW - Machine Learning
KW - Natural Language Processing
KW - Topic modeling
UR - http://www.scopus.com/inward/record.url?scp=85048384043&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048384043&partnerID=8YFLogxK
U2 - 10.1109/ICSC.2018.00078
DO - 10.1109/ICSC.2018.00078
M3 - Conference contribution
AN - SCOPUS:85048384043
T3 - Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018
SP - 391
EP - 396
BT - Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th IEEE International Conference on Semantic Computing, ICSC 2018
Y2 - 31 January 2018 through 2 February 2018
ER -