This REU program is hosted by the School of Electrical Engineering and Computer Science at Washington State University. The program consists of 10 weeks working with top research professors and graduate students at WSU. I worked alongside Doctor Haipeng Cai and conducted research focused on multiple-language use of open-source software projects on Github. Our research including topics such as artificial intelligence, machine learning, and data mining.
These are the fundamental questions driving our research and data collection efforts.
- How many languages by average are used by these sampled projects?
- What is the distribution of the languages used across all these projects in Mentions?
- What is the distribution of the languages used across all these projects in Bytes?
- What are the top combinations of languages across all these projects?
- What is the average+stdev of each language’s percentage use across all the projects?
- What are the application/functionality domain of the projects? (e.g., accounting software, personal health apps, games, ML 2. Can we identify the potential correlation between these domains and the multiple-language use patterns?
- How are different kinds of components of a project are associated with the varied languages in the project?
- How have projects evloved, in multiple-language use, that have a reasonably long version history spanning multiple years?