This is a limited proof of concept to search for research data, not a production system.

Search the MIT Libraries

Title: Application Domain of 5,000 GitHub Repositories

Type Dataset Borges, Hudson Silva, Valente, Marco Tulio (2017): Application Domain of 5,000 GitHub Repositories. Zenodo. Dataset.

Authors: Borges, Hudson Silva (Federal University of Minas Gerais) ; Valente, Marco Tulio (Federal University of Minas Gerais) ;



We provide a manual classification of the application domain of 5,000 GitHub repositories (the most popular ones, by number of stars, on January, 2017). We classified each system in one of the following application domains:

Application software: systems that provide functionalities to end-users, like browsers and text editors (e.g., WordPress/WordPress and adobe/brackets). System software: systems that provide services and infrastructure to other systems, like operating systems, middleware, and databases (e.g., torvalds/linux and mongodb/mongo). Web libraries and frameworks (e.g., twbs/bootstrap and angular/angular.js). Non-web libraries and frameworks (e.g., google/guava and facebook/fresco). Software tools: systems that support development tasks, like IDEs, package managers, and compilers (e.g., Homebrew/homebrew and git/git). Documentation: repositories with documentation, tutorials, source code examples, etc. (e.g., iluwatar/java-design-patterns).

To cite the dataset, please use the following paper (which proposes and uses a first dataset version):

Hudson Borges, Andre Hora, Marco Tulio Valente. Understanding the Factors that Impact the Popularity of GitHub Repositories. In 32nd IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 334-344, 2016.

More information

  • DOI: 10.5281/zenodo.804474


  • github, domains


  • Publication date: 2017
  • Issued: June 08, 2017


Much of the data past this point we don't have good examples of yet. Please share in #rdi slack if you have good examples for anything that appears below. Thanks!


electronic resource


DescriptionItem typeRelationshipUri