Summarising big data: public GitHub dataset for software engineering challenges

Küçük Resim Yok

Tarih

2020

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

In open-source software development environments; textual, numerical, and relationshipbased data generated are of interest to researchers. Various data sets are available for this data,which is frequently used in areas such as software engineering and natural languageprocessing. However, since these data sets contain all the data in the environment, the problemarises in the terabytes of data processing. For this reason, almost all of the studies using GitHubdata use filtered data according to certain criteria. In this context, using a different data set ineach study makes a comparison of the accuracy of the studies quite difficult. In order to solvethis problem, a common dataset was created and shared with the researchers, which wouldallow to work on many software engineering problems.

Açıklama

Anahtar Kelimeler

Kaynak

Cumhuriyet Science Journal

WoS Q Değeri

Scopus Q Değeri

Cilt

41

Sayı

3

Künye