Summarising big data: public GitHub dataset for software engineering challenges

dc.contributor.authorŞeker, Abdulkadir
dc.contributor.authorDiri, Banu
dc.contributor.authorArslan, Halil
dc.contributor.authorAmasyalı, Mehmet Fatih
dc.date.accessioned2024-10-26T17:34:07Z
dc.date.available2024-10-26T17:34:07Z
dc.date.issued2020
dc.departmentSivas Cumhuriyet Üniversitesi
dc.description.abstractIn open-source software development environments; textual, numerical, and relationshipbased data generated are of interest to researchers. Various data sets are available for this data,which is frequently used in areas such as software engineering and natural languageprocessing. However, since these data sets contain all the data in the environment, the problemarises in the terabytes of data processing. For this reason, almost all of the studies using GitHubdata use filtered data according to certain criteria. In this context, using a different data set ineach study makes a comparison of the accuracy of the studies quite difficult. In order to solvethis problem, a common dataset was created and shared with the researchers, which wouldallow to work on many software engineering problems.
dc.identifier.doi10.17776/csj.728932
dc.identifier.endpage724
dc.identifier.issn2587-2680
dc.identifier.issn2587-246X
dc.identifier.issue3
dc.identifier.startpage720
dc.identifier.trdizinid456642
dc.identifier.urihttps://doi.org/10.17776/csj.728932
dc.identifier.urihttps://search.trdizin.gov.tr/tr/yayin/detay/456642
dc.identifier.urihttps://hdl.handle.net/20.500.12418/23542
dc.identifier.volume41
dc.indekslendigikaynakTR-Dizin
dc.language.isoen
dc.relation.ispartofCumhuriyet Science Journal
dc.relation.publicationcategoryMakale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.titleSummarising big data: public GitHub dataset for software engineering challenges
dc.typeArticle

Dosyalar