Yazar "Diri, Banu" seçeneğine göre listele
Listeleniyor 1 - 5 / 5
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe A novel alignment-free DNA sequence similarity analysis approach based on top-k n-gram match-up(Elsevier Science Inc, 2020) Delibas, Emre; Arslan, Ahmet; Seker, Abdulkadir; Diri, BanuDNA sequence similarity analysis is an essential task in computational biology and bioinformatics. In nearly all research that explores evolutionary relationships, gene function analysis, protein structure prediction and sequence retrieving, it is necessary to perform similarity calculations. As an alternative to alignment-based sequence comparison methods, which result in high computational cost, alignment-free methods have emerged that calculate similarity by digitizing the sequence in a different space. In this paper, we proposed an alignment-free DNA sequence similarity analysis method based on top-k n-gram matches, with the prediction that common repeating DNA subsections indicate high similarity between DNA sequences. In our method, we determined DNA sequence similarities by measuring similarity among feature vectors created according to top-k n-gram match-up scores without the use of similarity functions. We applied the similarity calculation for three different DNA data sets of different lengths. The phylogenetic relationships revealed by our method show that our trees coincide almost completely with the results of the MEGA software, which is based on sequence alignment. Our findings show that a certain number of frequently recurring common sequence patterns have the power to characterize DNA sequences. (C) 2020 Elsevier Inc. All rights reserved.Öğe DNA Sequence Compression within Traditional Text Compression Algorithms(IEEE, 2017) Seker, Abdulkadir; Delibas, Emre; Diri, BanuIt aimed that traditional text compression methods are using on compression of DNA sequence in this study. It has seen that the random short repeats are vital and it has examined their posivite impact for compression. A pipelined system with multiple algorithms running sequentially for compression. How the contribution of the algorithm to the system was investigated and especially the effect of the BWT on compression was shown. Results show that the pipeline system was found unable to catch the compression success of the Huffman coding alone.Öğe Open Source Software Development Challenges: A Systematic Literature Review on GitHub(IGI Global, 2021) Seker, Abdulkadir; Diri, Banu; Arslan, Halil; Amasyalı, Mehmet FatihGitHub is the most common code hosting and repository service for open-source software (OSS) projects. Thanks to the great variety of features, researchers benefit from GitHub to solve a wide range of OSS development challenges. In this context, the authors thought that was important to conduct a literature review on studies that used GitHub data. To reach these studies, they conducted this literature review based on a GitHub dataset source study instead of a keyword-based search in digital libraries. Since GHTorrent is the most widely known GitHub dataset according to the literature, they considered the studies that cite this dataset for the systematic literature review. In this study, they reviewed the selected 172 studies according to some criteria that used the dataset as a data source. They classified them within the scope of OSS development challenges thanks to the information they extract from the metadata of studies. They put forward some issues about the dataset and they offered the focused and attention-grabbing fields and open challenges that we encourage the researchers to study on them. © 2021 by IGI Global. All rights reserved.Öğe Summarising big data: public GitHub dataset for software engineering challenges(2020) Şeker, Abdulkadir; Diri, Banu; Arslan, Halil; Amasyalı, Mehmet FatihIn open-source software development environments; textual, numerical, and relationshipbased data generated are of interest to researchers. Various data sets are available for this data,which is frequently used in areas such as software engineering and natural languageprocessing. However, since these data sets contain all the data in the environment, the problemarises in the terabytes of data processing. For this reason, almost all of the studies using GitHubdata use filtered data according to certain criteria. In this context, using a different data set ineach study makes a comparison of the accuracy of the studies quite difficult. In order to solvethis problem, a common dataset was created and shared with the researchers, which wouldallow to work on many software engineering problems.Öğe Summarising big data: public GitHub dataset for software engineering challenges(Sivas Cumhuriyet University, 2020) Şeker, Abdulkadir; Diri, Banu; Arslan, Halil; Amasyalı, FatihIn open-source software development environments; textual, numerical, and relationship-based data generated are of interest to researchers. Various data sets are available for this data, which is frequently used in areas such as software engineering and natural language processing. However, since these data sets contain all the data in the environment, the problem arises in the terabytes of data processing. For this reason, almost all of the studies using GitHub data use filtered data according to certain criteria. In this context, using a different data set in each study makes a comparison of the accuracy of the studies quite difficult. In order to solve this problem, a common dataset was created and shared with the researchers, which would allow to work on many software engineering problems.