DSpace Arşivi :: by Yazar "Gormez, Yasin" değerine göre listeleniyor

Yazar "Gormez, Yasin" seçeneğine göre listele

Listeleniyor 1 - 20 / 29

A deep learning-based solution for digitization of invoice images with automatic invoice generation and labelling
(Springer Heidelberg, 2024) Arslan, Halil; Isik, Yunus Emre; Gormez, Yasin
Nowadays, the level of invoice traffic between companies has reached enormous levels. Invoices are crucial financial documents for companies, and they need to extract this information from these documents to access and control them quickly when necessary. While electronic invoices can be easily transferred to the company's ERP system with the help of integrators, information from printed invoices must be entered into the ERP system. Information entry is generally performed manually by company employees, so the probability of error is high. The automatic recognition of information in printed invoices will reduce the possibility of error. It will also save time and money by reducing workforce requirements. This study proposes a deep learning-based solution for detecting fields in image invoices that are in high demand among businesses. The system offers an end-to-end solution, which includes a novel method for generating synthetic invoices and automatic labeling. Three invoice templates were used to evaluate the usability of the system and an adaptive fine-tuning-based solution is proposed for newly coming invoice templates. Furthermore, 6 different object detection models were compared to find the most suitable one for our problem. The system was also tested with 1022 real invoice images that were manually labeled to test real-world usage. The results indicated that the fine-tuned model achieved an accuracy that was 8.4% higher than the baseline models. In tests performed on CPU, TOOD and Cascade-RCNN models were the most successful algorithms, while YOLOv5 was the fastest running algorithm. Depending on the priority of the needs, both algorithms can be preferred for real-time usage in the detection of invoice fields. The synthetic invoice generation code is available at https://github.com/SCU-CENG/Invoice-Generation.
A novel hybrid model for bluetooth low energy-based indoor localization using machine learning in the internet of things
(Pamukkale Univ, 2024) Gormez, Yasin; Arslan, Halil; Isik, Yunus Emre; Tomac, Sercan
Indoor localization involves pinpointing the location of an object in an interior space and has several applications, including navigation, asset tracking, and shift management. However, this technology has not yet been perfected, and many methods, such as triangulation, Kalman filters, and machine learning models have been proposed to address indoor localization problems. Unfortunately, these methods still have a large degree of error that makes them ill-suited for difficult cases in real-time. In this study, we propose a hybrid model for Bluetooth low energy -based indoor localization. In this model, the triangulation method is combined with several machine learning methods (naive Bayes, k -nearest neighbor, logistic regression, support vector machines, and artificial neural networks) that are optimized and tested in three different environments. In the experiment, the proposed model performed similarly to the solo triangulation model in easy and medium cases; however, the proposed model obtained a much smaller degree of error for hard cases than either solo triangulation or machine learning models alone.
Automatic Classification of Natural Stone Tiles with Computer Vision
(IEEE, 2018) Kaynar, Oguz; Torun, Yunis; Temiz, Mustafa; Gormez, Yasin
Classification in natural stone industry have a great importance for enterprises. There are reinstatement cases arising from the fact that ordered granite parties are not the same as the agreed sample parties at the beginning, which causes significant economic losses for the companies. There is a greater need to classify tiles using computer-aided image processing methods for the development of quality control processes that have become increasingly important due to the rapidly increasing competition and globalization in the natural stone industry. In this type of automatic systems, the attributes that give information about color and surface are extracted from the images of natural stone tiles with image processing techniques and then the data set obtained by using these attributes are classified by various artificial intelligence and data mining techniques. In this study, a classification was made on a dataset consisting of 996 pictures of natural stone tiles from six categories obtained from a natural stone producer (Beta Mermer I. C.) operating in Sivas. Gray level co-occurrence matrix (GLCM) and local binary pattern (LPB) are used to obtain pattern information of granite tiles. Several statistics related to each color channel were used to obtain color information of granites. Various datasets are created using only pattern information and combination of pattern and color information of tiles. Subsequently, classification performance of these datasets are compared using several algorithms such as, artificial neural networks, support vector machines, and naive bayes.
Biomarker discovery and development of prognostic prediction model using metabolomic panel in breast cancer patients: a hybrid methodology integrating machine learning and explainable artificial intelligence
(Frontiers Media Sa, 2024) Yagin, Fatma Hilal; Gormez, Yasin; Al-Hashem, Fahaid; Ahmad, Irshad; Ahmad, Fuzail; Ardigo, Luca Paolo
Background Breast cancer (BC) is a significant cause of morbidity and mortality in women. Although the important role of metabolism in the molecular pathogenesis of BC is known, there is still a need for robust metabolomic biomarkers and predictive models that will enable the detection and prognosis of BC. This study aims to identify targeted metabolomic biomarker candidates based on explainable artificial intelligence (XAI) for the specific detection of BC.Methods Data obtained after targeted metabolomics analyses using plasma samples from BC patients (n = 102) and healthy controls (n = 99) were used. Machine learning (ML) models based on raw data were developed, then feature selection methods were applied, and the results were compared. SHapley Additive exPlanations (SHAP), an XAI method, was used to clinically explain the decisions of the optimal model in BC prediction.Results The results revealed that variable selection increased the performance of ML models in BC classification, and the optimal model was obtained with the logistic regression (LR) classifier after support vector machine (SVM)-SHAP-based feature selection. SHAP annotations of the LR model revealed that Leucine, isoleucine, L-alloisoleucine, norleucine, and homoserine acids were the most important potential BC diagnostic biomarkers. Combining the identified metabolite markers provided robust BC classification measures with precision, recall, and specificity of 89.50%, 88.38%, and 83.67%, respectively.Conclusion In conclusion, this study adds valuable information to the discovery of BC biomarkers and underscores the potential of targeted metabolomics-based diagnostic advances in the management of BC.
Comparison of Graph Based Document Summarization Method
(IEEE, 2017) Kaynar, Oguz; Gormez, Yasin; Isik, Yunus Emre; Demirkoparan, Ferhan
Today, with the development of the internet, documents containing information such as articles, news, web pages are produced and stored in digital environment. However, the increase in the number of media where people are able to add new contents such as social media, Twitter, and blog has increased the amount of information on the internet to enormous size. However, it is very difficult and time-consuming to determine whether or not information under research is reached. Automated document summarization systems can reduce the size of the text while keeping the important part of the text and present quickly whether the text contains the desired information. In this study, graph based document summarization methods are discussed. Besides the LexRank method, TextRank algorithm is used with 4 different similarity methods. Unlike other studies, Longest Common Subsequence (LCS), a similarity measure method, is used as a measure of similarity between nodes in the TextRank algorithm. Among the similarity measurement methods used, the longest subset achieved the best success by taking 0,510 Rogue1 and 0,266 Rouge-2 scores in English dataset. Similarly, the same method yields 0,742 Rouge-1 and 0,676 Rouge-2 scores in Turkish data set, which are better than other methods.
Comparison of Machine Learning Classifiers for Protein Secondary Structure Prediction
(IEEE, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; Isik, Yunus Emre
Three-dimensional structure prediction is one of the important problems in bioinformatics and theoretical chemistry. One of the most important steps in the three-dimensional structure prediction is the estimation of secondary structure. Due to rapidly growing databases and recent feature extraction methods datasets used for predicting secondary structure can potentially contain a large number of samples and dimensions. For this reason, it is important to use algorithms that are fast and accurate. In this study, various classification algorithms have been optimized for the second phase of a two-stage classifier on EVAset benchmark both in the original input space and in the space reduced using the information gain metric. The most accurate classifier is obtained as the support vector machine while the extreme learning machine is significantly faster in model training.
Comparison of NR and UniClust Databases for Protein Secondary Structure Prediction
(IEEE, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin
Three-dimensional structure prediction is one of the important problems in bioinformatics and theoretical chemistry. One of the most important steps in the three-dimensional structure prediction is the estimation of secondary structure. Improving the accuracy rate in protein secondary structure prediction depends on computed attributes as well as the classification algorithms. In multiple alignment methods, which are often used to extract an attribute, the calculated values differ according to the database used for the alignment. For this reason, it is important to use a suitable database against which the target proteins are aligned to compute profile feature vectors. In this study, 5 different datasets are generated for the CB513 benchmark with the aid of two different alignment methods and three different databases. The profile features are fed as input to a two-stage hybrid classifier. According to the experimental results, the highest accuracy rate is obtained when UniClust database is used at the first stage of HHBlits alignment to calculate PSSM values and NR database is used at the first stage of HHBlits alignment to calculate structural profile matrices.
Customized deep learning based Turkish automatic speech recognition system supported by language model
(Peerj Inc, 2024) Gormez, Yasin
Background. In today's world, numerous applications integral to various facets of daily life include automatic speech recognition methods. Thus, the development of a successful automatic speech recognition system can significantly augment the convenience of people's daily routines. While many automatic speech recognition systems have been established for widely spoken languages like English, there has been insufficient progress in developing such systems for less common languages such as Turkish. Moreover, due to its agglutinative structure, designing a speech recognition system for Turkish presents greater challenges compared to other language groups. Therefore, our study focused on proposing deep learning models for automatic speech recognition in Turkish, complemented by the integration of a language model. Methods. In our study, deep learning models were formulated by incorporating convolutional neural networks, gated recurrent units, long short-term memories, and transformer layers. The Zemberek library was employed to craft the language model to improve system performance. Furthermore, the Bayesian optimization method was applied to fine-tune the hyper-parameters of the deep learning models. To evaluate the model's performance, standard metrics widely used in automatic speech recognition systems, specifically word error rate and character error rate scores, were employed. Results. Upon reviewing the experimental results, it becomes evident that when optimal hyper-parameters are applied to models developed with various layers, the scores are as follows: Without the use of a language model, the Turkish Microphone Speech Corpus dataset yields scores of 22.2 -word error rate and 14.05-character error rate, while the Turkish Speech Corpus dataset results in scores of 11.5 -word error rate and 4.15 character error rate. Upon incorporating the language model, notable improvements were observed. Specifically, for the Turkish Microphone Speech Corpus dataset, the word error rate score decreased to 9.85, and the character error rate score lowered to 5.35. Similarly, the word error rate score improved to 8.4, and the character error rate score decreased to 2.7 for the Turkish Speech Corpus dataset. These results demonstrate that our model outperforms the studies found in the existing literature.
Detection and Classification of Closed Angle Glaucoma Using Optical Coherence Tomography Images
(Institute of Electrical and Electronics Engineers Inc., 2023) Teke, Fatih; Kaynar, Oguz; Gormez, Yasin
Glaucoma is one of the 3 most important optic nerve diseases that cause vision loss in the world. There are 4 types of glaucoma that develops due to the destruction of the optic nerve, and one of them is closed-angle glaucoma. Closed-angle glaucoma causes an increase in intraocular pressure with the obstruction of drainage channels due to age and triggers glaucoma. In this study, disease classification was made using anterior segment optical coherence tomography (AS-OCT) images of closed-angle glaucoma samples. A total of 1200 ASOCT images were trained with convolutional networks for classification. It supports the use of peripapillary OCT images for the early diagnosis of glaucoma, with a test accuracy of 97.5%, which gives a very good result in peripapillary layer maps of glaucoma. With the developed method, AS-OCT images are aimed to help doctors in the detection and diagnosis of glaucoma © 2023 IEEE.
Dimensionality reduction for protein secondary structure and solvent accesibility prediction
(IMPERIAL COLLEGE PRESS, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin
Secondary structure and solvent accessibility prediction provide valuable information for estimating the three dimensional structure of a protein. As new feature extraction methods are developed the dimensionality of the input feature space increases steadily. Reducing the number of dimensions provides several advantages such as faster model training, faster prediction and noise elimination. In this work, several dimensionality reduction techniques have been employed including various feature selection methods, autoencoders and PCA for protein secondary structure and solvent accessibility prediction. The reduced feature set is used to train a support vector machine at the second stage of a hybrid classifier. Cross-validation experiments on two difficult benchmarks demonstrate that the dimension of the input space can be reduced substantially while maintaining the prediction accuracy. This will enable the incorporation of additional informative features derived for predicting the structural properties of proteins without reducing the accuracy due to overfitting.
Efficient and Scalable Broker Design for the Internet of Things Environments
(IEEE, 2020) Gormez, Yasin; Arslan, Halil; Kelek, Omer Faruk
In line with recent requirements, many institutions and organizations have started to need IoT devices. Number of used IoT devices increased because of the increasing need. These devices can collect huge amounts of data from a wide range of sensors. This increase in the number of data brings along the problem of how the data should be collected and processed. Relational databases and standard methods are insufficient for storing and processing these huge data. Therefore, in this study, a Broker that uses NoSql database to store data, indexing the short time data with Elasticsearch to increase instantaneous processing speed, can work with multiple copies thanks to virtualization, and provide powerful user interface thanks to Kibana is proposed. This proposed broker has been tested at one of our airports on low-energy bluetooth data and was able to transmit a maximum of 68,000 data per second with the determined server.
Estimation of Obesity Levels through the Proposed Predictive Approach Based on Physical Activity and Nutritional Habits
(Mdpi, 2023) Gozukara Bag, Harika Gozde; Yagin, Fatma Hilal; Gormez, Yasin; Gonzalez, Pablo Prieto; Colak, Cemil; Gulu, Mehmet; Badicu, Georgian
Obesity is the excessive accumulation of adipose tissue in the body that leads to health risks. The study aimed to classify obesity levels using a tree-based machine-learning approach considering physical activity and nutritional habits. Methods: The current study employed an observational design, collecting data from a public dataset via a web-based survey to assess eating habits and physical activity levels. The data included gender, age, height, weight, family history of being overweight, dietary patterns, physical activity frequency, and more. Data preprocessing involved addressing class imbalance using Synthetic Minority Over-sampling TEchnique-Nominal Continuous (SMOTE-NC) and feature selection using Recursive Feature Elimination (RFE). Three classification algorithms (logistic regression (LR), random forest (RF), and Extreme Gradient Boosting (XGBoost)) were used for obesity level prediction, and Bayesian optimization was employed for hyperparameter tuning. The performance of different models was evaluated using metrics such as accuracy, recall, precision, F1-score, area under the curve (AUC), and precision-recall curve. The LR model showed the best performance across most metrics, followed by RF and XGBoost. Feature selection improved the performance of LR and RF models, while XGBoost's performance was mixed. The study contributes to the understanding of obesity classification using machine-learning techniques based on physical activity and nutritional habits. The LR model demonstrated the most robust performance, and feature selection was shown to enhance model efficiency. The findings underscore the importance of considering both physical activity and nutritional habits in addressing the obesity epidemic.
Estimation of Obesity Levels with a Trained Neural Network Approach optimized by the Bayesian Technique
(Mdpi, 2023) Yagin, Fatma Hilal; Gulu, Mehmet; Gormez, Yasin; Castaneda-Babarro, Arkaitz; Colak, Cemil; Greco, Gianpiero; Fischetti, Francesco
Background: Obesity, which causes physical and mental problems, is a global health problem with serious consequences. The prevalence of obesity is increasing steadily, and therefore, new research is needed that examines the influencing factors of obesity and how to predict the occurrence of the condition according to these factors. This study aimed to predict the level of obesity based on physical activity and eating habits using the trained neural network model. Methods: The chi-square, F-Classify, and mutual information classification algorithms were used to identify the most critical factors associated with obesity. The models' performances were compared using a trained neural network with different feature sets. The hyperparameters of the models were optimized using Bayesian optimization techniques, which are faster and more effective than traditional techniques. Results: The results predicted the level of obesity with average accuracies of 93.06%, 89.04%, 90.32%, and 86.52% for all features using the neural network and for the features selected by the chi-square, F-Classify, and mutual information classification algorithms. The results showed that physical activity, alcohol consumption, use of technological devices, frequent consumption of high-calorie meals, and frequency of vegetable consumption were the most important factors affecting obesity. Conclusions: The F-Classify score algorithm identified the most essential features for obesity level estimation. Furthermore, physical activity and eating habits were the most critical factors for obesity prediction.
Explainable Artificial Intelligence Paves the Way in Precision Diagnostics and Biomarker Discovery for the Subclass of Diabetic Retinopathy in Type 2 Diabetics
(Mdpi, 2023) Yagin, Fatma Hilal; Yasar, Seyma; Gormez, Yasin; Yagin, Burak; Pinar, Abdulvahap; Alkhateeb, Abedalrhman; Ardigo, Luca Paolo
Diabetic retinopathy (DR), a common ocular microvascular complication of diabetes, contributes significantly to diabetes-related vision loss. This study addresses the imperative need for early diagnosis of DR and precise treatment strategies based on the explainable artificial intelligence (XAI) framework. The study integrated clinical, biochemical, and metabolomic biomarkers associated with the following classes: non-DR (NDR), non-proliferative diabetic retinopathy (NPDR), and proliferative diabetic retinopathy (PDR) in type 2 diabetes (T2D) patients. To create machine learning (ML) models, 10% of the data was divided into validation sets and 90% into discovery sets. The validation dataset was used for hyperparameter optimization and feature selection stages, while the discovery dataset was used to measure the performance of the models. A 10-fold cross-validation technique was used to evaluate the performance of ML models. Biomarker discovery was performed using minimum redundancy maximum relevance (mRMR), Boruta, and explainable boosting machine (EBM). The predictive proposed framework compares the results of eXtreme Gradient Boosting (XGBoost), natural gradient boosting for probabilistic prediction (NGBoost), and EBM models in determining the DR subclass. The hyperparameters of the models were optimized using Bayesian optimization. Combining EBM feature selection with XGBoost, the optimal model achieved (91.25 +/- 1.88) % accuracy, (89.33 +/- 1.80) % precision, (91.24 +/- 1.67) % recall, (89.37 +/- 1.52) % F1-Score, and (97.00 +/- 0.25) % the area under the ROC curve (AUROC). According to the EBM explanation, the six most important biomarkers in determining the course of DR were tryptophan (Trp), phosphatidylcholine diacyl C42:2 (PC.aa.C42.2), butyrylcarnitine (C4), tyrosine (Tyr), hexadecanoyl carnitine (C16) and total dimethylarginine (DMA). The identified biomarkers may provide a better understanding of the progression of DR, paving the way for more precise and cost-effective diagnostic and treatment strategies.
Fabric Defect Detection with LBP-GLMC
(IEEE, 2017) Kaynar, Oguz; Isik, Yunus Emre; Gormez, Yasin; Demirkoparan, Ferhan
Fabric defect detection is vital for fabric quality. In the face of increasing fabric production, the fact that the detection of fabric faults by manpower is insufficient in terms of speed and quality has forced firms to work with automatic systems in this area. Until today, many methods have been developed to automatically detect fabric faults. Common purpose of many of these methods is to find some defective parts in the fabric by making some changes in image processing techniques or using machine learning methods. In this study, data sets obtained by applying local binary pattern and gray level co-occurrence matrix feature extraction methods on Tilda textile data are trained with artificial neural networks and two different models are created and success rates are compared.
Feature Selection for Protein Dihedral Angle Prediction
(IEEE, 2017) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; Koyuncu, B; Tomar, GS
Three-dimensional structure prediction has crucial importance for bioinformatics and theoretical chemistry. One of the main steps of three-dimensional structure prediction is dihedral (torsion) angle prediction. As new feature extraction methods are developed the dimension of the input space increases considerably yielding longer model training and less accurate models due to noisy or redundant features. In this study, feature selection is employed for dimensionality reduction on one of the established benchmarks of protein 1D structure prediction. Experimental results show that the feature selection improves the accuracy of protein dihedral angle class prediction by 2% and can eliminate up to %82 of the features when random forest classifier is used. Accurate prediction of dihedral angles will eventually contribute to protein structure prediction.
Feature Selection Methods in Sentiment Analaysis
(IEEE, 2017) Kaynar, Oguz; Arslan, Halil; Gormez, Yasin; Demirkoparan, Ferhan
In today's technology, people are starting to share their opinions, ideas and feelings through many mediums because the internet is used extensively by every segment. These shares have become an important source of work on sentiment analysis and have led to increased work on this field. The sentiment analysis is simply to determine whether the emotion is included or not, and to determine whether the emotion is positive, negative, or neutral. In this study, chi-square, information gain, gain ratio, gini coefficient, oneR and reliefF methods are applied on the data sets according to the contents of movie comments and the obtained data sets are classified by Support Vector Machines (SVM). As a result of the application, it has been observed that the feature selection methods improve the results of sentiment analysis.
Graph Based Automatic Document Summarization with Different Similarity Methods
(IEEE, 2017) Kaynar, Oguz; Isik, Yunus Emre; Gormez, Yasin
Today, with the rapid increase in the use of the internet, thousands of resources can be reached about an information that is interested. However, it is difficult and time consuming to determine which of these sources is useful. Automatic document summarization is a dimension reduction process which remains the important parts of the text. In this study, the TextRank algorithm, which is a graph based summarization approach, is used with 4 different similarity methods. The effect of these methods on the automatically generated summaries is examined. Among the similarity methods, Levenhesiten method was more successful than others with 0,506 Rouge-1 score.
Hybrid Explainable Artificial Intelligence Models for Targeted Metabolomics Analysis of Diabetic Retinopathy
(Mdpi, 2024) Yagin, Fatma Hilal; Colak, Cemil; Algarni, Abdulmohsen; Gormez, Yasin; Guldogan, Emek; Ardigo, Luca Paolo
Background: Diabetic retinopathy (DR) is a prevalent microvascular complication of diabetes mellitus, and early detection is crucial for effective management. Metabolomics profiling has emerged as a promising approach for identifying potential biomarkers associated with DR progression. This study aimed to develop a hybrid explainable artificial intelligence (XAI) model for targeted metabolomics analysis of patients with DR, utilizing a focused approach to identify specific metabolites exhibiting varying concentrations among individuals without DR (NDR), those with non-proliferative DR (NPDR), and individuals with proliferative DR (PDR) who have type 2 diabetes mellitus (T2DM). Methods: A total of 317 T2DM patients, including 143 NDR, 123 NPDR, and 51 PDR cases, were included in the study. Serum samples underwent targeted metabolomics analysis using liquid chromatography and mass spectrometry. Several machine learning models, including Support Vector Machines (SVC), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), and Multilayer Perceptrons (MLP), were implemented as solo models and in a two-stage ensemble hybrid approach. The models were trained and validated using 10-fold cross-validation. SHapley Additive exPlanations (SHAP) were employed to interpret the contributions of each feature to the model predictions. Statistical analyses were conducted using the Shapiro-Wilk test for normality, the Kruskal-Wallis H test for group differences, and the Mann-Whitney U test with Bonferroni correction for post-hoc comparisons. Results: The hybrid SVC + MLP model achieved the highest performance, with an accuracy of 89.58%, a precision of 87.18%, an F1-score of 88.20%, and an F-beta score of 87.55%. SHAP analysis revealed that glucose, glycine, and age were consistently important features across all DR classes, while creatinine and various phosphatidylcholines exhibited higher importance in the PDR class, suggesting their potential as biomarkers for severe DR. Conclusion: The hybrid XAI models, particularly the SVC + MLP ensemble, demonstrated superior performance in predicting DR progression compared to solo models. The application of SHAP facilitates the interpretation of feature importance, providing valuable insights into the metabolic and physiological markers associated with different stages of DR. These findings highlight the potential of hybrid XAI models combined with explainable techniques for early detection, targeted interventions, and personalized treatment strategies in DR management.
The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behcet's Disease
(IEEE, 2018) Gormez, Yasin; Isik, Yunus Emre; Bakir-Gungor, Burcu
Behcet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behcet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 200/u of the disease's genetic risk. In this study, for Behcet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance.

Yazar "Gormez, Yasin" seçeneğine göre listele

Sayfa Başına Sonuç

Sıralama seçenekleri