Yazar "Isik, Yunus Emre" seçeneğine göre listele
Listeleniyor 1 - 11 / 11
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe A deep learning-based solution for digitization of invoice images with automatic invoice generation and labelling(Springer Heidelberg, 2024) Arslan, Halil; Isik, Yunus Emre; Gormez, YasinNowadays, the level of invoice traffic between companies has reached enormous levels. Invoices are crucial financial documents for companies, and they need to extract this information from these documents to access and control them quickly when necessary. While electronic invoices can be easily transferred to the company's ERP system with the help of integrators, information from printed invoices must be entered into the ERP system. Information entry is generally performed manually by company employees, so the probability of error is high. The automatic recognition of information in printed invoices will reduce the possibility of error. It will also save time and money by reducing workforce requirements. This study proposes a deep learning-based solution for detecting fields in image invoices that are in high demand among businesses. The system offers an end-to-end solution, which includes a novel method for generating synthetic invoices and automatic labeling. Three invoice templates were used to evaluate the usability of the system and an adaptive fine-tuning-based solution is proposed for newly coming invoice templates. Furthermore, 6 different object detection models were compared to find the most suitable one for our problem. The system was also tested with 1022 real invoice images that were manually labeled to test real-world usage. The results indicated that the fine-tuned model achieved an accuracy that was 8.4% higher than the baseline models. In tests performed on CPU, TOOD and Cascade-RCNN models were the most successful algorithms, while YOLOv5 was the fastest running algorithm. Depending on the priority of the needs, both algorithms can be preferred for real-time usage in the detection of invoice fields. The synthetic invoice generation code is available at https://github.com/SCU-CENG/Invoice-Generation.Öğe A novel hybrid model for bluetooth low energy-based indoor localization using machine learning in the internet of things(Pamukkale Univ, 2024) Gormez, Yasin; Arslan, Halil; Isik, Yunus Emre; Tomac, SercanIndoor localization involves pinpointing the location of an object in an interior space and has several applications, including navigation, asset tracking, and shift management. However, this technology has not yet been perfected, and many methods, such as triangulation, Kalman filters, and machine learning models have been proposed to address indoor localization problems. Unfortunately, these methods still have a large degree of error that makes them ill-suited for difficult cases in real-time. In this study, we propose a hybrid model for Bluetooth low energy -based indoor localization. In this model, the triangulation method is combined with several machine learning methods (naive Bayes, k -nearest neighbor, logistic regression, support vector machines, and artificial neural networks) that are optimized and tested in three different environments. In the experiment, the proposed model performed similarly to the solo triangulation model in easy and medium cases; however, the proposed model obtained a much smaller degree of error for hard cases than either solo triangulation or machine learning models alone.Öğe Comparative analysis of machine learning approaches for predicting respiratory virus infection and symptom severity(Peerj Inc, 2023) Isik, Yunus Emre; Aydin, ZaferRespiratory diseases are among the major health problems causing a burden on hospitals. Diagnosis of infection and rapid prediction of severity without time-consuming clinical tests could be beneficial in preventing the spread and progression of the disease, especially in countries where health systems remain incapable. Personalized medicine studies involving statistics and computer technologies could help to address this need. In addition to individual studies, competitions are also held such as Dialogue for Reverse Engineering Assessment and Methods (DREAM) challenge which is a community-driven organization with a mission to research biology, bioinformatics, and biomedicine. One of these competitions was the Respiratory Viral DREAM Challenge, which aimed to develop early predictive biomarkers for respiratory virus infections. These efforts are promising, however, the prediction performance of the computational methods developed for detecting respiratory diseases still has room for improvement. In this study, we focused on improving the performance of predicting the infection and symptom severity of individuals infected with various respiratory viruses using gene expression data collected before and after exposure. The publicly available gene expression dataset in the Gene Expression Omnibus, named GSE73072, containing samples exposed to four respiratory viruses (H1N1, H3N2, human rhinovirus (HRV), and respiratory syncytial virus (RSV)) was used as input data. Various preprocessing methods and machine learning algorithms were implemented and compared to achieve the best prediction performance. The experimental results showed that the proposed approaches obtained a prediction performance of 0.9746 area under the precision-recall curve (AUPRC) for infection (i.e., shedding) prediction (SC-1), 0.9182 AUPRC for symptom class prediction (SC-2), and 0.6733 Pearson correlation for symptom score prediction (SC-3) by outperforming the best leaderboard scores of Respiratory Viral DREAM Challenge (a 4.48% improvement for SC-1, a 13.68% improvement for SC-2, and a 13.98% improvement for SC-3). Additionally, over-representation analysis (ORA), which is a statistical method for objectively determining whether certain genes are more prevalent in pre-defined sets such as pathways, was applied using the most significant genes selected by feature selection methods. The results show that pathways associated with the 'adaptive immune system' and 'immune disease' are strongly linked to pre-infection and symptom development. These findings contribute to our knowledge about predicting respiratory infections and are expected to facilitate the development of future studies that concentrate on predicting not only infections but also the associated symptoms.Öğe Comparison of Graph Based Document Summarization Method(IEEE, 2017) Kaynar, Oguz; Gormez, Yasin; Isik, Yunus Emre; Demirkoparan, FerhanToday, with the development of the internet, documents containing information such as articles, news, web pages are produced and stored in digital environment. However, the increase in the number of media where people are able to add new contents such as social media, Twitter, and blog has increased the amount of information on the internet to enormous size. However, it is very difficult and time-consuming to determine whether or not information under research is reached. Automated document summarization systems can reduce the size of the text while keeping the important part of the text and present quickly whether the text contains the desired information. In this study, graph based document summarization methods are discussed. Besides the LexRank method, TextRank algorithm is used with 4 different similarity methods. Unlike other studies, Longest Common Subsequence (LCS), a similarity measure method, is used as a measure of similarity between nodes in the TextRank algorithm. Among the similarity measurement methods used, the longest subset achieved the best success by taking 0,510 Rogue1 and 0,266 Rouge-2 scores in English dataset. Similarly, the same method yields 0,742 Rouge-1 and 0,676 Rouge-2 scores in Turkish data set, which are better than other methods.Öğe Comparison of Machine Learning Classifiers for Protein Secondary Structure Prediction(IEEE, 2018) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin; Isik, Yunus EmreThree-dimensional structure prediction is one of the important problems in bioinformatics and theoretical chemistry. One of the most important steps in the three-dimensional structure prediction is the estimation of secondary structure. Due to rapidly growing databases and recent feature extraction methods datasets used for predicting secondary structure can potentially contain a large number of samples and dimensions. For this reason, it is important to use algorithms that are fast and accurate. In this study, various classification algorithms have been optimized for the second phase of a two-stage classifier on EVAset benchmark both in the original input space and in the space reduced using the information gain metric. The most accurate classifier is obtained as the support vector machine while the extreme learning machine is significantly faster in model training.Öğe Fabric Defect Detection with LBP-GLMC(IEEE, 2017) Kaynar, Oguz; Isik, Yunus Emre; Gormez, Yasin; Demirkoparan, FerhanFabric defect detection is vital for fabric quality. In the face of increasing fabric production, the fact that the detection of fabric faults by manpower is insufficient in terms of speed and quality has forced firms to work with automatic systems in this area. Until today, many methods have been developed to automatically detect fabric faults. Common purpose of many of these methods is to find some defective parts in the fabric by making some changes in image processing techniques or using machine learning methods. In this study, data sets obtained by applying local binary pattern and gray level co-occurrence matrix feature extraction methods on Tilda textile data are trained with artificial neural networks and two different models are created and success rates are compared.Öğe Graph Based Automatic Document Summarization with Different Similarity Methods(IEEE, 2017) Kaynar, Oguz; Isik, Yunus Emre; Gormez, YasinToday, with the rapid increase in the use of the internet, thousands of resources can be reached about an information that is interested. However, it is difficult and time consuming to determine which of these sources is useful. Automatic document summarization is a dimension reduction process which remains the important parts of the text. In this study, the TextRank algorithm, which is a graph based summarization approach, is used with 4 different similarity methods. The effect of these methods on the automatically generated summaries is examined. Among the similarity methods, Levenhesiten method was more successful than others with 0,506 Rouge-1 score.Öğe The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behcet's Disease(IEEE, 2018) Gormez, Yasin; Isik, Yunus Emre; Bakir-Gungor, BurcuBehcet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behcet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 200/u of the disease's genetic risk. In this study, for Behcet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance.Öğe Intrusion detection with autoencoder based deep learning machine(IEEE, 2017) Kaynar, Oguz; Yuksek, Ahmet Gurkan; Gormez, Yasin; Isik, Yunus EmreIn changing and constantly evolving information age, together with the developments in computer and internet technology, the production, digitization, storage and sharing of information has become much easier than in the past. The sharing of information via computer networks and the Internet has made information security a vital issue for people, institutions and organizations with critical data. Various information security policies have been established in order to protect the critical preserve data and prevent unauthorized access to this data. Intrusion detection systems which is one of the indispensable elements of information security policies, constantly monitor the network and the system to detect possible unauthorized access and infiltrations. So far, many machine learning methods such as artificial neural networks, support vector machines, decision trees have been used in intrusion detection systems. In this study, differently from other studies, autoencoder based deep learning machines are proposed for intrusion detection. KDDcup99 data set containing 22 attack types has been used in the study and a performance with 99.42% of succes rate has been achieved.Öğe Mobil Application for Automatic Document Summarization Systems(IEEE, 2018) Kaynar, Oguz; Isik, Yunus Emre; Gormez, Yasin; Kus, EmreToday, with the rapid development of internet and technology, the amount of information in the internet environment is increasing exponentially. Although increasing the number of texts with information is beneficial in terms of diversity, it also brings with problems such as missed research information and inability to access information quickly. To solve this problem, the computer-aided automatic systems that summarize the subject and topic in documents quickly can be useful. However, when looking at the literature, studies on automatic summarization are generally theoretically aimed at increasing summary success. Except a few web applications, the number of applications running as summaries is extremely low. In this study, an easy-to-use mobile application for automatic document summarization methods was developed. The summary of a document can be quickly and easily obtained at the desired time and place thanks to application, thus user needs can be met. In addition, summarizing same document in different forms according to needs is provided with 4 different methods of summarization left to the user's preference. As a result of the experiments on Turkish dataset, the application algorithms achieved a good Rouge-1 score of 0.59 on average. In addition, the application can extract the system summary in an acceptable time, such as an average of 18 seconds.Öğe The Determination of Distinctive Single Nucleotide Polymorphism Sets for the Diagnosis of Behcet's Disease(IEEE Computer Soc, 2022) Isik, Yunus Emre; Gormez, Yasin; Aydin, Zafer; Bakir-Gungor, BurcuBehcet's Disease (BD) is a multi-system inflammatory disorder in which the etiology remains unclear. The most probable hypothesis is that genetic tendency and environmental factors play roles in the development of BD. In order to find the essential reasons, genetic changes on thousands of genes should be analyzed. Besides, there is a need for extra analysis to find out which genetic factor affects the disease. Machine learning approaches have high potential for extracting the knowledge from genomics and selecting the representative Single Nucleotide Polymorphisms (SNPs) as the most effective features for the clinical diagnosis process. In this study, we have attempted to identify representative SNPs using feature selection methods, incorporating biological information and aimed to develop a machine-learning model for diagnosing Behcet's disease. By combining biological information and machine learning classifiers, up to 99.64 percent accuracy of disease prediction is achieved using only 13,611 out of 311,459 SNPs. In addition, we revealed the SNPs that are most distinctive by performing repeated feature selection in cross-validation experiments.