Bilgisayar Mühendisliği Bölümü Makale Koleksiyonu

Bu koleksiyon için kalıcı URI

https://hdl.handle.net/20.500.12418/589

Listeleniyor 1 - 20 / 22

Effects of Neighborhood-based Collaborative Filtering Parameters on Their Blockbuster Bias Performances
(Ağustos, 2022) Yalcin, Emre
Collaborative filtering algorithms are efficient tools for providing recommendations with reasonable accuracy performances to individuals. However, the previous research has realized that these algorithms are undesirably biased towards blockbuster items. i.e., both popular and highly-liked items, in their recommendations, resulting in recommendation lists dominated by such blockbuster items. As one most prominent types of collaborative filtering approaches, neighborhood-based algorithms aim to produce recommendations based on neighborhoods constructed based on similarities between users or items. Therefore, the utilized similarity function and the size of the neighborhoods are critical parameters on their recommendation performances. This study considers three well-known similarity functions, i.e., Pearson, Cosine, and Mean Squared Difference, and varying neighborhood sizes and observes how they affect the algorithms’ blockbuster bias and accuracy performances. The extensive experiments conducted on two benchmark data collections conclude that as the size of neighborhoods decreases, these algorithms generally become more vulnerable to blockbuster bias while their accuracy increases. The experimental works also show that using the Cosine metric is superior to other similarity functions in producing recommendations where blockbuster bias is treated more; however, it leads to having unqualified recommendations in terms of predictive accuracy as they are usually conflicting goals.
Determination of Photonuclear Reaction Cross-Sections on Stable P-shell Nuclei by Using Deep Neural Networks
(Springer, 13 Mayıs 2023) Akkoyun, Serkan; Kaya, Hüseyin; Şeker, Abdulkadir; Yeşilyurt, Saliha
Photonuclear reactions are widely used in investigations of nuclear structure. Thus, the determination of the cross-sections are essential for the experimental studies. In the present work, (γ, n) photonuclear reaction cross-sections for stable p-shell nuclei have been estimated by using the neural network method. The main purpose of this study is to find neural network structures that give the best estimations for the cross-sections, and to compare them with the available data. These comparisons indicate the deep neural network structures that are convenient for this task. Through this procedure, we have found that the shallow NN models, tanh activation function is better than the ReLU. However, as our models become deeper, the difference between tanh and ReLU decreases considerably. In this context, we think that the crucial hyperparameters are the size of the hidden layer and neuron numbers of each layer.
Zero-shot learning via self-organizing maps
(Springer, 25.01.2023) İsmailoğlu, Fırat
Collecting-labeled images from all possible classes related to the task at hand is highly impractical and may even be impossible. At this point, Zero-Shot Learning (ZSL) can enable the classification of new test classes for which there are no labeled images for training. The vast majority of existing ZSL methods aim to learn a projection from the feature space into the semantic space, where all classes are represented by a list of semantic attributes. To this end, they usually try to solve a complex optimization problem. Nevertheless, the semantic features (attributes) may not be suitable to represent the images because they are derived based on human knowledge and are, therefore, abstract. Alternatively, in this study, we introduce a novel ZSL method called SOMZSL, which has its roots in Self-Organizing Maps (SOM), a famous data visualization method. In particular, SOMZSL builds two SOMs of the same size and shape, one for the feature space and one for the attribute space, and then establishes a correspondence between them. Instead of considering a direct projection between the feature space and the attribute space, which is inherently different, SOMZSL connects them through comparable intermediate layers, i.e., SOMs. In terms of performance, SOMZSL can classify novel test classes as well or even better than existing ZSL methods without dealing with a complex optimization problem, thanks to the heuristic nature of SOM on which it is based. Finally, SOMZSL uses unlabeled test images in the construction of SOMs and can thus mitigate the domain shift problem inherent in ZSL.
LVQ Treatment for Zero-Shot Learning
(Tubitak Academic Journals, 23.01.2023) İsmailoğlu, Fırat
In image classification, there are no labeled training instances for some classes, which are therefore called unseen classes or test classes. To classify these classes, zero-shot learning (ZSL) was developed, which typically attempts to learn a mapping from the (visual) feature space to the semantic space in which the classes are represented by a list of semantically meaningful attributes. However, the fact that this mapping is learned without using instances of the test classes affects the performance of ZSL, which is known as the domain shift problem. In this study, we propose to apply the learning vector quantization (LVQ) algorithm in the semantic space once the mapping is determined. First and foremost, this allows us to refine the prototypes of the test classes with respect to the learned mapping, which reduces the effects of the domain shift problem. Secondly, the LVQ algorithm increases the margin of the 1-NN classifier used in ZSL, resulting in better classification. Moreover, for this work, we consider a range of LVQ algorithms, from initial to advanced variants, and applied them to a number of state-of-the-art ZSL methods, then obtained their LVQ extensions. The experiments based on five ZSL benchmark datasets showed that the LVQ-empowered extensions of the ZSL methods are superior to their original counterparts in almost all settings.
A Novel Contour Tracing Algorithm for Object Shape Reconstruction Using Parametric Curves
(Tech Science Press, 2023) Gürkahraman, Kali
Parametric curves such as Bézier and B-splines, originally developed for the design of automobile bodies, are now also used in image processing and computer vision. For example, reconstructing an object shape in an image, including different translations, scales, and orientations, can be performed using these parametric curves. For this, Bézier and B-spline curves can be generated using a point set that belongs to the outer boundary of the object. The resulting object shape can be used in computer vision fields, such as searching and segmentation methods and training machine learning algorithms. The prerequisite for reconstructing the shape with parametric curves is to obtain sequentially the points in the point set. In this study, a novel algorithm has been developed that sequentially obtains the pixel locations constituting the outer boundary of the object. The proposed algorithm, unlike the methods in the literature, is implemented using a filter containing weights and an outer circle surrounding the object. In a binary format image, the starting point of the tracing is determined using the outer circle, and the next tracing movement and the pixel to be labeled as the boundary point is found by the filter weights. Then, control points that define the curve shape are selected by reducing the number of sequential points. Thus, the Bézier and B-spline curve equations describing the shape are obtained using these points. In addition, different translations, scales, and rotations of the object shape are easily provided by changing the positions of the control points. It has also been shown that the missing part of the object can be completed thanks to the parametric curves.
A survey of smart home energy conservation techniques
(Elsevier, Mart, 2023) Fakhar, Muhemmed Zaman; Yalcin, Emre; Bilge, Alper
Smart homes are equipped with easy-to-interact interfaces, providing a more comfortable living environment and less energy consumption. There are currently satisfactory approaches proposed to deliver adequate comfort and ease to smart home inhabitants through infrared sensors, motion sensors, and other similar technologies. However, the goal of reducing energy consumption is always a significant concern for smart home stakeholders. A detailed discussion about energy management techniques might open new leads for advanced research and even introduce more ways to improve existing methods since a summary of effective energy conservation techniques are helpful to get a quick overview of the state-of-the-art techniques. This review study aims to provide an overview of previously proposed techniques for energy conservation and energy-saving recommendations. We identify various critical features in energy conservation techniques, i.e., user energy profiling, appliance energy profiling, and off-peak load scheduling to perform a comparative analysis among different techniques. Then, we explain various energy conservation techniques, describe common and rare evaluation metrics, identify several techniques for realizing synthetic smart home energy consumption datasets, and provide a statistical analysis of the existing literature. The survey finally points out possible research directions which might lead to new inquiries in energy conservation research.
Popularity bias in personality perspective: An analysis of how personality traits expose individuals to the unfair recommendation
(Wiley, Şubat, 2023) Yalcin, Emre; Bilge, Alper
Recommender systems are subject to well-known popularity bias issues, that is, they expose frequently rated items more in recommendation lists than less-rated ones. Such a problem could also have varying effects on users with different gender, age, or rating behavior, which significantly diminishes the users' overall satisfaction with recommendations. In this paper, we approach the problem from the view of user personalities for the first time and discover how users are inclined toward popular items based on their personality traits. More importantly, we analyze the potential unfairness concerns for users with different personalities, which the popularity bias of the recommenders might cause. To this end, we split users into groups of high, moderate, and low clusters in terms of each personality trait in the big-five factor model and investigate how the popularity bias impacts such groups differently by considering several criteria. The experiments conducted with 10 well-known algorithms of different kinds have concluded that less-extroverted people and users avoiding new experiences are exposed to more unfair recommendations regarding popularity, despite being the most significant contributors to the system. However, discrepancies in other qualities of the recommendations for these user characteristics, such as accuracy, diversity, and novelty, vary depending on the utilized algorithm.
A novel classification?based shilling attack detection approach for multi?criteria recommender systems
(Wiley, Mayıs, 2023) Turkoglu Kaya, Tugba; Yalcin, Emre; Kaleli, Cihan
Recommender systems are emerging techniques guiding individuals with provided referrals by considering their past rating behaviors. By collecting multi-criteria preferences concentrating on distinguishing perspectives of the items, a new extension of traditional recommenders, multi-criteria recommender systems reveal how much a user likes an item and why user likes it; thus, they can improve predictive accuracy. However, these systems might be more vulnerable to malicious attacks than traditional ones, as they expose multiple dimensions of user opinions on items. Attackers might try to inject fake profiles into these systems to skew the recommendation results in favor of some particular items or to bring the system into discredit. Although several methods exist to defend systems against such attacks for traditional recommenders, achieving robust systems by capturing shill profiles remains elusive for multi-criteria rating-based ones. Therefore, in this study, we first consider a prominent and novel attack type, that is, the power-item attack model, and introduce its four distinct variants adapted for multi-criteria data collections. Then, we propose a classification method detecting shill profiles based on various generic and model-based user attributes, most of which are new features usually related to item popularity and distribution of rating values. The experiments conducted on three benchmark datasets conclude that the proposed method successfully detects attack profiles from genuine users even with a small selected size and attack size. The empirical outcomes also demonstrate that item popularity and user characteristics based on their rating profiles are highly beneficial features in capturing shilling attack profiles.
Robustness of privacy-preserving collaborative recommenders against popularity bias problem
(PeerJ, Temmuz, 2023) Gulsoy, Mert; Yalcin, Emre; Bilge, Alper
Recommender systems have become increasingly important in today’s digital age, but they are not without their challenges. One of the most significant challenges is that users are not always willing to share their preferences due to privacy concerns, yet they still require decent recommendations. Privacy-preserving collaborative recommenders remedy such concerns by letting users set their privacy preferences before submitting to the recommendation provider. Another recently discussed challenge is the problem of popularity bias, where the system tends to recommend popular items more often than less popular ones, limiting the diversity of recommendations and preventing users from discovering new and interesting items. In this article, we comprehensively analyze the randomized perturbation-based data disguising procedure of privacy-preserving collaborative recommender algorithms against the popularity bias problem. For this purpose, we construct user personas of varying privacy protection levels and scrutinize the performance of ten recommendation algorithms on these user personas regarding the accuracy and beyond-accuracy perspectives. We also investigate how well-known popularity-debiasing strategies combat the issue in privacy-preserving environments. In experiments, we employ three well-known real-world datasets. The key findings of our analysis reveal that privacy-sensitive users receive unbiased and fairer recommendations that are qualified in diversity, novelty, and catalogue coverage perspectives in exchange for tolerable sacrifice from accuracy. Also, prominent popularity-debiasing strategies fall considerably short as provided privacy level improves.
Aggregating user preferences in group recommender systems: A crowdsourcing approach
(Elsevier, 2022) Firat Ismailoglu
We present that group recommendations are similar to crowdsourcing, where the responses of different crowd workers are aggregated in the absence of ground truth. With this in mind, we mimic the use of the EM algorithm as in crowdsourcing to aggregate the preferences of group members to estimate group ratings and the expertise levels the group members. Moreover, for the first time in the literature, we cast the problem of estimating group rating as an ordinal classification problem relying on the natural ordering between the ratings, which allows us to define the expertise levels of the members in terms of sensitivity and specificity. In fact, we impose priors on the sensitivity and the specificity scores corresponding to the members, taking a Bayesian approach. We validate the effectiveness of the proposed aggregation method using the CAMRa2011 dataset, which consists of small and established groups, and the MovieLens dataset, which consists of large and random groups.
End to End Invoice Processing Application Based on Key Fields Extraction
(IEEE, 2022) Arslan, Halil
In this paper, an automatic invoice processing system, which is in great demand among private and public companies, was proposed. The proposed system supports all invoice file types that can be submitted by companies. Companies can easily submit invoices to the system via the web interface or email, and all invoices submitted to the system are queued and processed sequentially. If the invoice is a text file, the invoice information is extracted from the text by using template matching. If the invoice is an image, the text and table areas are detected and extracted. For table detection, we used both image processing based and YOLOv5-based deep learning method. Cell extraction was then performed from the extracted table images. As a result of these processes, all text and table cells were obtained as images and these images were converted into machine-readable text using the open-source software Tesseract OCR. Tesseract already provides trained models for English and Turkish. However, these models do not provide successful results for invoices submitted by companies in Turkish. Therefore, the new fine-tuned model trained with invoices in Turkish was used for OCR. The experimental results showed that the trained Turkish model was more accurate than the Turkish and English models provided by Tesseract. In addition, the YOLOv5-based table detection model was more accurate than the image-processing-based table detection method.
Exploring potential biases towards blockbuster items in ranking-based recommendations
(Springer, 2022) Yalcin Emre
Popularity bias is defined as the intrinsic tendency of recommendation algorithms to feature popular items more than unpopular ones in the ranked lists lists they produced. When investigating the adverse effects of popularity bias, the literature has usually focused on the most frequently rated items only. However, an item's popularity does not always indicate that it is highly-liked by individuals; in fact, the degree of liking may even introduce biases that are more extreme than the famous popularity bias in terms of beyond-accuracy evaluations. Therefore, in the present study, we attempt to consider items that are both popular and highly-liked, which we refer to as blockbuster items, and to investigate whether the recommendation algorithms impose a considerable bias in favor of the blockbuster items in their ranking-based recommendations. To this end, we first present a practical formulation that measures the degree of the blockbuster level of the items by combining their liking-degree and popularity effectively. Then, based on this formulation, we perform a comprehensive set of experiments with ten different algorithms on five datasets with different characteristics to explore the potential biases towards blockbuster items in recommendations. The experimental outcomes demonstrate that most recommenders propagate an undesirable bias in their recommendations towards the blockbuster items, and such a bias is, in fact, not caused by the item popularity. Moreover, the observed biases to blockbuster items are more harmful and persistent than those to popular ones in terms of beyond-accuracy aspects such as diversity, catalog coverage, and novelty. The obtained results also suggest that conventional popularity-debiasing strategies are not so talented in treating the adverse effects of the observed blockbuster bias in recommendations.
Evaluating unfairness of popularity bias in recommender systems: A comprehensive user-centric analysis
(Elsevier, 2022) Yalcin, Emre; Bilge, Alper
The popularity bias problem is one of the most prominent challenges of recommender systems, i.e., while a few heavily rated items receive much attention in presented recommendation lists, less popular ones are underrepresented even if they would be of close interest to the user. This structural tendency of recommendation algorithms causes several unfairness issues for most of the items in the catalog as they are having trouble finding a place in the top- 𝑁 lists. In this study, we evaluate the popularity bias problem from users’ viewpoint and discuss how to alleviate it by considering users as one of the major stakeholders. We derive five critical discriminative features based on the following five essential attributes related to users’ rating behavior, (i) the interaction level of users with the system, (ii) the overall liking degree of users, (iii) the degree of anomalous rating behavior of users, (iv) the consistency of users, and (v) the informative level of the user profiles, and analyze their relationships to the original inclinations of users toward item popularity. More importantly, we investigate their associations with possible unfairness concerns for users, which the popularity bias in recommendations might induce. The analysis using ten well-known recommendation algorithms from different families on four real-world preference collections from different domains reveals that the popularity propensities of individuals are significantly correlated with almost all of the investigated features with varying trends, and algorithms are strongly biased towards popular items. Especially, highly interacting, selective, and hard-to-predict users face highly unfair, relatively inaccurate, and primarily unqualified recommendations in terms of beyond-accuracy aspects, although they are major stakeholders of the system. We also analyze how state-ofthe-art popularity debiasing strategies act to remedy these problems. Although they are more effective for mistreated groups in alleviating unfairness and improving beyond-accuracy quality, they mostly fail to preserve ranking accuracy.
IESR: Instant Energy Scheduling Recommendations for Cost Saving in Smart Homes
(IEEE, 10.05.2022) Fakhar, Muhammad Zaman; Yalçın, Emre; Bilge, Alper
The exponential increase in energy demands continuously causes high price energy tariffs for domestic and commercial consumers. To overcome this problem, researchers strive to discover effective ways to reduce peak-hour energy demand through off-peak scheduling yielding low price energy tariffs. Efficient off-peak scheduling requires precise appliance pro ling to identify a scheduling recommendation for peak load management. We propose a novel off-peak scheduling technique that provides instant energy scheduling recommendations by monitoring appliances in real-time following user-devised criteria. Once an appliance operates during a peak hour and fulfills the user criteria, a real-time scheduling recommendation is presented for users' approval. The proposed technique utilizes appliance energy consumption data, user-devised criteria, and energy price signals to identify the recommendation points. The energy cost-saving performance of the proposed technique is evaluated using two publicly available real-world energy consumption datasets with four price signals. Simulation results show a significant cost-saving performance of up to 84% for the experimented datasets. Moreover, we formulate a novel evaluation metric to compare the performance of various off-peak scheduling techniques on similar criteria. Comparative analysis indicates that the proposed technique outperforms the existing methods.
A multiclass hybrid approach to estimating software vulnerability vectors and severity score
(Elsevier, 2021) Kekül, Hakan; Ergen, Burhan; Arslan, Halil
Classifying detected software vulnerabilities is an important process. However, the metric values of security vectors are manually determined by humans, which takes time and may introduce errors stemming from human nature. These metrics are important because of their role in the calculation of vulnerability severity. It is necessary to use machine learning algorithms and data mining techniques to improve the quality and speed of vulnerability analysis and discovery processes. However, studies in this area are still limited. In this study, vulnerability vectors were estimated using the natural language processing techniques bag of words, term frequency–inverse document frequency, and n-gram for feature extraction together with various multiclass classification algorithms, namely Naïve Bayes, decision tree, k-nearest neighbors, multilayer perceptron, and random forest. Our experiments using a large public dataset facilitate assessment and provide a standard-compliant prediction model for classifying software vulnerability vectors. The results show that the joint use of different techniques and classification algorithms is a promising solution to a multi-probability and difficult-to-predict problem. In addition, our study fills an important gap in its field in terms of the size of the dataset used and because it covers a vulnerability scoring system version that has not yet been extensively studied.
Open Source Software Development Challenges: A Systematic Literature Review on GitHub
(IGI GLOBAL, 01.11.2021) Seker, Abdulkadir; Diri, Banu; Arslan, Halil; Amasyalı, Fatih
GitHub is the most common code hosting and repository service for open-source software (OSS) projects. Thanks to the great variety of features, researchers benefit from GitHub to solve a wide range of OSS development challenges. In this context, the authors thought that was important to conduct a literature review on studies that used GitHub data. To reach these studies, they conducted this literature review based on a GitHub dataset source study instead of a keyword-based search in digital libraries. Since GHTorrent is the most widely known GitHub dataset according to the literature, they considered the studies that cite this dataset for the systematic literature review. In this study, they reviewed the selected 172 studies according to some criteria that used the dataset as a data source. They classified them within the scope of OSS development challenges thanks to the information they extract from the metadata of studies. They put forward some issues about the dataset and they offered the focused and attention-grabbing fields and open challenges that we encourage the researchers to study on them.
New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems
(MDPI, 20.01.2021) Seker, Abdulkadir; Diri, Banu; Arslan, Halil
Software collaboration platforms where millions of developers from diverse locations can contribute to the common open source projects have recently become popular. On these platforms, various information is obtained from developer activities that can then be used as developer metrics to solve a variety of challenges. In this study, we proposed new developer metrics extracted from the issue, commit, and pull request activities of developers on GitHub. We created developer metrics from the individual activities and combined certain activities according to some common traits. To evaluate these metrics, we created an item-based project recommendation system. In order to validate this system, we calculated the similarity score using two methods and assessed top-n hit scores using two different approaches. The results for all scores with these methods indicated that the most successful metrics were binary_issue_related, issue_commented, binary_pr_related, and issue_opened. To verify our results, we compared our metrics with another metric generated from a very similar study and found that most of our metrics gave better scores that metric. In conclusion, the issue feature is more crucial for GitHub compared with other features. Moreover, commenting activity in projects can be equally as valuable as code contributions. The most of binary metrics that were generated, regardless of the number of activities, also showed remarkable results. In this context, we presented improvable and noteworthy developer metrics that can be used for a wide range of open-source software development challenges, such as user characterization, project recommendation, and code review assignment.
Treating adverse effects of blockbuster bias on beyond-accuracy quality of personalized recommendations
(Elsevier, 2022) Yalcin Emre; Bilge Alper
Collaborative filtering recommendation algorithms are vulnerable against the popularity bias, including the most popular items repeatedly into the produced ranked lists. However, the research on popularity bias focuses solely on the number of times items are rated rather than the magnitude of the provided ratings when scrutinizing the adverse effects of such bias. This paper introduces a metric describing the blockbuster items that are popular and highly rated simultaneously and investigates the potential biases of collaborative recommendation algorithms towards such items comprehensively. Then, we develop an algorithmic post-processing debiasing approach for potential blockbuster bias in recommendations. Specifically, this method aims to penalize blockbuster items in produced ranked lists by re-sorting items based on the artificial ranking scores, estimated by considering both the blockbuster degree of the items and the generated predictions for them simultaneously. The experiments conducted on three benchmark real-world datasets demonstrate that four prominent collaborative filtering algorithms lead to an undesirable bias in their recommendations towards blockbuster items. The empirical outcomes also indicate that our mitigation method helps treat the adverse effects of the blockbuster bias in terms of beyond-accuracy evaluations such as catalog coverage, diversity, and novelty, with negligible losses in ranking accuracy.
An entropy empowered hybridized aggregation technique for group recommender systems
(Elsevier, 15.03.2021) Yalcin Emre; Ismailoglu Firat; Bilge Alper
Group recommender systems aim to suggest appropriate products/services to a group of users rather than individuals. These recommendations rely solely on determining group preferences, which is accomplished by an aggregation technique that combines individuals’ preferences. A plethora of aggregation techniques of various types have been developed so far. However, they consider only one particular aspect of the provided ratings in aggregating (e.g., counts, rankings, high averages), which imposes some limitations in capturing group members’ propensities. Besides, maximizing the number of satisfied members with the recommended items is as significant as producing items tailored to the individual users. Therefore, the ratings’ distribution is an essential element for aggregation techniques to discover items on which the majority of the members provided a consensus. This study proposes two novel aggregation techniques by hybridizing additive utilitarian and approval voting methods to feature popular items on which group members provided a consensus. Experiments conducted on three real-world benchmark datasets demonstrate that the proposed hybridized techniques significantly outperform all traditional methods. For the first time in the literature, we offer to use entropy to analyze rating distributions and detect items on which group members have reached no or little consensus. Equipping the proposed hybridized type aggregation techniques with the entropy calculation, we end up with an ultimate enhanced aggregation technique, Agreement without Uncertainty, which was proven to be even better than the hybridized techniques and outperform two recent state-of-the-art techniques.
Novel automatic group identification approaches for group recommendation
(Elsevier, 2021) Yalcin Emre; Bilge Alper
Group recommender systems are specialized in suggesting preferable products or services to a group of users rather than an individual by aggregating personal preferences of group members. In such expert systems, the initial task is to identify groups of similar users via clustering approaches as user groups are usually not predefined. However, clustering users into groups commonly suffer from sparsity, scalability, and complexity problems as the content in the domain proliferate. Moreover, group homogeneity and size are the critical parameters for organizing group members and enhancing their satisfaction. In this study, we propose novel automatic user grouping approaches by constructing a binary decision tree via bisecting k-means clustering for enhanced group formation and group size restriction. Furthermore, we propose adopting a genre-based mapping of user ratings into a tiny and dense vector to represent users, which both improves computation time for constructing the binary decision tree and enables eliminating adverse effects of sparsity. Finally, since the quality of group formation is not only dependent on conforming preferences but also to the demographic harmony among members, we further introduce utilizing similarities based on demographic characteristics along with the genre-based similarities. We propose applying two distinct strategies for small and large groups by decorating the genre-based similarities with demographic properties, which leads to a more homogeneous automatic group formation. Experiments performed on real-world benchmark datasets demonstrate that each proposed method outperforms its traditional rival significantly, and the final proposed method achieves significantly more qualified ranked recommendation lists than the state-of-the-art algorithm.

Güncel Gönderiler