Best practices of feature selection in multi-omics data

Ipekten, Funda; Zararsiz, Gözde Ertürk; Doğan, Halef Okan; Eldem, Vahap; Zararsiz, Gökmen

Best practices of feature selection in multi-omics data

Tarih

2024

Yazarlar

Ipekten, Funda

Zararsiz, Gözde Ertürk

Doğan, Halef Okan

Eldem, Vahap

Zararsiz, Gökmen

Yayıncı

IGI Global

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

With the recent advances in molecular biology techniques such as next-generation sequencing, massspectrometry, etc., a large omic data is produced. Using such data, the expression levels of thousands of molecular features (genes, proteins, metabolites, etc.) can be quantified and associated with diseases. The fact that multiple omics data contains different types of data and the number of analyzed variables increases the complexity of the models created with machine learning methods. In addition, due to many variables, the investigation of molecular variables associated with diseases is very costly. Therefore, selecting the informative and disease-related molecular features is applicable before model training and evaluation. This feature selection step is essential for obtaining accurate and generalizable models in minimum time with minimum cost. Some current methods used for feature selection are as follows: recursive feature elimination, information gain, minimum redundancy maximum relevance (mRMR), boruta, altmann, and lasso. © 2024, IGI Global. All rights reserved.

Kaynak

Research Anthology on Bioinformatics, Genomics, and Computational Biology

Bağlantı

https://doi.org/10.4018/979-8-3693-3026-5.ch014
https://hdl.handle.net/20.500.12418/26656

Koleksiyon

Scopus İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Best practices of feature selection in multi-omics data

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon