The Effect of Document Length on Machine Learning Success in Text-Based Data
dc.contributor.author | Polatgil, Mesut | |
dc.contributor.author | Kekul, Hakan | |
dc.date.accessioned | 2024-10-26T17:51:06Z | |
dc.date.available | 2024-10-26T17:51:06Z | |
dc.date.issued | 2023 | |
dc.department | Sivas Cumhuriyet Üniversitesi | |
dc.description | 2023 Innovations in Intelligent Systems and Applications Conference, ASYU 2023 -- 11 October 2023 through 13 October 2023 -- Sivas -- 194153 | |
dc.description.abstract | Natural Language Processing (NLP) is an important research area for artificial intelligence studies. In the process of processing textual data, feature extraction and the creation of the word-document vector are very important. Especially for machine learning algorithms, these numerical vectors play a critical role in the creation of the model. Textual data must be preprocessed to generate these vectors. There are common methods such as removing stopwords, converting text to lowercase, and cleaning punctuation marks. The effects of these methods on the created model have also been investigated in the literature. However, it has not been investigated how the length values of the text can affect the model created. So how does a document or text having less than 10 or 20 characters affect the machine learning model? This study was carried out in order to solve this problem and fill the gap in the literature. The effect of text length on text classification models has been tested with different feature extraction methods. © 2023 IEEE. | |
dc.identifier.doi | 10.1109/ASYU58738.2023.10296594 | |
dc.identifier.isbn | 979-835030659-0 | |
dc.identifier.scopus | 2-s2.0-85178317821 | |
dc.identifier.uri | https://doi.org/10.1109/ASYU58738.2023.10296594 | |
dc.identifier.uri | https://hdl.handle.net/20.500.12418/26017 | |
dc.indekslendigikaynak | Scopus | |
dc.language.iso | en | |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | |
dc.relation.ispartof | 2023 Innovations in Intelligent Systems and Applications Conference, ASYU 2023 | |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | |
dc.rights | info:eu-repo/semantics/closedAccess | |
dc.subject | feature extraction; machine learning; text analysis; text classification; Text length | |
dc.title | The Effect of Document Length on Machine Learning Success in Text-Based Data | |
dc.type | Conference Object |