Kennung
pages.identifier.id.title
71a3d172-5b1d-4e7c-83c3-d947faf82accPersistenter Bezeichner
DOI: 10.82556/bexb-5283
Titel
Spam Mails Dataset
Beschreibungen
AbstractEnglish
Preprocessed data derived from the "spam-mails" dataset, containing email messages labeled as spam or ham. Each record includes a unique identifier from the original dataset and an experiment_id indicating its assignment to a specific data split (training, validation, or test) used in this experiment. The email content has been lemmatized and cleaned to remove noise such as punctuation, special characters, and stopwords, ensuring consistent input for embedding and model training. Original data source: https://www.kaggle.com/datasets/venky73/spam-mails-dataset
Herausgeber
TU Wien
Schöpfer
Veröffentlichungsdatum
2025
Zitierempfehlung