Kennung
pages.identifier.id.title
71a3d172-5b1d-4e7c-83c3-d947faf82acc
Persistenter Bezeichner
Titel

Spam Mails Dataset

Beschreibungen
AbstractEnglish

Preprocessed data derived from the "spam-mails" dataset, containing email messages labeled as spam or ham. Each record includes a unique identifier from the original dataset and an experiment_id indicating its assignment to a specific data split (training, validation, or test) used in this experiment. The email content has been lemmatized and cleaned to remove noise such as punctuation, special characters, and stopwords, ensuring consistent input for embedding and model training. Original data source: https://www.kaggle.com/datasets/venky73/spam-mails-dataset

Herausgeber
TU Wien
Schöpfer

Veröffentlichungsdatum
2025
Zitierempfehlung