地球资源数据云——数据资源详情
该数据集《Music Genre Classification》主要用于多分类任务,数据形态以文本为主,应用场景偏向金融风控。 题目说明:Optimizing multi - class log loss to generalize well on unseen data 任务类型:文本多分类。 建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:submission.csv, test.csv, train.csv。 Context Dataset is acquired from one of the MachineHack Hackathon Content Training dataset: 17,996 rows with 17 columns Column details: artist name; track name; popularity; ‘danceability’; energy; key; loudness; mode; ‘speechiness’; ‘acousticness’; ‘instrumentalness’; liveness; valence; tempo; duration in milliseconds and time_signature.

该数据集《Music Genre Classification》主要用于多分类任务,数据形态以文本为主,应用场景偏向金融风控。 题目说明:Optimizing multi - class log loss to generalize well on unseen data
任务类型:文本多分类。
建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。
评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。
可用文件:submission.csv, test.csv, train.csv。
Context
Dataset is acquired from one of the MachineHack Hackathon
Content
Training dataset: 17,996 rows with 17 columns
Column details: artist name; track name; popularity; ‘danceability’; energy; key; loudness; mode; ‘speechiness’; ‘acousticness’; ‘instrumentalness’; liveness; valence; tempo; duration in milliseconds and time_signature.