地球资源数据云——数据资源详情
该数据集《Student Performance in Secondary Education》主要用于多分类任务,数据形态以表格为主,应用场景偏向医疗健康。 题目说明:Demographic, social & academic factors shaping student grades in Portugal 任务类型:表格多分类。 建议流程:先做缺失值/异常值处理与特征编码,再比较逻辑回归、随机森林、XGBoost。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:student.csv。 This dataset contains information on secondary school student performance collected from two Portuguese schools. It was originally introduced by Cortez & Silva in the paper “Using Data Mining to Predict Secondary School Student Performance.” The data was gathered through school reports and student questionnaires, covering demographic, social, and academic - related variables. Two separate datasets are provided: student - mat.csv → Math course performance student - por.csv → Portuguese language course performance Number of instances: 649 (Mathematics) + 649 (Portuguese) Number of features: 30 input variables + 3 grade outputs (G1, G2, G3) Target variable: G3 (final grade, 0–20 scale) Missing values: None

该数据集《Student Performance in Secondary Education》主要用于多分类任务,数据形态以表格为主,应用场景偏向医疗健康。 题目说明:Demographic, social & academic factors shaping student grades in Portugal
任务类型:表格多分类。
建议流程:先做缺失值/异常值处理与特征编码,再比较逻辑回归、随机森林、XGBoost。
评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。
可用文件:student.csv。
This dataset contains information on secondary school student performance collected from two Portuguese schools. It was originally introduced by Cortez & Silva in the paper “Using Data Mining to Predict Secondary School Student Performance.”
The data was gathered through school reports and student questionnaires, covering demographic, social, and academic - related variables. Two separate datasets are provided:
student - mat.csv → Math course performance
student - por.csv → Portuguese language course performance
Number of instances: 649 (Mathematics) + 649 (Portuguese) Number of features: 30 input variables + 3 grade outputs (G1, G2, G3) Target variable: G3 (final grade, 0–20 scale) Missing values: None