地球资源数据云——数据资源详情
该数据集《Indian School Education Statistics》主要用于监督学习任务,数据形态以文本为主,应用场景偏向天文科学。 题目说明:Statistics for Indian Schools, includes GER, Water facility, Electricity, DR,etc 任务类型:文本监督学习。 建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:dropout - ratio - 2012 - 2015.csv, gross - enrollment - ratio - 2013 - 2016.csv, percentage - of - schools - with - comps - 2013 - 2016.csv 等 7 个文件。 Context This dataset contains information about Indian School Education Statistics of the year 2013 - 2014 to 2015 - 2016. Many public datasets from Indian Government are scattered and the goal here is to have all those datasets under one umbrella so that it is easy for beginners to find important datasets like this to start their Data Science journey. Content I acquired this dataset from here. Have a look at the website. This dataset contains 7 files in .csv format. You can find a description for each column. Let me summarize it here too.

该数据集《Indian School Education Statistics》主要用于监督学习任务,数据形态以文本为主,应用场景偏向天文科学。 题目说明:Statistics for Indian Schools, includes GER, Water facility, Electricity, DR,etc
任务类型:文本监督学习。
建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。
评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。
可用文件:dropout - ratio - 2012 - 2015.csv, gross - enrollment - ratio - 2013 - 2016.csv, percentage - of - schools - with - comps - 2013 - 2016.csv 等 7 个文件。
Context
This dataset contains information about Indian School Education Statistics of the year 2013 - 2014 to 2015 - 2016. Many public datasets from Indian Government are scattered and the goal here is to have all those datasets under one umbrella so that it is easy for beginners to find important datasets like this to start their Data Science journey.
Content
I acquired this dataset from here. Have a look at the website.
This dataset contains 7 files in .csv format. You can find a description for each column. Let me summarize it here too.