地球资源数据云——数据资源详情
该数据集《Diabetes Dataset 2019》主要用于二分类任务,数据形态以文本为主。 题目说明:Prediction and classification using Machine Learning 任务类型:文本二分类。 建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:diabetes_dataset__2019.csv。 Context This dataset was collected by Neha Prerna Tigga and Dr. Shruti Garg of the Department of Computer Science and Engineering, BIT Mesra, Ranchi - 835215 for research, non - commercial purposes only. An article is also published implementing this dataset. For more information and citation of this dataset please refer: Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes using Machine Learning Classification Methods. Procedia Computer Science, 167, 706 - 716. DOI: https://doi.org/10.1016/j.procs.2020.03.336 Content There is a total of 952 instances with 17 independent predictor variables and one binary target or dependent variable, Diabetes.

该数据集《Diabetes Dataset 2019》主要用于二分类任务,数据形态以文本为主。 题目说明:Prediction and classification using Machine Learning
任务类型:文本二分类。
建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。
评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。
可用文件:diabetes_dataset__2019.csv。
Context
This dataset was collected by Neha Prerna Tigga and Dr. Shruti Garg of the Department of Computer Science and Engineering, BIT Mesra, Ranchi - 835215 for research, non - commercial purposes only. An article is also published implementing this dataset. For more information and citation of this dataset please refer:
Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes using Machine Learning Classification Methods. Procedia Computer Science, 167, 706 - 716. DOI: https://doi.org/10.1016/j.procs.2020.03.336
Content
There is a total of 952 instances with 17 independent predictor variables and one binary target or dependent variable, Diabetes.
该数据集《Diabetes Dataset 2019》主要用于二分类任务,数据形态以文本为主。
数据格式为 CSV。
Context This dataset was collected by Neha Prerna Tigga and Dr.
在本页登录后即可下载。建议引用格式:地球资源数据云. 2019 年糖尿病数据集. https://www.gis5g.com/dataset/2031261418597552129