地球资源数据云——数据资源详情

2019 年糖尿病数据集

发布时间:2026-03-17 14:32:17资源ID:2031261418597552129资源类型:免费

该数据集《Diabetes Dataset 2019》主要用于二分类任务,数据形态以文本为主。 题目说明:Prediction and classification using Machine Learning 任务类型:文本二分类。 建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:diabetes_dataset__2019.csv。 Context This dataset was collected by Neha Prerna Tigga and Dr. Shruti Garg of the Department of Computer Science and Engineering, BIT Mesra, Ranchi - 835215 for research, non - commercial purposes only. An article is also published implementing this dataset. For more information and citation of this dataset please refer: Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes using Machine Learning Classification Methods. Procedia Computer Science, 167, 706 - 716. DOI: https://doi.org/10.1016/j.procs.2020.03.336 Content There is a total of 952 instances with 17 independent predictor variables and one binary target or dependent variable, Diabetes.

2019 年糖尿病数据集

摘要概览

该数据集《Diabetes Dataset 2019》主要用于二分类任务,数据形态以文本为主。 题目说明:Prediction and classification using Machine Learning

任务类型:文本二分类。

建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。

评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。

可用文件:diabetes_dataset__2019.csv。

Context

This dataset was collected by Neha Prerna Tigga and Dr. Shruti Garg of the Department of Computer Science and Engineering, BIT Mesra, Ranchi - 835215 for research, non - commercial purposes only. An article is also published implementing this dataset. For more information and citation of this dataset please refer:

Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes using Machine Learning Classification Methods. Procedia Computer Science, 167, 706 - 716. DOI: https://doi.org/10.1016/j.procs.2020.03.336

Content

There is a total of 952 instances with 17 independent predictor variables and one binary target or dependent variable, Diabetes.

常见问题

2019 年糖尿病数据集是什么?

该数据集《Diabetes Dataset 2019》主要用于二分类任务,数据形态以文本为主。

2019 年糖尿病数据集是什么数据格式?坐标系是什么?

数据格式为 CSV。

2019 年糖尿病数据集是如何生产或处理的?

Context This dataset was collected by Neha Prerna Tigga and Dr.

如何获取并引用2019 年糖尿病数据集?

在本页登录后即可下载。建议引用格式:地球资源数据云. 2019 年糖尿病数据集. https://www.gis5g.com/dataset/2031261418597552129