地球资源数据云——数据资源详情

美国人口普查数据集:教育、金融、工业

发布时间:2026-03-17 14:31:30资源ID:2031994553928945665资源类型:免费

该数据集《U.S. Census Dataset : Education, Finance, Industry》主要用于回归/预测任务,数据形态以表格为主,应用场景偏向金融风控。 题目说明:Education, Finance, Industry: Analyzing U.S. District - Level Population 2019 - 2021 任务类型:表格回归/预测。 建议流程:先做缺失值/异常值处理与特征编码,再比较逻辑回归、随机森林、XGBoost。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:Educationv.csv, Finance.csv, Industry.csv。 The data set for Congressional districts originated from the dataset at the zip code level, which was aggregated and processed. The below details shows the information of the sources: Industry Population Data - Tabular Data Industry file 1: rows 33120 X columns 542, Year: 2019 Industry file 2: rows 33120 X columns 542, Year: 2020 Industry file 3: rows 33120 X columns 542, Year: 2021 Clubbed dataset name: Industry_zipcode Within each file, we can find information on the estimated population of people belonging to various industrial types at the level of U.S. zip codes for a given year. Additionally, these files include columns that provide details on the total estimated population within a particular industry type, further broken down into estimated male and estimated female populations for that industry. It is important to note that there are additional data columns that are unrelated to the goals of our research.

美国人口普查数据集:教育、金融、工业

摘要概览

该数据集《U.S. Census Dataset : Education, Finance, Industry》主要用于回归/预测任务,数据形态以表格为主,应用场景偏向金融风控。 题目说明:Education, Finance, Industry: Analyzing U.S. District - Level Population 2019 - 2021

任务类型:表格回归/预测。

建议流程:先做缺失值/异常值处理与特征编码,再比较逻辑回归、随机森林、XGBoost。

评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。

可用文件:Educationv.csv, Finance.csv, Industry.csv。

The data set for Congressional districts originated from the dataset at the zip code level, which was aggregated and processed.

The below details shows the information of the sources:

Industry Population Data - Tabular Data Industry file 1: rows 33120 X columns 542, Year: 2019 Industry file 2: rows 33120 X columns 542, Year: 2020 Industry file 3: rows 33120 X columns 542, Year: 2021 Clubbed dataset name: Industry_zipcode Within each file, we can find information on the estimated population of people belonging to various industrial types at the level of U.S.

zip codes for a given year. Additionally, these files include columns that provide details on the total estimated population within a particular industry type, further broken down into estimated male and estimated female populations for that industry.

It is important to note that there are additional data columns that are unrelated to the goals of our research.