地球资源数据云——数据资源详情
该数据集《Topic Modeling for Research Articles》主要用于回归/预测任务,数据形态以文本为主,应用场景偏向金融风控。 题目说明:NLP Topic Modelling based on Research Articles. 任务类型:文本回归/预测。 建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:test.csv, train.csv。 Context Since the lockdown was announced in the country back in March, we started with a 1 day hackathon called Janatahack inspired from Janata cerfew to start our war against the pandemic. Looking at the amazing response and demand for more, we continued the hackathons over the weekends every week. Janatahack today is a phenomena where loads of esteemed members of our community regularly participate to showcase their machine learning skills by sharing their approaches and more important to learn how to apply machine learning and predictive analytics to new domains such as agriculture, Banking, IOT, forecasting and so on. This time we bring to you hackathon, this time a 10 day extravaganza launching on the independence day for India, 15th August 2020. Open to all data practitioners, beginners in data science and data scientists. Register today to test your skills and earn AV Points. The theme for this hackathon will be launched on the independence day along with the problem statement and the dataset. So stay tuned and register today to receive all the updates regarding this exciting event.

该数据集《Topic Modeling for Research Articles》主要用于回归/预测任务,数据形态以文本为主,应用场景偏向金融风控。 题目说明:NLP Topic Modelling based on Research Articles.
任务类型:文本回归/预测。
建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。
评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。
可用文件:test.csv, train.csv。
Context
Since the lockdown was announced in the country back in March, we started with a 1 day hackathon called Janatahack inspired from Janata cerfew to start our war against the pandemic. Looking at the amazing response and demand for more, we continued the hackathons over the weekends every week.
Janatahack today is a phenomena where loads of esteemed members of our community regularly participate to showcase their machine learning skills by sharing their approaches and more important to learn how to apply machine learning and predictive analytics to new domains such as agriculture, Banking, IOT, forecasting and so on.
This time we bring to you hackathon, this time a 10 day extravaganza launching on the independence day for India, 15th August 2020. Open to all data practitioners, beginners in data science and data scientists. Register today to test your skills and earn AV Points.
The theme for this hackathon will be launched on the independence day along with the problem statement and the dataset. So stay tuned and register today to receive all the updates regarding this exciting event.