地球资源数据云——数据资源详情
该数据集《Supreme Court Judgment Prediction》主要用于多分类任务,数据形态以文本为主。 题目说明:Predict the judgment of a court using NLP 任务类型:文本多分类。 建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。 评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。 可用文件:justice.csv。 Context Artificial intelligence is being utilized in many domains as of late, and the legal system is no exception. However, as it stands now, the number of well - annotated datasets pertaining to legal documents from the Supreme Court of the United States (SCOTUS) is very limited for public use. Even though the Supreme Court rulings are public domain knowledge, trying to do meaningful work with them becomes a much greater task due to the need to manually gather and process that data from scratch each time. Hence, our goal is to create a high - quality dataset of SCOTUS court cases so that they may be readily used in natural language processing (NLP) research and other data - driven applications. Additionally, recent advances in NLP provide us with the tools to build predictive models that can be used to reveal patterns that influence court decisions. By using advanced NLP algorithms to analyze previous court cases, the trained models are able to predict and classify a court's judgment given the case's facts from the plaintiff and the defendant in textual format; in other words, the model is emulating a human jury by generating a final verdict

该数据集《Supreme Court Judgment Prediction》主要用于多分类任务,数据形态以文本为主。 题目说明:Predict the judgment of a court using NLP
任务类型:文本多分类。
建议流程:先做文本清洗与分词,再比较 TF - IDF+线性模型 与 预训练语言模型。
评估建议:使用分层切分或交叉验证,优先关注 F1、Recall、AUC 等分类指标。
可用文件:justice.csv。
Context
Artificial intelligence is being utilized in many domains as of late, and the legal system is no exception. However, as it stands now, the number of well - annotated datasets pertaining to legal documents from the Supreme Court of the United States (SCOTUS) is very limited for public use.
Even though the Supreme Court rulings are public domain knowledge, trying to do meaningful work with them becomes a much greater task due to the need to manually gather and process that data from scratch each time.
Hence, our goal is to create a high - quality dataset of SCOTUS court cases so that they may be readily used in natural language processing (NLP) research and other data - driven applications. Additionally, recent advances in NLP provide us with the tools to build predictive models that can be used to reveal patterns that influence court decisions.
By using advanced NLP algorithms to analyze previous court cases, the trained models are able to predict and classify a court's judgment given the case's facts from the plaintiff and the defendant in textual format; in other words, the model is emulating a human jury by generating a final verdict