当前位置:网站首页>NLP text summary: data set introduction and preprocessing [New York Times annotated corpus]
NLP text summary: data set introduction and preprocessing [New York Times annotated corpus]
2022-06-30 02:50:00 【Ninja luantaro】
New York Time Description of the corpus :
- 1.8 million The article
- exceed 650k Manually written article summaries
- exceed 1.5 million Manually tagged articles , Tags include figure , place , organization , title , The theme
- exceed 275k Use algorithms to generate tagged articles
- For parsing xml Of documents java Tools
There are... In the corpus 650k A manually written article summary , This can be used to evaluate the document summarization algorithm ,
Reference material :
New York Times Corpus Introduce ( To be continued )
The New York Times Annotated Corpus
边栏推荐
- 在php中字符串的概念是什么
- 002 color classification
- Global and Chinese market of relay lens 2022-2028: Research Report on technology, participants, trends, market size and share
- Ffmpeg source code
- Global and Chinese markets for light cargo conveyors 2022-2028: Research Report on technology, participants, trends, market size and share
- Global and Chinese market of wind energy equipment logistics 2022-2028: Research Report on technology, participants, trends, market size and share
- Some configuration details about servlet initial development
- (graph theory) connected component (template) + strongly connected component (template)
- oracle怎么设置密码复杂度及超时退出的功能
- 【postgres】postgres 数据库迁移
猜你喜欢

What are the requirements for NPDP product manager international certification examination?

Time complexity analysis

Differences among digicert, SECTIONO and globalsign code signing certificates

Study diary: February 15, 2022

微信小程序页面跳转以及参数传递

Mysql表数据比较大情况下怎么修改添加字段

Matlab code running tutorial (how to run the downloaded code)

如何在 JupyterLab 中把 ipykernel 切换到不同的 conda 虚拟环境?

Raki's notes on reading paper: Leveraging type descriptions for zero shot named entity recognition and classification

什么是自签名证书?自签名SSL证书的优缺点?
随机推荐
002 color classification
2.8 【 weight of complete binary tree 】
How to use redis to realize the like function
Intel-Hex , Motorola S-Record 格式详细解析
Linear algebra Chapter 3 summary of vector and vector space knowledge points (Jeff's self perception)
oracle怎么设置密码复杂度及超时退出的功能
Distributed file system fastdfs
银行的理财产品一般期限是多久?
中断操作:AbortController学习笔记
如何在 JupyterLab 中把 ipykernel 切换到不同的 conda 虚拟环境?
Mysql表数据比较大情况下怎么修改添加字段
LeetCode 3. Longest substring without duplicate characters
CMake教程系列-03-依赖管理
Unity3D UGUI强制刷新Layout(布局)组件
Three solutions to forced hibernation of corporate computers
[Postgres] Postgres database migration
How can redis+aop customize annotations to achieve flow restriction
Steam elements hidden in science and Technology Education
Cmake tutorial series-03-dependency management
IBM WebSphere channel connectivity setup and testing