当前位置:网站首页>NLP text summary: data set introduction and preprocessing [New York Times annotated corpus]
NLP text summary: data set introduction and preprocessing [New York Times annotated corpus]
2022-06-30 02:50:00 【Ninja luantaro】
New York Time Description of the corpus :
- 1.8 million The article
- exceed 650k Manually written article summaries
- exceed 1.5 million Manually tagged articles , Tags include figure , place , organization , title , The theme
- exceed 275k Use algorithms to generate tagged articles
- For parsing xml Of documents java Tools
There are... In the corpus 650k A manually written article summary , This can be used to evaluate the document summarization algorithm ,
Reference material :
New York Times Corpus Introduce ( To be continued )
The New York Times Annotated Corpus
边栏推荐
- 002 color classification
- What is the concept of string in PHP
- Intel-Hex , Motorola S-Record 格式详细解析
- Xunwei enzhipu ITop - imx6 Development Platform
- How vscode debugs into standard library files / third-party package source code
- Network neuroscience -- a review of network Neuroscience
- What is the difference between a layer 3 switch and a layer 2 switch
- 什么是证书透明度CT?如何查询CT logs证书日志?
- 怎样的外汇交易平台是有监管的,是安全的?
- Unity3d ugui force refresh of layout components
猜你喜欢

什么是证书透明度CT?如何查询CT logs证书日志?

Raki's notes on reading paper: neighborhood matching network for entity alignment

IDEA 远程调试 Remote JVM Debug

NPDP产品经理国际认证考试报名有什么要求?

Software testing skills, JMeter stress testing tutorial, transaction controller of logic controller (25)

Creating exquisite skills in maker Education

Raki's notes on reading paper: named entity recognition as dependency parsing

Idea remote debugging remote JVM debug

Some configuration details about servlet initial development

Jupyter notebook显示k线图集合
随机推荐
什么是证书透明度CT?如何查询CT logs证书日志?
Distributed file system fastdfs
【npm】解决使用npm安装TypeORM的报错问题
LeetCode 3. 无重复字符的最长子串
原生JS怎么生成九宫格
How does native JS generate Jiugong lattice
Xunwei enzhipu ITop - imx6 Development Platform
Seven common errors of SSL certificate and their solutions
微信小程序页面跳转以及参数传递
Cmake tutorial series-01-minimum configuration example
学术汇报(academic presentation)/PPT应该怎么做?
CMake教程系列-01-最小配置示例
What is certificate transparency CT? How to query CT logs certificate logs?
c#控制台格式化代码
What should academic presentation /ppt do?
The rigorous judgment of ID number is accurate to the last place in the team
[NPM] solve the problem of error reporting when installing typeorm with NPM
Global and Chinese markets for wireless security in LTE networks 2022-2028: Research Report on technology, participants, trends, market size and share
Global and Chinese market for defense network security 2022-2028: Research Report on technology, participants, trends, market size and share
A quick look at the statistical data of 23 major cyber crimes from 2021 to 2022