当前位置:网站首页>NLP text summary: data set introduction and preprocessing [New York Times annotated corpus]
NLP text summary: data set introduction and preprocessing [New York Times annotated corpus]
2022-06-30 02:50:00 【Ninja luantaro】
New York Time Description of the corpus :
- 1.8 million The article
- exceed 650k Manually written article summaries
- exceed 1.5 million Manually tagged articles , Tags include figure , place , organization , title , The theme
- exceed 275k Use algorithms to generate tagged articles
- For parsing xml Of documents java Tools
There are... In the corpus 650k A manually written article summary , This can be used to evaluate the document summarization algorithm ,
Reference material :
New York Times Corpus Introduce ( To be continued )
The New York Times Annotated Corpus
边栏推荐
- 可视化HTA窗体设计器-HtaMaker 界面介绍及使用方法,下载 | HTA VBS可视化脚本编写
- Call collections Sort() method, compare two person objects (by age ratio first, and by name ratio for the same age), and pass lambda expression as a parameter.
- Entering Jiangsu writers and poets carmine Jasmine World Book Day
- Global and Chinese market of wind energy equipment logistics 2022-2028: Research Report on technology, participants, trends, market size and share
- Série de tutoriels cmake - 02 - génération de binaires à l'aide du Code cmake
- 2. < tag dynamic programming and 0-1 knapsack problem > lt.416 Split equal sum subset + lt.1049 Weight of the last stone II
- 2022 the action of protecting the net is imminent. Things about protecting the net
- Unity timeline data binding
- Xunwei NXP itop-imx6 development platform
- Steam elements hidden in science and Technology Education
猜你喜欢

uniapp 地址转换经纬度

Ffmpeg source code

打造创客教育中精湛技艺

What files does a CA digital certificate contain? How to view SSL certificate information?

隐藏在科技教育中的steam元素

IBM WebSphere channel connectivity setup and testing

学术汇报(academic presentation)/PPT应该怎么做?

公司电脑强制休眠的3种解决方案

What is the concept of string in PHP

Raki's notes on reading paper: named entity recognition as dependency parsing
随机推荐
Série de tutoriels cmake - 02 - génération de binaires à l'aide du Code cmake
中断操作:AbortController学习笔记
身份证号的严谨判断精确到队后一位
oracle怎么设置密码复杂度及超时退出的功能
Global and Chinese market of wind energy equipment logistics 2022-2028: Research Report on technology, participants, trends, market size and share
Welfare lottery | what are the highlights of open source enterprise monitoring zabbix6.0
How does native JS generate Jiugong lattice
NPDP产品经理国际认证考试报名有什么要求?
How to use redis to realize the like function
How to use vant to realize data paging and drop-down loading
Unity TimeLine 数据绑定
Global and Chinese market for defense network security 2022-2028: Research Report on technology, participants, trends, market size and share
How can redis+aop customize annotations to achieve flow restriction
Enlightenment from the revocation of Russian digital certificate by mainstream CA: upgrade the SSL certificate of state secret algorithm to help China's network security to be autonomous and controlla
What kind of foreign exchange trading platform is regulated and safe?
What files does a CA digital certificate contain? How to view SSL certificate information?
JMeter obtains cookies across thread groups or JMeter thread groups share cookies
CMake教程系列-02-使用cmake代码生成二进制
什么是自签名证书?自签名SSL证书的优缺点?
RAII内存管理