当前位置:网站首页>Sogou news - dataset
Sogou news - dataset
2022-08-03 13:03:00 【51CTO】
2,909,551 news articles from 5 categories of SogouCA and SogouCS news corpora.Each category contains 90,000 training samples and 12,000 test samples, respectively.These Chinese characters have been converted into Pinyin.
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks couldachieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.
Translation:
This paper conducts an empirical study on the application of character-level convolutional networks (ConvNets) in text classification.We construct several large-scale datasets to demonstrate that character-level convolutional networks can achieve state-of-the-art or competitive results.Traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks are compared.
You can download the dataset from the official website address, and I myself shared a copy on Baidu Netdisk.You can follow my official account and reply "2020082502" to get the download link.
As long as I have time, I try to write articles and share them with everyone.
My official account:

边栏推荐
猜你喜欢
随机推荐
基于php旅游网站管理系统获取(php毕业设计)
From the physical level of the device to the circuit level
免费的网络传真平台_发传真不显示发送号码
Random forest project combat - temperature prediction
R语言使用ggpubr包的ggtexttable函数可视化表格数据(直接绘制表格图或者在图像中添加表格数据)、使用tab_add_vline函数自定义表格中竖线(垂直线)的线条类型以及线条粗细
Kubernetes 网络入门
leetcode 11. 盛最多水的容器
Five, the function calls
Yahoo! Answers-数据集
Use %Status value
便携烙铁开源系统IronOS,支持多款便携DC, QC, PD供电烙铁,支持所有智能烙铁标准功能
self-discipline
Key points for account opening of futures companies
The Yangtze river commercial Banks to the interview
An工具介绍之3D工具
数据库系统原理与应用教程(073)—— MySQL 练习题:操作题 131-140(十七):综合练习
Free Internet fax platform fax _ don't show number
An工具介绍之钢笔工具、铅笔工具与画笔工具
(通过页面)阿里云云效上传jar
pandas连接oracle数据库并拉取表中数据到dataframe中、生成当前时间的时间戳数据、格式化为指定的格式(“%Y-%m-%d-%H-%M-%S“)并添加到csv文件名称中









