当前位置:网站首页>Sogou news - dataset
Sogou news - dataset
2022-08-03 13:03:00 【51CTO】
2,909,551 news articles from 5 categories of SogouCA and SogouCS news corpora.Each category contains 90,000 training samples and 12,000 test samples, respectively.These Chinese characters have been converted into Pinyin.
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks couldachieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.
Translation:
This paper conducts an empirical study on the application of character-level convolutional networks (ConvNets) in text classification.We construct several large-scale datasets to demonstrate that character-level convolutional networks can achieve state-of-the-art or competitive results.Traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks are compared.
You can download the dataset from the official website address, and I myself shared a copy on Baidu Netdisk.You can follow my official account and reply "2020082502" to get the download link.
As long as I have time, I try to write articles and share them with everyone.
My official account:

边栏推荐
- 流式编程使用场景
- Feature dimensionality reduction study notes (pca and lda) (1)
- YOLOv5 training data prompts No labels found, with_suffix is used, WARNING: Ignoring corrupted image and/or label appears during yolov5 training
- Five, the function calls
- 浅谈低代码平台远程组件加载方案
- AMS simulation
- 免费的网络传真平台_发传真不显示发送号码
- Notepad++ 安装jsonview插件
- 苹果发布 AI 生成模型 GAUDI,文字生成 3D 场景
- An基本工具介绍之选择线条工具(包教会)
猜你喜欢
随机推荐
Notepad++ 安装jsonview插件
The common problems in the futures account summary
7月份最后一篇博客
可重入锁详解(什么是可重入)
从器件物理级提升到电路级
安防监控必备的基础知识「建议收藏」
浅谈低代码平台远程组件加载方案
Image fusion SDDGAN article learning
Nodejs 安装依赖cpnm时,install 出现Error: Cannot find module ‘fs/promises‘
[数据仓库]分层概念,ODS,DM,DWD,DWS,DIM的概念「建议收藏」
【Verilog】HDLBits题解——Verification: Writing Testbenches
pandas连接oracle数据库并拉取表中数据到dataframe中、筛选当前时间(sysdate)到一天之前的所有数据(筛选一天范围数据)
2022 年 CISO 最关心的是什么?
实数取整写入文件(C语言文件篇)
setTimeout, setInterval requestAnimationFrame
【实战技能】单片机bootloader的CANFD,I2C,SPI和串口方式更新APP视频教程(2022-08-01)
Kubernetes 网络入门
随机森林项目实战---气温预测
Oracle安装完毕(系统盘),从系统盘转移到数据盘
Using the Work Queue Manager (4)








