当前位置:网站首页>Sogou news - dataset
Sogou news - dataset
2022-08-03 13:03:00 【51CTO】
2,909,551 news articles from 5 categories of SogouCA and SogouCS news corpora.Each category contains 90,000 training samples and 12,000 test samples, respectively.These Chinese characters have been converted into Pinyin.
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks couldachieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.
Translation:
This paper conducts an empirical study on the application of character-level convolutional networks (ConvNets) in text classification.We construct several large-scale datasets to demonstrate that character-level convolutional networks can achieve state-of-the-art or competitive results.Traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks are compared.
You can download the dataset from the official website address, and I myself shared a copy on Baidu Netdisk.You can follow my official account and reply "2020082502" to get the download link.
As long as I have time, I try to write articles and share them with everyone.
My official account:

边栏推荐
- 长江商业银行面试
- 【Verilog】HDLBits题解——验证:阅读模拟
- 可视化图表设计Cookbook
- R language ggplot2 visualization: use the patchwork bag plot_layout function will be more visual image together, ncol parameter specifies the number of rows, specify byrow parameters configuration dia
- Byte's favorite puzzle questions, how many do you know?
- 类和对象(中上)
- word标尺有哪些作用
- 特征降维学习笔记(pca和lda)(1)
- Database basics one (MySQL) [easy to understand]
- The Yangtze river commercial Banks to the interview
猜你喜欢

YOLOv5训练数据提示No labels found、with_suffix使用、yolov5训练时出现WARNING: Ignoring corrupted image and/or label

Oracle安装完毕(系统盘),从系统盘转移到数据盘

实数取整写入文件(C语言文件篇)

An工具介绍之摄像头

(through page) ali time to upload the jar

How can I get a city's year-round weather data for free?Precipitation, temperature, humidity, solar radiation, etc.

An基本工具介绍之选择线条工具(包教会)

PolarFormer: Multi-camera 3D Object Detection with Polar Transformers 论文笔记

Image fusion GAN-FM study notes

An工具介绍之宽度工具、变形工具与套索工具
随机推荐
php microtime encapsulates the tool class, calculates the running time of the interface (breakpoint)
An动画优化之传统引导层动画
shell编程之条件语句
(through page) ali time to upload the jar
技术分享 | 接口自动化测试如何搞定 json 响应断言?
字节最爱问的智力题,你会几道?
图像融合GAN-FM学习笔记
Last blog for July
长江商业银行面试
别再用if-else了,分享一下我使用“策略模式”的项目经验...
Feature Engineering Study Notes
R语言使用zoo包中的rollapply函数以滚动的方式、窗口移动的方式将指定函数应用于时间序列、计算时间序列的滚动标准差(设置每个窗口不重叠)
流式编程使用场景
Nodejs 安装依赖cpnm时,install 出现Error: Cannot find module ‘fs/promises‘
期货公司开户关注的关键点
【云原生 · Kubernetes】部署Kubernetes集群
An工具介绍之宽度工具、变形工具与套索工具
An工具介绍之形状工具及渐变变形工具
awk入门教程
使用工作队列管理器(四)