当前位置:网站首页>Day 8.Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog
Day 8.Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog
2022-07-27 05:50:00 【Ignorant graduate student】
Title:
Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog
Develop simplified Chinese psycholinguistic analysis dictionary for Weibo
Keywords:
LIWC,
Traditional Chinese, Traditional Chinese
Simplified Chinese, Simplified Chinese
microblog, Microblogging
text analysis. Text analysis
Abstract:
The words that people use could reveal their emotional states, intentions, thinking styles, individual differences, etc. LIWC (Linguistic Inquiry and Word Count) has been widely used for psychological text analysis, and its dictionary is the core. The Traditional Chinese version of LIWC dictionary has been released, which is a translation of LIWC English dictionary. However, Simplified Chinese which is the world’s most widely used language has subtle differences with Traditional Chinese. Furthermore, both English LIWC dictionary and Traditional Chinese version dictionary were both developed for relatively formal text. Microblog has become more and more popular in China nowadays. Original LIWC dictionaries take less consideration on microblog popular words, which makes it less applicable for text analysis on microblog. In this study, a Simplified Chinese LIWC dictionary is established according to LIWC categories. After translating Traditional Chinese dictionary into Simplified Chinese, five thousand words most frequently used in microblog are added into the dictionary. Four graduate students of psychology rated whether each word belonged in a category. The reliability and validity of Simplified Chinese
LIWC dictionary were tested by these four judges. This new dictionary could contribute to all the text analysis on microblog in future.
The words people use can reveal their emotional state 、 Intention 、 Way of thinking 、 Individual differences, etc . Language query and word count (LIWC) It is widely used in psychological discourse analysis , The dictionary is its core .《LIWC The dictionary 》 The traditional Chinese version of has been released , It is LIWC Translation of English dictionaries . However , As the most widely used language in the world , There are subtle differences between simplified Chinese and traditional Chinese . Besides , English LIWC Dictionaries and traditional Chinese dictionaries are developed for relatively formal texts . Nowadays, Weibo is becoming more and more popular in China . The original LIWC Dictionaries give less consideration to popular words on Weibo , Not suitable for Weibo text analysis . This study is based on LIWC The classification of , Established a simplified Chinese LIWC The dictionary . After translating the traditional Chinese dictionary into simplified Chinese , The 5000 most commonly used words on Weibo are added to the dictionary . Four psychology graduate students rated whether each word belonged to a category . Through these four judges 《 Simplified Chinese LIWC The dictionary 》 The reliability and validity of . This new dictionary will help all text analysis on Weibo in the future .
Conclusion:
Percentage of words captured by the SCLIWC dictionary indicates that words usage in internet environment like Sina microblog are much more diverse compared to formal text materials[9, 14]. Percentage of words captured by the SCMBWC dictionary improves above 10 percent, especially captured more words in category of psychological processes and its sub categories, such as social processes, affective
processes, cognitive processes and etc. Internal Reliability and External Validity of those two dictionaries are well guaranteed by four groups of judges. SCLIWC bridges the gap between LIWC software and Simplified Chinese. What is more, SCMBWC suggests a promising approach for further text analysis of Chinese Simplified in various internet environments.
SCLIWC The percentage of words captured in the dictionary indicates , The vocabulary usage in Sina Weibo and other online environments is better than that in official text materials [9, 14] More diverse .SCMBWC The percentage of words in the dictionary has increased 10% above , Especially in the psychological process class and its subclasses , Such as social process 、 Emotional process, etc , Capture more words , The internal reliability and external validity of these two dictionaries have been fully guaranteed by four groups of judges .SCLIWC Make up for LIWC The gap between software and simplified Chinese . Besides ,SCMBWC It provides a promising method for further analyzing simplified Chinese Texts in various network environments .
边栏推荐
- Minio fragment upload lifting fragment size limit - chunk size must be greater than 5242880
- If the interviewer asks you about JVM, the extra answer of "escape analysis" technology will give you extra points
- Which futures company do you go to and how do you open an account?
- Deploy redis with docker for high availability master-slave replication
- 我想不通,MySQL 为什么使用 B+ 树来作索引?
- Day 9. Graduate survey: A love–hurt relationship
- 使用Docker部署Redis进行高可用主从复制
- 建设创客教育运动中的完整体系
- Choose a qualified futures company to open an account
- Emoji表情符号用于文本情感分析-Improving sentiment analysis accuracy with emoji embedding
猜你喜欢

PHP 实现与MySQL的数据交互

jenkins构建镜像自动化部署

GBase 8c产品简介

How to choose a good futures company for futures account opening?

NFT new paradigm, okaleido innovation NFT aggregation trading ecosystem

「中高级试题」:MVCC实现原理是什么?

Day 9. Graduate survey: A love–hurt relationship

我想不通,MySQL 为什么使用 B+ 树来作索引?

记一次PG主从搭建及数据同步性能测试流程

How to realize master-slave synchronization in mysql5.7
随机推荐
GBASE 8C——SQL参考6 sql语法(14)
2020年PHP中级面试知识点及答案
个人收款码不得用于经营收款
Day 11. Evidence for a mental health crisis in graduate education
根据文本自动生成UML时序图(draw.io格式)
给测试小姐姐的第三封信 | ORACLE存储过程知识分享和测试说明
Aquanee will land in gate and bitmart in the near future, which is a good opportunity for low-level layout
How to realize master-slave synchronization in mysql5.7
2021中大厂php+go面试题(1)
GBASE 8C——SQL参考6 sql语法(2)
Deploy redis with docker for high availability master-slave replication
GBase 8c产品简介
Personal collection code cannot be used for business collection
存储过程试炼2--建立Test表测试不同类型的存储过程
MySQL索引分析除了EXPLAIN还有什么方法
Mysql5.7版本如何实现主从同步
How MySQL and redis ensure data consistency
Seektiger's okaleido has a big move. Will the STI of ecological pass break out?
Characteristics of hexadecimal
Fortex Fangda releases the electronic trading ecosystem to share and win-win with customers