当前位置:网站首页>FAQs and answers to the imitation Niuke technology blog project (III)
FAQs and answers to the imitation Niuke technology blog project (III)
2022-07-06 13:37:00 【Li bohuan】
Take the book back : FAQs and answers of the imitation Niuke technology blog project ( Two )_ Li bohuan's blog -CSDN Blog
13 In the project kafka How does it work ?
kafka introduction
Apache Kafka It's a distributed streaming platform . A distributed streaming platform should include 3 Key capabilities :
- Publish and subscribe streams data streams , Similar to message queue or enterprise messaging system
- Store data streams in a fault-tolerant and persistent manner
- Process data flow
- application : The messaging system 、 Log collection 、 User behavior tracking 、 Streaming
·kafka characteristic
- High throughput : Handle TB Massive data
- Message persistence : Persistence , Store data on hard disk , Not just stored in memory , Persistent messages , The reading speed stored in the hard disk is much lower than that of the memory , The efficiency of reading and writing hard disk depends on the way of reading hard disk , The efficiency of sequential reading and writing of hard disk is very high ,kafka Ensure that the reading and writing of hard disk messages are sequential ;
- high reliability :kafka It's distributed deployment , A server hangs up , There's something else , There's a fault tolerance mechanism
- High expansibility : When there are not enough servers in the cluster , You can expand the server , Just a simple configuration
·kafka The term
-Broker: Kafka's server , Each server in Kafka cluster is called a Broker
-Zookeeper: Software for managing clusters , When using Kafka, it can be installed separately zookeeper Or built-in zookeeper
Implementation of message queue :
Point to point implementation :BlockingQueue, The producer puts the message on the queue , Consumer takes data out of the queue , Every message will only be consumed by one consumer ;

The message sender sends the production message to the message queue , Then the message receiver takes the message from the message queue and consumes the message . After the news was consumed , There is no more storage in the message queue , So it's impossible for the message receiver to consume the consumed message .

Publish subscribe mode : Producers publish messages to a certain location , Multiple consumers can subscribe to this location at the same time , This message can be read by multiple consumers ,
Kafka uses publish subscribe mode : The area where producers publish messages is called topic, Can be understood as a folder


-Partition: Partition the theme

-Offset: The index of the message in the partition
-Leader Replica: copy , Kafka is distributed , Therefore, multiple copies of the partition will be repeated
Master copy : Can handle requests to get messages
-Follower Replica: Just back up the data from the copy , No response , When the primary copy hangs , Distributed will choose one of all the secondary replicas as the new primary replica
Send system notifications : --- It's very frequent , There are many user groups , Performance issues need to be considered
· Triggering event
Define three different themes , Wrap different trigger events into different messages , Publish to the corresponding topic , In this way, the producer thread can continue to publish messages ,
At this point, the consumer thread can read messages concurrently , For storage

- After comments , Issue notice
- After likes , Issue notice
- After attention , Issue notice
· Handling events
- Encapsulating event objects
- Producer of development events
- Consumers of development events
producer : Trigger Event, Encapsulates the Topic as well as userId、Entity Etc , call sendMsg when , Extract event.Topic and JSONObject.toJSONString(Event) With content Send as , Call ; ( Active trigger , Adding comments 、 Triggered when you follow and like )

consumer : monitor Topic, If there's new news , Just read ,record What we get from it is event Inside json strand , And then return to event that will do JSONObject.paresObject(record.value().toString,Event.class); Then put the relevant attributes , Encapsulated into message The form of private messages , Save to the database , Supply the front-end page to call and display .( The consumption here is to store data from the message queue into the database , Passive trigger ,kafka Listening topic , Automatically consume when there is news )

14 Do message queues go to memory or disk ? Why is the disk so fast ?
Kafka The messages are stored or cached on disk , Generally speaking, reading and writing data on disk will degrade performance , Because addressing takes time , But actually ,Kafka One of the features of is high throughput .
Analyze from two aspects of data writing and reading , Why? Kafka So fast
Write data : How fast the disk reads and writes depends on how you use it , That is, sequential reading and writing or random reading and writing . In the case of sequential reading and writing , The sequential read and write speed of the disk is the same as that of the memory . Because the hard disk is a mechanical structure , Every read and write will address -> write in , Where addressing is a “ Mechanical action ”, It's the most time consuming . So hard drives hate random I/O, Favorite order I/O. In order to improve the speed of reading and writing hard disk ,Kafka It's the order of use I/O.
Even writing to the hard disk in sequence , Hard disk access speed is still impossible to catch up with memory . therefore Kafka The data is not written to the hard disk in real time , It makes full use of paging storage of modern operating system to improve memory I/O efficiency .
Reading data : Zero copy
15 TrieTree Prefix tree introduction
Prefix tree It is a tree data structure of multi tree , It is used to filter sensitive words in the project .
Construct a prefix tree : The first layer stores the first character of all sensitive words
Prefix tree features :1. The root node does not contain any information Each node except the root node contains only one character ,2. The path from the root node to a node , The string connected by the character is the string corresponding to this node 3. All child nodes of each node contain different characters
Here's the picture :


Filter sensitive words algorithm :
Three pointers , One points to the root (node), The other two pointers (begin and position), All point to the beginning of the text , One of them keeps moving backwards (begin), The other follows , Discovery is not a sensitive word , It means that begin The first character cannot form a sensitive word , Deposit it in StringBuilder,begin Move backward , Then go back to begin. If it's a sensitive word , The replacement , And the other two pointers move back , The tree pointer points to the root node .

public String filter(String text) {
if (StringUtils.isBlank(text)) {
return null;
}
// The pointer 1
TrieNode tempNode = rootNode;
// The pointer 2
int begin = 0;
// The pointer 3
int position = 0;
// result
StringBuilder sb = new StringBuilder();
while (position < text.length()) {
char c = text.charAt(position);
// Skip symbols
if (isSymbol(c)) {
// If pointer 1 At the root node , Count this symbol into the result , Let the pointer 2 Take a step down
if (tempNode == rootNode) {
sb.append(c);
begin++;
}
// Whether the symbol is at the beginning or in the middle , The pointer 3 Take a step down
position++;
continue;
}
// Check the child nodes
tempNode = tempNode.getSubNode(c);
if (tempNode == null) {
// With begin The first string is not a sensitive word
sb.append(text.charAt(begin));
// Go to the next position
position = ++begin;
// Point back to the root node
tempNode = rootNode;
} else if (tempNode.isKeywordEnd()) {
// Find sensitive words , take begin~position Replace the string with
sb.append(REPLACEMENT);
// Go to the next position
begin = ++position;
// Point back to the root node
tempNode = rootNode;
} else {
// Check the next character
position++;
}
}
// Count the last batch of characters into the result
sb.append(text.substring(begin));
return sb.toString();
}
边栏推荐
- 5.MSDN的下载和使用
- ABA问题遇到过吗,详细说以下,如何避免ABA问题
- Database operation of tyut Taiyuan University of technology 2022 database
- C language Getting Started Guide
- 9.指针(上)
- MySQL Database Constraints
- 6.函数的递归
- 【九阳神功】2020复旦大学应用统计真题+解析
- 【九阳神功】2018复旦大学应用统计真题+解析
- Service ability of Hongmeng harmonyos learning notes to realize cross end communication
猜你喜欢

1.C语言初阶练习题(1)

8. C language - bit operator and displacement operator

西安电子科技大学22学年上学期《射频电路基础》试题及答案

Tyut Taiyuan University of technology 2022 "Mao Gai" must be recited

View UI plus released version 1.3.0, adding space and $imagepreview components

强化学习系列(一):基本原理和概念

13 power map

Counter attack of flour dregs: redis series 52 questions, 30000 words + 80 pictures in detail.

MySQL锁总结(全面简洁 + 图文详解)

View UI plus released version 1.2.0 and added image, skeleton and typography components
随机推荐
(super detailed II) detailed visualization of onenet data, how to plot with intercepted data flow
Data manipulation language (DML)
(超详细二)onenet数据可视化详解,如何用截取数据流绘图
There is always one of the eight computer operations that you can't learn programming
Implement queue with stack
Counter attack of flour dregs: redis series 52 questions, 30000 words + 80 pictures in detail.
ABA问题遇到过吗,详细说以下,如何避免ABA问题
编写程序,模拟现实生活中的交通信号灯。
4.分支语句和循环语句
2.C语言矩阵乘法
3.猜数字游戏
[中国近代史] 第九章测验
5. Function recursion exercise
FileInputStream和BufferedInputStream的比较
仿牛客技术博客项目常见问题及解答(二)
6.函数的递归
最新坦克大战2022-全程开发笔记-3
5.MSDN的下载和使用
Mortal immortal cultivation pointer-2
西安电子科技大学22学年上学期《信号与系统》试题及答案