当前位置：网站首页>FAQs and answers to the imitation Niuke technology blog project (III)

FAQs and answers to the imitation Niuke technology blog project (III)

2022-07-06 13:37:00 【Li bohuan】

Take the book back ： FAQs and answers of the imitation Niuke technology blog project （ Two ）_ Li bohuan's blog -CSDN Blog

13 In the project kafka How does it work ？

kafka introduction

Apache Kafka It's a distributed streaming platform . A distributed streaming platform should include 3 Key capabilities ：

Publish and subscribe streams data streams , Similar to message queue or enterprise messaging system
Store data streams in a fault-tolerant and persistent manner
Process data flow

- application ： The messaging system 、 Log collection 、 User behavior tracking 、 Streaming

·kafka characteristic

- High throughput ： Handle TB Massive data

- Message persistence ： Persistence , Store data on hard disk , Not just stored in memory , Persistent messages , The reading speed stored in the hard disk is much lower than that of the memory , The efficiency of reading and writing hard disk depends on the way of reading hard disk , The efficiency of sequential reading and writing of hard disk is very high ,kafka Ensure that the reading and writing of hard disk messages are sequential ;

- high reliability ：kafka It's distributed deployment , A server hangs up , There's something else , There's a fault tolerance mechanism

- High expansibility ： When there are not enough servers in the cluster , You can expand the server , Just a simple configuration

·kafka The term

-Broker： Kafka's server , Each server in Kafka cluster is called a Broker

-Zookeeper： Software for managing clusters , When using Kafka, it can be installed separately zookeeper Or built-in zookeeper

Implementation of message queue ：

Point to point implementation ：BlockingQueue, The producer puts the message on the queue , Consumer takes data out of the queue , Every message will only be consumed by one consumer ;

The message sender sends the production message to the message queue , Then the message receiver takes the message from the message queue and consumes the message . After the news was consumed , There is no more storage in the message queue , So it's impossible for the message receiver to consume the consumed message .

Publish subscribe mode ： Producers publish messages to a certain location , Multiple consumers can subscribe to this location at the same time , This message can be read by multiple consumers ,

Kafka uses publish subscribe mode ： The area where producers publish messages is called topic, Can be understood as a folder

-Partition： Partition the theme

-Offset： The index of the message in the partition

-Leader Replica： copy , Kafka is distributed , Therefore, multiple copies of the partition will be repeated

Master copy ： Can handle requests to get messages

-Follower Replica： Just back up the data from the copy , No response , When the primary copy hangs , Distributed will choose one of all the secondary replicas as the new primary replica

Send system notifications ： --- It's very frequent , There are many user groups , Performance issues need to be considered

· Triggering event

Define three different themes , Wrap different trigger events into different messages , Publish to the corresponding topic , In this way, the producer thread can continue to publish messages ,

At this point, the consumer thread can read messages concurrently , For storage

- After comments , Issue notice

- After likes , Issue notice

- After attention , Issue notice

· Handling events

- Encapsulating event objects

- Producer of development events

- Consumers of development events

producer ： Trigger Event, Encapsulates the Topic as well as userId、Entity Etc , call sendMsg when , Extract event.Topic and JSONObject.toJSONString(Event) With content Send as , Call ; （ Active trigger , Adding comments 、 Triggered when you follow and like ）

consumer ： monitor Topic, If there's new news , Just read ,record What we get from it is event Inside json strand , And then return to event that will do JSONObject.paresObject(record.value().toString,Event.class); Then put the relevant attributes , Encapsulated into message The form of private messages , Save to the database , Supply the front-end page to call and display .（ The consumption here is to store data from the message queue into the database , Passive trigger ,kafka Listening topic , Automatically consume when there is news ）

14 Do message queues go to memory or disk ？ Why is the disk so fast ？

Kafka The messages are stored or cached on disk , Generally speaking, reading and writing data on disk will degrade performance , Because addressing takes time , But actually ,Kafka One of the features of is high throughput .

Analyze from two aspects of data writing and reading , Why? Kafka So fast

Write data ： How fast the disk reads and writes depends on how you use it , That is, sequential reading and writing or random reading and writing . In the case of sequential reading and writing , The sequential read and write speed of the disk is the same as that of the memory . Because the hard disk is a mechanical structure , Every read and write will address -> write in , Where addressing is a “ Mechanical action ”, It's the most time consuming . So hard drives hate random I/O, Favorite order I/O. In order to improve the speed of reading and writing hard disk ,Kafka It's the order of use I/O.

Even writing to the hard disk in sequence , Hard disk access speed is still impossible to catch up with memory . therefore Kafka The data is not written to the hard disk in real time , It makes full use of paging storage of modern operating system to improve memory I/O efficiency .

Reading data ： Zero copy

15 TrieTree Prefix tree introduction

Prefix tree It is a tree data structure of multi tree , It is used to filter sensitive words in the project .

Construct a prefix tree ： The first layer stores the first character of all sensitive words

Prefix tree features ：1. The root node does not contain any information Each node except the root node contains only one character ,2. The path from the root node to a node , The string connected by the character is the string corresponding to this node 3. All child nodes of each node contain different characters

Here's the picture ：

Filter sensitive words algorithm ：

Three pointers , One points to the root （node）, The other two pointers （begin and position）, All point to the beginning of the text , One of them keeps moving backwards （begin）, The other follows , Discovery is not a sensitive word , It means that begin The first character cannot form a sensitive word , Deposit it in StringBuilder,begin Move backward , Then go back to begin. If it's a sensitive word , The replacement , And the other two pointers move back , The tree pointer points to the root node .

 public String filter(String text) {
        if (StringUtils.isBlank(text)) {
            return null;
        }
        //  The pointer 1
        TrieNode tempNode = rootNode;
        //  The pointer 2
        int begin = 0;
        //  The pointer 3
        int position = 0;
        //  result 
        StringBuilder sb = new StringBuilder();
        while (position < text.length()) {
            char c = text.charAt(position);
            //  Skip symbols 
            if (isSymbol(c)) {
                //  If pointer 1 At the root node , Count this symbol into the result , Let the pointer 2 Take a step down 
                if (tempNode == rootNode) {
                    sb.append(c);
                    begin++;
                }
                //  Whether the symbol is at the beginning or in the middle , The pointer 3 Take a step down 
                position++;
                continue;
            }
            //  Check the child nodes 
            tempNode = tempNode.getSubNode(c);
            if (tempNode == null) {
                //  With begin The first string is not a sensitive word 
                sb.append(text.charAt(begin));
                //  Go to the next position 
                position = ++begin;
                //  Point back to the root node 
                tempNode = rootNode;
            } else if (tempNode.isKeywordEnd()) {
                //  Find sensitive words , take begin~position Replace the string with 
                sb.append(REPLACEMENT);
                //  Go to the next position 
                begin = ++position;
                //  Point back to the root node 
                tempNode = rootNode;
            } else {
                //  Check the next character 
                position++;
            }
        }
        //  Count the last batch of characters into the result 
        sb.append(text.substring(begin));
        return sb.toString();
    }

原网站

版权声明
本文为[Li bohuan]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/187/202207060916513246.html

当前位置：网站首页>FAQs and answers to the imitation Niuke technology blog project (III)

FAQs and answers to the imitation Niuke technology blog project (III)

kafka introduction

Send system notifications ： --- It's very frequent , There are many user groups , Performance issues need to be considered

边栏推荐

猜你喜欢

随机推荐