当前位置:网站首页>30: Kakfa simulates JSON data generation and transmission
30: Kakfa simulates JSON data generation and transmission
2022-06-13 01:36:00 【Python's path to becoming a God】
In the calculation PV and UV A key step in the process of is to clean the log data . In fact, in other businesses , For example, in order data statistics , We also need to filter out some “ Dirty data ”.
So-called “ Dirty data ” It is inconsistent with the standard data structure defined by us , Or unwanted data . Because in data cleaning ETL Data deserialization, parsing and Java Class mapping , In this mapping process “ Dirty data ” Will cause deserialization failure , This causes the task to fail and restart . In some big assignments , Restarting will cause the task to become unstable , And too much “ Dirty data ” This will cause our task to report errors frequently , Finally, it failed completely .
framework
Mentioned the whole PV and UV Data processing architecture in computing process , It uses Flume Collect business data and send it to Kafka in , So calculating PV、UV Need to consume before Kafka Data in , And will “ Dirty data ” To filter out .
In real business , We consume primitive Kafka After processing the log data , The detailed data will also be written to similar Elasticsearch Query in such an engine ; The summary data will also be written into HBase perhaps Redis And other databases for front-end query and display . meanwhile , And write the data again Kafka For other businesses .
边栏推荐
- csdn涨薪技术之Jmeter接口测试数据库断言的实现与设计
- Stone from another mountain: Web3 investment territory of a16z
- About retrieving ignored files in cornerstone
- Binary tree traversal - recursive and iterative templates
- Project training (XVII) -- personal work summary
- [WSL2]限制WSL2可访问的硬件资源(CPU/内存)
- ES6 deconstruction assignment
- Argparse command line passes list type parameter
- [WSL2]WSL2迁移虚拟磁盘文件ext4.vhdx
- Run Presto under docker to access redis and Bi presentation
猜你喜欢
FLIP动画实现思路
项目实训(十七)---个人工作总结
Crypto JS reports uglifyjs error
[leetcode] valid phone number Bash
Record the VMware installation process of VMware Tools and some problems encountered
Should the audience choose observation mode or positioning mode?
[andoid][step pit]cts 11_ Testbootclasspathandsystemserverclasspath at the beginning of R3_ Analysis of nonduplicateclasses fail
Leetcode 01 array
The storage structure of a tree can adopt the parent representation, that is, the parent pointer array representation. Try to give the corresponding class definition. Each tree node contains two membe
[Stanford Jiwang cs144 project] lab1: streamreassembler
随机推荐
Lecture on Compilation Principles
【官方文件汇总】国科大学位论文撰写规范
Argparse command line passes list type parameter
FSOs forest simulation optimization model learning notes
[soft test] software designer knowledge points sorting (to be updated)
Crypto JS reports uglifyjs error
他山之石:a16z 的 Web3 投资版图
Should the audience choose observation mode or positioning mode?
Set and array conversion, list, array
On February 26, 2022, the latest news of national oil price adjustment today
Differences among bio, NiO and AIO
如何通过受众群体定位解决实际问题?
Minimum score of one question per day
谷歌的智能出价有几种?
The storage structure of a tree can adopt the parent representation, that is, the parent pointer array representation. Try to give the corresponding class definition. Each tree node contains two membe
Stack and queue practice (C language): Demon King's language
Study notes on the introduction paper of face recognition deep facial expression recognition: a survey
Add default right-click menu
兴趣相似的受众群体
leetode. 242. valid Letter heteronyms