当前位置:网站首页>Introduction and application of bigfilter global transaction anti duplication component
Introduction and application of bigfilter global transaction anti duplication component
2022-07-04 22:18:00 【Chang'an chain open source community】
We will use three articles to explain the work of Chang'an chain in improving the anti duplication ability of transactions , This is the third part .
All three chapters mainly include the following contents :
1. Chang'an chain trading pool and anti duplication trading optimization ;
2. The cuckoo filter of Chang'an chain transaction ;
3. bigfilter Introduction and application of global transaction anti duplication component .
One 、 background
All new transactions in the blockchain ledger need a transaction id, transaction id It has to be globally unique , Cannot conflict with existing transactions in the ledger ( equally ), Because the transactions in the blockchain ledger are increasing , So there will be more and more transactions accumulated in the ledger .
chart 1
New transactions should be stored in the ledger , It is necessary to check the transactions of new transactions id Whether it already exists in the account book . This check is called transaction filtering ( Anti weight ). The corresponding mathematical model is a new element newTxID Whether there is a collection leadger_txIDs in . The mathematical model is very simple , Combined with blockchain scenario , Demand is :
1. txID Generate rules , Must be decentralized , Generated arbitrarily by the user , No restrictions ;
2. txID Can't repeat .
Seemingly simple requirements , But it's actually very difficult . What is the specific difficulty ?
chart 2
Two 、 Difficulty analysis
difficulty 1: Transaction filtering is required to be fast .
technical , Verify whether an element is in a collection , The fastest way is to use hash sets ; Algorithm , Time complexity O(1), Spatial complexity O(n) Build a collection Set, Each element first determines whether there is , Do a hash operation .
But collections are built in memory , If thousands 、 Tens of thousands txid It is acceptable to store in memory , But if it is 10 One hundred million txid, This magnitude is too large , Memory may not be able to store ; What's more? , Transactions in the blockchain are only increasing , The collection will be bigger and bigger , It will burst the memory .
difficulty 2: There are many transactions .
Transactions in the blockchain are only increasing . If the number of transactions accumulated in the ledger reaches 10 Billion , What do I do ? In memory 10 Billion may not be saved . above-mentioned , The time complexity of hashing is O(1), But the spatial complexity is O(n);10 One hundred million TxID Put it in memory ,10^9*32B=32GB( Suppose a txid from 2 individual uuid, common 32Byte).
difficulty 3: Single machine resources are limited .
Single server memory is limited . If there is 50 Billion 、100 Billion 、 even to the extent that 1000 Billion transactions , A single machine can't be saved ,100 Billion will occupy 320G Memory .
difficulty 4: Transactions are not lost .
Transactions cannot be written only to memory , Because data will be lost in case of power failure , So you need persistence .
3、 ... and 、 Technical solution analysis
1. Algorithm
Through the hash function, we will TxID Mapping to a certain value is a good choice , At the same time, you cannot directly use the original string txID Store in collection .
2. Storage efficiency
If you just take txid In memory , It takes up a lot of space .
Change the way of thinking :
Through a function , hold txid Mapping to 0-n An integer in the integer interval of , such as txid->interN; Then through a function interN Map to a bit of a binary number bit On ; Through the above two steps , You can put some txid Map to a bit of a binary number bit On , This is equivalent to putting a pile of txid, Stored in a binary number , Judge a txid Whether there is , Just operate on two functions + One and operation is enough , This is it. The bloon filter The prototype of the .
such ,100 One hundred million txid, It only needs 10^10 individual bit position , It only needs 1200MB( about ) Storage space ; And direct deposit needs 320000MB, Efficiency comparison 266 times .
3. Node expansion
The storage space and computing resources of a single machine are limited , In order to store more data and make use of more resources in parallel at the same time, we need to add another function , hold txid Map to a server in some servers ; meanwhile , Support batch processing txid Whether there is , Also need to be able to do sharding And aggregation .
4. Persistence
The fastest way to write is WAL. Because the seek time of mechanical disk , Including disk rotation + moving head , The time of random data reading and writing is much longer than that of sequential data reading and writing . If we put txid Written in WAL in , It's the fastest 、 A scheme without losing data ; At the same time, in order to filter faster , You need to use a bloom filter , also bitmap Data should reside in memory .
Four 、 Chang'an chain solution -bigfilter
Based on the above background 、 problem 、 Difficulties and technical analysis we designed and implemented bigfilter.bigfilter The functions are as follows :
1. Filter new transactions faster . The bottom layer uses The bloon filter , Data stored in memory , With redis-bloom Plug in form .
2. Support massive transactions . Manage multiple through one manager redis-bloom node , Complete the data sharding And aggregation .
3. Persistence . adopt wal Write the new deal log in , At the same time redis Memory and aof Persistent storage , Do not lose data 、 It can recover automatically after crash .
4. Batch filtering . Support writing a batch of new transactions to bigfilter in , It also supports batch judgment of whether a batch of new transactions are bigfilter And achieve parallel computing .
chart 3
5、 ... and 、 Realization effect
We start from the storage space 、 Anti weight stability 、 Read and write performance 、 Four dimensions of accuracy are used to evaluate the effect .
Storage space : It refers to the implementation of transaction filtering , Memory used 、 Disk and other space size .
Storage comparison | disk | Memory | remarks | The number of deals |
bigfilter | 10GB | 5G | 100 Billion | |
leveldb | 500GB | txid Direct deposit | 100 Billion | |
Redis set | >1TB | 1TB | With set Mode storage | 100 Billion |
You can see ,bigfilter It takes up the least memory and disk space , Than directly storing the original txid The efficiency of the method is dozens of times 、 Even a hundred times the gap .
Anti weight stability : It refers to the increasing number of transactions accumulated in the account book , Whether the time delay required for a new transaction filtering will increase significantly with the increase of the total transaction volume in the ledger .
chart 4
You can see ,bigfilter As the total volume of transactions increases , The delay time for filtering new transactions is basically the same . And based on db Do transaction filtering , As the total volume of transactions increases, the delay also increases ; Memory based filtering , Due to the limitation of single machine memory, it cannot support too much transaction volume .
Read and write performance : It refers to the delay in writing some new transactions into the ledger and filtering whether some new transactions are in the ledger .
In our tests :
1. In trade id, Completely decentralized , Without restricting the generation of rules ;
2. Already exists in the ledger 50 Billion transactions .
Read and write performance - Batch filtering , Judge 1 Whether 10000 new transactions are 50 There are hundreds of millions of transactions ( Filter ) need 30ms, On average, every transaction 3us.
Read and write performance - Batch write , write in 1 Ten thousand transactions , need 10ms, On average, every transaction 1us.
Accuracy rate : Refers to a new transaction , Not in the account book , But after the anti weight calculation in the filter , Sometimes it is misjudged as This new deal txID In the account book , Get rid of this misjudgment , The accuracy percentage of the transaction weight prevention obtained .
In different configurations , The performance is different , because bigfilter It is a distributed component that supports multiple node storage . Theoretically speaking , When the total trading volume is certain , More nodes , The higher the accuracy of transaction filtering . In our test environment , stay 50 Billion in total trading volume , You can achieve 99% The accuracy of .
6、 ... and 、 summary
test result :
1. Storage space .50 Billion transactions ,redis aof Occupy 10G. Memory footprint 5G.
2. Anti weight stability . Basically stable , Specific view .
3. Read and write performance - Batch filtering . Judge 1 Whether 10000 new transactions are 50 There are hundreds of millions of transactions ( Filter ) need 30ms, On average, every transaction 3us.
4. Read and write performance - Batch write . write in 1 Ten thousand transactions , need 10ms. On average, every transaction 1us.
RECOMMEND
Recommended reading
Changan chain ChainMaker Anti duplication optimization of trading pool transactions
Changan chain ChainMaker v2.2.x Performance pursuit of Vm-Docker-Go Optimize
Tips
More Chang'an chain open source projects QA, You can log in to the open source community 、 Technical document library view .
Download the source code
https://git.chainmaker.org.cn/chainmaker/chainmaker-go
Consult the documentation
https://docs.chainmaker.org.cn/
Changan chain ChainMaker Case collection
http://www.wenjuan.com/s/UZBZJvhFGte/
“ Changan chain ChainMaker” It is the first independent and controllable blockchain software and hardware technology system in China , It is jointly developed by microchip Research Institute, enterprises and universities , With full autonomy 、 High performance 、 Strong privacy 、 The outstanding characteristics of wide cooperation . Chang'an chain is oriented to large-scale node networking 、 High transaction processing performance 、 Strong data security, privacy and other next-generation blockchain technology requirements , Integrate the special acceleration chip hardware of blockchain and the bottom software platform that can be assembled , To build high performance 、 High credibility 、 High security digital infrastructure provides new solutions , Provide strong blockchain technical support for Chang'an chain ecological alliance . The name “ Changan chain ”, Metaphorical meaning “ lasting political stability 、 Create brilliance again 、 Link to the world “.
边栏推荐
- 股票开户流程是什么?使用同花顺手机炒股软件安全吗?
- HBuilder X 常用的快捷键
- El tree combined with El table, tree adding and modifying operations
- Scala download and configuration
- 关系型数据库
- B站大量虚拟主播被集体强制退款:收入蒸发,还倒欠B站;乔布斯被追授美国总统自由勋章;Grafana 9 发布|极客头条
- 可视化任务编排&拖拉拽 | Scaleph 基于 Apache SeaTunnel的数据集成
- Force buckle 2_ 1480. Dynamic sum of one-dimensional array
- ACM Multimedia 2022 | 视觉语言预训练模型中社会偏见的反事实衡量和消除
- Enabling digital economy Fuxin software attends the BRICs high level Forum on Sustainable Development
猜你喜欢
保证接口数据安全的10种方案
湘江鲲鹏加入昇腾万里伙伴计划,与华为续写合作新篇章
Huawei Nova 10 series released Huawei application market to build a solid application security firewall
GTEST from ignorance to proficiency (3) what are test suite and test case
可视化任务编排&拖拉拽 | Scaleph 基于 Apache SeaTunnel的数据集成
Machine learning notes mutual information
Play with grpc - go deep into concepts and principles
QT - plot other problems
Sorting and sharing of selected papers, systems and applications related to the most comprehensive mixed expert (MOE) model in history
Scala下载和配置
随机推荐
HDU - 2859 Phalanx(DP)
Is it safe to open an account in the stock of Caicai college? Can you only open an account by digging money?
Service online governance
Force buckle 2_ 1480. Dynamic sum of one-dimensional array
Cloudcompare & open3d DBSCAN clustering (non plug-in)
玩转gRPC—深入概念与原理
ACM multimedia 2022 | counterfactual measurement and elimination of social prejudice in visual language pre training model
股票开户流程是什么?使用同花顺手机炒股软件安全吗?
短视频系统源码,点击屏幕空白处键盘不自动收起
Nat. Commun.| Machine learning jointly optimizes the affinity and specificity of mutagenic therapeutic antibodies
MongoDB聚合操作总结
Sqlserver encrypts and decrypts data
HUAWEI nova 10系列发布 华为应用市场筑牢应用安全防火墙
From repvgg to mobileone, including mobileone code
机器人相关课程考核材料归档实施细则2022版本
抖音实战~评论数量同步更新
Enabling digital economy Fuxin software attends the BRICs high level Forum on Sustainable Development
283. 移动零-c与语言辅助数组法
文件读取写入
Visual task scheduling & drag and drop | scalph data integration based on Apache seatunnel