当前位置:网站首页>OLAP - Druid introduction
OLAP - Druid introduction
2022-06-22 23:47:00 【IT_ one 's mind settles as still water】
Catalog
background
Druid It is a distributed data storage system supporting real-time analysis . Popular point theory : It is a high-performance real-time analysis database .2011 year , By American advertising technology company MetaMarkets establish , And in 2012 In open source . The official website address is :http://druid.io/. at present Druid Based on Apache License 2.0 Open source agreement , By Apache incubation , The code is hosted in Github. The latest official website address is :https://druid.apache.org/
( Be careful : Ali once opened a project called Druid It's a database connection pool . Same as here Driud Just the same name , There's no connection .)
characteristic
1. A quick query
Memory based data storage improves druid Query speed of , Provides fast aggregation capabilities as well as fast OLAP Query power , Multi tenant design , It is the most ideal way for user oriented analysis and application .druid The granularity of data aggregation can be 1 minute ,5 minute ,1 Hour or 1 God wait .
2. Real time data injection
druid Support real-time streaming data injection , And provides event driven data , Ensure the timeliness and uniformity of events in real-time and offline environments . Typical Lambda framework , Do not change historical data , Real time access to real-time data .
3. Extensible PB Levels of storage
Scalable distributed architecture ,druid Clusters can be easily expanded to PB The amount of data , A million levels of data injection per second . Even if you scale up the data , It can also ensure its timeliness .druid Aggregate data can be partitioned according to time range .
4. Cloud native architecture , High fault tolerance :
druid It can run on commercial hardware , It can also run on the cloud . It can inject data from a variety of data systems , Include hadoop,spark,kafka,storm and samza etc. .
Basic concepts
Design principles
1. A quick query (Fast Query) : Partial data aggregation (Partial Aggregate) + Memory (In-Memory) + Indexes (Index)
2. Level development ability (Horizontal Scalability): Distributed data (Distributed data)+ Parallel query (Parallelizable Query)
3. Real time analysis (Realtime Analytics):Immutable Past , Append-Only Future
data format
druid Before data intake , First of all, you need to define a data source that is Datasource, This dataSource The structure of is Time column (TimeStamp), Dimension column (Dimension) And indicators (Metric).
Time column :druid It will aggregate some data with similar time , Specify a time range when querying .
Dimension column : As a way to identify some statistical dimensions , For example, all kinds of .
Index column : Is the column used for aggregation and calculation , Include count,sum wait .
Data intake
druid There are two ways of data intake , Real time and batch processing .

Data query
druid Two kinds of queries are supported , Native and sql
Applicable scenario
according to Druid We know the characteristics of ,druid Suitable data scenarios :
More queries, less changes
Queries are mainly aggregated or grouped
A quick query
Need to support offline and real-time data sources ·
Specific business scenarios :
User behavior analysis
Real time monitoring of service performance indicators
Digital marketing
business intelligence / OLAP
边栏推荐
- 考过HCIP依然转行失败,职业网工最看重的到底是什么
- Notes on zhouguohua's reading
- uniapp 修改数组属性,视图不更新
- The breakthrough of key chips hindering Huawei 5g mobile phones has been achieved, and domestic chips have gained 10% share
- Asynchronous FIFO
- Programmers' choice of taking private jobs and part-time jobs
- After passing the hcip exam, I still failed to change my career. What do professional network workers value most
- 阻止别人使用浏览器调试
- Canvas generate Poster
- 斐波那契数列合集
猜你喜欢

SOA Service Oriented Architecture

What does password security mean? What are the password security standard clauses in the ISO 2.0 policy?

Enterprise digitalization is not a separate development, but a comprehensive SaaS promotion

Programmers' choice of taking private jobs and part-time jobs

wallys/WiFi6 MiniPCIe Module 2T2R 2 × 2.4GHz 2x5GHz

KunlunDB备份和恢复

再立云计算“昆仑”,联想混合云Lenovo xCloud凭什么?

'dare not doubt the code, but have to doubt the code 'a network request timeout analysis

PHP7.3报错undefined function simplexml_load_string()

canvas生成海报
随机推荐
LeetCode_ Backtracking_ Dynamic programming_ Medium_ 131. split palindrome string
【STM32技巧】使用STM32 HAL库的硬件I2C驱动RX8025T实时时钟芯片
在Word中自定义多级列表样式
OJ daily practice - delete word suffixes
Fibonacci sequence set
Autoincrement attribute of sqlserver replication table
考过HCIP依然转行失败,职业网工最看重的到底是什么
C sqlsugar, hisql, FreeSQL ORM framework all-round performance test vs. sqlserver performance test
13. Roman numeral to integer
周国华 读书随记
弱电转职业网工难不难?华为售前工程师分享亲身经历
Digital data depth | about software self-control, source code left, no code right
XML escape character cross reference table
Is it difficult to turn weak current into professional network worker? Huawei pre-sales engineers share their own experience
再立云计算“昆仑”,联想混合云Lenovo xCloud凭什么?
Web Caching Technology
Redis cache
js----SVG转PNG
Sword finger offer 07 Rebuild binary tree
【GO】go多态