当前位置:网站首页>HTAP in depth exploration Guide
HTAP in depth exploration Guide
2022-06-27 06:35:00 【Tianxiang shop】
This guide describes how to further explore and use TiDB Online transaction and online analytical processing (Hybrid Transactional and Analytical Processing, HTAP) function .
Be careful
If you are right about TiDB HTAP I don't know much about the functions , Hope to have a quick trial experience , see also Quick start HTAP.
Get to know... Quickly TiDB stay HTAP The architecture under the scenario is similar to HTAP The applicable scenarios of , It is recommended to watch the following training video first ( Duration 15 minute ). Note that this video is for reference only , For more information HTAP Related content , Please refer to the documentation below .
HTAP Applicable scenario
TiDB HTAP It can meet the increasing production demand of enterprises' massive data 、 Reduce the risk cost of operation and maintenance 、 Seamlessly stitch with the existing big data stack , So as to realize the real-time realization of the value of data assets .
Here are three kinds of HTAP Typical applicable scenarios :
Mixed load scenarios
When will TiDB When applied to the mixed load scenario of online real-time analysis and processing , Developers only need to provide an entry ,TiDB Different processing engines will be automatically selected according to the business type .
Real time streaming scenarios
When will TiDB When applied to real-time stream processing scenarios ,TiDB It can ensure that the data flowing into the system continuously can be checked in real time , At the same time, high concurrent data service and BI Inquire about .
Data hub scenario
When will TiDB When applied to data hub scenarios ,TiDB As a data hub, it can seamlessly connect the data business layer and the data warehouse layer , Meet the needs of different businesses .
If you want to know more about TiDB HTAP Scene information , see also PingCAP On the official website HTAP The blog of .
HTAP framework
stay TiDB in , Row storage engine for online transaction processing TiKV And the column storage engine for real-time analysis scenarios TiFlash At the same time , Automatic synchronization , Maintain strong consistency .
More architecture information , Please refer to TiDB HTAP Morphological architecture .
HTAP Environmental preparation
In depth exploration TiDB HTAP Before function , Please deploy according to your data scenario TiDB And the corresponding data analysis engine . Big data scenario (100 T) Next , Recommended TiFlash MPP As HTAP The main scheme of ,TiSpark As a supplementary scheme .
TiFlash
If already deployed TiDB Cluster but not yet deployed TiFlash node , see also Capacity expansion TiFlash node The steps in are in the existing TiDB Add in cluster TiFlash node .
If not already deployed TiDB colony , Please use TiUP Deploy TiDB colony , And on the basis of containing the minimum Topology , meanwhile increase TiFlash topology .
In deciding how to choose TiFlash When the number of nodes , Consider the following business scenarios :
- If the business scenario is based on OLTP Mainly , Do lightweight Ad hoc OLAP Calculation , Usually deploy 1 One or more TiFlash Nodes will have a significant acceleration effect .
- When OLTP Data throughput is important to nodes I/O When there is no obvious pressure , Every TiFlash Nodes will use more resources for computing , such TiFlash Clusters can achieve approximately linear scalability .TiFlash The number of nodes should be adjusted according to the expected performance and response time .
- When OLTP When the data throughput is high ( For example, write or update more than ten million lines / Hours ), Due to the limited writing capacity of the network and physical disks , Inside TiKV And TiFlash Between I/O Will become the main bottleneck , It is also easy to generate read and write hotspots . here TiFlash Number of nodes and OLAP The amount of calculation has a complex nonlinear relationship , The number of nodes needs to be adjusted according to the specific system status .
TiSpark
- If your business needs to be based on Spark Analyze , Please deploy TiSpark. Specific steps , see also TiSpark User guide .
HTAP Data preparation
TiFlash Data will not be synchronized automatically after deployment , You need to specify that you want to sync to TiFlash Data sheet for . After designation ,TiDB The corresponding TiFlash copy .
- If TiDB There is no data in the cluster , Please migrate the data to TiDB. Please refer to Data migration .
- If TiDB The cluster already has data synchronized from the upstream ,TiFlash Data will not be synchronized automatically after deployment , You need to manually specify the tables that need to be synchronized , Please refer to Use TiFlash.
HTAP Data processing
Use TiDB when , You just type in SQL Statement to query or write requirements . For creating TiFlash Copy of the table ,TiDB It will rely on the front-end optimizer to freely select the optimal execution method .
Be careful
TiFlash Of MPP Mode is on by default . When executed SQL When the sentence is ,TiDB The optimizer will automatically determine and select whether to use MPP Mode execution .
- If you need to shut down MPP Pattern , Please change the system variable tidb_allow_mpp Is set to OFF.
- If mandatory TiFlash Of MPP Schema execution query , Please change the system variable tidb_allow_mpp and tidb_enforce_mpp Is set to ON.
- To view TiDB Whether to select with MPP Mode execution , You can adopt EXPLAIN Statement to view the specific query execution plan . If EXPLAIN Statement ExchangeSender and ExchangeReceiver operator , indicate MPP In force .
HTAP Performance monitoring
stay TiDB In the course of using , You can select the following methods to monitor TiDB Cluster operation and view performance data .
- TiDB Dashboard: View the overall operation overview of the cluster , Analyze the distribution and trend of cluster read / write traffic , Learn more about the time-consuming SQL Statement execution information .
- The monitoring system (Prometheus & Grafana): see TiDB Cluster components ( Include PD、TiDB、TiKV、TiFlash、TiCDC、Node_exporter) Relevant monitoring parameters .
To view TiDB and TiFlash Cluster alarm rules and processing methods , Please refer to TiDB Cluster alarm rules and TiFlash Alarm rules .
HTAP Troubleshooting
In the use of TiDB If you encounter problems in the process of , Please refer to the following documents :
- Analyze slow queries
- Locate queries that consume more system resources
- TiDB Dealing with hot issues
- TiDB Cluster fault diagnosis
- TiFlash common problem
besides , You can Github Issues Create a new one Issue Feedback questions , Or in AskTUG Submit your question .
边栏推荐
- 高斯分布Gaussian distribution、線性回歸、邏輯回歸logistics regression
- Quick personal site building guide using WordPress
- Configuration of vscode korofileheader
- Quick realization of Bluetooth ibeacn function
- 汇编语言-王爽 第9章 转移指令的原理-笔记
- 427- binary tree (617. merge binary tree, 700. search in binary search tree, 98. verify binary search tree, 530. minimum absolute difference of binary search tree)
- [QT] use structure data to generate read / write configuration file code
- Active learning
- Scala advanced_ Member access modifier
- TiDB 中的SQL 基本操作
猜你喜欢

G1和ZGC垃圾收集器

研究生数学建模竞赛-无人机在抢险救灾中的优化应用

Using CSDN to develop cloud and build navigation websites

信息系统项目管理师---第七章 项目成本管理

网关状态检测 echo request/reply

Redis cache penetration, cache breakdown, cache avalanche

IDEA一键生成Log日志

快速实现Thread Mesh组网详解

使用 WordPress快速个人建站指南

Thinking technology: how to solve the dilemma in work and life?
随机推荐
快速实现单片机和手机蓝牙通信
高斯分布Gaussian distribution、线性回归、逻辑回归logistics regression
Openresty usage document
Once spark reported an error: failed to allocate a page (67108864 bytes), try again
Force buckle 179, max
Currying Scala functions
论文阅读技巧
观测电机转速转矩
NoViableAltException([email protected][2389:1: columnNameTypeOrConstraint : ( ( tableConstraint ) | ( columnNameT
Matlab quickly converts two-dimensional coordinates of images into longitude and latitude coordinates
My opinion on test team construction
2018 mathematical modeling competition - special clothing design for high temperature operation
IDEA中关于Postfix Completion代码模板的一些设置
写一个 goroutine 实例, 同时练习一下 chan
How to check the frequency of memory and the number of memory slots in CPU-Z?
426-二叉树(513.找树左下角的值、112. 路径总和、106.从中序与后序遍历序列构造二叉树、654. 最大二叉树)
JVM common instructions
310. minimum height tree
Dev++ environment setting C language keyword display color
TiDB 数据库快速上手指南