当前位置:网站首页>Talk about row storage and column storage of database
Talk about row storage and column storage of database
2022-06-23 18:26:00 【Gauss squirrel Club】
Catalog
Comparison of advantages and disadvantages
Row storage and column storage experiments
When many people first learned about databases , It's a relational database , Data is stored in tabular form , A row represents a record . In fact, this is a typical row storage (Row-based store), Store tables on disk partitions by rows .
Some databases also support column storage (Column-based store), It stores tables in columns on disk partitions .
Comparison of storage methods
The difference between the two is shown in the figure below :

As you can see from the diagram , When saving , The attribute values of a row of records are stored in the adjacent space , Then there is the attribute value of the next record .
And when it comes to inventory , All values of a single attribute are stored in adjacent spaces , That is, all data in a column is stored continuously , Each attribute has a different space .
here , You can think about which of the two is more suitable for query , Which is more suitable for modification ?
Comparison on data writing :
1) Write to row store is done at one time . Writing is based on the file system of the operating system , It can guarantee the success or failure of the writing process , The integrity of the data can thus be determined .
2) Column storage because of the need to split a row of records into a single column to save , Write times are significantly more than line storage , Plus the time it takes for the head to move and position on the disc , The actual time consumption will be greater . therefore , Row storage has a great advantage in writing .
3) And data modification , This is actually a write process . therefore , Data modification is also dominated by row storage .
Comparison on data reading :
1) Row storage usually takes a row of data out completely , If only a few columns of data are needed , There will be redundant columns , In order to shorten the processing time , The process of eliminating redundant columns is usually done in memory .
2) Column stores one or all of the data read at a time , There is no redundancy problem , Find content for continuous storage , Especially suitable for projection .
3) Two types of stored data distribution . Because each column of data stored in a column is homogeneous , There is no ambiguity . For example, the data type of a column is integer (int), So its data set must be integer data . This makes data parsing very easy . by comparison , Row storage is much more complicated , Because there are many types of data stored in one row of records , Data parsing requires frequent conversion between multiple data types , This operation is very consuming CPU, Increased parsing time . therefore , The parsing process of column storage is more conducive to analyzing big data .
4) Compare data compression with better performance reading . Data in the same column , Data types are consistent , Column storage mode is suitable for data compression , Different columns can use different compression algorithms , Compressed storage brings IO Performance improvement .
Comparison of advantages and disadvantages
The storage type of a table is the first step in table definition design , The customer business type is the main factor that determines the storage type of the table . That's ok 、 Column storage models have their own advantages and disadvantages , It is suggested to choose according to the actual situation .
That's ok 、 See the table below for the advantages and disadvantages of listing and comparison of applicable scenarios :
Bank deposit | Column to save | |
advantage | The data is kept together .INSERT/UPDATE Easy to . |
|
shortcoming | choice (Selection) Even if only a few columns are involved , All the data will also be read . |
|
Applicable scenario |
|
|
Row storage and column storage experiments
openGauss Support row column hybrid storage , You can specify the storage method when creating tables . Now let's do an experiment .
Experimental environment : Huawei cloud server +openGauss Enterprise Edition 3.0.0 + openEuler20.03
Create row save table custom1 And inventory table custom2 , Insert 50 Ten thousand records .
openGauss=# create table custom1 (id integer,name varchar2(20)); CREATE TABLE openGauss=# create table custom2 (id integer,name varchar2(20)) with (orientation = column); CREATE TABLE openGauss=# insert into custom1 select n,'testtt'||n from generate_series(1,500000) n; INSERT 0 500000 openGauss=# insert into custom2 select * from custom1; INSERT 0 500000Let's look at the storage space of the two tables , Compare Size Column , It can be seen that the storage space of column storage table is much smaller than that of row storage table , Almost rows are stored in table space 1/7.
openGauss=# \d+ List of relations Schema | Name | Type | Owner | Size | Storage | Description --------+------------+-------+-------+------------+--------------------------------------+------------- public | custom1 | table | omm | 24 MB | {orientation=row,compression=no} | public | custom2 | table | omm | 3104 kB | {orientation=column,compression=low} |Compare the time of inserting a new record , It's a little slower to list tables .
openGauss=# explain analyze insert into custom1 values(1,'zhang3'); QUERY PLAN ----------------------------------------------------------------------------------------------- [Bypass] Insert on custom1 (cost=0.00..0.01 rows=1 width=0) (actual time=0.059..0.060 rows=1 loops=1) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1) Total runtime: 0.135 ms (4 rows) openGauss=# explain analyze insert into custom2 values(1,'zhang3'); QUERY PLAN ----------------------------------------------------------------------------------------------- Insert on custom2 (cost=0.00..0.01 rows=1 width=0) (actual time=0.119..0.120 rows=1 loops=1) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1) Total runtime: 0.207 ms (3 rows)Finally, delete the test table .
openGauss=# drop table custom1; DROP TABLE openGauss=#drop table custom2; DROP TABLEInterested students can test more scenarios by themselves , For example, create large and wide tables 、update Table and other scenarios .
Choose suggestions
- Update frequency : If the data is updated frequently , Select row save table .
- Insertion frequency : Frequent small insertions , Select row save table . Insert a large amount of data at one time , Select the column save table .
- The column number of the table : In general , If the table has more fields, that is, more columns ( A wide watch ), When there are not many columns involved in the query , Suitable for column storage . If the number of fields in the table is small , Query most fields , It is better to select row storage .
- Number of columns to query : If every query , Only a few of the tables are involved (<50% The total number of columns ) Several columns , Select the column save table .( Don't ask what the rest of the columns are for , What Party A says is useful is useful .)
- compression ratio : The compression ratio of column saving table is higher than that of row saving table . But high compression rates consume more CPU resources .
matters needing attention
Because of the special storage method , There are many constraints when using . such as , The column save table does not support arrays 、 Generating Columns... Is not supported 、 Creating global temporary tables is not supported 、 Foreign key not supported , The supported data types are also less than row storage . You need to view the corresponding database documents .
边栏推荐
- 【華中科技大學】考研初試複試資料分享
- 渗透测试基础,初识渗透测试
- 论文阅读 (48):A Library of Optimization Algorithms for Organizational Design
- Reading papers (51):integration of a holonic organizational control architecture and multiobjective
- Prevent users from submitting repeatedly in the uniapp project
- 异步or线程池
- The battlefield of live broadcast e-commerce is not in the live broadcast room
- A set of code to launch seven golang web frameworks at the same time
- How to make good use of daily time to review efficiently?
- 【故障公告】取代 memcached 的 redis 出现问题造成网站故障
猜你喜欢

基于FPGA的电磁超声脉冲压缩检测系统 论文+源文件

iMeta | 南农沈其荣团队发布微生物网络分析和可视化R包ggClusterNet

Practical circuit analysis 3

全局组织结构控制之抢滩登陆

Shell process control - 39. Special process control statements

Wiley- Open Science Joint Symposium of the documentation and information center of the Chinese Academy of Sciences, lecture 2: open access journal selection and paper submission
![[esp8266 - 01s] obtenir la météo, Ville, heure de Beijing](/img/8f/89e6f0d482f482ed462f1ebd53616d.png)
[esp8266 - 01s] obtenir la météo, Ville, heure de Beijing

Remote connection raspberry pie in VNC Viewer Mode

What does the science and technology interactive sand table gain popularity by virtue of
![[failure announcement] there is a problem with the redis that replaces memcached, causing the website to fail](/img/b5/447faaee6d5d2d88927e84e17403ed.png)
[failure announcement] there is a problem with the redis that replaces memcached, causing the website to fail
随机推荐
"Tribute to a century old master, collect pocket book tickets"
基于QT实现的图形学绘制系统 文档+项目源码及可执行EXE文件+系统使用说明书
Thesis reading (53):universal advantageous perturbations
Thesis reading (57):2-hydr_ Ensemble: lysine 2-hydroxyisobutyrylation identification with ensemble method (task)
正则表达式使用图床
论文阅读 (51):Integration of a Holonic Organizational Control Architecture and Multiobjective...
五星认证!知道创宇通过中国信通院内容审核服务系统评测
芯片原厂必学技术之理论篇(4-1)时钟技术、复位技术
实现领域驱动设计 - 使用ABP框架 - 存储库
研控电机步进模式
How to make good use of daily time to review efficiently?
Paper reading (48):a Library of optimization algorithms for organizational design
随机过程——马尔科夫链
测试
TT 语音落地 Zadig:开源共创 Helm 接入场景,环境治理搞得定!
如何利用好每天的时间高效复习?
Simpledateformat has thread safety problems in multi-threaded environments.
Paper reading (56):muti features predction of protein translational modification sites (task)
[unity] instructions for beginners of textanimator plug-in
The draganddrop framework, a new member of jetpack, greatly simplifies the development of drag and drop gestures!