当前位置:网站首页>MySQL advanced learning notes (4)
MySQL advanced learning notes (4)
2022-07-02 23:56:00 【The Mutents】
One 、 Database design specifications
If it is a poor database design, it may cause the following problems :
data redundancy
、 Repetition of information 、 Waste of storage spaceData update 、 Insert . Deleted exception
- Can't represent information correctly
- Loss of valid information
- Poor program performance
Good database design has the following advantages :
- Save data storage space
- It can guarantee the integrity of data
- Convenient for the development of database application system
All in all , When you start setting up the database , We need to pay attention to the design of data table . In order to establish less redundancy 、 A well structured database , There are certain rules that must be followed when designing a database .
1. normal form (Normal Form)
- First normal form (1st NF)
First normal form
It is mainly to ensure thatThe value of each field must be atomic
, That is to say, in the data tableThe value of each field is the smallest data unit that cannot be split again
.
- Second normal form (2nd NF)
The second paradigm requires , On the basis of satisfying the first paradigm , Also meet every data record in the data sheet , Are uniquely identifiable .
And all non primary key fields , Must be completely dependent on the primary key , You can't rely on only part of the primary key
. If you know the values of all attributes of the primary key , Any tuple can be retrieved ( That's ok ) Any value of any property of .( The primary key in the request , In fact, it can be extended and replaced with candidate keys ).
Second normal form (2NF) requirement
The attribute of an entity depends entirely on the primary keyword
.If there is an incomplete dependency , Then this attribute and this part of the main keyword should be separated to form a A new entity , There is a one to many relationship between the new entity and the meta entity .
- Third normal form (3rd NF)
The third paradigm is based on the second paradigm , Ensure that every non primary key field in the data table is directly related to the primary key field , That is to say, all non primary key fields in the data table cannot depend on other non primary key fields .( namely ,
There cannot be a non primary attribute A Dependent on non primary properties B, Non primary property B Depends on the primary key C The situation of , There is "A→B→C The decisive relationship of
) informally , This rule means that all non primary key attributes cannot have dependencies , Must be independent of each other , The candidate key can be extended here .
- BCBF( Bass paradigm )
stay 3NF On the basis of improvement , Put forward
Bass paradigm (BCNF)
, It's also called bath - The COD paradigm (Boyce-Codd Normal Form).BCNF It is considered that no new design specification has been added , It only requires stronger design specifications in the third paradigm , Make the database redundancy smaller . therefore ,It is called the modified third paradigm , Or extended third paradigm
,BCNF It is not called the fourth paradigm .
If a relationship reaches the third paradigm , And itThere is only one candidate key
, perhapsEach of its candidate keys is a single attribute
, Then the relationship naturally reaches BC normal form .It's in 3NF In addition, it eliminates the partial or transitive dependency of primary attributes on candidate keys
Generally speaking , A database design conforms to 3NF or BCNF That's all right. .
- Fourth normal form
Multivalued dependency
The concept of :
Multivalued dependency
namelyOne to many relationships between attributes
, Write it down as K→→A.Function dependency
In fact, it isSingle value dependency
, Therefore, the one to many relationship between attribute values cannot be expressed .Ordinary multi value dependence
: The complete U=K+A, One K Can correspond to multiple A, namely K→→A. At this point, the whole table is a set of one to many relationships .Non trivial multivalued dependence
: The complete U=K+A+B, One K Can correspond to multiple A, It can also correspond to multiple B, A And B Independent to each other , namely K→→A,K→→B.The entire table has multiple sets of one to many relationships
, And there are :“ One “ Part is the same set of attributes ,“ many “ Part is a collection of independent attributes .The fourth paradigm is to satisfy bath - The COD paradigm (BCNF) On the basis of ,
Eliminate nontrivial and non functional dependencies with multivalued dependencies ( That is, delete the many to many relationship in the same table )
.
- The fifth paradigm 、 Domain key paradigm
In addition to the fourth paradigm , We also have a more advanced fifth paradigm ( Also called perfect paradigm ) And domain key paradigm (DKNF).
In meeting the fourth paradigm (4NF) On the basis of , Eliminate connection dependencies that are not implied by candidate keys
.If the relationship pattern R Each connection dependency in is determined by R Implied by the candidate key of , It is said that this relationship model conforms to the fifth paradigm
.
Functional dependency is a special case of multivalued dependency , Multivalued dependency is actually a special case of connection dependency . However, unlike functional dependency and multivalued dependency, connection dependency can be derived by semantic connection , It isIn relational connection operation
Only then reflects . The relational pattern with connection dependency may still encounter data redundancy and insertion 、 modify 、 Delete exceptions and other problems .
The fifth paradigm deals withLossless connection problem
, This paradigm has little practical significance , Because lossless connections rarely occur , And imperceptible . The domain key paradigm attempts to define an ultimate paradigm , This paradigm considers all dependency and constraint types , But the practical value is also the smallest , Only exists in theoretical research .
- The advantages and disadvantages of paradigms
The advantages of paradigm
: Standardization of data helpsEliminate data redundancy in the database
, Third normal form (3NF) It's usually considered to be in performance 、 The best balance between scalability and data integrity .The drawbacks of paradigms
: The use of paradigms , ProbablyReduce the efficiency of query
. Because the higher the paradigm level , The more data tables you design 、 The more refined , The less redundant the data is , When querying data, you may need to associate multiple tables , It's not only expensive , It may also invalidate some indexing strategies .
Paradigms just put forward the standard of design , In fact, when designing data tables , It is not necessary to meet these standards . In development , We will encounter performance and read efficiency violations
The principle of anti paradigm
, Improve the read performance of the database by adding a small amount of redundant or duplicate data , Reduce associated queries ,join Times of table ,Realize the purpose of space for time
. Therefore, in the actual design process, we should combine theory with practice , Flexible use of .
2. Anti normalization
Anti paradigm can be achieved by
Space for time
, Improve the efficiency of query , But anti paradigm also brings someNew problems
:
- The storage space has become larger
- A field in a table has been modified , Redundant fields in another table also need to be modified synchronously , Otherwise, the data is inconsistent
- If you use stored procedures to support data updates 、 Delete and other extra operations , If updates are frequent , It will consume system resources
- In the case of a small amount of data , The anti normal form does not embody the performance advantage , It may also complicate the design of the database
When redundant information is valuable or can greatly improve query efficiency , We will adopt anti paradigm optimization .
- Suggestions for adding redundant fields
Adding redundant fields must meet the following two conditions . Only if these two conditions are met , You can consider adding redundant fields .
This redundant field does not need to be repaired frequently
This redundant field is indispensable for query
.
Anti paradigm optimization is also commonly used in
Data warehouse design
in , becauseData warehouses usually store historical data
, The real-time requirement of addition, deletion and modification is not strong , Strong demand for historical data analysis . In this case, appropriate redundancy is allowed , More convenient for data analysis .
The difference between data warehouse and database in use : .
The purpose of database design is to capture data , The purpose of data warehouse design is to analyze data ;
- The database has strong real-time requirements for data addition, deletion and modification , Need to store online user data , Data warehouse stores historical data ;
- Database design needs to avoid redundancy as much as possible , However, in order to improve query efficiency, some redundancy is also allowed , and
The design of data warehouse is more inclined to adopt anti paradigm design
.
3.ER Model
ER The model is also called
Entity relationship model
, It is used to describe the objective things in real life 、 The nature of things , And a data model of the relationship between things .In the design stage of developing information system based on Database , Usually use ER Model to describe information requirements and information characteristics , Help us clarify the business logic , So as to design an excellent database .
ER There are three elements in the model , NamelyEntity
、attribute
andRelationship
.
By drawing ER Model , We have clarified the business logic , Put the drawn ER Convert the model into a specific data table , The principle of conversion is as follows :
- (1) An entity is usually converted into a data table ;
- (2) One
many-to-many
, It is usually also converted into a data table ;- (3) One 1 Yes 1, perhaps 1 Relationship to many , It is often expressed by the foreign key of the table , Instead of designing a new data table ;
- (4) Attribute is converted to the field of the table .
have access to
PowerDesigner Tools
Draw data conceptual data model and physical data model , And it can also transform conceptual data model and physical data model , You can also export physical data models as executable sql Script .
4. Design principle of data sheet
- “ The principle of three less and one more ”
The fewer tables, the better
RDBMS At the heart of this is the definition of entities and connections , That is to say E-R chart (Entity Relationship Diagram) , Fewer tables , The simpler the design of proof entities and connections , It's easy to understand and operate .The fewer fields in the data table, the better
The more fields , The greater the possibility of data redundancy . The premise of setting a small number of fields is that each field is independent , Instead, the value of a field can be calculated by other fields . Of course, a small number of fields is relative , We usually balance data redundancy with retrieval efficiency .The fewer fields in the data table, the better
The primary key is set to determine uniqueness , When the uniqueness of a field cannot be determined , We need to use the method of joint primary key ( That is to define a primary key with multiple fields ). The more fields in a federated primary key , The larger the index space , Not only will it make understanding more difficult , It also increases the running time
Index space , Therefore, the fewer fields in the union primary key, the better .The more primary and foreign keys you use, the better
Database design is actually to define various tables , And the relationship between the various fields . The more of these relationships , Prove that the lower the redundancy between these entities , The higher the utilization . The advantage of this is that it not only ensures the independence between data tables , It can also improve the utilization rate of correlation between each other .“ Three less and one more " The core of the principle is
Simple and reusable
. Simple means using fewer tables 、 Fewer fields 、 Fewer union primary key fields to complete data table design . Reusability is through primary keys 、 The use of foreign keys to enhance the reuse rate between data tables . Because a primary key can be understood as the representative of a table . The more keys are designed , Prove that the higher the utilization ratio between them .
边栏推荐
- How to set automatic reply for mailbox and enterprise mailbox?
- RuntimeError: no valid convolution algorithms available in CuDNN
- 流媒体技术优化
- Interface difference test - diffy tool
- Container runtime analysis
- Use redis to realize self increment serial number
- CADD course learning (4) -- obtaining proteins without crystal structure (Swiss model)
- Solution: exceptiole 'xxxxx QRTZ_ Locks' doesn't exist and MySQL's my CNF file append lower_ case_ table_ Error message after names startup
- Where is the win11 automatic shutdown setting? Two methods of setting automatic shutdown in win11
- Digital twin visualization solution digital twin visualization 3D platform
猜你喜欢
RuntimeError: no valid convolution algorithms available in CuDNN
Ideal car × Oceanbase: when the new forces of car building meet the new forces of database
How can cross-border e-commerce achieve low-cost and steady growth by laying a good data base
[live broadcast appointment] database obcp certification comprehensive upgrade open class
Create an interactive experience of popular games, and learn about the real-time voice of paileyun unity
Intranet penetration | teach you how to conduct intranet penetration hand in hand
Open source | Wenxin big model Ernie tiny lightweight technology, which is accurate and fast, and the effect is fully open
2022 latest and complete interview questions for software testing
[analysis of STL source code] imitation function (to be supplemented)
QT 如何将数据导出成PDF文件(QPdfWriter 使用指南)
随机推荐
JDBC tutorial
Data set - fault diagnosis: various data and data description of bearings of Western Reserve University
C MVC creates a view to get rid of the influence of layout
Maybe you read a fake Tianlong eight
67 page overall planning and construction plan for a new smart city (download attached)
程序分析与优化 - 9 附录 XLA的缓冲区指派
流媒体技术优化
Matlab 信号处理【问答笔记-1】
JDBC练习案例
Brief introduction to common sense of Zhongtai
[error record] the flutter reports an error (could not resolve io.flutter:flutter_embedding_debug:1.0.0.)
Codeforces Round #771 (Div. 2)---A-D
67页新型智慧城市整体规划建设方案(附下载)
Returns the maximum distance between two nodes of a binary tree
Request and response
【ML】李宏毅三:梯度下降&分类(高斯分布)
YOLOX加强特征提取网络Panet分析
来自数砖大佬的 130页 PPT 深入介绍 Apache Spark 3.2 & 3.3 新功能
What experience is there only one test in the company? Listen to what they say
开源了 | 文心大模型ERNIE-Tiny轻量化技术,又准又快,效果全开