当前位置:网站首页>Chapter 6 relational database theory
Chapter 6 relational database theory
2022-07-05 06:16:00 【Xinghai can sink fish】
Catalog
🥧6.2 Normalization
6.2.1 Function dependency
6.2.2 code
🥩🥩6.2.3 normal form ( Key content )
6.2.4 2NF
🥯🥯6.2.5 3NF
6.2.6 BCNF
6.2.7 Multivalued dependency
6.2.8 4NF
6.2.9 Standardized summary
ps:
The part about theory , Mainly to introduce How to create a good ( High operational efficiency 、 Low data redundancy ) database .
6.1 Problem presentation
1. Representation of relational patterns
The relationship model consists of five parts , It's a quintuple :R(U,D,DOM,F).
(1) Relationship name R Is a symbolic tuple semantics .
(2)U For a set of properties .
(3)D For attribute group U The domain from which the attribute in comes .
(4)DOM For attribute to domain mapping .
(5)F For attribute group U A group on the Data dependency .
ps:
R Is the name of the relationship , For example, in the past Student( ...... )、Course( ...... )、SC( ...... ) etc. .
U:SC( U、D......), This U It stands for :Sno、Cno、Grade These attributes .
D: It stands for Value range , for instance Sno The range of phi is zero Integers wait .
DOM: Specific to which value to take , for instance Sno=100, So it's called 100 yes Sno Attribute to integer value range mapping .
【 explain 】
- because D、DOM It has little to do with pattern design , Therefore, in this chapter, the relational schema is regarded as a triple :R<U、F>.
- If and only if U A relationship on r Satisfy F when ,r It's called the relational model R <U、F> A relationship of .
- As a 2D table ( A table with rows and columns ), A relationship must meet one of the most basic conditions : Each component must be an indivisible data item . The relationship pattern that meets this condition belongs to First normal form (1NF). The one on the left meets the requirements , yes 1NF; The one on the right doesn't meet the requirements , No 1NF( To be converted into 1NF You can put Remove the price , Number 、 Increase the unit price —— Understand everything ).
2. Data dependency (F)
Data dependency It's a Inside the relationship Between attributes A constraint relationship , It's through Between attributes The value is equal to no Reflected Data are interrelated .
ps:
Know the student number Sno, Students' names can be introduced , namely Sno->Sname;
know Cno, Can be launched Cname.
Sname and Cname Depend on Sno and Cno.
The main types of numerical dependencies :
Function dependency ( abbreviation FD)、 Multivalued dependency ( abbreviation MVD)
3. The embodiment of functional dependence in real life
【 example 】 Describe a student relationship , You can have a student ID 、 full name 、 Attributes such as family names .
A student number corresponds to only one student , A student can only study in one department ,“ Student number ” after , The value of the student's name and department is uniquely determined .
【 example 】 Create a description School educational administration The database of . The objects involved include : Student's student number (Sno)、 Department (Sdept)、 The name of the dean (Mname)、 Course no. (Cno)、 achievement (Grade).
Suppose the database mode of school educational administration Use a single relationship pattern Student To express , Then the attribute set of the relationship pattern is :
U={ Sno,Sdept,Mname,Cno,Grade } .
Known facts of the real world ( semantics ):
- There are several students in a department , But a student belongs to only one department ;
- There is only one dean in a department ;
- A student can take multiple courses , There are multiple students taking each course ;
- Every student has a grade in every course .
From this, we can get the attribute group U A set of functional dependencies on F:
F={ Sno->Sdept,Sdept->Mname,(Sno,Cno) ->Grade }
4. Problems with functional dependencies
Relationship model Student<U,F> Problems in :
(1) data redundancy
Waste a lot of storage space , The name of each Dean appears repeatedly , The number of repetitions is the same as the number of occurrences of all course scores of all students in the Department .
(2) Update exception
data redundancy , When updating data , Maintaining data integrity is costly . for instance After changing the dean of a department , Each tuple related to the students of the Department must be modified , Otherwise, data inconsistency exceptions will occur .
(3) Insertion exception
If a department has just been established , There are no students , The information of the Department and its Dean cannot be stored in the database .
(4) Delete exception
If all the students in a department graduate , While deleting the student information of the Department , The information about the Department and its Dean has also been lost .
【 explain 】
- Student The relational model is not a good model . One “ good ” There should be no insert exception in the mode of 、 Delete exception and update exception , Data redundancy should be minimized .
- The reasons for the above problems It's because of something in the pattern Data dependency Caused by the .
- The solution is to use Normalization theory Transforming the relationship model To eliminate inappropriate data dependencies .
5. The solution of functional dependency
Divide this single model into three relational models :
S(Sno,Sdept,Sno->Sdept);
SC(Sno,Cno,Grade,(Sno,Cno)->Grade);
DEPT(Sdept,Mname,Sdept->Mname);
ps:
The above three are relationship patterns R(U,F) Specific examples of .
When we were building the database , Be sure to build a separate table for one kind of thing , Don't let many things get mixed up in one table .
None of these three modes will have insertion exceptions 、 Delete the problem of exception , The redundancy of data is also controlled .
6.2 Normalization
ps:
Because the database was not designed well , There are various functional dependencies , Resulting in large data redundancy , Insertion exception 、 Delete exception 、 Update exception ; Let's start with How to standardize these bad databases .
6.2.1 Function dependency
【 Definition 1】 set up R(U) It's the property set U The pattern of relationships on ,X、Y yes U Subset . If for R(U) Any possible relationship of r,r There can't be two tuples in X Property values on are equal , And in the Y The attribute values on are not equal , said “ X The function determines Y ” or “ Y Function depends on ” X, Write it down as X->Y.
ps:
It can be understood in this way :
X Express Sno,Y Express Sname, Relationship r Means a sheet of Student surface ;
Sno->Sname, Given a student number 100, The result is Zhang San ;
And can't be Now give one 100, The result is Zhang San , Give another student number 100, The result is Li Si .
In a nutshell , Given a Student number , You can uniquely identify a name , This is called functional dependency ( Student number -> full name ).
【 example 】 The previous example Student(Sno,Sname,Ssex,Sage,Sdept).
Suppose duplicate names are not allowed , Then there are :
Sno -> Ssex,Sno -> Sage,Sno -> Sdept,Sno <--> Sname.
Sname -> Ssex,Sname -> Sage,Sname -> Sdept.
but Ssex Undetermined Sage ,Ssex Undetermined Sdept.
ps:
Generally speaking , Student number and other numbers ( Main code ) It must be certain Other attributes in the relationship .
Some terms and symbols :
- X -> Y, but Y⊊X said X -> Y yes Nontrivial functional dependencies .
- X -> Y, but Y⊆X said X -> Y yes Trivial functional dependencies .
- if X -> Y, be X The set of decision attributes called this functional dependency , Also known as determinants .
- if X -> Y, also Y -> X, Remember as X <--> Y.
- if Y No, the function depends on X, Remember as .
ps:
For any relationship pattern , Trivial functional dependencies are inevitable , Unless otherwise stated , We always discuss Dependence of nontrivial functions .
for instance :
for instance Sno and Cno yes Grade The determinants of .
【 Definition 2】 stay R(U) in , If X -> Y, And for the X Any true subset of X', There are X' Y, said Y Yes X Completely function dependent , Write it down as ;
if X->Y, but Y Not entirely dependent on X, said Y Yes X Part of the function depends on , Write it down as .
【 example 】 In relation SC(Sno,Cno,Grade) in ,
because :SnoGrade,CnoGrade,
therefore ,(Sno,Cno)Grade
ps:
Only when the two are combined can they be uniquely determined , namely Give an attribute or attribute group , Only one record can be identified Of It is called complete functional dependency ..
(Sno,Cno)Sno
(Sno,Cno)Cno
ps:
As long as there is one that can be pushed out .
【 Definition 3】 stay R(U) in , If X->Y(Y⊊X),YX,Y->Z(Z⊊Y), said Z Yes X The transfer function depends on , Write it down as .
【 Be careful 】 If Y->X, namely X<-->Y, be Z Depend directly on X, Instead of transferring functional dependencies .
【 example 】 In relation Std(Sno,Sdept,Mname) in , Yes Sno->Sdept,Sdept->Mname,Mname Delivery depends on Sno.
6.2.2 code
ps:
code There are different names in different books , Some are called code , Some are called keyword , Some are called key .
【 Definition 4】 set up K by R<U,F> Medium Attribute or combination of attributes . if KU, be K be called R A candidate code for .
If U Some functions depend on K, namely KU, be K Called super code . The candidate code is the smallest hypercode , namely K Any true subset of is not a candidate code .
If the relational pattern R There are multiple candidate codes , Then select one of them as the main code .
ps:
for instance Sno It's a student form Student Code of ,(Sno,Cno) It's a course schedule SC Code of ;
Studnet Inside the watch ( The student's name is not the same ), Yes Sno,Sname,Ssex...... that Sname and Sno These two attributes can be called Student Candidate code of the table ;
Any candidate code can be regarded as a master code , There can only be one main code in a table ;
Simply speaking Hypercode It is based on the main code or candidate code And then expanded , Add other attributes .
Primary and non primary
Main attribute : Included in any one Candidate code Properties in .
ps:
Above mentioned Student surface Sno There's only one property , Then it is the main attribute ;SC Inside the watch (Sno,Cno) Code has two properties , that Sno,Cno These two attributes are called primary attributes .
Non primary property : Attributes that are not included in any candidate code .
ps:
In addition to the main attribute attribute , Others are called non primary attributes .
Full size : The entire attribute group is code , Called full size .
ps:
The whole attribute is combined , To determine a record is called full code .
【 example 】S(Sno,Sdept,Sage), Single attribute Sno Code .Sno It's the main attribute ,Sdept、Sage Is a non primary property .
SC(Sno,Cno,Grade) in ,(Sno,Cno) Code . Sno、Cno It's the main attribute ,Grade Yes no main genus
sex .
【 example 】 In the relational model R(P,W,A)
P: performer ,W: works ,A: Audience
One player can play multiple works , A piece of music can be played by multiple players , Listeners can enjoy different works of different performers .
Code for (P,W,A), Only when the three attributes are combined can the only record be determined , Full code .
【 Definition 5】 Relationship model R Attribute or attribute group in X Is not R Code of , but X Is the code of another relational pattern , said X yes R External code of , Also known as outer code .
【 example 】SC(Sno,Cno,Grade) in ,Sno Not a code ,Sno yes S(Sno,Sdept,Sage) Code of , be Sno yes SC The outer code of .
The main code and outer code together provide a means to express the relationship between relationships .
ps:
In the 3 Chapter SQL At the time of statement , When connecting , Yes SC.Sno=S.Sno, It is equivalent to making a natural connection , Pass the two tables through Sno Connect .
6.2.3 normal form ( Key content )
ps:
Simply speaking , The paradigm is to standardize the database , Different paradigms have different restrictions and requirements on the functional dependence of databases .
A paradigm is a collection of relational patterns that conform to a certain level . Relationships in relational databases Must meet certain requirements , Different paradigms meet different requirements .
ps:
BC Paradigm is an extended version of the third paradigm , here we are BC The paradigm database is optimized .
The connection between various paradigms :
A relationship pattern R For the first time n normal form , It can be abbreviated as :R∈nNF.
Normalization : It refers to a relational model of a lower paradigm Through pattern decomposition It can be converted into several A collection of relational patterns of a higher-level paradigm , This process is called Normalization .
6.2.4 2NF
【 Definition 6】 If the relational pattern R∈1NF, also Every non primary attribute is completely functionally dependent on any candidate code , be R∈2NF.
ps:
That is to say Given the value of a candidate code , You can deduce a non primary attribute .
【 example 】S-L-C(Sno,Sdept,Sloc,Cno,Grade),Sloc For the student's residence , And the students of each department live in the same place .S-L-C The code of is (Sno,Cno). Then the functional dependency has :
(Sno,Cno) Grade
Sno -> Sdept,(Sno,Cno) Sdept
Sno -> Sloc, (Sno,Cno) Sloc
Sdept - > Sloc
【 explain 】
- The dotted line in the picture shows Part of the function depends on .
- Non primary property Sdept、Sloc It doesn't depend entirely on the code .
- Relationship model S-L-C Do not belong to 2NF.
A relational pattern does not belong to 2NF, The following questions will arise :
(1) Insertion exception
If you insert a new student , But the student did not choose a course , That is, this student has no Cno, Because when inserting tuples , Must give Code value (Sno,Cno), So insert failed .
(2) Delete exception
If S4 Only one course C3, Now he is not taking this course , Delete C3 after , Other information about the entire tuple is also deleted .
(3) The modification is complicated
If a student takes more than one course , be Sdept,Sloc Stored many times . If the province is transferred , You need to modify all relevant Sdept and Sloc, Complicate the modification .
The reason for this problem :
Example classes have two types of non primary properties : One kind is like Grade, It's completely functional dependent on the code , The other is Sdept、Sloc, They are not completely functional dependent on the code .
resolvent :
Use projective decomposition to decompose relational patterns S-L-C Break it down into two relational patterns .
S-L-C(Sno,Sdept,Sloc,Cno,Grade) Decompose into :
SC(Sno,Cno,Grade).
S-L(Sno,Sdept,Sloc).
ps:
S-L You can't Just break down Sdept and Sloc( Partially functionally dependent ), Otherwise, there is no way and SC Connected to , So take out Sno .
SC The code of is (Sno,Cno),SL The code of is Sno, In this way, the non primary attribute is completely functionally dependent on the code .
6.2.5 3NF
【 Definition 7】 Set the relational pattern R<U,F>∈1NF, if R There is no such code in X、 Attribute group Y And non attribute groups Z(Z⊊Y), bring X->Y,Y->Z establish ,Y Z Don't set up , said R<U,F>∈3NF.
【 example 】SC There is no transitive dependency , therefore SC∈3NF.
S-L in Sno->Sdept(SdeptSno),Sdept->Sloc,
Available SnoSloc.
The solution is to S-L(Sno,Sdept,Sloc) Decompose into :
S-D(Sno,Sdept)∈3NF.
D-L(Sdept,Sloc)∈3NF.
6.2.6 BCNF
BCNF from Boyce and Codd Put forward , Than 2NF A step closer , But not yet 4NF The point of . It is commonly believed BCNF It's the revised third paradigm , Sometimes called the extended third normal form .
【 Definition 8】 Set the relational pattern R<U,F>∈1NF, if X->Y And Y⊊ when X Must contain code , said R<U,F>∈BCNF.
In other words , In relation mode R<U,F> in , If every one Determine the attribute set All contain candidate codes , be R<U,F>∈BCNF.
BCNF The nature of the relational pattern of :
- All non primary attributes all Completely dependent on On Each candidate code .
- All primary properties Are completely dependent on Each candidate code that does not contain it .
- No attributes Depend entirely on Any set of attributes of non code .
If all the schemas of a relational database belong to BCNF, Then in the scope of functional dependency , It has implemented a thorough decomposition of patterns , To the highest degree of Standardization , Eliminate insert exception and delete exception .
【 example 】 Look at relational patterns C(Cno,Cname,Pcno), It has only one code Cno, There is no attribute to Cno Partial dependence or transitive dependence , therefore C∈3NF. meanwhile C in Cno Is the only determinant , therefore C∈BCNF.
【 example 】 Relationship model S(Sno,Sname,Sdept,Sage), Assume Sname Also unique , that S Just two yards , These two codes are composed of a single attribute , They don't intersect . Other attributes have no transitive or partial dependency on the code , therefore S∈3NF. meanwhile S Middle Division Sno、Sname There are no other determinants , therefore S∈BCNF.
【 example 】 Relationship model SJP(S,J,P) in ,S It's the students ,J Indicates the course ,P Indicates rank . Each student has a certain ranking in the results of each course , There is only one student in each class ( That is, there is no parallel ranking ).
Functional dependencies can be obtained from semantics :(S,J) -> P ;(J,P)->S, therefore (S,J) And (J,P) Can be used as a candidate code .
In relational mode There is no attribute Rely on code transmission or partial dependence , so SJP∈3NF.
except (S,J) And (J,P) There are no other determinants , therefore SJP∈BCNF.
【 example 】 Relationship model STJ(S,T,J) in ,S Students ,T It means that the teacher ,J Indicates the course . Each teacher teaches only one course . There are several teachers in each school , A student chooses a course , It corresponds to a fixed teacher .
Functional dependencies can be obtained from semantics :(S,J)->T;(S,T)->J;T->J.
Because there are no non primary attributes Right code Transitive dependency or partial dependency , therefore STJ∈3NF.
because T Is the determining factor , and T Does not contain code , therefore STJ∉BCNF.
【 explain 】
(1) For not BCNF Relationship model of , There are still inappropriate places . Not BCNF The relationship pattern can also be decomposed into BCNF. for example ,STJ Can be broken down into ST(S,T) And TJ(T,J), They are all BCNF.
(2)3NF and BCNF Under the condition of functional dependence What schema decomposition can achieve A measure of the degree of separation .
A relational pattern in a pattern If all belong to BCNF, So in the context of functional dependencies , It has achieved complete separation , To eliminate the exceptions of insertion and deletion .
3NF Of “ Not completely ” Sex is manifested in Possible The main attribute depends on the code partially and transitively .
6.2.7 Multivalued dependency
ps:
Multivalued dependency refer to One to many functional relationship ; What is written above is One to one functional relationship .
【 example 】 A certain course in the school is taught by multiple teachers , They use the same set of reference books . Each teacher can also teach multiple courses , Each reference book can be used for multiple courses .
Use relational patterns Teaching(C,T,B) To represent the course C、 Teachers' T And reference books B The relationship between . The denormalization relationship is as follows :
Teaching With unique candidate code (C,T,B), Full code .Teaching∈BCNF .
The above relationship has the following problems :
- Large data redundancy : How many teachers are there , How many times do reference books have to be stored .
- Increase the complexity of operation : When a teacher is added to a course , How many reference books are there in this course , How many tuples must be inserted .
- Delete operation is complex : A reference book should be removed from a course , How many teachers are there in this course , How many tuples must be deleted .
- The modification operation is complex : A reference book needs to be revised for a certain course , How many teachers are there in this course , How many tuples must be modified .
【 Definition 9】 set up R(U) It's the property set U A relationship model on .X、Y、Z yes U Subset , also Z=U-X-Y. Relationship model R(U) Multivalued dependencies in X->->Y establish , If and only if Yes R(U) Any relationship between r, A given pair (x,z) value , There is a group Y Value , This set of values only depends on x It's worth it with z Value independent .
【 example 】 Relationship Teaching(C,T,B) in , about C Every value of ,T There is a set of values corresponding to , Regardless of B What's the value . therefore T Multivalued depends on C, namely C->->T.
ps:
【 Trivial multivalued dependencies 】 if X->->Y, and Z=∅, namely Z It's empty , said X->->Y For trivial multivalued dependencies , Otherwise, it is called X->->Y For nontrivial multivalued dependencies .
【 example 】 Relationship model WSC(W,S,C) in ,W Indicates warehouse ,S It means the storekeeper ,C Represent goods . Suppose there are several custodians in each warehouse , There are several kinds of goods . Each storekeeper keeps all the goods in his warehouse , Each commodity is kept by all custodians .
According to semantics, for W Every value of Wi,S There is a complete set corresponding to it without asking C What's the value . therefore W->->S.
Use the figure below to show WSC Relationship :
【 explain 】
- Corresponding W A certain value of Wi All S Value recorded as {S}wi( Indicates all the custodians working in this warehouse ).
- All C Value recorded as {C}wi( Represents all goods stored in this warehouse ).
- Ought to have {S}wi Each value in and {C}wi Each of them C Value correspondence .
- therefore {S}wi And {C}wi It just forms a complete bipartite graph , so W->->S.
- because C And S Complete symmetry of , There must be W->->C establish .
Properties of multivalued dependencies :
(1) Multivalued dependencies have symmetry : if X->->Y, be X->->Z, among Z=U-X-Y.
It is easy to see from the above example , Because every cleaner keeps all the goods , At the same time, each commodity is kept by all custodians , Obviously if W->->S, There must be W->->C.
(2) Multivalued dependencies are transitive : if X->->Y,Y->->Z, be X->->Z-Y.
(3) Functional dependency is a special case of multivalued dependency : if X->Y, be X->->Y.
(4) if X->->Y,X->->Z, be X->->Y∪Z.
(5) if X->->Y,X->->Z, be X->->Y∩Z.
(6) if X->->Y,X->->Z, be X->->Y-Z,X->->Z-Y.
6.2.8 4NF
【 Definition 10】 Relationship model R<U,F>∈1NF, If for R For each nontrivial multivalued dependency X->->Y(X⊊Y),X They all contain codes , be R<U,F>∈4NF.
【 explain 】
- 4NF It's about limiting relational patterns Between attributes... Is not allowed Nontrivial and non functionally dependent Multivalued dependency .4NF Permitted Nontrivial multivalued dependencies It's actually functional dependency .
- If a relationship pattern is 4NF, Then it must be BCNF.
- In relation WSC in ,W->->S,W->->C, They are all nontrivial multivalued dependencies . and W Not a code , Relationship model WSC The code is (W,S,C), Full code , therefore WSC∉4NF. You can put WSC Decompose into WS(W,S),WC(W,C),WS∈4NF,WC∈4NF.
6.2.9 Standardized summary
(1) In a relational database , The basic requirement of relational model is to meet the first paradigm .
(2) The relationship of too low standardization mode It may not be a good description of the real world . There may be an insertion exception 、 Delete exception 、 Modify exception 、 Data redundancy and so on , The solution is to standardize it , Convert to high-level paradigm .
(3) A relational model of a lower paradigm , Through schema decomposition, it can be transformed into a set of relational schemas of several higher-level paradigms , This process is called the normalization of relational patterns .
(4) The normalization theory of relational database is a tool for database logic design .
(5) Standardization thought
- Gradually eliminate inappropriate parts of data dependence , It is the relationship mode in the mode that reaches a certain degree “ Separate ”.
- use “ One thing at a time ” Pattern design principles . Let a relationship describe a concept 、 An entity or a connection between entities . If there is more than one concept, put it “ Separate ” get out . Therefore, standardization is essentially the simplification of concepts .
(6) It can't be said that the more standardized the relationship model is, the better . It is necessary to further analyze the actual situation of the real world and the application needs of users , Identify a suitable 、 A model that reflects the real world .
ps:
The process of Standardization :
边栏推荐
- 【Rust 笔记】14-集合(上)
- Appium基础 — 使用Appium的第一个Demo
- Dynamic planning solution ideas and summary (30000 words)
- SQLMAP使用教程(一)
- TypeScript 基础讲解
- 1039 Course List for Student
- 【Rust 笔记】13-迭代器(下)
- The connection and solution between the shortest Hamilton path and the traveling salesman problem
- Traditional databases are gradually "difficult to adapt", and cloud native databases stand out
- 数据可视化图表总结(二)
猜你喜欢
Error ora-28547 or ora-03135 when Navicat connects to Oracle Database
Smart construction site "hydropower energy consumption online monitoring system"
1.14 - 流水线
leetcode-6110:网格图中递增路径的数目
LeetCode 0108. Convert an ordered array into a binary search tree - the median of the array is the root, and the left and right of the median are the left and right subtrees respectively
1.15 - 输入输出系统
传统数据库逐渐“难适应”,云原生数据库脱颖而出
Full Permutation Code (recursive writing)
实时时钟 (RTC)
SQLMAP使用教程(二)实战技巧一
随机推荐
Navicat連接Oracle數據庫報錯ORA-28547或ORA-03135
数据可视化图表总结(二)
Full Permutation Code (recursive writing)
leetcode-9:回文数
数据可视化图表总结(一)
[practical skills] technical management of managers with non-technical background
【Rust 笔记】14-集合(上)
Leetcode-22: bracket generation
In depth analysis of for (VaR I = 0; I < 5; i++) {settimeout (() => console.log (I), 1000)}
一些工具的记录2022
leetcode-556:下一个更大元素 III
【Rust 笔记】15-字符串与文本(下)
One question per day 2047 Number of valid words in the sentence
阿里新成员「瓴羊」正式亮相,由阿里副总裁朋新宇带队,集结多个核心部门技术团队
1039 Course List for Student
Daily question 1984 Minimum difference in student scores
【Rust 笔记】13-迭代器(中)
[practical skills] how to do a good job in technical training?
MySQL advanced part 1: triggers
Some common problems in the assessment of network engineers: WLAN, BGP, switch