当前位置:网站首页>What are the advantages of using SQL in Excel VBA

What are the advantages of using SQL in Excel VBA

2022-07-06 13:05:00 NewJune

   Xiao Bo mentioned many times in his previous blog post , Can be in VBA Write in SQL To operate Excel file , Realize various data processing and analysis requirements . that , You may have such a question :Excel Native VBA, PivotTable , Data analysis Is the function insufficient , Why do you have to use SQL To implement? ? stay Excel VBA Use in SQL Where are the advantages ? Today, I'll take you to have a good look at these problems .

   First , stay VBA Write in SQL, This is basically VBA High level functions of , Not everyone knows , therefore , In addition to SQL, Indeed, there are other technical solutions . secondly , If we are more skilled in using other technologies , We can use our best technology stack to solve problems . In terms of consequence , As long as it can solve practical problems , All good plans .

   however , What I want to discuss here is , If we are right VBA A native method , PivotTable ,SQL Under the premise of being familiar with knowledge , How can we choose a technology stack to solve practical problems , This is the time ,SQL What's the advantage of ? I think there are several points :

One 、 The highest efficiency of execution

Follow VBA Native methods and Excel The formula compares ,SQL The execution speed is much faster . When the number of tables exceeds 10000 rows , This advantage is more and more obvious . More data , The more suitable to use SQL To solve the problem , More efficient ;

Two 、 The code is extremely concise

①、 Data matching scenario : If we have multiple fields to match , In principle, more Vlookup perhaps VBA Create multiple dictionaries in , however , If we use SQL External connection of , Maybe one line of code is enough ;

②、 Data grouping and aggregation scenario : If we were VBA Record macros in to generate PivotTable code , To find the maximum by grouping and aggregating ,EXCEL A lot of code will be generated automatically in the background , These codes are extremely unreadable , But if we use SQL Group By, combination max,min,average Wait for the aggregate function , We only use one line SQL Sentence can be used .

3、 ... and 、 Easier to implement

   There are some special scenes , Use native VBA The method will be extremely complex , But use SQL Maybe just one line of code . Take my previous blog post VBA How to realize the filtering conditions “ Some values are excluded ” give an example , This scene , That article used a lot of space to use pure VBA Skills to achieve such a seemingly simple need , Single one “ Screening —— Some values are not included ” Of VBA scene , We need to use Recording macro function , One dimensional array 、 Two dimensional array function 、 Transpose method of array 、 Dictionary remove Method 、 Dictionary key fast storage array method etc. . But in fact , If we use SQL, The knowledge points needed are much less .

   Let's review this scene again ,【 Source data 】 The table contains 【 full name 】【 postal address 】【 Postal Code 】 The three column ,【 Exception list 】 The table stores names to be excluded ,【 result 】 Table requires return 【 Source data 】 Table does not contain 【 Exception list 】 All data of the name . We use Python faker Library generation 20000 Row data , It is convenient to compare the efficiency of different methods when comparing large samples .

 

 

    If you use SQL, In essence, we need to 【 Source data 】 Table left outer connection (Left Join)【 Exception list 】 surface , be based on 【 full name 】 Column , Plus where The conditions are good .

 

 

The specific code is as follows :

 1 Sub myQuery()
 2   Dim conn As Object, rs As Object, sht1 As Worksheet, sht2 As Worksheet, sht3 As Worksheet, sql As String, startTime As Date, endTime As Date
 3   startTime = Timer
 4   Set conn = CreateObject("ADODB.Connection")
 5   Set rs = CreateObject("ADODB.recordset")
 6   Set sht1 = ThisWorkbook.Sheets(" Source data ")
 7   Set sht2 = ThisWorkbook.Sheets(" Exception list ")
 8   Set sht3 = ThisWorkbook.Sheets(" result ")
 9   conn.Open "provider=Microsoft.ACE.OLEDB.12.0;extended properties=excel 12.0;data source=" & ThisWorkbook.FullName
10   sql = "SELECT a.* FROM [ Source data $]a LEFT JOIN [ Exception list $]b ON a. full name =b. full name  WHERE b. full name  IS NULL"
11   Set rs = conn.Execute(sql)
12   For i = 0 To rs.Fields.Count - 1 ' Output recordset Field names to 【 result 】 surface 
13     sht3.Cells(1, i + 1) = rs.Fields(i).Name
14   Next
15   sht3.Cells(2, 1).CopyFromRecordset rs ' Output recordset Results to 【 result 】 surface 
16   conn.Close
17   Set conn = Nothing
18   endTime = Timer
19   sht3.Activate
20   MsgBox " Cumulative operation " & (endTime - startTime) & " second "
21 
22 End Sub

The running results are as follows , It takes about 0.63 second :

 

Let's review the use of VBA Dictionary to achieve this requirement , The code is as follows :

 1 Sub dictWay()
 2   Dim conn As Object, rs As Object, sht1 As Worksheet, sht2 As Worksheet, sht3 As Worksheet, sql As String, startTime As Date, endTime As Date, maxRow1 As Integer, myDic As Object, maxRow2 As Integer
 3   startTime = Timer
 4   Application.ScreenUpdating = False
 5   Set sht1 = ThisWorkbook.Sheets(" Source data ")
 6   Set sht2 = ThisWorkbook.Sheets(" Exception list ")
 7   Set sht3 = ThisWorkbook.Sheets(" result ")
 8   Set myDic = CreateObject("scripting.dictionary")
 9   maxRow1 = sht1.Cells(Rows.Count, 1).End(xlUp).Row
10   maxRow2 = sht2.Cells(Rows.Count, 1).End(xlUp).Row
11   Dim i As Integer, j As Integer, k As Integer
12   For i = 2 To maxRow2
13     myDic.Add sht2.Cells(i, 1).Value, ""
14   Next
15   
16   k = 1
17   For i = 1 To maxRow1
18     If myDic.exists(sht1.Cells(i, 1).Value) = False Then
19         For j = 1 To 3
20             sht3.Cells(k, j).Value = sht1.Cells(i, j).Value
21         Next
22         k = k + 1
23     End If
24   
25   Next
26 
27   endTime = Timer
28   sht3.Activate
29   Application.ScreenUpdating = True
30   MsgBox " Cumulative operation " & (endTime - startTime) & " second "
31 
32 End Sub

In order to make this method as fast as possible , We used a dictionary , At the same time, it turns off Excel Screen refresh . Final 20000 Line data , It takes about 4 second :

 

You can see , Prior to VBA Compared with , Just one line SQL sentence , The logic is clear and easy to understand , The code is simple and clear , The running time of the program is traditional 1/6 Less than , This is the small climbing force pushing in VBA Use in SQL The main reason is ~~

Welcome to scan code to pay attention to my official account Get more crawlers 、 Knowledge of data analysis !

 

原网站

版权声明
本文为[NewJune]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202131456071716.html