当前位置:网站首页>[advanced data mining technology] Introduction to advanced data mining technology
[advanced data mining technology] Introduction to advanced data mining technology
2022-07-24 20:29:00 【Sunny qt01】
Functional classification of data mining technology
Descriptive data mining (Descriptive Data Mining(Unsupervised Learning Unsupervised learning , No target value is required ))
Association Rules( Association rules )
Find out which events often occur together ,
Example : Amazon Amazon.com

People who buy these two books will generally buy the following books , According to past purchase records , Find the connection . Most e-commerce companies have recommendations , The technology behind it is related technology .
This technology is to find those events that happen at the same time .( Buy two books at the same time )
Generally used in e-commerce , Virtual store , You can make recommendations directly in the purchase process ( Just like the battery and flashlight in front of each other )
Algorithm :
Apriori
FP-Growth
Sequential Patterns( Sequence pattern )
Find out which events often appear in sequence

This technology is to find out what products will be purchased again after purchasing a certain product . There is a time order relationship
Generally used in physical stores , Like the bookstore above , After paying for the book , You can predict the goods you may buy next time , Print out this type of discount coupon , Attract customers to buy again , Or consumption . Because the sales are not at the same time ,up_selling. There are recommendations in chronological order
Algorithm :
ApriorAll
Cluster Analysis( Clustering analysis )
Find out the internal structure between data
Most of the fields of customers of the same type will be the same , Fields of different classes differ greatly

The above figure is a bank that hopes to cluster the investment propensity of consumers by income and age .
One point represents one customer , If the points are dense and concentrated, a group will be formed . Can we see that there is 3 Species group . We can make some corresponding marketing strategies for different customer groups
The first group has low income , Middle and upper age ( Investigation found , It may be workers , High risk , Insurance Strategy , The insurance cost is low .)
The second group has a high income , Older ( Old and golden , Prefer capital guaranteed investment )
The third group has a medium to high income , Younger ( Rich second generation , Tend to high-yield and high-risk investment , The risk tolerance is relatively large )
Algorithm :
Hierarchical clustering :Single Linkage,Aberage Linkage,Complete linkage
Separate clusters :K-means
Kohonen self Organizing Maps(SOM)
Two-Step First determine the number of clusters , How to get a group ( You can get several groups )
Predictive data mining (Predictive Data Mining(Supervised Learning Supervised learning ))
Classification( classification )
The category to which the forecast data belongs
You need to input variables , Target variables are also required .
Predict whether customers are interested in car models
Customer number Input properties Result properties

Input Attributes Input factors :Car( The type of car you bought ) Age( Age ) Children( There are several children )
New customers :car=sedan Age=35 To make predictions See if you have any preferences
The result variable must be a categorical variable
Algorithm :
Bayes Net
Logistic Regreession ( Logical regression )
Decision Tree ( Decision tree )
Neural Network( neural network )
Support Vector Machine( Support vector machine )
Knearest Neighborhood(K- Nearest neighbor )
Prediction( forecast )
The value corresponding to the predicted data
There should be input variables ( reason ), The result field must be numeric
For example, predict the annual revenue of customers ,

Location, The location of the house Type Type of house Miles And school location SF House size CM How many houses in the community
forecast
Input field Location=Rural Miles=3 SF=1500
Result fields :Home Price
Linear Regression( Linear regression )
Time Series( The time series )
Decision Tree( Decision tree )*
Neural Network( neural network )*
Support Vector Machine( Support vector machine )*
K-Nearest Neighborhood(K- Nearest neighbor )*
- Introduction to data mining related websites -KDnugets&Kaggle
KDnugets: Can provide a lot of things , There are data sets , The latest information will be sent to you after registration
Data UCI The data is easy to use . Data sets from all walks of life ,
Kaggle: A bridge between enterprises and data scientists , Enterprises can send data to scientists for analysis .
There are enterprise requirements

Enterprises hand over demand , Compete in the form of competition . It can provide a large amount of data at the enterprise level . It's very large Valued Shoppers( value ), More authentic data .
Data Castle, kordsa Kesci( There is training )( China's Kaggle) It also provides data
Positioning of data mining 、 expectation 、 With the establishment of data mining team

Machines replace labor , The first industrial revolution
Data mining middle managers , Will be replaced by Intelligent Automation
Prospects of data mining :
Time magazine lists data mining as 21 One of the five emerging industries in the 21st century , Data mining is of great importance in business
The future marketing focus will shift from products to customers
- Customers may be robbed by better services from competitors
- Whoever has the most knowledge about customers has the most capital
- Know more about customers , The more we can deepen the uniqueness of the brand , The competition is stronger
Only by converting data into knowledge , Knowledge becomes action , To turn action into profit
Data mining and fortune telling
Data Mining data mining ( Modern fortune telling )
Data:Attributes
Algorithm:Classification,Clustering,Association,…( Various algorithms )
Predict Future Trends( Future trends )
Fortune-telling Fortune-telling
Data: The eight words of birth , Face , Palm ,…
Algorithm: Ziwei's number , Four column tweet …
Predict Future fortune
Data mining can predict the future , Reduce fear .
How to carry out data mining :
Short term data mining requires tools , Long term data mining requires self-development
The team's best case group needs 3 people of the same race
One kind of person is Project Manager( managers )
One kind of person is CRM(Data Mining) People who ( data mining )
One kind of person is IT(Database) People who ( The person who provides the data )
Data analysis certification process :
Data Miner1. Given field Attributes And data Data, It works DATA Mining Tools Tools to analyze , Get the results Mining Results
Data Analyzer2. Given field Attributes( Influencing factors ), Can operate database Databases,DATA Mining Tools Tool analysis can get and interpret results Mining Results
Business Analyzer3. Given the subject , Can operate database Databases, Tools to analyze DATA Mining Tools Get and interpret the results Mining Results
边栏推荐
- 【LeetCode】1184. 公交站间的距离
- Mass modify attribute values in objects in JS
- Actual measurement of Qunhui 71000 Gigabit Network
- Generate self signed certificate: generate certificate and secret key
- Alibaba Sentinel 基操
- What does software testing need to learn?
- The beginning of winter in the year of bitterness and ugliness
- From code farmer to great musician, you only need these music processing tools
- Do you want to verify and use the database in the interface test
- Pychart tutorial: 5 very useful tips
猜你喜欢

The difference between map and flatmap in stream
![[training Day9] maze [line segment tree]](/img/56/e8458245fe564821740ab94ece37a4.png)
[training Day9] maze [line segment tree]

(posted) differences and connections between beanfactory and factorybean

Do you want to enroll in a training class or study by yourself?

The U.S. economy continues to be weak, and Microsoft has frozen recruitment: the cloud business and security software departments have become the hardest hit

Lunch break train & problem thinking: thinking about the problem of converting the string formed by hour: minute: second to second

API data interface of A-share transaction data

C# 窗体应用TreeView控件使用
![[training Day10] tree [interval DP]](/img/2d/807cabc257f67fb708ed9588769de3.png)
[training Day10] tree [interval DP]

Transport layer protocol parsing -- UDP and TCP
随机推荐
Leetcode 560 and the subarray of K (with negative numbers, one-time traversal prefix and), leetcode 438 find all alphabetic ectopic words in the string (optimized sliding window), leetcode 141 circula
The U.S. economy continues to be weak, and Microsoft has frozen recruitment: the cloud business and security software departments have become the hardest hit
Leetcode 1928. minimum cost of reaching the destination within the specified time
C form application treeview control use
Understand the domestic open source Magnolia license series agreement in simple terms
147-利用路由元信息设置是否缓存——include和exclude使用——activated和deactivated的使用
(forward) usage of PostMessage
Lunch break train & problem thinking: on multidimensional array statistics of the number of elements
微服务架构 | 服务监控与隔离 - [Sentinel] TBC...
Actual measurement of Qunhui 71000 Gigabit Network
147 set whether to cache by using the routing meta information - use of include and exclude - use of activated and deactivated
Redis basic knowledge, application scenarios, cluster installation
How does starknet change the L2 landscape?
[training Day8] interesting number [digital DP]
Click the button to return to the top smoothly
Lua environment configuration
Oracle creates table spaces and views table spaces and usage
[training Day6] game [mathematics]
Modbus communication protocol specification (Chinese) sharing
English grammar_ Demonstrative pronoun this / these / that / those