当前位置:网站首页>数据挖掘——关联分析例题代码实现(下)
数据挖掘——关联分析例题代码实现(下)
2022-07-29 03:47:00 【泡泡怡】
1.导包
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules
import pandas as pd2.读取文件
my_data=pd.read_excel("D:/棒/数据挖掘/basket.xlsx")
df_data=my_data.iloc[:,7:].copy()
df_data.head()
3.数据显示
my_data.describe()
结果如图:

4.数据处理:
dict_data={'F':False,'T':True}
df_data['fruitveg']=df_data['fruitveg'].map(dict_data)
df_data['freshmeat']=df_data['freshmeat'].map(dict_data)
df_data['dairy']=df_data['dairy'].map(dict_data)
df_data['cannedveg']=df_data['cannedveg'].map(dict_data)
df_data['cannedmeat']=df_data['cannedmeat'].map(dict_data)
df_data['frozenmeal']=df_data['frozenmeal'].map(dict_data)
df_data['beer']=df_data['beer'].map(dict_data)
df_data['wine']=df_data['wine'].map(dict_data)
df_data['softdrink']=df_data['softdrink'].map(dict_data)
df_data['fish']=df_data['fish'].map(dict_data)
df_data['confectionery']=df_data['confectionery'].map(dict_data)结果如下:
5.设置支持度求频繁项集
frequent_itemsets = apriori(df_data,min_support=0.1,use_colnames= True)
frequent_itemsets
结果如下:

6.
#求关联规则,设置最小置信度为0.15
rules = association_rules(frequent_itemsets,metric = 'confidence',min_threshold = 0.15)
#设置最小提升度
rules = rules.drop(rules[rules.lift <1.0].index)
#设置标题索引并打印结果
rules.rename(columns = {'antecedents':'from','consequents':'to','support':'sup','confidence':'conf'},inplace = True)
rules = rules[['from','to','sup','conf','lift']]
rules
结果如下:
边栏推荐
- (2022 Hangdian multi school III) 1002 boss rush (pressure dp+ dichotomy)
- Configmap配置与Secret加密
- Cannot paste multiple pictures at once
- Use of leak scanning (vulnerability scanning) tool burpsuite or burp Suite (with installation and installation package download of burpsuite+1.7.26)
- 【C语言入门】ZZULIOJ 1031-1035
- 深入C语言(3)—— C的输入输出流
- Machine learning based on deepchem
- Process tracking of ribbon principle
- Why do many programmers hate pair programming?
- Violence recursion to dynamic programming 01 (robot movement)
猜你喜欢

Practical application cases of digital Twins - smart energy

Ribbon principle analysis namedcontextfactory

(nowcoder22529c) diner (inclusion exclusion principle + permutation and combination)

Getting started with caspin

@Configuration (proxybeanmethods = false) what's the use of setting this to false

小马智行进军前装量产,从自研域控制器入手?

Deep into C language (1) -- operators and expressions

新零售O2O 电商模式解析

Analysis of new retail o2o e-commerce model
![[BGP] small scale experiment](/img/58/877e5e454e9bab9d1bccb8fdd3b04d.png)
[BGP] small scale experiment
随机推荐
Simple understanding of CDN, SDN and QoS
Deep into C language (3) -- input and output stream of C
Introduction to static routing and dynamic routing protocols OSPF and rip and static routing configuration commands
小马智行进军前装量产,从自研域控制器入手?
Why don't programmers work blindly?
How to understand "page storage management scheme"
(codeforce547)C-Mike and Foam(质因子+容斥原理)
How do programmers use code to completely end those things in the system?
String template of ES6 new features and methods to simplify objects and functions
The latest second edition of comic novels, listening to books, three in one, complete source code / integrated visa free interface / building tutorials / with acquisition interface
Common methods of lodash Library
OPENSQL快速学习
EMD 经验模态分解
Shopify卖家:EDM营销就要搭配SaleSmartly,轻松搞定转化率
1985-2020(8个版次)全球地表覆盖下载与介绍
(codeforce547) c-mike and foam
Rdkit I: using rdkit to screen the structural characteristics of chemical small molecules
从2019 年开始,你一定停止使用了这个营销策略…
Exness: dove resolution helped gold rebound, and the focus turned to U.S. GDP
1. Mx6u driver development-2-led driver