当前位置:网站首页>[How to smash wool according to the music the couple listens to during the Qixi Festival] Does the background music affect the couple's choice of wine?
[How to smash wool according to the music the couple listens to during the Qixi Festival] Does the background music affect the couple's choice of wine?
2022-08-05 00:59:00 【sunny day qt01】
目录
The statistical method of feature selection is as follows
Qixi Festival special column
简介
七夕到了,Couples are also in high-end hotels,I've been in hotel sales before,Sales at this time are generally not bad.,But how to stand out from it,And that's about the background music in the bar.,有时,Couples can not order music,sometimes french:French手风琴,Some people will order Italian style:italian手风琴,Have you sell the wineFrench、italian、其他酒类
How can we get wool in the hands of these little lovers?,This requires feature selection in data mining..
特征选择的方法
无效变量
不相关变量,多余变量
The statistical method of feature selection is as follows
Just a few here
方差阈值化、卡方检验、ANOVA检验及T检验、皮尔森相关系数
高度相关特征的选择(多余变量)
模型方式的特征选择
决策树、逻辑回归,随机森林,XGBoost
模型会自动选择变量
递归式的特征选择.
将特征慢慢消除,限制到特定范围内.
当输入增加,就必须增加数据,不然模型就会不稳定
无效变量
不相关变量,多余变量
Redundancy:两个变量的相关性太高,说明1二者的概念可能是否接近,也就是多余变量,可以采取合并的方法.甚至删除字段,二者带来的信息
Irrelevancy:X4,X3就是不相关变量,X4变大时会发现目标值的变动.当X3变动的时候预测值是随机的,不相关,无法带来信息.
统计方式的特征选择
VT方差阈值化:算出数值型字段的方差,如果低于某个值,说明它包含的信息量不足.
方差不能事先对它进行标准化.比如Z-scold 它的方差为1,均值为0
必须决定一个门槛值,是否删除该字段
二元变量:把其中一个编码为1,一个编码为0方差就是P(1-P)(先做特征转换)
当方差越大,说明是越重要的字段.最大值是0.25.
当然,这个与目标无关
皮尔森相关系数:
高度相关特征的选择(多余变量):
经常会出现高度相关字段,带来的信息是重复的,利用皮尔森相关系数,查看二者的相关性.大于0.95就抹除变量.
要看保留那个,可以求变量1和变量2与目标的关系.
统计检验的方式:
输入字段与目标字段的关系
类别型字段:卡方检验:输入字段与目标字段的关联性
数值型字段:ANOVA检验(目标字段大于2就行):T检验(目标字段只有2个值,比如yes or no):来检验输入字段与目标字段的关联性.
ANOVA案例:背景音乐是否会影响消费者心情.音乐(输入字段)与酒类购买的关系.
无音乐,French手风琴,italian手风琴
酒:French、italian、其他酒类
统计量
真实销售减去期望值求和除以期望值求和
这是期望频数.设二者相互独立,概率1乘以概率2,乘总数243.
上表减下表,平方之和,除以均值之和
得到的值越大越好.The comparison values can be found in the chi-square statistics table,
先计算其卡方值,利用该值查表,对应的概率,如果小于显著性水平0.05,说明二者无关的概率极小,予以排除.
结论
Then we can conclude that there is a strong correlation between alcohol and music,Then we can actually scour it,We sell Italian wine when couples are listening to the Italian-style accordion,French手风琴,就卖French酒,那么我们就对症下药,pluck their wool.
边栏推荐
- 2021年11月网络规划设计师上午题知识点(上)
- 安装oracle11的时候为什么会报这个问题
- [GYCTF2020]EasyThinking
- Software testing interview questions: What is the difference between load testing, capacity testing, and strength testing?
- GCC: paths to header and library files
- Software Testing Interview Questions: Qualifying Criteria for Software Acceptance Testing?
- 2022 Multi-school Second Session K Question Link with Bracket Sequence I
- 面试汇总:为何大厂面试官总问 Framework 的底层原理?
- 快速批量修改VOC格式数据集标签的文件名,即快速批量修改.xml文件名
- 2022牛客多校训练第二场 L题 Link with Level Editor I
猜你喜欢
随机推荐
Pytorch使用和技巧
2022 Hangzhou Electric Multi-School 1004 Ball
Software Testing Interview Questions: What do you think about software process improvement? Is there something that needs improvement in the enterprise you have worked for? What do you expect the idea
ora-01105 ora-03175
2022 The Third J Question Journey
OPENWIFI实践1:下载并编译SDRPi的HDL源码
活动推荐 | 快手StreamLake品牌发布会,8月10日一起见证!
gorm joint table query - actual combat
Software testing interview questions: What are the seven-layer network protocols?
MongoDB搭建及基础操作
Opencv——视频跳帧处理
tensor.nozero(),面具,面具
2022杭电多校训练第三场 1009 Package Delivery
Knowledge Points for Network Planning Designers' Morning Questions in November 2021 (Part 1)
day14--postman接口测试
JUC thread pool (1): FutureTask use
第十一章 开关级建模
Software testing interview questions: Please draw the seven-layer network structure diagram of OSI and the four-layer structure diagram of TCP/IP?
Introduction to JVM class loading
4. PCIe interface timing