当前位置:网站首页>Simpson's paradox
Simpson's paradox
2022-08-02 00:16:00 【Zhang Chuncheng】
辛普森悖论
There is an amazing paradox in statistics,It's called Simpson's paradox(Simple’s Paradox).
简单来说,就是“在分组比较中都占优势的一方,In the overall evaluation, it is sometimes the loser.”
This paper attempts to adopt an interactive visualization method,explain it.
And trying to illustrate this paradoxical situation is not very out of the way,Even with proper construction methods,This kind of conflict can always happen.
辛普森悖论
This is a serious statistical problem,A detailed discussion can be found here
Simpson’s Paradox (Stanford Encyclopedia of Philosophy)[1]

Interactive chart explanation
本文的代码可见我的 OBSERVABLE codebook
Interactive Simpson's Paradox[2]

Raw data starts with OA 和 AB 的形式获得.The slope of the line segment refers to the precision,比例等.因此,OB The slope refers to the overall accuracy.
通常情况下,We want the slope to be as large as possible.
In the presence of the red triangle,It is easy to obtain a slope greater than OA的“更好”的OC方法.之后,can always be doneCD与AB平行.It's not hard to find this time,CD The slope of and AB 相等.
Then you can always find a ratio CD 更好的 CE,只要满足 CE 大于 CD 即可.
这时,射线 CE 与 OB There are always intersections,在 C Pick any point on the line segment between the point and the intersection O‘,This is obviously a comparisonOB更糟糕的OO'.
但是考虑到 OO’ 是由 OC 和 CE 生成的,However, in terms of slope,
OC 优于 OA CE 优于 AB 但 OO’ 劣于 OB
这就是辛普森悖论.
有意思的是,My previous derivation was from the red triangle OAB 开始,as long as this triangle exists,The existence interval of the paradox must be deduced OCO’.
也就是说,Regardless of the grouping of the group comparisons,We can always“生成”A new set of data,来“导致”Paradox occurs.
This shows that Simpson's paradox is not a special case of a corner,But as long as there are group comparisons,may appear“一般情况”.
参考资料
Simpson’s Paradox (Stanford Encyclopedia of Philosophy): https://plato.stanford.edu/entries/paradox-simpson/#:~:text=Simpson%E2%80%99s%20Paradox%20is%20a%20statistical%20phenomenon%20where%20an,independent%20or%20even%20negatively%20associated%20in%20all%20subpopulations.
[2]Interactive Simpson's Paradox: https://observablehq.com/@listenzcc/interactive-simpsons-paradox
边栏推荐
猜你喜欢

security cross-domain configuration

如何重装Win11?一键重装Win11方法

学习笔记:机器学习之回归

【MySQL系列】MySQL数据库基础

不了解SynchronousQueue?那ArrayBlockingQueue和LinkedBlockingQueue不会也不知道吧?

OpenCV DNN blogFromImage() detailed explanation

Study Notes: The Return of Machine Learning

认识USB、Type-C、闪电、雷电接口

Flink Yarn Per Job - 提交流程一

SphereEx Miao Liyao: Database Mesh R&D Practice under Cloud Native Architecture
随机推荐
【Leetcode】473. Matchsticks to Square
When Netflix's NFTs Forget Web2 Business Security
图解LeetCode——1161. 最大层内元素和(难度:中等)
不了解SynchronousQueue?那ArrayBlockingQueue和LinkedBlockingQueue不会也不知道吧?
Arduino 基础语法
【Leetcode】479. Largest Palindrome Product
ROS 动态参数
不就是个TCC分布式事务,有那么难吗?
[头条]笔试题——最小栈
玩转NFT夏季:这份工具宝典值得收藏
一个有些意思的项目--文件夹对比工具(一)
【解决】win10下emqx启动报错Unable to load emulator DLL、node.db_role = EMQX_NODE__DB_ROLE = core
接地气讲解TCP协议和网络程序设计
信息系统项目管理师必背核心考点(五十七)知识管理工具
Flink Yarn Per Job - 提交流程一
Thymeleaf简介
Task execution control in Ansible
【MySQL系列】 MySQL表的增删改查(进阶)
Ansible中的任务执行控制
solidity