当前位置:网站首页>【R语言】年龄性别频数匹配 挑选样本 病例对照研究,对年龄性别进行频数匹配
【R语言】年龄性别频数匹配 挑选样本 病例对照研究,对年龄性别进行频数匹配
2022-07-01 07:18:00 【违规账号247188】
病例对照研究,对年龄性别进行频数匹配
示例数据
| ID | group | sex | age |
|---|---|---|---|
| A1 | Control | 女 | 90 |
| A10 | Control | 男 | 89 |
| A100 | Control | 男 | 85 |
| A1000 | Control | 男 | 74 |
| A1001 | Case | 女 | 74 |
| A1002 | Case | 男 | 74 |
| A1003 | Case | 女 | 74 |
| A1004 | Case | 男 | 74 |
| A1005 | Control | 男 | 74 |
data <- read.csv("年龄性别频数匹配.csv")
## 可以先对年龄进行分层处理
data$agegroup <- cut(data$age,c(0,10,20,30,40,50,60,70,80,90,Inf),
right =FALSE # 选取区间开口[ ]情况
)
# 生成list数据,按每一个list[?]为同样sex和age
dataSplit <- split(data, list(data$sex, data$age))
# 单个list提取
A <- as.data.frame(data_case[[1]])
data1 <- A[A$group == "Case", ]
data2 <- A[A$group == "Control", ]
num <- as.numeric(table(A$group == "Case")[2])
set.seed(123) # 设置随机数目,多次重复保持一致
data3 <- data2[sample(1:NROW(data2), num, replace = FALSE), ]
assign(paste0("dataFinal", 1), rbind(data1, data3))
# 循环提取
for (i in 1:length(dataSplit)) {
A <- as.data.frame(data_case[[i]])
data1 <- A[A$group == "Case", ]
data2 <- A[A$group == "Control", ]
numCase <- nrow(data1)
numControl <- nrow(data2)
if (numCase > 0 & numControl>0 & numControl-numCase>=0) {
na.omit <- TRUE
set.seed(123) # 设置随机数目,多次重复保持一致 【可以沉默】
data3 <- data2[sample(1:numControl, numCase, replace = FALSE), ]
assign(paste0("dataFinal", i), rbind(data1, data3))
}
if (numCase > 0 & numControl>0 & numControl-numCase<0) {
na.omit <- TRUE
set.seed(123) # 设置随机数目,多次重复保持一致 【可以沉默】
data3 <- data1[sample(1:numCase, numControl, replace = FALSE), ]
assign(paste0("dataFinal", i), rbind(data2, data3))
}
}
# 多个数据框合并
multimerge<-function(dat=list(),...){
if(length(dat)<2)return(as.data.frame(dat))
mergedat<-dat[[1]]
dat[[1]]<-NULL
for(i in dat){
mergedat<-merge(all=TRUE,mergedat,i,...)
}
return(mergedat)
}
files=ls(pattern = "dataFinal")
listALL=list()
# 数据框合成list
for (i in 1:length(ls(pattern = "dataFinal"))) {
eval(parse(text = paste0("listALL","[[",i,"]]", " <- ",files[i])))
}
dataALL <- multimerge(listALL)
# 排序
dataALL=dplyr::arrange(dataALL,age,sex,group)
# 导出
write.csv(dataALL,"dataALL.csv",row.names = F)
边栏推荐
- 未来互联网人才还稀缺吗?哪些技术方向热门?
- Subclasses call methods and properties of the parent class with the same name
- 手机开户选哪个证券公司比较好,哪个更安全
- 【LINGO】求七个城市最小连线图,使天然气管道价格最低
- Todolist classic case ①
- Fix the problem that the AI video intelligent platform easycvr device video cannot be played
- 【Tikhonov】基于Tikhonov正则化的图像超分辨率重建
- 如何画产品架构图?
- 【微服务|openfeign】Feign的日志记录
- Huawei modelarts training alexnet model
猜你喜欢

如何画产品架构图?

ctfshow-web354(SSRF)

AI视频智能平台EasyCVR设备录像出现无法播放现象的问题修复

Is it suitable for girls to study product manager? What are the advantages?

C # read and write customized config file

Todolist classic case ①

Easynvs cloud management platform function reconfiguration: support adding users, modifying information, etc

The game is real! China software cup releases a new industrial innovation competition, and schools and enterprises can participate in it jointly

开源了!文心大模型ERNIE-Tiny轻量化技术,又准又快,效果全开

【目标检测】目标检测界的扛把子YOLOv5(原理详解+修炼指南)
随机推荐
redisson使用全解——redisson官方文档+注释(下篇)
[target detection] yolov5, the shoulder of target detection (detailed principle + Training Guide)
【分类模型】Q 型聚类分析
MySQL table partition creation method
[image processing] image histogram equalization system with GUI interface
【微服务|openfeign】Feign的日志记录
【编程强训2】排序子序列+倒置字符串
【LINGO】求七个城市最小连线图,使天然气管道价格最低
【Flutter 问题系列第 72 篇】在 Flutter 中使用 Camera 插件拍的图片被拉伸问题的解决方案
[programming training] delete public characters (hash mapping) + team competition (greedy)
Is it reliable to open an account on the compass with your mobile phone? Is there any potential safety hazard
热烈祝贺五行和合酒成功挂牌
【LINGO】求无向图的最短路问题
Kdtree notes
[programming training 2] sorting subsequence + inverted string
go-etcd
[Electrical dielectric number] electrical dielectric number and calculation considering HVDC and facts components
Unity2021-Scene视图中物体无法直接选中的解决办法
浅谈CVPR2022的几个研究热点
关于图灵测试和中文屋Chinese room的理解