当前位置:网站首页>【R语言】年龄性别频数匹配 挑选样本 病例对照研究,对年龄性别进行频数匹配
【R语言】年龄性别频数匹配 挑选样本 病例对照研究,对年龄性别进行频数匹配
2022-07-01 07:18:00 【违规账号247188】
病例对照研究,对年龄性别进行频数匹配
示例数据
| ID | group | sex | age |
|---|---|---|---|
| A1 | Control | 女 | 90 |
| A10 | Control | 男 | 89 |
| A100 | Control | 男 | 85 |
| A1000 | Control | 男 | 74 |
| A1001 | Case | 女 | 74 |
| A1002 | Case | 男 | 74 |
| A1003 | Case | 女 | 74 |
| A1004 | Case | 男 | 74 |
| A1005 | Control | 男 | 74 |
data <- read.csv("年龄性别频数匹配.csv")
## 可以先对年龄进行分层处理
data$agegroup <- cut(data$age,c(0,10,20,30,40,50,60,70,80,90,Inf),
right =FALSE # 选取区间开口[ ]情况
)
# 生成list数据,按每一个list[?]为同样sex和age
dataSplit <- split(data, list(data$sex, data$age))
# 单个list提取
A <- as.data.frame(data_case[[1]])
data1 <- A[A$group == "Case", ]
data2 <- A[A$group == "Control", ]
num <- as.numeric(table(A$group == "Case")[2])
set.seed(123) # 设置随机数目,多次重复保持一致
data3 <- data2[sample(1:NROW(data2), num, replace = FALSE), ]
assign(paste0("dataFinal", 1), rbind(data1, data3))
# 循环提取
for (i in 1:length(dataSplit)) {
A <- as.data.frame(data_case[[i]])
data1 <- A[A$group == "Case", ]
data2 <- A[A$group == "Control", ]
numCase <- nrow(data1)
numControl <- nrow(data2)
if (numCase > 0 & numControl>0 & numControl-numCase>=0) {
na.omit <- TRUE
set.seed(123) # 设置随机数目,多次重复保持一致 【可以沉默】
data3 <- data2[sample(1:numControl, numCase, replace = FALSE), ]
assign(paste0("dataFinal", i), rbind(data1, data3))
}
if (numCase > 0 & numControl>0 & numControl-numCase<0) {
na.omit <- TRUE
set.seed(123) # 设置随机数目,多次重复保持一致 【可以沉默】
data3 <- data1[sample(1:numCase, numControl, replace = FALSE), ]
assign(paste0("dataFinal", i), rbind(data2, data3))
}
}
# 多个数据框合并
multimerge<-function(dat=list(),...){
if(length(dat)<2)return(as.data.frame(dat))
mergedat<-dat[[1]]
dat[[1]]<-NULL
for(i in dat){
mergedat<-merge(all=TRUE,mergedat,i,...)
}
return(mergedat)
}
files=ls(pattern = "dataFinal")
listALL=list()
# 数据框合成list
for (i in 1:length(ls(pattern = "dataFinal"))) {
eval(parse(text = paste0("listALL","[[",i,"]]", " <- ",files[i])))
}
dataALL <- multimerge(listALL)
# 排序
dataALL=dplyr::arrange(dataALL,age,sex,group)
# 导出
write.csv(dataALL,"dataALL.csv",row.names = F)
边栏推荐
- redisson使用全解——redisson官方文档+注释(中篇)
- go-etcd
- Huawei modelarts training alexnet model
- 在长城证券上做基金定投安全吗?
- 2022电工(中级)复训题库及答案
- redisson使用全解——redisson官方文档+注释(下篇)
- Easynvs cloud management platform function reconfiguration: support adding users, modifying information, etc
- [lingo] solve quadratic programming
- ctfshow-web351(SSRF)
- Do securities account opening affect the security of account opening
猜你喜欢

ctfshow-web351(SSRF)

LeetCode+ 71 - 75

开源了!文心大模型ERNIE-Tiny轻量化技术,又准又快,效果全开

Is it suitable for girls to study product manager? What are the advantages?
![[lingo] solve quadratic programming](/img/4d/3f7de69943f29a71c4039299c547f7.png)
[lingo] solve quadratic programming

【LINGO】求七个城市最小连线图,使天然气管道价格最低

為什麼這麼多人轉行產品經理?產品經理發展前景如何?

【推荐系统】美团外卖推荐场景的深度位置交互网络DPIN的突破与畅想

AI视频智能平台EasyCVR设备录像出现无法播放现象的问题修复
![[lingo] find the shortest path problem of undirected graph](/img/14/1ccae0f33f5857b546d7fd0aa74c35.png)
[lingo] find the shortest path problem of undirected graph
随机推荐
ctfshow-web351(SSRF)
【电气介数】电气介数及考虑HVDC和FACTS元件的电气介数计算
[FPGA frame difference] FPGA implementation of frame difference target tracking based on vmodcam camera
Pourquoi tant de gens sont - ils devenus des gestionnaires de produits? Quelles sont les perspectives de développement des gestionnaires de produits?
【MATLAB】求解非线性规划
C语言实现【扫雷游戏】完整版(实现源码)
灰度何以跌下神坛?
【系统分析师之路】第五章 复盘软件工程(逆向净室与模型驱动开发)
浏览器本地存储
Redisson uses the full solution - redisson official document + comments (Part 2)
Programming examples of stm32f1 and stm32subeide infrared receiving and decoding of NEC protocol
图像风格迁移 CycleGAN原理
在券商账户上买基金安全吗
How to permanently configure local opencv4.5.5 for vs2019
C language implementation [minesweeping game] full version (implementation source code)
【LINGO】求无向图的最短路问题
女生适合学产品经理吗?有什么优势?
weback5基础配置详解
开源了!文心大模型ERNIE-Tiny轻量化技术,又准又快,效果全开
[Tikhonov] image super-resolution reconstruction based on Tikhonov regularization