利用公共数据库GBD分析数据,借助R语言基于age-period-cohort模型分析数据,其中有一步骤,需要将1992年-2021年的以年龄分组的数据,以5年为一组,即1992-1996,1997-2001.。。。,计算5年均数后再合成新的一列,这一部分的代码,好多博主用的source()的函数,而且没找到分享的版本,求助大家用什么代码,学习了!以下是我的代码
y <-vroom::vroom('6apc.csv')
str(y)
#提取年龄分组
age1 <-c("20-24 years","25-29 years",
"30-34 years","35-39 years","40-44 years","45-49 years","50-54 years","55-59 years",
"60-64 years","65-69 years","70-74 years","75-79 years","80-84 years","85-89 years",
"90-94 years") #6apc分析数据
age2 <-c("20 to 24","25 to 29",
"30 to 34","35 to 39","40 to 44","45 to 49","50 to 54","55 to 59",
"60 to 64","65 to 69","70 to 74","75 to 79","80 to 84","85 to 89",
"90 to 94") #population数据
location <-c("Low SDI","Low-middle SDI","Middle SDI","High-middle SDI","High SDI","Global")
#发生率的年龄周期队列#
#发生率的人数,提取
y1 <- subset(y,
(y$age_name %in% age1)&
y$sex_name=='Both' &
(y$location_name %in% location)&
y$metric_name=='Number'&
y$measure_name=='Incidence')
unique(y$age_name)
#替换years和“-”字符
y1$age_name<-gsub(" years","",y1$age_name)
unique(y1$age_name)
y1$age_name<-gsub("-"," to ",y1$age_name)
unique(y1$age_name)
#因子化
y1$age_name <- factor(y1$age_name, levels = c("20 to 24","25 to 29","30 to 34",
"35 to 39","40 to 44","45 to 49",
"50 to 54","55 to 59","60 to 64",
"65 to 69","70 to 74","75 to 79",
"80 to 84","85 to 89","90 to 94"))
y1 <- y1[, c("age_id","age_name","year","val")]
y1_n <- dcast(data = y1, age_id + age_name ~ year, value.var = "val", fun.aggregate = mean)
。。。