在学习PKG包HSAUR手册的第七章时，遇到个难题，机器运行了半个小时还不出结果，我把表达式及它的相关步骤复制到下面，望大家给予试试，最好能把每步应该注意的环节都注释出来，以供学习。 ################################################## logL <- function(param, x) { d1 <- dnorm(x, mean = param[2], sd = param[3]) d2 <- dnorm(x, mean = param[4], sd = param[5]) -sum(log(param[1] * d1 + (1 - param[1]) * d2)) } startparam <- c(p = 0.5, mu1 = 50, sd1 = 3, mu2 = 80,sd2 = 3) opp <- optim(startparam, logL, x = faithful$waiting, method = "L-BFGS-B", lower = c(0.01, rep(1,4)), upper = c(0.99, rep(200, 4))) data("faithful", package = "datasets") x <- faithful$waiting library("mclust") mc <- Mclust(faithful$waiting) library("boot") fit <- function(x, indx) { a <- Mclust(x[indx], minG = 2, maxG = 2)$parameters if (a$pro[1] < 0.5) return(c(p = a$pro[1], mu1 = a$mean[1],mu2 = a$mean[2])) return(c(p = 1 - a$pro[1], mu1 = a$mean[2],mu2 = a$mean[1])) } bootpara <- boot(faithful$waiting, fit, R = 1000) # 此步半天不给结果 ###################################################

boot 函数是那个库里，我运行不能发现boot函数

[quote] 引用第1楼 huadeng 于 2006-11-24 21:36 发表的“” : boot 函数是那个库里，我运行不能发现boot函数[/quote] R的PKG中，“boot”，大约1M

boot(boot) Bootstrap Resampling Description Generate R bootstrap replicates of a statistic applied to data. Both parametric and nonparametric resampling are possible. For the nonparametric bootstrap, possible resampling methods are the ordinary bootstrap, the balanced bootstrap, antithetic resampling, and permutation. For nonparametric multi-sample problems stratified resampling is used. This is specified by including a vector of strata in the call to boot. Importance resampling weights may be specified. Usage boot(data, statistic, R, sim="ordinary", stype="i", strata=rep(1,n), L=NULL, m=0, weights=NULL, ran.gen=function(d, p) d, mle=NULL, ...) Arguments data The data as a vector, matrix or data frame. If it is a matrix or data frame then each row is considered as one multivariate observation. statistic A function which when applied to data returns a vector containing the statistic(s) of interest. When sim="parametric", the first argument to statistic must be the data. For each replicate a simulated dataset returned by ran.gen will be passed. In all other cases statistic must take at least two arguments. The first argument passed will always be the original data. The second will be a vector of indices, frequencies or weights which define the bootstrap sample. Further, if predictions are required, then a third argument is required which would be a vector of the random indices used to generate the bootstrap predictions. Any further arguments can be passed to statistic through the ...{} argument. R The number of bootstrap replicates. Usually this will be a single positive integer. For importance resampling, some resamples may use one set of weights and others use a different set of weights. In this case R would be a vector of integers where each component gives the number of resamples from each of the rows of weights. …… Details The statistic to be bootstrapped can be as simple or complicated as desired as long as its arguments correspond to the dataset and (for a nonparametric bootstrap) a vector of indices, frequencies or weights. statistic is treated as a black box by the boot function and is not checked to ensure that these conditions are met. The first order balanced bootstrap is described in Davison, Hinkley and Schechtman (1986). The antithetic bootstrap is described by Hall (1989) and is experimental, particularly when used with strata. The other non-parametric simulation types are the ordinary bootstrap (possibly with unequal probabilities), and permutation which returns random permutations of cases. All of these methods work independently within strata if that argument is supplied. For the parametric bootstrap it is necessary for the user to specify how the resampling is to be conducted. The best way of accomplishing this is to specify the function ran.gen which will return a simulated data set from the observed data set and a set of parameter estimates specified in mle. Value The returned value is an object of class "boot", containing the following components t0 The observed value of statistic applied to data. t A matrix with R rows each of which is a bootstrap replicate of statistic. R The value of R as passed to boot. data The data as passed to boot. seed The value of .Random.seed when boot was called. statistic The function statistic as passed to boot. Examples # usual bootstrap of the ratio of means using the city data ratio <- function(d, w) sum(d$x * w)/sum(d$u * w) boot(city, ratio, R=999, stype="w") # Stratified resampling for the difference of means. In this # example we will look at the difference of means between the final # two series in the gravity data. diff.means <- function(d, f) { n <- nrow(d) gp1 <- 1:table(as.numeric(d$series))[1] m1 <- sum(d[gp1,1] * f[gp1])/sum(f[gp1]) m2 <- sum(d[-gp1,1] * f[-gp1])/sum(f[-gp1]) ss1 <- sum(d[gp1,1]^2 * f[gp1]) - (m1 * m1 * sum(f[gp1])) ss2 <- sum(d[-gp1,1]^2 * f[-gp1]) - (m2 * m2 * sum(f[-gp1])) c(m1-m2, (ss1+ss2)/(sum(f)-2)) } grav1 <- gravity[as.numeric(gravity[,2])>=7,] boot(grav1, diff.means, R=999, stype="f", strata=grav1[,2]) # In this example we show the use of boot in a prediction from # regression based on the nuclear data. This example is taken # from Example 6.8 of Davison and Hinkley (1997). Notice also # that two extra arguments to statistic are passed through boot. nuke <- nuclear[,c(1,2,5,7,8,10,11)] nuke.lm <- glm(log(cost)~date+log(cap)+ne+ ct+log(cum.n)+pt, data=nuke) nuke.diag <- glm.diag(nuke.lm) nuke.res <- nuke.diag$res*nuke.diag$sd nuke.res <- nuke.res-mean(nuke.res) # We set up a new data frame with the data, the standardized # residuals and the fitted values for use in the bootstrap. nuke.data <- data.frame(nuke,resid=nuke.res,fit=fitted(nuke.lm)) # Now we want a prediction of plant number 32 but at date 73.00 new.data <- data.frame(cost=1, date=73.00, cap=886, ne=0, ct=0, cum.n=11, pt=1) new.fit <- predict(nuke.lm, new.data) nuke.fun <- function(dat, inds, i.pred, fit.pred, x.pred) { assign(".inds", inds, envir=.GlobalEnv) lm.b <- glm(fit+resid[.inds] ~date+log(cap)+ne+ct+ log(cum.n)+pt, data=dat) pred.b <- predict(lm.b,x.pred) remove(".inds", envir=.GlobalEnv) c(coef(lm.b), pred.b-(fit.pred+dat$resid[i.pred])) } nuke.boot <- boot(nuke.data, nuke.fun, R=999, m=1, fit.pred=new.fit, x.pred=new.data) # The bootstrap prediction error would then be found by mean(nuke.boot$t[,8]^2) # Basic bootstrap prediction limits would be new.fit-sort(nuke.boot$t[,8])[c(975,25)] # Finally a parametric bootstrap. For this example we shall look # at the air-conditioning data. In this example our aim is to test # the hypothesis that the true value of the index is 1 (i.e. that # the data come from an exponential distribution) against the # alternative that the data come from a gamma distribution with # index not equal to 1. air.fun <- function(data) { ybar <- mean(data$hours) para <- c(log(ybar),mean(log(data$hours))) ll <- function(k) { if (k <= 0) out <- 1e200 # not NA else out <- lgamma(k)-k*(log(k)-1-para[1]+para[2]) out } khat <- nlm(ll,ybar^2/var(data$hours))$estimate c(ybar, khat) } air.rg <- function(data, mle) # Function to generate random exponential variates. mle will contain # the mean of the original data { out <- data out$hours <- rexp(nrow(out), 1/mle) out } air.boot <- boot(aircondit, air.fun, R=999, sim="parametric", ran.gen=air.rg, mle=mean(aircondit$hours)) # The bootstrap p-value can then be approximated by sum(abs(air.boot$t[,2]-1) > abs(air.boot$t0[2]-1))/(1+air.boot$R)

嘿嘿，搞错了，我调试了一下。估计错在最后两步。由于我对bootstrap内容了解很少，故不能解释。不好意思了。

帮助里的例题都运行正常，但这个手册中的题原因还没有找到

我没敢试1000次，只作了50次bootstrap就挺慢的了 > system.time(bootpara <- boot(faithful$waiting, fit, R = 50)) [1] 39.66 0.16 45.05 NA NA 我的电脑本身也只有256M内存。

我的100次还调出来，一分钟吧，而1000次，好几次了，最长一次，45分钟没有结果

程序没问题，我做了100次，但很慢，我的电脑是双核，512 的，花了一分多钟。结果如下： > bootpara <- boot(faithful$waiting, fit, R = 100) > bootpara ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = faithful$waiting, statistic = fit, R = 100) Bootstrap Statistics : original bias std. error t1* 0.3610159 -0.003395616 0.03105552 t2* 54.6191115 -0.014340831 0.77774902 t3* 80.0938427 -0.235098402 1.81724872 >

> bootpara <- boot(faithful$waiting, fit, R = 500) > bootpara ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = faithful$waiting, statistic = fit, R = 500) Bootstrap Statistics : original bias std. error t1* 0.3610159 -0.007233957 0.03717666 t2* 54.6191115 -0.115011885 0.92811284 t3* 80.0938427 -0.411865915 2.66701469 > ## 抽500次，花了9分钟

求助！R函数调试

« 上一页