我的mac搓cuo2机器(2010的笔记本了比诸位的肯定都差很多):
<br />
#<br />
set.seed(65535)<br />
y <- 1:50000<br />
x <- sample(y,1000)+0.5<br />
re=rep(0,1000);</p>
<p>tic<-proc.time();</p>
<p>x<-sort(x);<br />
y<-sort(y);</p>
<p>for (i in 1:1000){<br />
a=sum(y<=x[i])<br />
if (a==0){<br />
re[i]=NA;<br />
}else{<br />
re[i]=y[a];<br />
}<br />
}</p>
<p>toc<-proc.time()<br />
toc-tic<br />
</p>
user system elapsed
0.555 0.448 1.028
我的代码几乎没有经过任何优化。sapply慢是满在内存申请上吧?
比如
<br />
system.time(re <- sapply(x, function(x,y) {y <- y[y <= x]; tail(y,1)}, y))<br />
在经过优化的BLAS库上是:
user system elapsed
1.086 0.825 1.927
当然我很奇怪的是,如果我把x,y 去成如下
<br />
set.seed(65535)<br />
y <- runif(50000)*100<br />
x <- sample(y,1000)+0.5<br />
这种奇葩的形式,计算速度会快些!当然我用了cmpfun
<br />
#<br />
set.seed(65535)<br />
y <- runif(50000)*100<br />
x <- sample(y,1000)+0.5<br />
library(compiler)</p>
<p>f<-cmpfun(function(x,y){</p>
<p> re=rep(0,1000);</p>
<p> x<-sort(x);<br />
y<-sort(y);</p>
<p> for (i in 1:1000){<br />
a=sum(y<=x[i])<br />
if (a==0){<br />
re[i]=NA;<br />
}else{<br />
re[i]=y[a];<br />
}<br />
}<br />
re<br />
});</p>
<p>tic<-proc.time();<br />
xxx=f(x,y);<br />
toc<-proc.time();<br />
toc-tic<br />
system.time(re <- sapply(x, function(x,y) {y <- y[y <= x]; tail(y,1)}, y))<br />
user system elapsed
0.337 0.151 0.529
当然sapply版本保持不变:
user system elapsed
1.100 0.551 1.778
</p>