比较cumsum意义不大,R的cumsum是针对vector的,Matlab的cumsum是针对矩阵的,相当于拿解释代码的循环与编译代码的循环在比。
不过R的代码还是可以优化很多的:
version<br />
m=10;n = floor(2^15/m)<br />
A=matrix(sqrt(1:(m*n)),m,n)<br />
B=matrix(1,m,m)<br />
B[lower.tri(B)]=0<br />
col.cumsum=crossprod(B,A)<br />
<br />
gc();<br />
system.time(for(i in 1:1000){ col.cumsum=apply(A,2,cumsum)})<br />
<br />
gc();<br />
system.time(for(i in 1:1000){ B=matrix(1,m,m)<br />
B[upper.tri(B)]=0<br />
col.cumsum1=B%*%A})<br />
gc();<br />
system.time(for(i in 1:1000){ B=matrix(1,m,m)<br />
B[.Internal(row(B))<.Internal(col(B))]=0<br />
col.cumsum2=B%*%A})<br />
<br />
gc();<br />
system.time(for(i in 1:1000){ B=matrix(1,m,m)<br />
B[.Internal(row(B))>.Internal(col(B))]=0<br />
col.cumsum3=crossprod(B,A)})<br />
<br />
gc();<br />
system.time(for(i in 1:1000){ B=sapply(0:(m-1),function(i)rep(0:1,c(i,m-i)))<br />
col.cumsum4=B%*%A})<br />
<br />
gc();<br />
system.time(for(i in 1:1000){ B=numeric(m*m);vec=rep(1,m)<br />
for(j in 1:m-1){B[(j*m+1):(j*m+m)]=vec;vec[j]=0}<br />
dim(B)=c(m,m)<br />
col.cumsum5=B%*%A})<br />
<br />
## if used in a loop, B matrix doesn't need to be generated everytime.<br />
gc();<br />
system.time({<br />
B=matrix(1,m,m)<br />
B[.Internal(row(B))>.Internal(col(B))]=0<br />
for(i in 1:1000)col.cumsum3=crossprod(B,A)<br />
})<br />
all.equal(col.cumsum,col.cumsum3)<br />
结果:注意apply cumsum与crossprod的差距。
<br />
> version<br />
_ <br />
platform i386-pc-mingw32 <br />
arch i386 <br />
os mingw32 <br />
system i386, mingw32 <br />
status Patched <br />
major 2 <br />
minor 5.1 <br />
year 2007 <br />
month 06 <br />
day 29 <br />
svn rev 42094 <br />
language R <br />
version.string R version 2.5.1 Patched (2007-06-29 r42094)<br />
> m=10;n = floor(2^15/m)<br />
> A=matrix(sqrt(1:(m*n)),m,n)<br />
> B=matrix(1,m,m)<br />
> B[lower.tri(B)]=0<br />
> col.cumsum=crossprod(B,A)<br />
> <br />
> gc();<br />
used (Mb) gc trigger (Mb) max used (Mb)<br />
Ncells 240311 6.5 467875 12.5 467875 12.5<br />
Vcells 21216824 161.9 52174655 398.1 52162618 398.0<br />
> system.time(for(i in 1:1000){col.cumsum=apply(A,2,cumsum)})<br />
user system elapsed <br />
42.39 0.07 49.60 <br />
> <br />
> gc();<br />
used (Mb) gc trigger (Mb) max used (Mb)<br />
Ncells 240321 6.5 467875 12.5 467875 12.5<br />
Vcells 21216844 161.9 52174655 398.1 52162618 398.0<br />
> system.time(for(i in 1:1000){B=matrix(1,m,m)<br />
+ B[upper.tri(B)]=0<br />
+ col.cumsum1=B%*%A})<br />
user system elapsed <br />
0.89 0.00 1.04 <br />
> gc();<br />
used (Mb) gc trigger (Mb) max used (Mb)<br />
Ncells 240328 6.5 467875 12.5 467875 12.5<br />
Vcells 21249607 162.2 52174655 398.1 52162618 398.0<br />
> system.time(for(i in 1:1000){B=matrix(1,m,m)<br />
+ B[.Internal(row(B))<.Internal(col(B))]=0<br />
+ col.cumsum2=B%*%A})<br />
user system elapsed <br />
0.88 0.00 1.04 <br />
> <br />
> gc();<br />
used (Mb) gc trigger (Mb) max used (Mb)<br />
Ncells 240335 6.5 467875 12.5 467875 12.5<br />
Vcells 21282370 162.4 52174655 398.1 52162618 398.0<br />
> system.time(for(i in 1:1000){B=matrix(1,m,m)<br />
+ B[.Internal(row(B))>.Internal(col(B))]=0<br />
+ col.cumsum3=crossprod(B,A)})<br />
user system elapsed <br />
0.59 0.00 0.83 <br />
> <br />
> gc();<br />
used (Mb) gc trigger (Mb) max used (Mb)<br />
Ncells 240342 6.5 467875 12.5 467875 12.5<br />
Vcells 21315133 162.7 52174655 398.1 52162618 398.0<br />
> system.time(for(i in 1:1000){B=sapply(0:(m-1),function(i)rep(0:1,c(i,m-i)))<br />
+ col.cumsum4=B%*%A})<br />
user system elapsed <br />
1.13 0.00 1.33 <br />
> <br />
> gc();<br />
used (Mb) gc trigger (Mb) max used (Mb)<br />
Ncells 240349 6.5 467875 12.5 467875 12.5<br />
Vcells 21347846 162.9 52174655 398.1 52162618 398.0<br />
> system.time(for(i in 1:1000){B=numeric(m*m);vec=rep(1,m)<br />
+ for(j in 1:m-1){B[(j*m+1):(j*m+m)]=vec;vec[j]=0}<br />
+ dim(B)=c(m,m)<br />
+ col.cumsum5=B%*%A})<br />
user system elapsed <br />
1.08 0.00 1.15 <br />
> <br />
> ## if used in a loop, B matrix doesn't need to be generated everytime.<br />
> gc();<br />
used (Mb) gc trigger (Mb) max used (Mb)<br />
Ncells 240356 6.5 467875 12.5 467875 12.5<br />
Vcells 21380659 163.2 52174655 398.1 52162618 398.0<br />
> system.time({<br />
+ B=matrix(1,m,m)<br />
+ B[.Internal(row(B))>.Internal(col(B))]=0<br />
+ for(i in 1:1000)col.cumsum3=crossprod(B,A)<br />
+ })<br />
user system elapsed <br />
0.55 0.00 0.59 <br />
> all.equal(col.cumsum,col.cumsum3)<br />
[1] TRUE<br />
MATLAB结果:
<br />
>> version<br />
<br />
ans =<br />
<br />
7.4.0.287 (R2007a)<br />
<br />
>> m=10;n=3276;A=reshape(sqrt(1:(m*n)),m,n);<br />
>> tic;for i=1:1000;cs=cumsum(A);end;toc<br />
Elapsed time is 0.433993 seconds.<br />