三维数组操作---矩阵运算

dingpeng · 2012年7月16日

我有个MCMC的程序，中间有一步涉及到加权最小二乘。需要计算如下的量：

$\widehat{ \beta} = ( \sum_{i=1}^n X_i^\top R^{-1} X_i )^{-1} ( \sum_{i=1}^n X_i^\top R^{-1} Z_i)$

其中， $X_i$ 是一个 $T\times p$ 的矩阵， $R$ 是一个 $T\times T$ 的矩阵， $Z_i$ 是一个 $T\times 1$ 的向量。
问题在于，我现在将 $X_i$ 堆起来放在一个 $N \times T\times p$ 的数组 X 里面， $Z_i$ 堆起来放在 $N\times T$ 的矩阵 $Z$ 里面。
那么每次计算，都需要做一个循环。样本量稍微大一点，计算的代价就太大。尤其是在mcmc中，每次迭代都要涉及这样一个显式的循环。
能否直接化成矩阵或者数组的操作而避免显式的循环呢？
多谢！

micro@ · 2012年7月16日

Compared to the use of an explicit loop, my guess is that some other problems deserve more attention in terms of speed. Examing profiling results might help.

If you want to keep the loop, then one thing to consider is to reorganize your NxTxP matrix to a TxPxN matrix, if your loop iterates over the index i. This will allow consecutive memory access and might make better use of cache.

If you want to remove the loop, then an NT x P matrix is possible, esp. useful when R is diagonal.

Is R diagonal? Will R change over MCMC? Will Z change over MCMC?

enthumelon · 2012年7月16日

回复第1楼的 dingpeng：
对于第一个部分可以这么写（至少看是看不出循环的）

mat<-list(X1=matrix(1:4,2),X2=matrix(5:8,2))
R<-matrix(rnorm(4),2)
f<-function(i) {mat[[i]]%*%R%*%mat[[i]]}
LLL=1:2###2 matrix
part1<-solve(Reduce('+',lapply(LLL,f)));
#or use snow package:
cl <- makeCluster(2, type = "MPI")
mat<-list(X1=matrix(1:4,2),X2=matrix(5:8,2))
R<-matrix(rnorm(4),2)
f<-function(i,mat=mat,R=R) {mat[[i]]%*%R%*%mat[[i]]}
LLL=1:2###2 matrix
Reduce('+',parLapply(cl,LLL,f,mat=mat,R=R))
stopCluster(cl)

nan.xiao · 2012年7月17日

如果 $\mathbf{R}$ 有一定规模，时间估计主要会消耗在矩阵求逆和相乘上吧，换掉默认的 BLAS 提升一下这部分的效率，也许会比改写显式循环收效更明显。

dingpeng · 2012年7月17日

回复第4楼的 nan.xiao：R可以很大，但是一般来说，我考虑小型的R，比如维数小于5*5。复杂度，我觉得还是主要在循环上。

dingpeng · 2012年7月17日

回复第2楼的 micro@: thanks for you comments. R will change during the mcmc iterations and R is not diagonal, because it is a general correlation matrix.

i will think about your suggestions, which are very helpful.

dingpeng · 2012年7月17日

回复第3楼的 enthumelon：谢谢！我去试试看。